Jun 7

Functional Form Misspecification in Causal Inference

Jamilla Cooiman, Founder Causal Academy

In predictive modelling, the role of functional form misspecification is usually quite intuitive. We have an outcome we want to predict, and we use input variables to build a model that predicts that outcome as accurately as possible. We may believe that the relationship between the outcome and the input variables is linear, nonlinear, smooth, discontinuous, or driven by interactions. The model we choose imposes some structure on that relationship.

If the structure we impose is wrong, our predictions may become less accurate. For example, if the true relationship between customer tenure and monthly spend is highly nonlinear, but we force the model to represent it as a straight line, the model may systematically underpredict spend for some tenure groups and overpredict it for others. In that setting, the consequence of functional form misspecification is relatively easy to understand: the model does not describe the relationship between the inputs and the outcome well enough, and the predictions suffer.

In causal inference, the role of functional form misspecification can feel less straightforward.

The reason is that our goal is usually not prediction for its own sake. We are not trying to predict an outcome as accurately as possible. We are trying to estimate a causal effect. We want to know what would happen to an outcome if we changed a treatment, such as a product price, a ranking rule, a delivery option, a service level, or a credit limit.

At the same time, many causal analyses still rely on predictive models in the estimation part of the analysis. We may model the outcome as a function of the treatment and covariates. We may model the probability of receiving the treatment given certain covariates. Or we may use machine learning models as part of a larger causal estimation procedure. This can make the role of functional form misspecification harder to understand. If prediction is not the final goal, but predictive models are still part of the analysis, then when does misspecification actually matter for the causal effect estimate?

In this blog post, I want to go deeper into that question.

The Role of Models in Causal Inference

Let’s start with the bigger idea: why and when do we use models in causal inference?

In causal inference, the first step is not to choose a model. The first step is to define the causal quantity we want to estimate and think about whether it can be identified from the data we have. Only after that does the modelling question arise: what role, if any, does a model need to play in estimating that causal quantity?

A useful distinction here is the distinction between randomized and observational settings.

In a randomized setting, the treatment is assigned randomly. For example, a company may randomly assign some customers to receive a marketing email and others not to receive it. Because the treatment is randomized, the different treatment groups are comparable on average. This means that, in principle, a simple comparison of average outcomes between the groups can estimate the causal effect for the experimental population.

Nevertheless, this does not mean models are never used in randomized experiments. In practice, we may still estimate the treatment effect using a regression model, such as a linear or logistic regression, where we include additional variables besides the treatment. For example, we may regress a spending outcome on whether someone received a marketing email, alongside (or interacted with) pre-treatment covariates such as customer tenure, historical spend, previous engagement, region, or device type.

In this setting, the role of the model is usually not to remove confounding bias, because randomization already handled that. Instead, the model is often used to improve statistical precision, account for chance imbalances, or estimate effects for specific subgroups.

Observational settings are different. In these settings, the treatment is not randomly assigned. Instead, it occurs through business rules, customer choices, operational decisions, or natural behaviour. Customers may receive a certain service level because they are high-value customers. Products may receive a different ranking because they already have strong sales performance. Accounts may receive extra attention because they show signs of risk. Regions may adopt a new process earlier because they differ in size, maturity, or local demand.

In these cases, units with different treatment levels may already differ before the treatment is applied. If those pre-existing differences are also related to the outcome, a simple comparison between units with different treatment levels will not isolate the causal effect. The observed association between the treatment and the outcome may partly reflect the effect of the treatment, but it may also reflect differences in the types of units with different levels of the treatment.

This is where models often become important.

In observational analyses, we rely on an assumption such as unconfoundedness. This means that, after conditioning on a sufficient set of observed variables, units with different treatment levels are comparable. Under this assumption, models are often used to adjust for those observed variables, so that the remaining association between the treatment and the outcome can be interpreted causally.

There are several ways this can happen.

One approach is to model the conditional expectation of the outcome directly. Here, we estimate the expected outcome given the treatment and the set of covariates we need to adjust for. In practice, this often means fitting some form of regression model of the outcome on the treatment and those covariates.

Another approach is to model treatment assignment. For example, in case of binary treatment, we estimate the probability that each unit receives the treatment given its covariates. We can then use these estimated probabilities for weighting, matching, or other forms of adjustment.

A third approach is to model both the outcome and the treatment assignment process. For example, double machine learning procedures often estimate outcome relationships and treatment assignment relationships as intermediate steps. These intermediate estimates are then combined in a way that estimates the causal effect of interest.

So models can enter causal inference in different ways. Sometimes we model the conditional expectation of the outcome given the treatment and covariates. In randomized experiments, this is often done to improve precision or estimate more detailed effects. In observational settings, this kind of model is often part of the adjustment strategy used to remove confounding. Sometimes we model the probability of receiving the treatment given covariates, for example to construct weights or match units together. And sometimes we use procedures that estimate both outcome relationships and treatment assignment relationships as intermediate steps. In all of these cases, our modelling choices determine how these relationships are represented in the estimation procedure.

How Functional Form Misspecification Affects Causal Estimation

Now we can return to the main question: how does functional form misspecification affect causal effect estimation?

The answer depends on which relationship we are modelling, why we are modelling it, and how the model is used in the causal estimation procedure.

Let’s start with the case where we model the outcome directly. This is what happens when we estimate the conditional expectation of the outcome given the treatment and a set of covariates. Empirically, this often looks like a regression model where the outcome is modelled as a function of the treatment and the variables we want to adjust for.

Once we have fitted this kind of model, we can use it to estimate treatment effects by comparing predicted outcomes under different treatment levels. So, for the same covariate values, we can predict what the outcome would be under one treatment level versus another. The difference between these two predicted outcomes gives us a model-based treatment contrast for units with those covariates. We can then average these contrasts over the relevant population to estimate an average treatment effect.

This logic depends on the model representing the relevant conditional expectation function well enough. The model is supposed to capture how the outcome varies with the covariates and then isolate the remaining difference in expected outcomes across treatment levels. Under the right causal assumptions, that remaining difference can be interpreted causally.

Functional form misspecification means that the structure imposed by the model does not match the true structure of this conditional expectation function. For example, the model may assume that a covariate has a linear relationship with the outcome when the true relationship is nonlinear. Or it may ignore an interactive relationship that is actually important.

This can matter because a misspecified model may not correctly account for how the outcome varies with the covariates. If the part of the outcome variation that the model misses is also related to treatment assignment, then the estimated treatment contrast can pick up more than the causal effect. It may partly reflect systematic differences between units with different treatment levels that were not properly adjusted away.

This is the important nuance. Misspecification is not problematic simply because the model predicts the outcome imperfectly. It becomes problematic when the model fails to capture outcome variation that is associated with the adjustment variables, and that missed outcome variation is also systematically related to treatment status. In that case, the model may attribute part of that remaining outcome difference to the treatment, even though it is actually due to covariate-related differences that were not properly adjusted away.

This also explains why functional form misspecification does not automatically imply bias. Suppose a model fails to capture some nonlinear relationship between the adjustment variables and the outcome. That may produce worse outcome predictions. But if the missed outcome variation is not systematically related to treatment status, then it does not necessarily contaminate the treatment effect estimate. In that case, the misspecification may hurt prediction or precision, but it does not necessarily bias the estimated effect.

Randomized experiments are the clearest example. In a randomized experiment, treatment assignment is unrelated to pre-treatment covariates in expectation. We may still model the conditional expectation of the outcome given the treatment and covariates for precision or detail purposes. For example, we may regress spending on treatment status, customer tenure, historical spend, region, device type, and other pre-treatment variables. If this model does not capture the true relationship between those covariates and spending, then the outcome model is misspecified.

However, in many cases, misspecification does not translate into bias for a quantity such as the overall Average Treatment Effect. The reason is that treatment status was randomized. As a result, treatment status is not systematically related to the pre-treatment covariates. So even if the model fails to capture part of how those covariates relate to the outcome, that missed part is often not systematically related to differences in the treatment level. The model may predict less well, and it may not deliver the precision gain we hoped for, but it often does not turn the overall comparison into a biased comparison.

This does not mean that functional form misspecification is never relevant in randomized experiments. If we use the model to estimate more detailed quantities, such as how the treatment effect differs across covariate levels, functional form choices become more important. In that case, the model is no longer only helping us estimate one overall average effect. It is also representing treatment effect heterogeneity. If the model imposes the wrong interaction structure or smooths over meaningful differences between groups, the estimated heterogeneous effects can be misleading.

Nevertheless, while causal estimates from experiments can be protected from certain kinds of bias in realistic and well-understood settings, in observational analyses we usually do not have that same protection. Here, treatment assignment is no longer random. It often depends on the same covariates that also help explain the outcome. This means that if the outcome model misses part of the relationship between those covariates and the outcome, that missed part is more likely to be systematically related to treatment status. If that happens, the treatment effect estimate can become biased.

This is why, when outcome models are used for adjustment in observational causal analyses, we usually try to approximate the conditional expectation function as well as we reasonably can. Not because we want to predict as accurately as we can, but rather because we really want to make sure that the model captures the outcome relationships needed for the adjusted treatment comparison to be credible and causal.

A similar logic applies when we model treatment assignment instead of the outcome. For example, if we estimate the probability of treatment given covariates and use these probabilities for weighting or matching, the purpose is to make treated and untreated units comparable in terms of the covariates we need to adjust for. Functional form misspecification becomes problematic when the estimated treatment probabilities are wrong in a way that leaves important covariate imbalance after weighting or matching. By important, I mean imbalance in covariates that are also related to the outcome. In that case, the adjusted comparison is still partly comparing different types of units, and the causal effect estimate can remain biased.

At the same time, not every remaining imbalance is equally problematic. If the treatment model fails to balance a variable that is not related to the outcome, or a variable that was not needed for adjustment in the first place, this may not bias the treatment effect estimate. It may indicate that the treatment model is imperfect, but the imperfection only matters for bias if it prevents us from balancing the variables needed for a valid causal comparison.

A similar logic applies to double machine learning procedures, although the details are more delicate. These methods are designed to reduce sensitivity to certain errors in the intermediate models, for example through orthogonalization and sample splitting. But this does not mean that they are immune to misspecification. If the nuisance models do not estimate the relevant outcome and treatment relationships well enough, the residualized outcome and residualized treatment may still contain covariate-related structure that should have been removed. If that remaining structure is related across the two residuals, the final treatment effect estimate can still be biased.

Conclusion

Overall, the role of functional form misspecification in causal inference is delicate.

The main point is that it depends on both the outcome side and the treatment side of the estimation problem. In many causal estimators, we are trying to compare outcome variation and treatment variation after accounting for the covariates that matter for the causal comparison. We may do this through outcome regression, weighting, matching, double machine learning, or another procedure.

Functional form misspecification becomes problematic when it leaves behind systematic variation that should have been adjusted away, and that remaining variation is still connected across the outcome and treatment sides. In that case, the final effect estimate can partly reflect pre-existing differences between treated and untreated units, rather than only the causal effect of the treatment. But when the misspecified part does not affect the treatment comparison in this way, it may not translate into bias.

For a more detailed treatment of this topic, there is a free notebook on Causal Academy called “On the Different Sensitivity of ATEs and CATEs to Model Misspecification in Experimental Settings.” You can access it by creating a free account.

I also cover this topic in more depth in my latest course, Causal Inference with Linear Regression: A Modern Approach, Part II, where I work through misspecification with simulation examples and also discuss double machine learning.

0 comments

Joinor login to leave a comment