Why the exclusion restriction in IV regression cannot be tested

3 minute read


(In progress - incomplete)

Instrumental variables are a common econometric technique taught in most undergraduate econometric curriculums for identifying the causal effect of a variable $X$ on a variable $Y$. The core idea behind the technique is it can be used in settings where a source of exogenous variation in $X$ cannot be found. Instead, a third variable called the instrument, $Z$, is used where $Z$ only effects $Y$by operating through $X$. Formally, let us say we want an unbiased estimate of $\beta_1$ where we have the model $Y = \beta_1 + \beta_2 X + \epsilon$. Unfortunately, our variation in $X$ is not exogenous i.e. $Cov(X, \epsilon) \neq 0$ which is why we cannot run a simple linear regression. In order to use an instrument $Z$ we must satisfy the following two conditions:

  • Inclusion restriction: $Cov(Z,X) \neq 0$
  • Exclusion restriction: $Cov(Z, \epsilon) = 0$

In undergrad econometrics, we learn that the exclusion restriction, $Cov(Z, \epsilon)$ cannot be empirically verified, but oftentimes the discussion ends there. Instead, we must make arguments about the underlying economic mechanisms. But how can we show that the exclusion restriction cannot be empirically verified? That is the subject of this note.

What is the exclusion restriction?

As stated before, the exclusion restriction is a necessary condition that an instrument must satisfy in order to produce unbiased estimates from the IV strategy. Mathematically, the restriction is $Cov(Z, \epsilon) = 0$. Intuitively, the exclusion restriction indicates that the instrument is not correlated with any component of $Y$ once the $X$ component is removed. In other words, there are no ommitted variables in our model through which $Z$ affects $Y$.

A hypothesis for empirically testing the exclusion restriction

When I took econometrics, a couple of my friends hypothesized that there is a way to test the exclusion restriction. In particular, testing if $Cov(Z,Y \vert X)=0$ seemed likely a plausible way to check if $Z$ was associated with $Y$ through a channel other than $X$.

It turns out that the previous hypothesis is incorrect for somewhat subtle reasons that are not usually taught in econometrics class.

Why the proposed method fails


Let’s consider an example where we want to estimate the relationship between going to college denoted by $X$ and wages $Y$. Assume that the distance from your home to college $Z$ is a valid instrument for college attendance. Then we have $Cov(Z, X) < 0$ and $Cov(Z, \epsilon) = 0$ where $Y = \beta_1 + \beta_2X + \epsilon$.

Now let us examine $Cov(Z, Y \vert X)$. We know that $Z$ does not effect $Y$ through any channel other than through $X$. Conditioning on $X$ can be thought of us examining individuals who have the same college attendance. Does knowing someone’s distance from college still give us information on earnings? Yes! If someone is further away from college, but still attends, then it is reasonable to expect they had other qualities such as college-educated parents ($W$) that make college attendance more likely and potentially also increase earnings. Therefore, holding $X$ constant implies a a positive relationship between distance from college and lifetime earnings through a separate channel. The subtle aspect of this relationship is that even if $Cov(Z, W) = 0$, we can still have $Cov(Z, W \vert X) \neq 0$.

Probabilistic Graphical Models

To provide more mathematical rigor to the above example, a tool from the statistics literature called the probabilistic graphical model is useful.