Causality: Granger Causality

Earlier this year, I reviewed one definition of causality: the Bradford Hill criteria. Now I move from a medical definition causality to one developed by an economist (in fact, an economist from my alma matter). Granger causality basically identifies a variable as causal if: (i) it occurs the outcome of interest and (ii) including past values of the variable improves the prediction of the outcome of interest in a statistically significant way. Scholarpedia provides a useful example:

The basic “Granger Causality” definition is quite simple. Suppose that we have three terms, Xt, Yt, and Wt, and that we first attempt to forecast Xt+1 using past terms of Xt and Wt . We then try to forecast Xt+1 using past terms of Xt+1, Yt, and Wt. If the second forecast is found to be more successful, according to standard cost functions, then the past of Y appears to contain information helping in forecasting Xt+1 that is not in past Xt or Wt. In particular, Wt could be a vector of possible explanatory variables. Thus, Yt would “Granger cause” Xt+1 if (a) Yt occurs before Xt+1; and (b) it contains information useful in forecasting Xt+1 that is not found in a group of other appropriate variables.

Consider a linear example with two time series processes.

X1(t) = Σj=1pA11,jX1(t-j)+Σj=1pA12,jX2(t-j)

X2(t) = Σj=1pA21,jX1(t-j)+ΣjpA22,j=1X2(t-j)

One can test for whether variable X2 Granger-causes X1 by testing whether the vector A12=0, using an F-test. The F-test examines whether all coefficients in the vector are statistically significantly different from zero. The magnitude of the Granger-causality (a.k.a. G-causality) can be estimated as the logarithm of the corresponding F-statistic. To determine the appropriate number of lag terms in the time series, one could use the Akaike Information Criterion (AIC,) or Bayesian Information Criterion (BIC) to determine which value of p maximizes the model fit.

G-causality can be readily extended to the n variable case, where n>2 , by estimating an n variable autoregressive model. In this case, X2 G-causes X1 if lagged observations of X2 help predict X1 when lagged observations of all other variables X3XN are also taken into account.