Vector Error Correction Models

MacroEconometricModels.jl provides full-featured Vector Error Correction Model (VECM) estimation for cointegrated $I(1)$ systems. The VECM decomposes multivariate dynamics into long-run equilibrium relationships and short-run adjustment, making it the canonical framework for modeling nonstationary variables that share common stochastic trends.

  • Estimation: Johansen (1991) reduced-rank MLE with automatic rank selection, and Engle-Granger (1987) two-step for bivariate systems
  • Rank selection: Trace and maximum eigenvalue tests at user-specified significance levels
  • Deterministic specification: None, constant, or linear trend in the cointegrating relation
  • VAR conversion: Automatic conversion to VAR in levels for full structural analysis (Cholesky, sign restrictions, ICA, and all 18 identification methods)
  • Forecasting: Direct VECM iteration preserving cointegrating relationships, with bootstrap and simulation confidence intervals
  • Granger causality: Short-run, long-run, and strong (joint) causality decomposition
  • TimeSeriesData dispatch: Pass TimeSeriesData objects directly –- variable names propagate automatically
using MacroEconometricModels, Random
Random.seed!(42)
qd = load_example(:fred_qd)
Y = log.(to_matrix(qd[:, ["GDPC1", "PCECC96", "GPDIC1"]]))
Y = Y[all.(isfinite, eachrow(Y)), :]
<< @setup-block not executed in draft mode >>

Quick Start

Recipe 1: Estimate with automatic rank selection

# Automatic rank via Johansen trace test
vecm = estimate_vecm(Y, 2)
report(vecm)
<< @example-block not executed in draft mode >>

Recipe 2: Explicit rank and deterministic specification

vecm = estimate_vecm(Y, 2; rank=1, deterministic=:constant)
report(vecm)
<< @example-block not executed in draft mode >>

Recipe 3: Impulse responses via VAR conversion

vecm = estimate_vecm(Y, 2; rank=1)
irfs = irf(vecm, 20; method=:cholesky)
<< @example-block not executed in draft mode >>
plot_result(irfs)

Recipe 4: Forecast with bootstrap confidence intervals

vecm = estimate_vecm(Y, 2; rank=1)
fc = forecast(vecm, 10; ci_method=:bootstrap, reps=50, conf_level=0.95)
report(fc)
<< @example-block not executed in draft mode >>

Recipe 5: Granger causality decomposition

vecm = estimate_vecm(Y, 2; rank=1)
g = granger_causality_vecm(vecm, 1, 2)  # GDP -> Consumption
report(g)
<< @example-block not executed in draft mode >>

Recipe 6: TimeSeriesData dispatch

ts = qd[:, ["GDPC1", "PCECC96", "GPDIC1"]]

# Pass TimeSeriesData directly --- variable names propagate
vecm = estimate_vecm(ts, 2; rank=1)
report(vecm)
<< @example-block not executed in draft mode >>

Model Specification

The VECM reparameterizes a VAR(p) in levels to separate long-run equilibrium relationships from short-run dynamics. Consider a VAR(p) for an $n$-dimensional $I(1)$ vector $y_t$:

\[y_t = c + A_1 y_{t-1} + A_2 y_{t-2} + \cdots + A_p y_{t-p} + u_t\]

where:

  • $y_t$ is the $n \times 1$ vector of endogenous variables at time $t$
  • $A_i$ are $n \times n$ coefficient matrices for lag $i = 1, \ldots, p$
  • $c$ is the $n \times 1$ intercept vector
  • $u_t \sim N(0, \Sigma)$ are i.i.d. innovations

When the variables are cointegrated, the Granger representation theorem (Engle & Granger 1987) implies the system admits a Vector Error Correction representation:

\[\Delta y_t = \alpha \beta' y_{t-1} + \Gamma_1 \Delta y_{t-1} + \cdots + \Gamma_{p-1} \Delta y_{t-p+1} + \mu + u_t\]

where:

  • $\Pi = \alpha \beta'$ is the $n \times n$ long-run matrix with rank $r$
  • $\alpha$ is the $n \times r$ matrix of adjustment coefficients (loading matrix)
  • $\beta$ is the $n \times r$ matrix of cointegrating vectors
  • $\Gamma_i = -(A_{i+1} + \cdots + A_p)$ are the $n \times n$ short-run dynamics matrices
  • $\mu$ is the $n \times 1$ intercept vector
  • $u_t \sim N(0, \Sigma)$ are i.i.d. innovations

The Cointegrating Relationship

Each column $\beta_j$ of $\beta$ defines a stationary linear combination of the $I(1)$ variables:

\[z_{j,t} = \beta_j' y_t \sim I(0), \quad j = 1, \ldots, r\]

where:

  • $z_{j,t}$ is the $j$-th error correction term (deviation from long-run equilibrium)
  • $\beta_j$ is the $j$-th cointegrating vector

The corresponding column $\alpha_j$ of $\alpha$ governs the speed of adjustment: $\alpha_{ij}$ measures how quickly variable $i$ responds to deviations from the $j$-th equilibrium. The cointegrating rank $r$ determines the number of independent long-run equilibrium relationships. When $r = 0$, there is no cointegration and the system reduces to a VAR in first differences. When $r = n$, all variables are stationary in levels.

Phillips Normalization

The package applies Phillips normalization to $\beta$ so that the first $r$ rows form an identity matrix. This ensures unique identification of the cointegrating vectors and makes the $\alpha$ coefficients directly interpretable as adjustment speeds toward each equilibrium.


Estimation

Johansen Maximum Likelihood

The Johansen (1991) reduced-rank regression procedure estimates $\alpha$ and $\beta$ jointly via maximum likelihood. The algorithm proceeds in four steps:

  1. Concentrate out short-run dynamics by regressing $\Delta Y$ and $Y_{t-1}$ on lagged differences $Z = [\Delta Y_{t-1}, \ldots, \Delta Y_{t-p+1}, \mu]$
  2. Compute moment matrices $S_{00}$, $S_{11}$, $S_{01}$ from the concentrated residuals
  3. Solve the generalized eigenvalue problem $|\lambda S_{11} - S_{10} S_{00}^{-1} S_{01}| = 0$
  4. Extract $\beta$ from the first $r$ eigenvectors and compute $\alpha = S_{01} \beta (\beta' S_{11} \beta)^{-1}$

The eigenvalues $\lambda_1 \geq \lambda_2 \geq \cdots \geq \lambda_n$ correspond to the canonical correlations between $\Delta Y$ and $Y_{t-1}$ after removing the short-run dynamics. The cointegrating rank $r$ equals the number of statistically significant eigenvalues, determined by the trace test:

\[\text{LR}_{\text{trace}}(r_0) = -T \sum_{i=r_0+1}^{n} \ln(1 - \hat{\lambda}_i)\]

where:

  • $T$ is the effective sample size
  • $\hat{\lambda}_i$ is the $i$-th largest eigenvalue
  • $r_0$ is the null hypothesis rank
# Automatic rank selection via Johansen trace test
vecm = estimate_vecm(Y, 2)
report(vecm)

# Explicit rank specification
vecm = estimate_vecm(Y, 2; rank=1)

# Different deterministic specifications
vecm_none = estimate_vecm(Y, 2; rank=1, deterministic=:none)      # No deterministic terms
vecm_const = estimate_vecm(Y, 2; rank=1, deterministic=:constant)  # Constant (default)
vecm_trend = estimate_vecm(Y, 2; rank=1, deterministic=:trend)     # Linear trend
<< @example-block not executed in draft mode >>

The Johansen method estimates $\alpha$ and $\beta$ jointly, producing efficient estimates of the cointegrating space regardless of the number of cointegrating vectors. The trace test selects the rank at the 5% significance level by default.

Rank Selection

The select_vecm_rank function provides fine-grained control over rank determination using either the trace test or the maximum eigenvalue test:

r_trace = select_vecm_rank(Y, 2; criterion=:trace, significance=0.05)
r_max = select_vecm_rank(Y, 2; criterion=:max_eigen)
<< @example-block not executed in draft mode >>
KeywordTypeDefaultDescription
criterionSymbol:traceTest statistic: :trace or :max_eigen
significanceReal0.05Significance level for critical values
deterministicSymbol:constantDeterministic specification for Johansen test

Engle-Granger Two-Step

For bivariate systems with a single cointegrating relationship ($r = 1$), the Engle-Granger (1987) two-step estimator provides a simpler alternative:

  1. Step 1: Estimate the cointegrating vector via static OLS regression of $y_{1,t}$ on $y_{2,t}, \ldots, y_{n,t}$
  2. Step 2: Estimate the VECM equations using the OLS residuals as the error correction term
vecm_eg = estimate_vecm(Y, 2; method=:engle_granger, rank=1)
report(vecm_eg)
<< @example-block not executed in draft mode >>

The Engle-Granger estimator is consistent but less efficient than Johansen MLE for multivariate systems. The static OLS regression in step 1 produces superconsistent estimates of $\beta$ (Stock 1987), but the two-step procedure does not jointly optimize the likelihood.

Engle-Granger supports rank=1 only

The Engle-Granger method estimates a single cointegrating vector via static OLS. For systems with multiple cointegrating relationships, use the Johansen method.

Keyword Arguments

KeywordTypeDefaultDescription
rankUnion{Symbol,Int}:autoCointegrating rank; :auto selects via trace test
deterministicSymbol:constantDeterministic terms: :none, :constant, :trend
methodSymbol:johansenEstimation method: :johansen or :engle_granger
significanceReal0.05Significance level for automatic rank selection
varnamesVector{String}["y1", ...]Variable display names

Return Value

estimate_vecm returns a VECMModel{T} with the following fields:

FieldTypeDescription
YMatrix{T}Original data in levels ($T_{obs} \times n$)
pIntUnderlying VAR order
rankIntCointegrating rank $r$
alphaMatrix{T}Adjustment coefficients ($n \times r$)
betaMatrix{T}Cointegrating vectors ($n \times r$), Phillips-normalized
PiMatrix{T}Long-run matrix $\alpha\beta'$ ($n \times n$)
GammaVector{Matrix{T}}Short-run dynamics matrices $[\Gamma_1, \ldots, \Gamma_{p-1}]$
muVector{T}Intercept vector
UMatrix{T}Residuals ($T_{eff} \times n$)
SigmaMatrix{T}Residual covariance ($n \times n$)
aic, bic, hqicTInformation criteria
loglikTLog-likelihood
deterministicSymbolDeterministic specification
methodSymbolEstimation method used
johansen_resultJohansenResult{T}Johansen test result (if applicable)
varnamesVector{String}Variable display names

VAR Conversion

The to_var function converts a VECM back to a VAR in levels, enabling all structural analysis methods. The mapping from VECM to VAR coefficients is:

\[A_1 = \Pi + I_n + \Gamma_1, \quad A_i = \Gamma_i - \Gamma_{i-1} \text{ for } i = 2, \ldots, p-1, \quad A_p = -\Gamma_{p-1}\]

where:

  • $A_i$ is the $i$-th VAR coefficient matrix in levels
  • $\Pi = \alpha\beta'$ is the long-run matrix
  • $I_n$ is the $n \times n$ identity matrix
  • $\Gamma_i$ are the short-run dynamics matrices
vecm = estimate_vecm(Y, 2; rank=1)
var_model = to_var(vecm)
report(var_model)
<< @example-block not executed in draft mode >>

This conversion is critical because it enables all 18 identification methods (Cholesky, sign restrictions, ICA, narrative, etc.) to work automatically with VECM models. The irf, fevd, and historical_decomposition functions dispatch through to_var() internally, so VECMModel objects can be passed directly.


Innovation Accounting

All structural analysis functions accept VECMModel objects directly. The conversion to VAR in levels is handled automatically via to_var().

vecm = estimate_vecm(Y, 2; rank=1)

# Impulse response functions (Cholesky identification)
irfs = irf(vecm, 20; method=:cholesky)

# Forecast error variance decomposition
decomp = fevd(vecm, 20)

# Historical decomposition
T_eff = effective_nobs(to_var(vecm))
hd = historical_decomposition(vecm, T_eff)
<< @example-block not executed in draft mode >>
plot_result(irfs)
plot_result(decomp)
plot_result(hd)

The Cholesky-identified IRFs trace the dynamic effects of orthogonalized shocks on the level variables. Because the VAR is estimated in levels (via conversion), the IRFs capture both the transitory short-run dynamics and the permanent effects that cointegration implies. For a system with $r < n$ cointegrating vectors, exactly $n - r$ shocks have permanent effects on the levels, corresponding to the common stochastic trends. For details on identification methods, see Innovation Accounting.


Forecasting

VECM forecasting iterates the VECM equations directly in levels, preserving the cointegrating relationships in the forecast path. This approach is preferable to forecasting from the converted VAR because the error correction mechanism operates explicitly during each forecast step, pulling the system toward the long-run equilibrium:

\[\hat{y}_{T+h} = \hat{y}_{T+h-1} + \alpha\beta'\hat{y}_{T+h-1} + \sum_{i=1}^{p-1} \Gamma_i \Delta\hat{y}_{T+h-i} + \mu\]

where:

  • $\hat{y}_{T+h}$ is the $h$-step-ahead forecast in levels
  • $\hat{y}_{T+h-1}$ is the previous forecast (or last observed value for $h = 1$)
  • $\Delta\hat{y}_{T+h-i}$ are lagged forecast differences
vecm = estimate_vecm(Y, 2; rank=1)

# Point forecast
fc = forecast(vecm, 10)
report(fc)

# Bootstrap confidence intervals
fc = forecast(vecm, 10; ci_method=:bootstrap, reps=50, conf_level=0.95)
report(fc)

# Simulation-based confidence intervals
fc = forecast(vecm, 10; ci_method=:simulation, reps=50)
report(fc)
<< @example-block not executed in draft mode >>
plot_result(fc)

The forecast preserves cointegrating relationships by iterating the VECM equations in levels rather than converting to VAR form. Bootstrap CIs resample residuals with replacement; simulation CIs draw from $N(0, \hat{\Sigma})$. Both methods generate $R$ replicate forecast paths and extract pointwise quantiles.

Keyword Arguments

KeywordTypeDefaultDescription
ci_methodSymbol:noneConfidence interval method: :none, :bootstrap, :simulation
repsInt500Number of bootstrap or simulation replications
conf_levelReal0.95Confidence level for intervals

Return Value

forecast returns a VECMForecast{T} with the following fields:

FieldTypeDescription
levelsMatrix{T}Forecasts in levels ($h \times n$)
differencesMatrix{T}Forecasts in first differences ($h \times n$)
ci_lowerMatrix{T}Lower confidence interval bounds ($h \times n$)
ci_upperMatrix{T}Upper confidence interval bounds ($h \times n$)
horizonIntForecast horizon
ci_methodSymbolCI method used
conf_levelTConfidence level

Granger Causality

VECM Granger causality tests (Toda & Phillips 1993) decompose causal channels into short-run (through lagged differences $\Gamma$) and long-run (through the error correction term $\alpha\beta'y_{t-1}$) components. The strong test combines both channels in a single joint test.

vecm = estimate_vecm(Y, 2; rank=1)

# Test: does GDP (var 1) Granger-cause Consumption (var 2)?
g = granger_causality_vecm(vecm, 1, 2)
report(g)
<< @example-block not executed in draft mode >>

The three Wald tests are constructed as follows:

TestNull hypothesisMechanism
Short-run$\Gamma_i[\text{effect}, \text{cause}] = 0$ for all $i$Causality through lagged differences
Long-run$\alpha[\text{effect}, :] = 0$Causality through error correction
StrongJoint test of both restrictionsCombined short-run and long-run causality

Each test reports a Wald $\chi^2$ statistic, degrees of freedom, and p-value. A significant short-run test indicates that past changes in the cause variable predict current changes in the effect variable. A significant long-run test indicates that the effect variable adjusts to deviations from the cointegrating equilibrium –- the error correction channel is active for that variable.

Return Value

granger_causality_vecm returns a VECMGrangerResult{T} with the following fields:

FieldTypeDescription
short_run_statTWald $\chi^2$ for short-run test
short_run_pvalueTP-value for short-run test
short_run_dfIntDegrees of freedom for short-run test
long_run_statTWald $\chi^2$ for long-run test
long_run_pvalueTP-value for long-run test
long_run_dfIntDegrees of freedom for long-run test
strong_statTWald $\chi^2$ for joint test
strong_pvalueTP-value for joint test
strong_dfIntDegrees of freedom for joint test
cause_varIntIndex of the cause variable
effect_varIntIndex of the effect variable

Complete Example

This example demonstrates the full VECM workflow: cointegration testing, estimation, structural analysis, forecasting, and Granger causality.

# Step 1: Test for cointegration
joh = johansen_test(Y, 2)
report(joh)

# Step 2: Estimate VECM with rank 1
vecm = estimate_vecm(Y, 2; rank=1)
report(vecm)

# Step 3: Impulse responses (Cholesky identification)
irfs = irf(vecm, 20; method=:cholesky)

# Step 4: Forecast with bootstrap CIs
fc = forecast(vecm, 10; ci_method=:bootstrap, reps=50)
report(fc)

# Step 5: Granger causality --- does GDP Granger-cause Consumption?
g = granger_causality_vecm(vecm, 1, 2)
report(g)

# Step 6: Convert to VAR for FEVD
var_model = to_var(vecm)
decomp = fevd(var_model, 20)
<< @example-block not executed in draft mode >>
plot_result(irfs)
plot_result(fc)
plot_result(decomp)

The cointegrating vector $\beta$ identifies the long-run equilibrium between GDP, consumption, and investment. A cointegrating relationship $\beta'y_t \sim I(0)$ implies these variables share common stochastic trends, consistent with balanced growth path theory. The adjustment coefficients $\alpha$ reveal how each variable responds to disequilibrium –- a negative $\alpha$ for consumption indicates it contracts when the system overshoots the equilibrium ratio. The Granger causality test decomposes predictive content into short-run (through lagged differences $\Gamma$) and long-run (through the error correction term $\alpha\beta'y_{t-1}$) channels, providing a richer picture than standard VAR-based Granger tests.


Common Pitfalls

  1. Incorrect cointegrating rank: Specifying $r$ too high introduces near-unit-root stationary components that contaminate the short-run dynamics. Specifying $r$ too low discards genuine equilibrium relationships. Always run johansen_test first and examine both the trace and maximum eigenvalue statistics before fixing $r$ manually.

  2. I(2) data passed without differencing: The VECM framework assumes all variables are $I(1)$. Passing $I(2)$ data (e.g., price levels that need double differencing) produces spurious cointegrating vectors. Test each series with adf_test or kpss_test before estimation and difference any $I(2)$ variables once to bring them to $I(1)$.

  3. Too many lags in levels: The underlying VAR order $p$ determines the number of lagged differences $p - 1$ in the VECM. Over-parameterization wastes degrees of freedom and inflates estimation uncertainty, especially in small samples. Use select_lag_order on the levels data or compare aic/bic across candidate orders.

  4. Misinterpreting the Johansen trace test: The sequential testing procedure starts from $r_0 = 0$ and increments until the trace statistic falls below the critical value. Rejecting $r_0 = 0$ but not $r_0 = 1$ implies exactly one cointegrating vector. The trace test has well-known size distortions in small samples; the Bartlett correction (not yet implemented) mitigates this, or use a more conservative significance level.

  5. Engle-Granger with multiple cointegrating vectors: The Engle-Granger two-step method estimates only a single cointegrating vector via static OLS. Applying it to a system with $r > 1$ recovers at most one linear combination and discards the remaining equilibrium relationships. Use the Johansen method for systems with multiple cointegrating vectors.

  6. Forgetting VAR conversion for structural analysis: irf, fevd, and historical_decomposition dispatch through to_var() automatically, so passing a VECMModel directly works. However, if you need the VAR coefficient matrices explicitly (e.g., for custom identification schemes), call to_var(vecm) and work with the resulting VARModel.


References

  • Johansen, S. (1991). Estimation and Hypothesis Testing of Cointegration Vectors in Gaussian Vector Autoregressive Models. Econometrica, 59(6), 1551-1580. DOI

  • Engle, R. F., & Granger, C. W. J. (1987). Co-Integration and Error Correction: Representation, Estimation, and Testing. Econometrica, 55(2), 251-276. DOI

  • Toda, H. Y., & Phillips, P. C. B. (1993). Vector Autoregressions and Causality. Econometrica, 61(6), 1367-1393. DOI

  • Stock, J. H. (1987). Asymptotic Properties of Least Squares Estimators of Cointegrating Vectors. Econometrica, 55(5), 1035-1056. DOI

  • Lutkepohl, H. (2005). New Introduction to Multiple Time Series Analysis. Berlin: Springer. ISBN 978-3-540-40172-8.