←︎ Back Econometrics › Class Slides
1 / 18

Time Series

Lecture 8

What is time series data?

A single unit observed repeatedly over time.
  • Indexed by t = 1, 2, ..., T. Order matters, this is not a random sample.
  • Examples: quarterly GDP, monthly unemployment, daily stock prices, annual crime rates.
The key challenge: observations are not independent.
  • Today's value is typically correlated with yesterday's value (autocorrelation).
  • OLS remains unbiased under autocorrelation, but standard errors are wrong, just like heteroskedasticity.
Two additional complications beyond cross-section.
  • Trends: many economic series drift upward over time, creating spurious correlations.
  • Seasonality: systematic within-year patterns (retail sales, weather-sensitive output).
Stationarity is the time-series analogue of having a stable population to draw from.

Stationary

Mean, variance, and autocovariances do not depend on t. The series fluctuates around a fixed level. Classical inference applies.

Non-stationary

Mean or variance drifts over time. Series may wander without bound. Regressions can be spurious. Standard t-statistics are unreliable.

Weak (covariance) stationarity requires: (1) E[Yt] = μ for all t, (2) Var(Yt) = σ2 for all t, (3) Cov(Yt, Yt−k) depends only on k, not on t.
Most asymptotic theory for time series requires stationarity (or a transformation to achieve it). Always plot your series and check before running regressions.
Autocorrelation: the correlation between a series and its own past values.
ρk = Corr(Yt, Yt−k) = Cov(Yt, Yt−k) / Var(Yt)
k is the lag. The autocorrelation function (ACF) plots ρk against k. A slowly decaying ACF signals a non-stationary or long-memory series.
Consequences for OLS: With autocorrelated errors, the OLS estimator is still unbiased and consistent, but the usual standard errors are biased downward, t-statistics are too large, leading to over-rejection of the null.
Breusch-Godfrey test: regress êt on Xt and p lags of the residual. The nR2 statistic is χ2(p). Preferred to the Durbin-Watson test because it works with lagged dependent variables.
The AR(1) is the simplest time series model.
Yt = ρYt−1 + et,   et ~ i.i.d.(0, σ2)
ρ is the autocorrelation coefficient. It controls how much today's value depends on yesterday's.
  • If |ρ| < 1: the series is stationary. Shocks dissipate over time.
  • If ρ = 1: unit root. The series is a random walk, shocks are permanent.
  • If |ρ| > 1: explosive. The series diverges. Rare in economic data.
An AR(p) model includes p lags: Yt = ρ1Yt−1 + … + ρpYt−p + et. Use AIC/BIC to select p.
A random walk has no tendency to revert to any long-run mean.
Yt = Yt−1 + et   ⇒   Yt = Y0 + ∑s=1t es
Variance grows without bound: Var(Yt) = 2. This violates stationarity. Any shock is permanently embedded in all future values.
Economic examples of likely unit roots: GDP, price levels, nominal exchange rates, asset prices. Their log-differences (growth rates, returns) are typically stationary.
A random walk with drift adds a constant: Yt = μ + Yt−1 + et. The series trends upward (or downward) on average while still wandering.

If two series both have unit roots, regressing one on the other can produce a high R2 and a significant t-statistic even when they are completely unrelated.

Spurious regression: non-stationary series that share a common trend look correlated, but the relationship is meaningless.
Granger and Newbold (1974) showed that regressing two independent random walks typically produces R2 > 0.9 and |t| > 2 even when the true coefficient is zero. As T ︎→︎ ∞, the t-statistic diverges rather than converging to its null distribution.
Warning
A high R2 with non-stationary variables is a red flag, not evidence of a real relationship. Always test for unit roots before regressing time series on each other.
The classic example: regressing the number of Nicolas Cage films on US drowning deaths yields a near-perfect fit. Both series happen to trend together by coincidence.
Remedy: either difference the series to induce stationarity, or test for cointegration before treating the levels regression as meaningful.
The Augmented Dickey-Fuller (ADF) test tests the null hypothesis of a unit root.
Rewrite the AR(1) by subtracting Yt−1 from both sides:
ΔYt = δYt−1 + et,   δ = ρ − 1
H0: δ = 0 (unit root)  vs.  H1: δ < 0 (stationary).
The augmented version adds lags of ΔYt to absorb serial correlation in the residuals. The test statistic does not follow a standard t-distribution under H0, use Dickey-Fuller critical values.
Variants: include a constant (drift), or a constant and a trend. Match the specification to the behavior of the series.
KPSS test reverses the null: H0 is stationarity. Using both ADF and KPSS together gives more confidence in the conclusion.

Differencing transforms a non-stationary series into a stationary one.

First difference: ΔYt = YtYt−1.
  • If Yt is a random walk, then ΔYt = et, white noise, stationary.
  • We say Yt is integrated of order 1, or I(1).
  • GDP in levels is I(1), GDP growth is I(0) (stationary).
Log-differencing.
  • Δ log Yt ≈ (YtYt−1) / Yt−1, the growth rate.
  • Stabilizes variance and linearizes multiplicative trends. Preferred for price and output series.
Cost of differencing: you lose the long-run relationship.
  • Differenced regressions identify short-run dynamics only.
  • If two series are cointegrated, differencing is inefficient. Use an error-correction model instead.
Cointegration: two I(1) series that share a common stochastic trend.
Even though Xt and Yt individually wander, the linear combination YtβXt is stationary. They are “tied together” in the long run.
Examples: consumption and income, prices in different markets (law of one price), money supply and price level.
Testing: Engle-Granger two-step, regress Yt on Xt in levels, then test the residuals for a unit root. If the residuals are stationary, the series are cointegrated.
Error-correction model (ECM): if cointegrated, short-run dynamics plus error-correction term:
ΔYt = α(Yt−1βXt−1) + γΔXt + et
α < 0 is the speed of adjustment back to the long-run equilibrium.
When errors are autocorrelated, use HAC (heteroskedasticity and autocorrelation consistent) standard errors.
Newey and West (1987) proposed a covariance estimator that is robust to both heteroskedasticity and serial correlation of unknown form:
HAC = (XTX)−1 (XTX)−1
where is the Newey-West long-run variance estimate that uses a kernel (Bartlett weights) to down-weight distant lags. The number of lags included is the bandwidth.
Bandwidth choice: a common rule of thumb is m = 0.75 T1/3. Too few lags: SEs remain too small. Too many: SEs are over-corrected.
HAC SEs are the time-series counterpart of HC robust SEs in cross-section. Use them by default whenever you are unsure whether errors are serially correlated.

Distributed lag models capture delayed effects of X on Y.

Finite distributed lag (FDL) model.
  • Yt = α + β0Xt + β1Xt−1 + … + βqXt−q + ut
  • β0 = impact multiplier: immediate effect of a unit increase in X.
  • β0 + … + βq = long-run multiplier: total cumulative effect after q periods.
Autoregressive distributed lag (ARDL) model.
  • Adds lags of Y on the right-hand side. Parsimonious way to capture rich dynamics.
  • Nests the FDL model and the error-correction model as special cases.
Multicollinearity in lag models.
  • Lags of the same variable are highly correlated. Individual β estimates can be imprecise even when the long-run sum is well-estimated.

Controlling for trends and seasonality.

Linear time trend.
  • Include t as a regressor. Removes a deterministic linear trend from both Y and X.
  • Equivalent to Frisch-Waugh: regressing detrended Y on detrended X.
Quadratic or log trend.
  • Include t2 or log(t) for series with non-linear trends.
  • But trends can also be stochastic (unit root) rather than deterministic. Distinguish by unit root testing.
Seasonal dummies.
  • Include Q − 1 dummy variables for quarterly data (or 11 dummies for monthly).
  • Alternatively, use seasonally adjusted data from official sources (BLS, BEA).
Failing to detrend leads to spurious correlations.
  • Two trending series look correlated even when the detrended residuals are not.

Forecasting with time series models.

One-step-ahead forecast from AR(1).
  • ŶT+1|T = ρ̂YT. Forecast uncertainty grows as the horizon extends.
Model selection for forecasting.
  • Use AIC (Akaike) or BIC (Schwarz/Bayesian) to choose lag length.
  • BIC penalizes extra parameters more heavily and tends to select more parsimonious models.
Evaluate forecast performance out-of-sample.
  • In-sample fit is not forecast accuracy. A complex model can overfit.
  • Use RMSE or MAE on a held-out test set. A common benchmark is the random-walk forecast: ŶT+1 = YT.
Diebold-Mariano test.
  • Tests whether two forecasting models have equal predictive accuracy based on their forecast error sequences.

Common mistakes in time series analysis.

Regressing non-stationary series in levels without testing.
  • High R2 and significant t-stats can be entirely spurious. Run ADF first.
Using OLS standard errors with autocorrelated errors.
  • Even a small autocorrelation in errors can severely distort t-statistics. Use HAC SEs.
Differencing away cointegrated series.
  • If two series are cointegrated, differencing discards the long-run relationship. Use ECM.
Evaluating forecast models in-sample.
  • Always assess forecast accuracy on data not used in estimation. Time-ordered splitting matters, never shuffle time series data.
Practice Questions
Question 1 of 4
Time series introduces new hazards: autocorrelation, non-stationarity, and spurious regression.
  • Autocorrelated errors do not bias OLS, but they invalidate standard errors. Use HAC (Newey-West) SEs.
  • Non-stationary (I(1)) series must be differenced or tested for cointegration before regression.
  • Always run an ADF test before regressing trending series on each other.