The foundation of statistical inference in time series analysis is the concept of weak stationarity.
roughly horizontal
constant variance
no patterns predictable in the long-term
Are these financial time series stationary?
tsfe::indices %>%select(date,`RUSSELL 2000 - PRICE INDEX`) %>%rename(r2000=`RUSSELL 2000 - PRICE INDEX`) %>%drop_na() %>%tq_transmute(select =r2000,mutate_fun = periodReturn,type='log') ->monthly_r2002rts(monthly_r2002r$monthly.returns, start =c(1988,1))->r2000r_m_tsautoplot(r2000r_m_ts) +ylab("Log returns") +xlab("Year") +labs(title="Figure 2: Monthly log returns of the Russell 2000 Price Index",subtitle =" from March 1988 to December 2019")
autoplot(tsfe::carnival_eps_ts) +xlab("Year") +ylab("Earnings") +labs(title="Figure 3:",subtitle ="Quarterly earnings per share for Carnival Plc from the first quarter of 1994 to the fourth quarter of 2019")
Inference and stationarity
The monthly log returns of Russell 2000 index vary around zero over time.
If we divide up the data into subperiods we would expect each sample mean to be roughly zero.
Furthermore, expect the recent financial crisis (2007-2009), the log returns range is approximately [-0.2,0.2].
Statistically, the mean and the variance are constant over time OR time invariant.
Put together these to time invariant properties characterise a weakly stationary series.
Weak stationarity and prediction
Weak form stationarity provides a basic framework for prediction.
For the monthly log returns of the Russell 2000 we can predict with reasonable confidence:
Future monthly returns \(\approx0\) and vary \([-0.2,0.2]\)
Inference and nonstationarity
Consider quarterly earnings for Carnival Plc.
library(fpp2)tsfe::carnival_eps_ts |>autoplot()
Inference and nonstationarity
If the timespan is divided into subperiods the sample mean and variance for each period show increasing pattern.
Earnings are not weakly stationary.
There does exist models and methods for modelling such nonstationary series.
Seasonally differenced series is closer to being stationary.
Remaining non-stationarity can be removed with further first difference.
If \(y'_t = y_t - y_{t-12}\) denotes seasonally differenced series, then twice-differenced series i
When both seasonal and first differences are applied
it makes no difference which is done first the result will be the same.
If seasonality is strong, we recommend that seasonal differencing be done first because sometimes the resulting series will be stationary and there will be no need for further first difference.
It is important that if differencing is used, the differences are interpretable.
Interpretation of differencing
first differences are the change between one observation and the next;
seasonal differences are the change between one year to the next.
But taking lag 3 differences for yearly data, for example, results in a model which cannot be sensibly interpreted.
Unit root tests
Statistical tests to determine the required order of differencing
Augmented Dickey Fuller test: null hypothesis is that the data are non-stationary and non-seasonal.
Kwiatkowski-Phillips-Schmidt-Shin (KPSS) test: null hypothesis is that the data are stationary and non-seasonal.
Other tests available for seasonal data.
KPSS test
library(urca)summary(ur.kpss(ftse_m_ts))
#######################
# KPSS Unit Root Test #
#######################
Test is of type: mu with 3 lags.
Value of test-statistic is: 0.3983
Critical value for a significance level of:
10pct 5pct 2.5pct 1pct
critical values 0.347 0.463 0.574 0.739
where \(\varepsilon_t\) is white noise. - This is a multiple regression with past errors as predictors. Don’t confuse this with moving average smoothing!
The points estimates are visualised as banded limits at the 80% and 95% forecasting levels
fit %>%forecast(h=10) %>%autoplot(include=80)
What is the Correct Interpretation of a 95% Confidence Interval for a Population Mean?
A. There is a 95% probability that the population mean falls within the calculated confidence interval.
B. If we were to draw 100 different samples and compute a 95% confidence interval for each sample, we would expect about 95 of these intervals to contain the population mean.
C. The population mean is 95% likely to be the center point of the calculated confidence interval.
D. There is a 95% chance that any given sample mean falls within the calculated confidence interval.
E. The calculated confidence interval captures the range within which 95% of the population data falls.
Understanding ARIMA models
If \(c=0\) and \(d=0\), the long-term forecasts will go to zero.
If \(c=0\) and \(d=1\), the long-term forecasts will go to a non-zero constant.
If \(c=0\) and \(d=2\), the long-term forecasts will follow a straight line.
If \(c\ne0\) and \(d=0\), the long-term forecasts will go to the mean of the data.
If \(c\ne0\) and \(d=1\), the long-term forecasts will follow a straight line.
If \(c\ne0\) and \(d=2\), the long-term forecasts will follow a quadratic trend.
Understanding ARIMA models
Forecast variance and \(d\)
The higher the value of \(d\), the more rapidly the prediction intervals increase in size.
For \(d=0\), the long-term forecast standard deviation will go to the standard deviation of the historical data.
Understanding ARIMA models
Cyclic behaviour
For cyclic forecasts, \(p\ge2\) and some restrictions on coefficients are required.
If \(p=2\), we need \(\phi_1^2+4\phi_2<0\). Then average length of stochastic cycles is
This formula has important uses in estimation business and economic cycles. (See Example 2.3 in Tsay (2010))
Model building
Maximum likelihood estimation (MLE)
Having identified the model order, we need to estimate the parameters \(c,\phi_1,\dots,\phi_p \text{ }\theta_1,\dots,\theta_q\).
MLE is very similar to least squares estimation obtained by minimizing \(\sum_{t-1}^T e_t^2\)
The Arima() command allows MLE estimation and constrained least squares(CLS).
Non-linear optimization must be used in either case.
Different software will give different estimates.
Partial autocorrelations
Partial autocorrelations} measure relationship between \(y_{t}\) and \(y_{t - k}\), when the effects of other time lags \(1,2, 3, \dots, k - 1\)are removed.
PACF dies out in an exponential or damped sine-wave manner
ACF has all zero spikes beyond the \(q\)th spike
So we have an MA(q) model when
the PACF is exponentially decaying or sinusoidal
there is a significant spike at lag \(q\) in ACF, but none beyond \(q\)
Information criteria for model selection
In advanced financial modelling we use information theory to scientifically determine which of our model choices contains the most statistical information.
Akaike’s Information Criterion (AIC):
\(\text{AIC} = -2 \log(L) + 2(p+q+k+1),\) where \(L\) is the likelihood of the data, \(k=1\) if \(c\ne0\) and \(k=0\) if \(c=0\)]
Bayesian Information Criterion: \(\text{BIC} = \text{AIC} + [\log(T)-2](p+q+k-1).\) Good models are obtained by minimizing either the AIC, AICc or BIC. My preference is to use the AICc.
Powerful non-stationary model in finance
In financial time series an important class on non-stationary times series model is the random walk model
A random walk can be define as \(y_t=y_{t-1}+ error_t\) or its drift variation \(y_t= constant + y_{t-1}+ error_t\)
Simulation 1
\(y_t = 10 + 0.99y_{t-1}+ \varepsilon_t\)
set.seed(1)autoplot(10+arima.sim(list(ar =0.99), n =100)) +ylab("") +ggtitle("Is this a random walk with drift?")
Simulation 1
set.seed(2)S0=10n=100chgs=rnorm(n-1,1.001,0.01)rw=ts(cumprod(c(S0,chgs)))autoplot(rw) +ylab("") +ggtitle("Is this a random walk with drift?")
AI automation for ARIMA models
auto.arima()
For a non-seasonal ARIMA process we first need to select appropriate orders: \(p,q,d\)
Setting both stepwise and approximation arguments to FALSE will slow the automation down but provides a more exhaustive search for the appropriate model.
The auto.arima function then searches over all possible models using MLE.
See help(auto.arima) for more details.
Human Vs Algo: Residual Diagnostics
Human choice
checkresiduals(fit, test=FALSE)
Human forecasting
fit %>%forecast(h=252) %>% autoplot
Algorithm
checkresiduals(fit.auto,test =FALSE)
Algo forecasting
fit.auto %>%forecast(h=252) %>%autoplot()
Modelling procedure with Arima
This is sometimes referred to as the Box-Jenkins approach
Plot the data. Identify any unusual observations.
If necessary, transform the data (using a Box-Cox transformation) to stabilize the variance.
If the data are non-stationary: take first differences of the data until the data are stationary.
Examine the ACF/PACF: Is an AR(p) or MA(q) model appropriate?
Try your chosen model(s), and use the AICc to search for a better model.
Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a portmanteau test of the residuals. If they do not look like white noise, try a modified model.
Once the residuals look like white noise, calculate forecasts.
Modelling procedure with auto.arima
Plot the data. Identify any unusual observations.
If necessary, transform the data (using logs) to stabilize the variance.
Use auto.arima to select a model.
Check the residuals from your chosen model by plotting the ACF of the residuals, and doing a portmanteau test of the residuals. If they do not look like white noise, try a modified model.
Once the residuals look like white noise, calculate forecasts.
Time plot shows sudden changes, particularly big movements in 2007/2008 due to financial crisis. Otherwise nothing unusual and no need for data adjustments.
Little evidence of changing variance, so no log transformation needed.
Data are clearly stationary, so no differencing required.
Project based example
ggtsdisplay(r2000r_m_ts)
PACF is suggestive of AR(5). So initial candidate model is ARIMA(5,0,0). No other obvious candidates.
Fit ARIMA(5,0,0) model along with variations: ARIMA(4,0,0), ARIMA(3,0,0), ARIMA(4,0,1), etc. ARIMA(3,0,1) has smallest AICc value.
The standard errors are 0.13, 0.06, 0.05, 0.12 and 0.002, respectively.
This suggest that only the AR1 and the constant (mean) are more than 2 SEs away from zero and thus statistically significant.
The significance of \(\phi_0\) of this entertained model implies that the expected mean return of the series is positive.
In fact \(\hat{\mu}=0.006/(1-(1.047-0.094-0.003)) =0.12\) which is small but has long term implications.
Using the multi-period return definition from the financial data lecture an annualised log return is simple \(\sum_1^{12} y_t\)\(\approx 1.44\) per annum.
Frequentist prediction intervals
\[\hat{y}_{T+h|T} \pm 1.96\sqrt{v_{T+h|T}}\] where \(v_{T+h|T}\) is estimated forecast variance.
\(v_{T+1|T}=\hat{\sigma}^2\) for all ARIMA models regardless of parameters and orders.
Multi-step prediction intervals for ARIMA(0,0,\(q\)):