Introduction
ARIMA models, standing for AutoRegressive Integrated Moving Average, is a great model for predicting temporal time series data with a complex nature. Unlike basic regression models, which may struggle to capture complexity with just a few variables (usually due to autocorrelation, trend and seasonality), ARIMA excels in treating autocorrelation, trend and seasonality through a built-in differencing function (the "Integrated" part of ARIMA); thus, it removes the need for specifying explanatory variables and extra transformation. Examples of application include finance, weather predictions, and even anticipating website traffic.
Define the model
Let's build the ARIMA model from scratch. As discussed, the ARIMA model is made of three parts: an autoregressive part, an integrated part and a moving average part.
For example, given some time series
The autoRegressive (AR) component says that an observation at certain point in time , can be described as a linear combination of its lagged observations (ie. prior time points):
where are the parameters or coefficients of such regression.
(Note: if you find the terms linear combination and linear regression confusing, a linear regression is model that outputs predictions based on a linear combination of input features).
Using the lag operator ( ), you can also express the above as:
An optimisation of the AR part for the ARIMA model is an optimisation of the number of previous terms - or the number of time lags to be included in the linear combination. We call the order of the autoregressive model.
For example, when we say the AR part has an order of two, we symbolise it as , and express it as:
Now, equivalently,
can also be expressed as a combination of previous error terms. This is described in the Moving Average part of the ARIMA model:
where:
are the parameters/ coefficients of the linear combination of the error terms
are the error terms at time
is the order of this moving average.
Note that the concept of moving average here can be misleading, as moving average in this context simply refers to the moving window of the previous errors, it does not take an average of anything.
Since both (1) and (2) describe
, we can add them together:
This is often expressed as:
We say that, given time series data
where
is an integer index and the
are real numbers, an
model is given by:
Why do we need both an AR component and an MA component?
Since the AR(p) component is a combination of the past 'p' values, and the values of the series become more and more dependent on its past values when moving along the time series. Hence, it captures the trend and autocorrelation in the series.
Meanwhile, the MA(q) part is a combination of the past 'q' forecast errors, so it captures the "shock" (unexpected changes) to the model instead. The impact of a shock in the series decreases over time (called "shock decay"). Thus, the effect of an error made at a particular point in time diminishes as we move further away from that point.
In short, the AR(p) part describes the overall and long term changes, whereas the MA(q) describes the short-term changes. Together, they allow the ARIMA model to capture a wide range of time series patterns.
Top comments (0)