The past is often the best predictor of the future. AR models formalize this intuition by treating previous data points as regression features.
1Regression on Self
An Autoregressive (AR) model predicts the current value of a series by taking a weighted sum of its own previous values. The formula is $Y(t) = eta_0 + eta_1 Y(t-1) + dots + eta_p Y(t-p) + epsilon$. This is essentially linear regression, where the 'features' are simply lagged versions of the target itself. This makes AR models exceptionally good for data that exhibits Momentum or Mean Reversion.
2The Stationary Standard
Statistical models like AR assume that the rules governing the data don't change over time. This is called Stationarity. A stationary series has a constant mean and variance. If your data has a trend (it's going up) or seasonality (it repeats), it is Non-Stationary. You must 'transform' it—usually by Differencing (subtracting yesterday's value from today's)—to make it stationary before the AR model can work correctly.
3Finding 'p' with PACF
How many lags should you use? We use the Partial Autocorrelation Function (PACF) plot. Unlike a standard correlation plot, the PACF shows the correlation between $Y(t)$ and $Y(t-k)$ *after removing the influence of all intermediate lags*. If the PACF 'cuts off' after 3 lags, it suggests an AR(3) model is the best fit. This prevents you from adding redundant features that would overfit the model.
