ARIMA and SARIMA Models

ARIMA and SARIMA: Classical Forecasting

Before jumping into complex deep learning models like LSTMs or Transformers, a data scientist must understand the statistical foundations. ARIMA and SARIMA provide interpretable, robust baselines for univariate time series forecasting.

The Foundation: Stationarity

Statistical models assume the data is stationary—meaning its mean, variance, and autocorrelation do not change over time. If your stock prices are always going up (a trend), the mean is changing. We use the Integrated (I) component of ARIMA to difference the data until it's stationary.

Anatomy of ARIMA(p, d, q)

AR (p) - AutoRegressive: Uses the dependent relationship between an observation and some number of lagged observations.
I (d) - Integrated: The use of differencing of raw observations in order to make the time series stationary.
MA (q) - Moving Average: Uses the dependency between an observation and a residual error from a moving average model applied to lagged observations.

Adding Seasonality: SARIMA

Real-world data often has cycles: retail sales spike in December, electricity usage peaks in summer. ARIMA cannot handle this naturally. SARIMA extends ARIMA by adding four seasonal parameters: (P, D, Q, s), where s is the length of the season.

❓ AI Search Knowledge Base

How do I determine p and q in ARIMA?

Use the Autocorrelation Function (ACF) and Partial Autocorrelation Function (PACF) plots. The PACF plot helps determine the AR term (p) by identifying where the plot cuts off. The ACF plot helps determine the MA term (q) by identifying where its plot cuts off.

What is the Augmented Dickey-Fuller (ADF) test?

The ADF test is a statistical test used to determine whether a given time series is stationary. The null hypothesis states the series has a unit root (is non-stationary). If the p-value is less than a threshold (e.g., 0.05), we reject the null hypothesis and assume stationarity.

SARIMA vs ARIMA: When to use which?

Use standard ARIMA when your data exhibits trends but no repeating cyclical patterns. Use SARIMA when your data has clear periodic fluctuations (like daily temperature or quarterly earnings) because it explicitly models the seasonal lag interactions.

Modeler's Lexicon

Stationarity

A property where the statistical characteristics of a series (mean, variance) are constant over time.

Differencing

Subtracting the previous observation from the current observation to stabilize the mean of a time series.

ACF (Autocorrelation)

Measures the correlation between a time series and its lagged version.

PACF (Partial Autocorr.)

Measures the correlation between a time series and its lagged version, controlling for the values of the intermediate lags.

statsmodels

The primary Python library used for estimating classical statistical models, including ARIMA and SARIMA.

ARIMA & SARIMA

Architecture Map

Concept: ARIMA

Logic Verification

Forecasting Labs

Data Science Holo-Net