Time series forecasting in 2026 has a richer toolkit than ever — classical methods remain strong baselines; ML approaches dominate many business forecasts; neural foundation models enable zero-shot forecasting for new series. Method selection depends on data characteristics. This post is the practical guidance on which method fits which problem.
Classical methods
ARIMA / SARIMA. AutoRegressive Integrated Moving Average. Mature, well-understood, strong baselines.
Exponential smoothing (ETS). Trend and seasonality via weighted averages. Simple and effective.
State-space models. Kalman filter and extensions. Handle missing data, incorporate external covariates.
When they win. Few series, long history, clear seasonality. Business forecasting traditional use case.
Libraries. statsmodels (Python), forecast (R). Mature, battle-tested.
ML-based methods
Gradient boosting (LightGBM, XGBoost). Feature engineering heavy — lags, rolling stats, calendar features, categorical encodings.
Multi-series efficiency. Train one model across thousands of series with hierarchical features. Much more data-efficient than per-series classical.
Non-linear relationships. Capture interactions classical methods miss.
M4 and M5 competitions. ML methods (particularly gradient boosting) won or placed highly. Validated for business forecasting.
Neural and foundation models
N-BEATS, N-HiTS. Specialized neural architectures for time series. Strong on univariate forecasting.
Temporal Fusion Transformer (TFT). Handles multivariate, external covariates, interpretability features.
PatchTST, iTransformer. Transformer variants adapted for time series.
Chronos (Amazon), TimeGPT (Nixtla), Moirai (Salesforce). Foundation models — pretrained on many series; zero-shot forecasting possible.
When they win. Cold-start scenarios with no history; complex multivariate; non-stationary patterns.
Method selection
Few series (1-100), long history (years). Classical methods often win. Simple, interpretable.
Many series (1000s), rich features. ML-based (LightGBM) typically best. Efficient at scale.
Cold-start, no history. Foundation models viable. TimeGPT, Chronos for zero-shot.
Production constraints. Classical and ML easier to serve; foundation models require more infrastructure.
Always benchmark. Run multiple methods on your data; pick what works.
Practical workflows
Data preparation. Handle missing values; detect outliers; ensure consistent frequency.
Feature engineering (for ML). Lags, rolling stats, calendar features, holiday indicators, external regressors.
Train/test split. Time-based, not random. Respect temporal order.
Cross-validation. Expanding window or sliding window CV. Standard random CV inappropriate for time series.
Evaluation
MAPE (Mean Absolute Percentage Error). Interpretable; problems with near-zero actuals.
RMSE, MAE. Standard regression metrics applied to forecast.
MASE (Mean Absolute Scaled Error). Scale-independent; compares to naive baseline.
Quantile loss. For probabilistic forecasts; evaluates distribution predictions.
Business metrics. Inventory cost, stockout frequency, revenue impact. Technical metrics inform; business metrics validate.
Probabilistic forecasts
Point forecasts insufficient for decision-making. Need distribution.
Quantile regression, Bayesian methods, conformal prediction. Various paths to uncertainty.
Prediction intervals. What is 80% / 95% confidence range around forecast?
Required for inventory optimization, risk management, capacity planning.
Tools
Darts (Python), NeuralForecast, StatsForecast (Nixtla). Unified APIs across many methods.
Prophet (Facebook). Popular, especially for business forecasting.
sktime. Scikit-learn-compatible time series library.
Commercial. AWS Forecast, Google Vertex AI Forecasting, Azure ML Forecasting.