Anomaly detection serves different purposes — fraud, infrastructure monitoring, quality control, security. Each has different data characteristics and tolerance for false positives. The right technique depends on context. This post is the approaches used in 2026 and guidance on which fits which use case.
Statistical approaches
Z-score. How many standard deviations from mean? Simple, interpretable, fast. Assumes Gaussian distribution.
IQR (interquartile range). Outliers defined as beyond 1.5×IQR from quartiles. More robust to non-Gaussian data.
MAD (median absolute deviation). Robust version of standard deviation. Less sensitive to outliers.
Limitations. Univariate; doesn't handle seasonality well; struggles with complex patterns.
ML-based approaches
Isolation Forest. Random tree-based approach; anomalies easy to isolate. Fast, handles high-dimensional data.
LOF (Local Outlier Factor). Density-based. Compares point density to neighbors. Good for local anomalies.
Autoencoders. Train to reconstruct normal data; anomalies have high reconstruction error.
Handles multivariate patterns. Where statistical approaches flatten to univariate, ML captures interactions.
Time series approaches
Prophet (Facebook). Decomposes into trend, seasonality, holidays. Good baseline for business time series.
STL decomposition. Classic statistical approach; trend + seasonal + residual. Flag anomalies in residual.
Neural time series. Transformers, PatchTST, etc. Better on complex patterns; more expensive.
Foundation models. TimeGPT, Chronos work zero-shot on time series. Useful for cold-start scenarios.
Use case fit
Fraud detection. ML-based dominates. Rich features (device, location, behavior patterns); high stakes; imbalanced classes.
Infrastructure monitoring. Time series. Seasonal patterns (workday vs weekend); known baselines; drift over time.
Quality control. Hybrid. Statistical thresholds on known metrics; ML for complex multivariate patterns.
Network security. ML-based; anomalies in packet patterns, flow characteristics.
Financial time series. Classical (GARCH-family) still competitive; neural approaches on specific patterns.
Practical considerations
False positive rate. Anomaly detection often has high false positive rate. Downstream handling (triage, investigation) must scale.
Labeled data. Rare. Most anomaly detection is unsupervised. Semi-supervised approaches useful when some labels exist.
Concept drift. What's anomalous changes over time. Models need retraining or adaptation.
Interpretability. Often need to explain why something is anomalous. Statistical and rule-based more interpretable.
Evaluation
Precision and recall. Precision: what fraction of flagged items are real anomalies? Recall: what fraction of real anomalies flagged?
ROC AUC. Trading off sensitivity and specificity at various thresholds.
Cost-weighted metrics. False positives and false negatives have different costs. Weight accordingly.
Time to detection. For monitoring, how quickly is anomaly flagged after it starts?
Tools
scikit-learn. IsolationForest, LocalOutlierFactor, EllipticEnvelope. Baseline tools.
PyOD. Broader anomaly detection library. Many algorithms.
Prophet, statsmodels. Time series classical approaches.
Datadog, Dynatrace, Splunk. Observability platforms with built-in anomaly detection.
Specialized. Fraud detection (Sift, Kount), AML (ComplyAdvantage), industrial (Seeq, OSIsoft).