[Avg. reading time: 15 minutes]
Anomaly Detection
Anomaly detection and predictive maintenance are important parts of the IoT upper stack. They help analyze device and sensor data to detect unusual behavior early and reduce the chance of equipment failure.
Anomaly Detection in IoT
Anomaly detection identifies data points or patterns that do not match normal system behavior.
In IoT systems, this is useful for:
- detecting abnormal sensor readings
- identifying device malfunctions
- spotting unusual operational behavior
- triggering alerts before failures become serious
This is especially valuable in industrial IoT, smart manufacturing, healthcare, logistics, and other environments where sensor data arrives continuously.
Common Approaches
Statistical Methods
Statistical approaches define a baseline of normal behavior and flag values that deviate significantly from it.
Examples:
- mean and standard deviation
- z-score
- moving averages
- seasonal thresholds
These methods are simple and fast, but they may struggle when the data is complex or changes over time.
Machine Learning Techniques
Machine learning models learn patterns from historical data and identify points that do not fit those patterns.
Examples:
- Isolation Forest
- One-Class SVM
- Local Outlier Factor
- clustering-based approaches
These methods are useful when normal behavior is not easy to define with simple rules.
Deep Learning Models
Deep learning models can detect anomalies in high-dimensional or sequential IoT data.
Examples:
- autoencoders
- LSTM-based sequence models
- transformer-based time-series models
These models are powerful, but they usually require more data, more tuning, and more compute.
Isolation Forest
Isolation Forest is one of the most practical algorithms for anomaly detection.
Unlike many other methods, it does not rely on distance or density. Instead, it works on a simple idea:
Anomalies are few and different, so they are easier to isolate than normal points.
Core Idea
Isolation Forest builds many random trees.
In each tree:
- a feature is selected randomly
- a split value is selected randomly
- the data is repeatedly divided until individual points become isolated
A point that gets isolated quickly is more likely to be an anomaly.
A point that needs more splits to isolate is more likely to be normal.
Why It Works
Normal points usually belong to dense regions of the dataset, so they take more splits to separate.
Anomalies are often far away from the bulk of the data, so they get isolated in fewer steps.
That is why:
- shorter path length > more anomalous
- longer path length > more normal
Simple Example
Dataset: [-100, 2, 11, 13, 100]
In practice, Isolation Forest builds many trees (100+).
Here we show only 4 trees for understanding.
Tree 1
Root
|
[Split at value = 7]
/ \
[-100, 2] [11, 13, 100]
| |
[Split at value = -49] [Split at value = 56]
/ \ / \
[-100] [2] [11, 13] [100]
Path lengths:
- -100 → 2
- 2 → 2
- 11 → 3
- 13 → 3
- 100 → 2
Tree 2
Root
|
[Split at value = 1]
/ \
[-100] [2, 11, 13, 100]
|
[Split at value = 50]
/ \
[2, 11, 13] [100]
Approx path lengths:
- -100 → 1
- 100 → 2
- 2, 11, 13 → 3 to 4
Tree 3
Root
|
[Split at value = 12]
/ \
[-100, 2, 11] [13, 100]
| |
[Split at value = -40] [Split at value = 57]
/ \ / \
[-100] [2, 11] [13] [100]
Path lengths:
- -100 → 2
- 2 → 3
- 11 → 3
- 13 → 2
- 100 → 2
Tree 4
Root
|
[Split at value = 80]
/ \
[-100, 2, 11, 13] [100]
|
[Split at value = -50]
/ \
[-100] [2, 11, 13]
Approx path lengths:
- 100 → 1
- -100 → 2
- others → 3+
Average Path Length
- -100 → (2 + 1 + 2 + 2) / 4 = 1.75
- 2 → (2 + 3 + 3 + 3) / 4 = 2.75
- 11 → (3 + 3 + 3 + 3) / 4 = 3.00
- 13 → (3 + 3 + 2 + 3) / 4 = 2.75
- 100 → (2 + 2 + 2 + 1) / 4 ≈ 2.0
Anomaly Score
s(x, n) = 2^(-E[h(x)] / c(n))
Where:
- E[h(x)] = average path length
- c(n) = normalization factor
Score meaning:
- closer to 1 > anomaly
- closer to 0 > normal
Interpretation
The extreme values (-100 and 100) are isolated faster than the middle values.
That means:
- -100 and 100 > anomalies
- 2, 11, 13 > normal points
Key Points
- anomalies are few and different
- random splits isolate anomalies faster
- path length determines anomaly likelihood
- ensemble of trees improves reliability
- no distance calculations required
- scales well for large datasets
Advantages
- simple and intuitive
- fast and scalable
- works with high-dimensional data
- no need for distance calculations
- good for unsupervised learning
Limitations
- struggles with clustered anomalies
- sensitive when anomalies are near normal data
- randomness can cause variation in small datasets
- threshold selection is use-case dependent
Isolation Forest in IoT
Used for:
- temperature anomalies
- vibration anomalies
- pressure irregularities
- device failure prediction
- real-time alerting
Applications:
- predictive maintenance
- fault detection
- industrial monitoring