[Avg. reading time: 18 minutes]
ML Models quick intro
Supervised Learning
In supervised learning, classification and regression are two distinct types of tasks, differing primarily in the nature of their output and the problem they solve.
Labeled historical data (e.g., sensor readings with timestamps of past failures).
Classification
Predicts discrete labels (categories or classes).
Example:
Binary: Failure (1) vs. No Failure (0).
Multi-class: Type of failure (bearing_failure, motor_overheat, lubrication_issue).
Regression
Predicts continuous numerical values.
Example:
Remaining Useful Life (RUL): 23.5 days until failure.
Time-to-failure: 15.2 hours.
Use Cases in Predictive Maintenance
Classification:
Answering yes/no questions:
- Will this motor fail in the next week?
- Is the current vibration pattern abnormal?
- Identifying the type of fault (e.g., electrical vs. mechanical).
Regression:
Quantifying degradation:
- How many days until the turbine blade needs replacement?
- What is the current health score (0–100%) of the compressor?
Algorithms
| Category | Algorithm | Description |
|---|---|---|
| Classification | Logistic Regression | Models probability of class membership. |
| Random Forest | Ensemble of decision trees for classification. | |
| Support Vector Machines (SVM) | Maximizes margin between classes. | |
| Neural Networks | Learns complex patterns and nonlinear decision boundaries. |
| Category | Algorithm | Description |
|---|---|---|
| Regression | Linear Regression | Models linear relationship between features and target. |
| Decision Trees (Regressor) | Tree-based model for predicting continuous values. | |
| Gradient Boosting Regressors | Ensemble of weak learners (e.g., XGBoost, LightGBM). | |
| LSTM Networks | Recurrent neural networks for time-series regression. |
Evaluation Metrics
Classification:
- Accuracy: % of correct predictions.
- Precision/Recall: Trade-off between false positives and false negatives.
- Precision: TP/(TP+FP)
- Recall: TP/(TP+FN)
- F1-Score: Harmonic mean of precision and recall.
Regression:
- Mean Absolute Error (MAE): Average absolute difference between predicted and actual values.
- Mean Squared Error (MSE): Penalizes larger errors.
- R² Score: How well the model explains variance in the data.
Unsupervised Learning
In unsupervised learning, clustering and anomaly detection serve distinct purposes and address different problems.
Primary Objective
Clustering
- Assigns each data point to a cluster (e.g., Cluster 1, Cluster 2).
- Outputs are groups of similar instances.
Goal: Group data points into clusters based on similarity.
- Focuses on discovering natural groupings or patterns in the data.
Example: Segmenting customers into groups for targeted marketing.
Anomaly Detection
- Labels data points as normal or anomalous (binary classification).
- Outputs are scores or probabilities indicating how “outlier-like” a point is.
Goal: Identify rare or unusual data points that deviate from the majority.
Focuses on detecting outliers or unexpected patterns.
Example: Flagging fraudulent credit card transactions.
Algorithms
| Category | Algorithm | Description |
|---|---|---|
| Clustering | K-Means | Partitions data into k spherical clusters. |
| Hierarchical Clustering | Builds nested clusters using dendrograms. | |
| DBSCAN | Groups dense regions and identifies sparse regions as outliers. | |
| Gaussian Mixture Models (GMM) | Probabilistic clustering using a mixture of Gaussians. | |
| Anomaly Detection | Isolation Forest | Isolates anomalies using random decision trees. |
| One-Class SVM | Learns a boundary around normal data to detect outliers. | |
| Autoencoders | Reconstructs input data; anomalies yield high reconstruction error. | |
| Local Outlier Factor (LOF) | Detects anomalies by comparing local density of data points. |
Time Series
Forecasting and Anomaly Detection are two fundamental but distinct tasks, differing in their objectives, data assumptions, and outputs.
| Model | Type | Strengths | Limitations |
|---|---|---|---|
| ARIMA/SARIMA | Classical | Simple, interpretable, strong for univariate, seasonal data | Requires stationary data, manual tuning |
| Facebook Prophet | Additive model | Easy to use, handles holidays/seasonality, works with missing data | Slower for large datasets, limited to trend/seasonality modeling |
| Holt-Winters (Exponential Smoothing) | Classical | Lightweight, works well with level/trend/seasonality | Not good with irregular time steps or complex patterns |
| LSTM (Recurrent Neural Network) | Deep Learning | Learns long-term dependencies, supports multivariate | Requires lots of data, training is resource-intensive |
| XGBoost + Lag Features | Machine Learning | High performance, flexible with engineered features | Requires feature engineering, not “true” time series model |
| NeuralProphet | Hybrid (Prophet + NN) | Better performance than Prophet, supports regressors/events | Heavier than Prophet, still maturing |
| Temporal Fusion Transformer (TFT) | Deep Learning | SOTA for multivariate forecasts with interpretability | Overkill for small/medium IoT data, very heavy |
| Layer | Model(s) | Why |
|---|---|---|
| Edge | Holt-Winters, thresholds, micro-LSTM (TinyML), Prophet (inference) | Extremely lightweight, low latency |
| Fog | Prophet, ARIMA, Isolation Forest, XGBoost | Moderate compute, supports both real-time + near-real-time |
| Cloud | LSTM, TFT, NeuralProphet, Prophet (training), XGBoost | Can handle heavy training, multivariate data, batch scoring |
git clone https://github.com/gchandra10/python_iot_ml_demo.git
````<span id='footer-class'>Ver 6.0.5</span>
<footer id="last-change">Last change: 2026-02-05</footer>````