[Avg. reading time: 18 minutes]

ML Models quick intro

Supervised Learning

In supervised learning, classification and regression are two distinct types of tasks, differing primarily in the nature of their output and the problem they solve.

Labeled historical data (e.g., sensor readings with timestamps of past failures).

Classification

Predicts discrete labels (categories or classes).

Example:

Binary: Failure (1) vs. No Failure (0).

Multi-class: Type of failure (bearing_failure, motor_overheat, lubrication_issue).

Regression

Predicts continuous numerical values.

Example:

Remaining Useful Life (RUL): 23.5 days until failure.

Time-to-failure: 15.2 hours.

Use Cases in Predictive Maintenance

Classification:

Answering yes/no questions:

  • Will this motor fail in the next week?
  • Is the current vibration pattern abnormal?
  • Identifying the type of fault (e.g., electrical vs. mechanical).

Regression:

Quantifying degradation:

  • How many days until the turbine blade needs replacement?
  • What is the current health score (0–100%) of the compressor?

Algorithms

CategoryAlgorithmDescription
ClassificationLogistic RegressionModels probability of class membership.
Random ForestEnsemble of decision trees for classification.
Support Vector Machines (SVM)Maximizes margin between classes.
Neural NetworksLearns complex patterns and nonlinear decision boundaries.
CategoryAlgorithmDescription
RegressionLinear RegressionModels linear relationship between features and target.
Decision Trees (Regressor)Tree-based model for predicting continuous values.
Gradient Boosting RegressorsEnsemble of weak learners (e.g., XGBoost, LightGBM).
LSTM NetworksRecurrent neural networks for time-series regression.

Evaluation Metrics

Classification:

  • Accuracy: % of correct predictions.
  • Precision/Recall: Trade-off between false positives and false negatives.
    • Precision: TP/(TP+FP)
    • Recall: TP/(TP+FN)
  • F1-Score: Harmonic mean of precision and recall.

Regression:

  • Mean Absolute Error (MAE): Average absolute difference between predicted and actual values.
  • Mean Squared Error (MSE): Penalizes larger errors.
  • R² Score: How well the model explains variance in the data.

Unsupervised Learning

In unsupervised learning, clustering and anomaly detection serve distinct purposes and address different problems.

Primary Objective

Clustering

  • Assigns each data point to a cluster (e.g., Cluster 1, Cluster 2).
  • Outputs are groups of similar instances.

Goal: Group data points into clusters based on similarity.

  • Focuses on discovering natural groupings or patterns in the data.

Example: Segmenting customers into groups for targeted marketing.

Anomaly Detection

  • Labels data points as normal or anomalous (binary classification).
  • Outputs are scores or probabilities indicating how “outlier-like” a point is.

Goal: Identify rare or unusual data points that deviate from the majority.

Focuses on detecting outliers or unexpected patterns.

Example: Flagging fraudulent credit card transactions.

Algorithms

CategoryAlgorithmDescription
ClusteringK-MeansPartitions data into k spherical clusters.
Hierarchical ClusteringBuilds nested clusters using dendrograms.
DBSCANGroups dense regions and identifies sparse regions as outliers.
Gaussian Mixture Models (GMM)Probabilistic clustering using a mixture of Gaussians.
Anomaly DetectionIsolation ForestIsolates anomalies using random decision trees.
One-Class SVMLearns a boundary around normal data to detect outliers.
AutoencodersReconstructs input data; anomalies yield high reconstruction error.
Local Outlier Factor (LOF)Detects anomalies by comparing local density of data points.

Time Series

Forecasting and Anomaly Detection are two fundamental but distinct tasks, differing in their objectives, data assumptions, and outputs.

ModelTypeStrengthsLimitations
ARIMA/SARIMAClassicalSimple, interpretable, strong for univariate, seasonal dataRequires stationary data, manual tuning
Facebook ProphetAdditive modelEasy to use, handles holidays/seasonality, works with missing dataSlower for large datasets, limited to trend/seasonality modeling
Holt-Winters (Exponential Smoothing)ClassicalLightweight, works well with level/trend/seasonalityNot good with irregular time steps or complex patterns
LSTM (Recurrent Neural Network)Deep LearningLearns long-term dependencies, supports multivariateRequires lots of data, training is resource-intensive
XGBoost + Lag FeaturesMachine LearningHigh performance, flexible with engineered featuresRequires feature engineering, not “true” time series model
NeuralProphetHybrid (Prophet + NN)Better performance than Prophet, supports regressors/eventsHeavier than Prophet, still maturing
Temporal Fusion Transformer (TFT)Deep LearningSOTA for multivariate forecasts with interpretabilityOverkill for small/medium IoT data, very heavy

LayerModel(s)Why
EdgeHolt-Winters, thresholds, micro-LSTM (TinyML), Prophet (inference)Extremely lightweight, low latency
FogProphet, ARIMA, Isolation Forest, XGBoostModerate compute, supports both real-time + near-real-time
CloudLSTM, TFT, NeuralProphet, Prophet (training), XGBoostCan handle heavy training, multivariate data, batch scoring
git clone https://github.com/gchandra10/python_iot_ml_demo.git
````<span id='footer-class'>Ver 6.0.5</span>
<footer id="last-change">Last change: 2026-02-05</footer>````