[Avg. reading time: 17 minutes]

ML Models quick intro

Supervised Learning

In supervised learning, classification and regression are two distinct types of tasks, differing primarily in the nature of their output and the problem they solve.

Labeled historical data (e.g., sensor readings with timestamps of past failures).

Classification

Predicts discrete labels (categories or classes).

Example:

Binary: Failure (1) vs. No Failure (0).

Multi-class: Type of failure (bearing_failure, motor_overheat, lubrication_issue).

Regression

Predicts continuous numerical values.

Example:

Remaining Useful Life (RUL): 23.5 days until failure.

Time-to-failure: 15.2 hours.

Use Cases in Predictive Maintenance

Classification:

Answering yes/no questions:

  • Will this motor fail in the next week?
  • Is the current vibration pattern abnormal?
  • Identifying the type of fault (e.g., electrical vs. mechanical).

Regression:

Quantifying degradation:

  • How many days until the turbine blade needs replacement?
  • What is the current health score (0–100%) of the compressor?

Algorithms

CategoryAlgorithmDescription
ClassificationLogistic RegressionModels probability of class membership.
Random ForestEnsemble of decision trees for classification.
Support Vector Machines (SVM)Maximizes margin between classes.
Neural NetworksLearns complex patterns and nonlinear decision boundaries.
CategoryAlgorithmDescription
RegressionLinear RegressionModels linear relationship between features and target.
Decision Trees (Regressor)Tree-based model for predicting continuous values.
Gradient Boosting RegressorsEnsemble of weak learners (e.g., XGBoost, LightGBM).
LSTM NetworksRecurrent neural networks for time-series regression.

Evaluation Metrics

Classification:

  • Accuracy: % of correct predictions.
  • Precision/Recall: Trade-off between false positives and false negatives.
    • Precision: TP/(TP+FP)
    • Recall: TP/(TP+FN)
  • F1-Score: Harmonic mean of precision and recall.

Example:

Will the temperatue exceed 90F in 10 mins?

Positive: Yes will cross 90F Negative: Will not cross 90F

True Positive

Model: Temp will cross 90 Actual: It did cross 90

Result: Correct and we are prepared.

False Positive

Model: Temp will cross 90 Actual: Didn’t cross

Result: Predicted heat but it never happened.

True Negative

Model: Temp will be less than 90 Actual: Temp stayed less than 90

Result: Predicted low and it happened.

False Negative

Model: Temp will be less than 90 Actual: Temp went above 90

Result: Missed issue.

In IoT - False Negative are risky, False Positive are annoyance.

Regression:

  • Mean Absolute Error (MAE): Average absolute difference between predicted and actual values.
  • Mean Squared Error (MSE): Penalizes larger errors.
  • R² Score: How well the model explains variance in the data.

Unsupervised Learning

In unsupervised learning, clustering and anomaly detection serve distinct purposes and address different problems.

Primary Objective

Clustering

  • Assigns each data point to a cluster (e.g., Cluster 1, Cluster 2).
  • Outputs are groups of similar instances.

Goal: Group data points into clusters based on similarity.

  • Focuses on discovering natural groupings or patterns in the data.

Example: Segmenting devices into groups based on usage.

RoomTempHumidityCO₂Occupancy
R12240500Low
R22342520Low
R32860900High
R42965950High

Cluster 1 - R1 and R2, Cluster 2 - R3 and R4

Anomaly Detection

  • Labels data points as normal or anomalous (binary classification).
  • Outputs are scores or probabilities indicating how “outlier-like” a point is.

Goal: Identify rare or unusual data points that deviate from the majority.

Focuses on detecting outliers or unexpected patterns.

Example: Flagging fraudulent credit card transactions.

Algorithms

CategoryAlgorithmDescription
ClusteringK-MeansPartitions data into k spherical clusters.
Hierarchical ClusteringBuilds nested clusters using dendrograms.
DBSCANGroups dense regions and identifies sparse regions as outliers.
Gaussian Mixture Models (GMM)Probabilistic clustering using a mixture of Gaussians.
Anomaly DetectionIsolation ForestIsolates anomalies using random decision trees.
One-Class SVMLearns a boundary around normal data to detect outliers.
AutoencodersReconstructs input data; anomalies yield high reconstruction error.
Local Outlier Factor (LOF)Detects anomalies by comparing local density of data points.

Time Series

Forecasting and Anomaly Detection are two fundamental but distinct tasks, differing in their objectives, data assumptions, and outputs.

ModelTypeStrengthsLimitations
ARIMA/SARIMAClassicalSimple, interpretable, strong for univariate, seasonal dataRequires stationary data, manual tuning
Facebook ProphetAdditive modelEasy to use, handles holidays/seasonality, works with missing dataSlower for large datasets, limited to trend/seasonality modeling
Holt-Winters (Exponential Smoothing)ClassicalLightweight, works well with level/trend/seasonalityNot good with irregular time steps or complex patterns
LSTM (Recurrent Neural Network)Deep LearningLearns long-term dependencies, supports multivariateRequires lots of data, training is resource-intensive
XGBoost + Lag FeaturesMachine LearningHigh performance, flexible with engineered featuresRequires feature engineering, not “true” time series model
NeuralProphetHybrid (Prophet + NN)Better performance than Prophet, supports regressors/eventsHeavier than Prophet, still maturing
Temporal Fusion Transformer (TFT)Deep LearningSOTA for multivariate forecasts with interpretabilityOverkill for small/medium IoT data, very heavy

LayerModel(s)Why
EdgeHolt-Winters, thresholds, micro-LSTM (TinyML), Prophet (inference)Extremely lightweight, low latency
FogProphet, ARIMA, Isolation Forest, XGBoostModerate compute, supports both real-time + near-real-time
CloudLSTM, TFT, NeuralProphet, Prophet (training), XGBoostCan handle heavy training, multivariate data, batch scoring
git clone https://github.com/gchandra10/python_iot_ml_demo.git

#ml #iot #edgeVer 6.0.23

Last change: 2026-04-16