[Avg. reading time: 0 minutes] Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 0 minutes]

Disclaimer

Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 4 minutes]

Required Tools

Windows

Mac

Common Tools (Windows & Mac)

Remote Development

Configure Env using Dev Container

Goto Terminal / Command Prompt

git clone https://github.com/gchandra10/workspace-iot-upperstack.git
  • Make sure Docker is running
  • Open VSCode
  • Goto File > Open Workspace from File
  • Goto workspace-rust-de folder and choose the workspace.
  • When VS Code prompts to “Reopen in Container” click it.

If VSCode doesnt prompt, then click the “Remote Connection” button at the Left Bottom of the screen.

Cloud Tools

Last change: 2026-04-16

[Avg. reading time: 1 minute]

Overview of IOT

  1. Introduction
  2. IoT Use Cases
  3. JOBS
  4. Computing Types
  5. Evolution of IOT
  6. Protocols
  7. IOT Stack Overview
  8. Lower Stack
  9. Upper Stack
  10. PuzzleVer 6.0.23

[Avg. reading time: 6 minutes]

Introduction

What is IoT

The Internet of Things is a system where physical objects are equipped with sensors, software, and network connectivity so they can collect data, communicate over the network, and trigger actions without continuous human involvement.

IoT is not just the device.

IoT is devices + data + connectivity + action.


Why IoT Matters

Operational Efficiency

  • Automates repetitive and time sensitive tasks
  • Reduces manual monitoring and human error
  • Enables real time visibility into systems

Data Driven Decisions

  • Sensors generate continuous time series data
  • Decisions shift from intuition to measurable signals
  • Analytics and ML sit on top of IoT, not the other way around

Quality of Life

  • Healthcare monitoring, smart homes, traffic systems
  • Problems are detected earlier, not after failure
  • Convenience is a side effect, reliability is the real win

Economic Impact

  • New products, new services, new pricing models
  • Hardware vendors become data companies
  • Entire industries move from reactive to predictive

What is not IOT

Devices that work only locally

  • A USB temperature sensor dumping values to a laptop
  • An electronic thermostat controlling temperature locally
  • No network, no IoT

Systems with no outward data flow

  • Hardware that performs an action but emits no telemetry
  • If data never leaves the device, it is automation, not IoT

What MUST exist for something to be IoT

  • Continuous or event based data generation
  • Network communication
  • Backend ingestion
  • Storage, usually time series oriented
  • Processing or decision making
  • Optional but important feedback or control loop

Examples

Watch vs Smart Watch

CO Detector vs Smart CO Detector

  • Senses CO locally
  • Triggers a buzzer or alarm
  • Operates entirely offline

vs

  • Transmits CO readings or alarm events
  • Uses a network to communicate
  • Notifies an external system such as a phone app, home hub, or fire department

Read more

Smart Fridge


Local intelligence is embedded systems. Networked intelligence is IoT.

#IOT #Importance #smart #network Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

Use Cases

Every IoT use case follows the same pattern

sense → transmit → store → decide → act


1. Smart Homes

Use Case Home automation for comfort, security, and energy efficiency.

Example Smart thermostats like Nest adjust temperature based on occupancy and behavior. Smart locks and cameras like Ring stream events and alerts.

Temperature or motion sensed > data sent > rule applied > device reacts.

2. Healthcare

Use Case Remote patient monitoring and early intervention.

Example Wearables such as Fitbit and Apple Watch track vitals and activity and trigger alerts.

Vitals sensed > transmitted > analyzed > alert raised.

3. Industrial IoT (IIoT)

Use Case Predictive maintenance and factory automation.

Example Sensors monitor vibration, temperature, and pressure to predict failures before they occur using platforms like GE Predix.

Machine signals sensed > streamed > modeled > maintenance action triggered.

Similarly Smart Shelves inventory update, Amazon Go, Tesla Cars, Smart meters, Air Quality and so on.


Why IoT Works Across All Fields

  • Sensors are cheap
  • Networks already exist
  • Storage is inexpensive
  • Compute and analytics are mature

#iotusecases #logistics #environmentalVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

JOBS

RoleWhat They Actually DoCore Skills
IoT Application DeveloperBuild web or mobile apps that display IoT data and trigger actionsAPIs, REST, MQTT, Web or Mobile frameworks
IoT Solutions ArchitectDesign the full IoT system from devices to cloud and appsArchitecture, cloud IoT services, security
Cloud Integration EngineerConnect devices to cloud storage, pipelines, and servicesAWS or Azure, MQTT, REST, data pipelines
IoT Data AnalystAnalyze sensor and event data to extract insightsPython, SQL, time series data, dashboards
IoT Product ManagerDecide what gets built and why from a business angleProduct thinking, requirements, communication
IoT Security SpecialistSecure data, APIs, devices, and cloud integrationsEncryption, auth, IAM, threat modeling
IoT Test EngineerValidate reliability, scale, and failure scenariosTesting, automation, system validation
IoT Support or OperationsKeep systems running and debug failuresMonitoring, logs, troubleshooting

#jobs #iotdevelopers #iotarchitects #dataecosystemVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 15 minutes]

Computing Types

Modern software systems use different computing approaches depending on where computation happens, how systems are structured, and when decisions are made.

There is no single “best” computing model. Each type exists to solve a specific class of problems related to scale, latency, reliability, cost, and complexity.

As systems evolved from single machines to globally distributed platforms and IoT systems, computing models also evolved:

  • From centralized to distributed
  • From monolithic to microservices
  • From cloud-only to edge and fog
  • From reactive to proactive

Understanding these computing types helps you:

  • Choose the right architecture for a problem
  • Understand why IoT systems cannot rely on cloud alone
  • See how modern data and IoT platforms fit together

Centralized Computing

Centralized Computing

Single computer or location handles all processing and storage. All resources and decisions are managed from one central point.

Characteristics

  • Single point of control
  • Centralized decision making
  • Consistent data
  • Simpler security
  • Easier maintenance

Examples

  • Traditional banking systems
  • Library systems
  • School management systems

Typical setup

  • Central server or mainframe
  • All branches connect to HQ
  • Single database
  • Centralized processing
  • One place for updates

Major drawback

  • Single point of failure

Distributed Computing

Multiple computers work together as one logical system. Processing, storage, and management are spread across multiple machines or locations.

Characteristics

  • Shared resources
  • Fault tolerance
  • High availability
  • Horizontal scalability
  • Load balancing

Example

  • Google Search
    • Multiple data centers
    • Distributed query processing
    • Replication and redundancy

Monolithic

Single application where all functionality is packaged into one codebase.

Characteristics

  • One deployment unit
  • Shared database
  • Tightly coupled components
  • Single technology stack
  • All-or-nothing scaling

Advantages

  • Simple to build
  • Easy to deploy
  • Good performance
  • Lower initial cost

Disadvantages

  • Hard to scale selectively
  • Technology lock-in

Examples

  • WordPress
  • Early-stage applications (many start monolithic)

Microservices

Application built as independent, small services that communicate via APIs.

Characteristics

  • Independent services
  • Separate databases (often)
  • Loosely coupled
  • Different tech stacks possible
  • Individual scaling

Advantages

  • Scale only what is needed
  • Team autonomy
  • Technology flexibility

Disadvantages

  • Operational overhead
  • Higher complexity
  • Latency and distributed failures
  • Tooling sprawl if unmanaged

Cloud Computing

Cloud computing provides compute resources (servers, storage, databases, networking, software) over the internet with pay-as-you-go pricing.

Benefits

  • Cost savings
    • No upfront infrastructure
    • Pay for usage
    • Reduced maintenance
  • Scalability
    • Scale up or down on demand
    • Handle traffic spikes
  • Accessibility
    • Access from anywhere
    • Global reach
  • Reliability
    • Backups and disaster recovery
    • Multi-region options
  • Automatic updates
    • Security patches
    • Managed services reduce ops work

Examples

  • Cloud storage
  • OTT streaming platforms

Service Models

  • SaaS (Software as a Service)
    • Ready-to-use apps
    • Examples: Gmail, Dropbox, Slack
  • PaaS (Platform as a Service)
    • App runtime and developer platforms
    • Examples: Heroku, Google App Engine
  • IaaS (Infrastructure as a Service)
    • Compute, network, storage building blocks
    • Examples: AWS EC2, Azure VMs

Edge Computing

Edge computing moves computation and storage closer to where data is generated, near or on IoT devices.

Benefits

  • Lower latency
  • Works with limited internet
  • Reduces bandwidth cost
  • Better privacy (data stays local)

Simple examples

  • Smart camera doing motion detection locally
  • Smart thermostat adjusting temperature locally
  • Factory robot making real-time decisions from sensors

Examples

  • Smart Home Security
    • Local video processing
    • Only sends alerts or clips to cloud
  • Tesla cars
    • Local sensor fusion and obstacle detection
    • Split-second decisions on device

Fog Computing

What it does

  • Aggregates data from multiple edge devices
  • Provides more compute than individual devices
  • Filters and enriches data before sending to cloud
  • Keeps latency lower than cloud-only systems

Examples

  • Smart building local server processing many sensors
  • Factory gateway analyzing multiple machines
  • Farm gateway coordinating multiple sensors and controllers

Cloud vs Edge vs Fog

AspectCloudEdgeFog
LocationCentral data centersOn/near deviceLocal network
LatencyHighVery lowMedium
ComputeVery highLowMedium
StorageHighVery limitedLimited
Internet dependencyRequiredOptionalLocal network required
Data scopeGlobalSingle deviceMultiple local devices
Typical useAnalytics, long-term storageReal-time decisionsAggregation, coordination
ExampleAWSSmart cameraFactory gateway

Computing Evolution

Manual Computing

Calculations and decisions performed by humans.

Drawbacks

  • Slow
  • Error-prone
  • Not scalable

Automated Computing

Computers execute workflows with minimal human involvement.

  • Faster processing
  • Higher accuracy
  • Efficient resource use

Reactive Computing

System responds after events happen.

Examples

  • Incident response
  • Support tickets
  • After-the-fact troubleshooting

Proactive Computing

System predicts and acts before failures happen.

Examples

  • Predictive maintenance
  • Capacity planning
  • Anomaly detection

Idea

  • Prevention is better than cure

Remember the saying “Prevention is better than cure”

#iot #computing #centralizedVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 10 minutes]

Evolution of IoT

IoT evolved from isolated device communication to distributed, event-driven systems where intelligence is shared across edge, fog, and cloud.


Early Phase (2000–2010): Machine-to-Machine Era

Characteristics

  • Direct device-to-system communication
  • Mostly industrial use cases
  • Proprietary protocols
  • Vendor-locked implementations

Limitations

  • No standardization
  • Poor interoperability
  • High cost
  • Difficult to scale

Example: OnStar Vehicle Communication

  • Direct vehicle to control-center connection
  • Proprietary cellular network
  • Centralized command system

Capabilities

  • Emergency alerts
  • Vehicle tracking
  • Remote diagnostics

Limitations

  • Closed ecosystem
  • Single-vendor dependency
  • High operational cost

Implementation: General Motors’ OnStar system (2000s)


Initial IoT Phase (2010–2015): Three-Layer Architecture

Architecture Layers

Perception Layer

  • Sensors and actuators
  • Data collection from physical world

Network Layer

  • Connectivity
  • Data transmission

Application Layer

  • Basic analytics
  • Visualization
  • User interfaces

Key Advances

  • Cloud computing adoption
  • Open protocols emerge
  • Improved interoperability

Example 1: Nest Learning Thermostat (1st Generation)

  • Temperature and motion sensors
  • Wi-Fi connectivity
  • Cloud-backed mobile application

Impact

  • Mainstream smart home adoption
  • Remote monitoring and automation

Intermediate Phase (2015-2018): Five-Layer Architecture

The five-layer model emerged because cloud-only processing could not meet latency, scale, and enterprise integration needs.

Additional Layers

  • Transport Layer: reliable data movement
  • Processing Layer: analytics and rule engines
  • Business Layer: enterprise integration and monetization

Improvements

  • Better security models
  • Edge computing introduced
  • Improved scalability
  • Structured data management

Example: Smart City - Barcelona

Architecture

  • City-wide sensor networks
  • High-speed transport networks
  • Central data platforms
  • Multiple city applications
  • Business and governance layer

Results

  • Reduced water consumption
  • Improved traffic flow
  • Optimized waste management

Modern Phase (2018-Present): Service-Oriented Architecture

Core Characteristics

  • Microservices-based systems
  • Edge–Cloud continuum
  • Event-driven architecture
  • Zero-trust security
  • AI and ML integration

Key Capabilities

Distributed Intelligence

  • Edge processing
  • Fog computing
  • Autonomous decision-making

Advanced Integration

  • API-first design
  • Event mesh
  • Digital twins

Security

  • Identity-based access
  • End-to-end encryption
  • Continuous threat detection

Scalability

  • Containers
  • Serverless computing
  • Auto-scaling

Example: Tesla Vehicle Platform

Architecture

  • Edge computing inside vehicles
  • Cloud-based OTA updates
  • AI-driven autopilot
  • Digital vehicle twins

Impact

  • Continuous improvement
  • Predictive maintenance
  • Fleet-level intelligence

Example : Amazon Go Stores

Technologies

  • Computer vision
  • Sensor fusion
  • Edge AI
  • Deep learning

Results

  • Cashierless retail
  • Reduced operational cost
  • Improved customer experience

Autonomous IoT

  • Self-healing systems
  • Self-optimizing networks
  • Cognitive decision-making

Sustainable IoT

  • Energy-efficient design
  • Green computing
  • Resource optimization

Resilient IoT

  • Fault tolerance
  • Disaster recovery
  • Business continuity

Example: Smart Agriculture

  • Autonomous machinery
  • Drone integration
  • Soil and weather sensors
  • Precision farming

Example: Smart Grids

  • Grid sensors
  • Smart meters
  • Edge intelligence
  • Automated fault recovery
  • Demand response

Key Architectural Shifts Over Time:

  • From Centralized → Distributed
  • From Monolithic → Microservices
  • From Cloud-centric → Edge-centric
  • From Static → Dynamic
  • From Manual → Automated
  • From Reactive → Proactive

Impact on Design Considerations

Scalability

  • Vertical → Horizontal
  • Static → Elastic

Security

  • Perimeter-based → Zero trust
  • Reactive → Preventive

Integration

  • Point-to-point → Event-driven
  • Tight coupling → Loose coupling

Operations

  • Manual → Automated
  • Centralized → Distributed

#iot #evolutionVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 14 minutes]

Protocols

A protocol in the context of computing and communications refers to a set of rules and conventions that dictate how data is transmitted and received over a network. Protocols ensure that different devices and systems can communicate with each other reliably and effectively. They define the format, timing, sequencing, and error checking mechanisms used in data exchange.


Importance of Protocols

Interoperability: Allows different systems and devices from various manufacturers to work together.

Reliability: Ensures data is transmitted accurately and efficiently.

Standardization: Provides a common framework that developers can follow, leading to consistent implementations.


Commonly used Protocols

HTTP (HyperText Transfer Protocol): Used for transmitting web pages over the internet.

FTP (File Transfer Protocol): Used for transferring files between computers.

TCP/IP (Transmission Control Protocol/Internet Protocol): A suite of communication protocols used to interconnect network devices on the internet.

UDP (User Datagram Protocol): UDP, or User Datagram Protocol, is a communication protocol used across the Internet. It is part of the Internet Protocol Suite, which is used by networked devices to send short messages known as datagrams but with minimal protocol mechanisms. Used in VoIP & Live Streaming.

Key Characteristics of Protocols

Syntax:

Defines the structure or format of the data.

Example: How data packets are formatted or how headers are structured.

Semantics:

Describes the meaning of each section of bits in the data.

Example: What specific bits represent, such as addressing information or control flags.

Timing:

Controls the sequencing and speed of data exchange.

Example: When data should be sent, how fast it should be sent, and how to handle synchronization.


1. Bluetooth

Description: A short-range wireless technology standard used for exchanging data between fixed and mobile devices. Its a Key protocol in the IoT ecosystem.

Use Cases:

  • Wearable devices (e.g., fitness trackers, smartwatches)
  • Wireless peripherals (e.g., keyboards, mice, headphones)
  • Home automation (e.g., smart locks, lighting control)
  • Health monitoring devices

2. Zigbee

Description: A low-power, low data rate wireless mesh network standard ideal for IoT applications. It can handle larger networks in 1000’s of nodes compared to Bluetooch with a limit of 5 to 30 devices. Lower Latency compared to Bluetooth. Needs a hub / controller to communicate. (Google Nest, Apple HomePod)

Use Cases:

  • Smart home devices (e.g., smart bulbs, thermostats, security systems)
  • Industrial automation
  • Smart energy applications (e.g., smart meters)
  • Wireless sensor networks

3. NFC (Near Field Communication)

Description: Direct Peer to Peer communication system. A set of communication protocols for communication between two electronic devices over a distance of 4 cm (1.6 in) or less. No pairing or controller is needed.

Use Cases:

  • Contactless payments (e.g., Apple Pay, Google Wallet)
  • Access control (e.g., NFC-enabled door locks, Yubi Keys)
  • Data exchange (e.g., transferring contacts, photos)
  • Smart posters and advertising

Payment Terminal

Phone → Terminal (direct) Terminal → Payment processor (separate connection)

Door Access

Card → Reader (direct) Reader → Access control system (separate connection)


4. LoRaWAN (Long Range Wide Area Network)

Description: A low-power, long-range wireless protocol designed for IoT applications.

Use Cases:

  • Smart cities (e.g., parking sensors, street lighting)
  • Agriculture (e.g., soil moisture sensors)
  • Asset tracking
  • Environmental monitoring

5. MQTT (Message Queuing Telemetry Transport)

Description: A lightweight messaging protocol for small sensors and mobile devices optimized for high-latency or unreliable networks.

  • It’s a lightweight messaging protocol designed for devices with limited resources
  • Works like a postal service for IoT devices
  • Uses a publish/subscribe model instead of direct device-to-device communication
  • Perfect for IoT because it’s:
    • Low bandwidth
    • Battery efficient
    • Reliable even with poor connections

Use Cases:

  • Home automation (e.g., smart home controllers)
  • Industrial automation.
  • Telemetry data collection.
  • Remote monitoring.

6. CoAP (Constrained Application Protocol)

Description: A specialized web transfer protocol for use with constrained nodes and networks in the IoT.

Key Characteristics

  • It’s a specialized web transfer protocol for resource-constrained IoT devices
  • Works similarly to HTTP but optimized for IoT needs
  • Uses UDP (User Datagram Protocol) instead of TCP, making it lighter and faster
  • Built for machine-to-machine (M2M) applications

Use Cases:

  • Smart energy and utility metering
  • Building automation
  • Environmental monitoring
  • Resource-constrained devices

Main Features

  • Built-in Resource Discovery
  • Support for multicast and broadcast messages
  • Simple proxy and caching capabilities
  • Low overhead and parsing complexity
  • Asynchronous message exchange
  • URI support similar to HTTP (coap://endpoint/path)

Apart from this there are few more Z-Wave, LTE-M, RFID

#protocol #http #mqttVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 4 minutes]

IoT Protocol Stack Overview

Many IoT protocols span multiple layers.
This stack is a conceptual view used to understand responsibilities, not a strict OSI mapping.

LayerPurposeExamples
Physical LayerHandles hardware-level transmission such as sensors, actuators, radios, and modulation.LoRa, BLE (PHY), Zigbee (PHY), Wi-Fi, Cellular (NB-IoT, LTE-M)
Data Link LayerManages MAC addressing, framing, error detection, and local delivery.IEEE 802.15.4, BLE Link Layer, LoRaWAN
Network LayerHandles addressing and routing across networks (IP or adapted IP).IPv6, 6LoWPAN, RPL
Transport LayerProvides end-to-end data delivery and reliability where required.UDP, TCP
Security LayerEnsures encryption, authentication, and integrity.DTLS, TLS
Application LayerDefines messaging, device interaction, and application semantics.MQTT, CoAP, HTTP, LwM2M, AMQP

IoT Stack Preferred Languages

Stack LayerPreferred LanguagesWhy
Lower Stack (Firmware / Device)C / C++ / Rust (emerging)Direct hardware access, deterministic performance, low memory footprint, real-time constraints, zero-cost abstractions.
Middle Stack (Gateway / Edge)Rust / PythonProtocol translation, buffering, edge analytics, balance of performance and developer productivity.
Upper Stack (Cloud / Data)Rust / PythonLarge-scale data processing, APIs, stream processing, ML orchestration, cloud-native services.

#protocol #stackVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

Layers of IoT - Lower Stack

IoT architecture typically consists of several layers, each serving a specific function in the overall system. These layers can be broadly divided into the lower stack and the upper stack.

The lower stack focuses on the physical and network aspects of IoT systems. It includes the following layers:

Physical Devices and Sensors:

Devices and sensors that collect data from the environment. Examples: Smart thermostats, industrial sensors, wearable health monitors.

Device Hardware and Firmware:

Microcontrollers, processors, and firmware that manage device operations. Ensures proper functioning and communication of IoT devices.

Connectivity and Network Layer:

Communication protocols (Wi-Fi, Bluetooth, Zigbee, LoRaWAN, etc.) that transmit data. Network hardware like routers and gateways that facilitate data transmission.

Edge Computing:

Edge devices that process data locally to reduce latency and bandwidth usage. Edge analytics for real-time decision-making without relying on cloud processing.

Power Management:

Battery technologies and energy harvesting methods to power IoT devices. Ensures prolonged operational life of remote and portable devices.

#lowerstack #physicaldevicesVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Layers of IoT - Upper Stack

IoT architecture typically consists of several layers, each serving a specific function in the overall system. These layers can be broadly divided into the lower stack and the upper stack.

The upper stack deals with application, data processing, and user interaction aspects of IoT systems. It includes the following layers:

Data Ingestion Layer

  • Different Data formats (JSON, Binary)
  • Message Brokers and queuing systems (RabbitMQ, Apache Kafka)

Data Processing & Storage

  • Time Series Databases like InfluxDB / TimescaleDB.
  • Hot vs Cold storage strategies.
  • Data aggregation techniques.
  • Edge vs Cloud processing decisions.

Analytical Layer

  • Realtime analytics
  • Vizualization frameworks and tools
  • Anomaly detection systems

Application Interface / Enablement

  • API (RESTful services)
  • User authentication / authorization

Enterprise Integration

  • Data transformation and mapping
  • Integration with legacy systems

#upperstack #data #integrationlayerVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

Puzzle

1. For each of the following IoT components, identify whether it belongs to the upper stack or the lower stack and explain why.

  • 1.1. A mobile app that allows users to control their home lighting system.

  • 1.2. A sensor that measures soil moisture levels in a farm.

  • 1.3. A gateway that translates Zigbee protocol data to Wi-Fi for transmission to the cloud.

  • 1.4. A cloud-based analytics platform that processes data from smart meters.

  • 1.5. Firmware running on a smart thermostat that controls HVAC systems.


2. Determine whether the following statements are true or false.

  • 2.1 Edge computing is part of the upper stack in IoT systems.

  • 2.2 User authentication and data encryption are important aspects of the lower stack.

  • 2.3 A smart refrigerator that sends notifications to your phone about expired food items involves both upper and lower stack components.

  • 2.4 Zigbee and Bluetooth are commonly used for high-bandwidth IoT applications.

  • 2.5 Predictive maintenance in industrial IoT primarily utilizes data from the upper stack.

#puzzle #iotVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

Data Processing

  1. Application Layer
    1. MQTT

    2. JSON

    3. CBOR

    4. XML

    5. TCP-UDP

    6. MessagePack

    7. Protocol Buffers

    8. HTTP & REST API

      1. Statefulness
      2. Statelessness
      3. REST API
  2. CPU Architecture
  3. Containers
    1. VMs or Containers
    2. What Container does
    3. Docker
    4. Docker Examples
    5. Container in IOT
  4. Python Environment
    1. Code Quality & Safety
    2. Error Handling
    3. Faker
    4. Logging
  5. Time Series Databases
    1. InfluxDB
    2. InfluxDB Demo
  6. Data Visualization libraries
    1. GrafanaVer 6.0.23

[Avg. reading time: 5 minutes]

Application Layer

Application Protocols

Lightweight protocols designed for IoT communication:

MQTT (Message Queuing Telemetry Transport):

Device → MQTT Broker → Server
Publish-subscribe model over TCP/IP.
Ideal for unreliable networks (e.g., remote sensors).

CoAP (Constrained Application Protocol):

RESTful, UDP-based protocol for low-power devices.
Features: Observe mode, resource discovery, DTLS security.

HTTP/HTTPS:

Used for cloud integration (less efficient than CoAP/MQTT).

LwM2M (Lightweight M2M):

Device management protocol built on CoAP.

Data Formats

JSON: Human-readable format for APIs and web services.

CBOR (Concise Binary Object Representation): Binary format for efficiency (used with CoAP).

XML: Less common due to larger payload size.


APIs and Services

RESTful APIs: Enable integration with cloud platforms (e.g., AWS IoT, Azure IoT).

WebSocket: Real-time bidirectional communication.

Device Management: Firmware updates, remote configuration (via LwM2M).


Security Mechanisms

DTLS (Datagram TLS): Secures CoAP communications.

TLS/SSL: Used for MQTT and HTTP.

Authentication: OAuth, API keys, X.509 certificates.


Why the Application Layer Matters

Efficiency: Protocols like CoAP minimize overhead for low-power devices.

Scalability: Supports thousands of devices in large-scale deployments.

Interoperability: Enables integration with existing web infrastructure (e.g., HTTP).

Security: Ensures data integrity and confidentiality in sensitive applications.


Challenges in IoT Application Layers

Fragmentation: Multiple protocols (CoAP, MQTT, HTTP) complicate interoperability.

Resource Constraints: Limited compute/memory on devices restricts protocol choices.

Latency: Real-time applications require optimized data formats and protocols.

#applicationlayer #protocols #formats #api #servicesVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 18 minutes]

MQTT - Message Queuing Telemetry Transport

MQTT is one of the most widely used messaging protocols in the Internet of Things (IoT).

It was originally developed by IBM in 1999 and later standardized by OASIS. MQTT became popular in IoT because it is simple, lightweight, and designed for unreliable networks.

MQTT works well on:

  • Low bandwidth networks
  • High latency connections
  • Intermittent or unreliable connectivity

Unlike HTTP, MQTT uses a binary message format, making it far more efficient for constrained devices such as sensors and embedded systems.

1

Why MQTT Exists

Traditional request–response protocols like HTTP are inefficient for IoT devices.

MQTT was designed to:

  • Minimize network usage
  • Reduce device CPU and memory consumption
  • Support asynchronous, event-driven communication

Work reliably even when devices disconnect frequently.

Core MQTT Concepts

  • Publish–Subscribe Model
  • MQTT uses a publish–subscribe architecture.
  • Devices publish messages to a broker
  • Devices subscribe to topics they are interested in
  • The broker routes messages to matching subscribers
  • Devices never communicate directly with each other.

MQTT Components

MQTT Broker

  • The broker is the central message hub.
  • Think of it like a post office:
  • Receives messages from publishers
  • Filters messages by topic
  • Delivers messages to subscribers

Common brokers:

  • Open source: Mosquitto
  • Commercial: HiveMQ

Register with hivemq cloud

Publishers

Devices that send data

Example:

  • Temperature sensor publishing readings
  • Garage door device publishing open or close status

Subscribers

Devices that receive data

Example:

  • Mobile app receiving temperature updates
  • Backend system monitoring device health

Topics

Topics are hierarchical strings used to route messages.

Example:

home/livingroom/temperature

  • Publishers send messages to a topic
  • Subscribers subscribe to topics of interest
  • The broker matches topics and delivers messages

Topic Wildcards

MQTT supports topic wildcards for flexible subscriptions.

Single-level wildcard

  • Matches exactly one level

Example:

home/+/temperature

Multi-level wildcard

Matches all remaining levels

Example:

home/#

Key Features of MQTT

  1. Lightweight and Efficient
  • Small packet size
  • Minimal protocol overhead
  • Ideal for constrained devices
  1. Bidirectional Communication
  • Devices can both publish and subscribe
  • Enables real-time updates and control

  1. Highly Scalable
  • Supports thousands to millions of devices
  • Widely used in large IoT and IIoT deployments
  1. Configurable Reliability
  • Supports different Quality of Service levels
  • Lets you trade reliability for performance
  1. Session Persistence and Buffering
  • Brokers can store messages when clients disconnect
  • Messages are delivered when clients reconnect
  1. Security Support
  • MQTT itself has no built-in security
  • Security is added using:
  • TLS encryption
  • Client authentication
  • Access control at the broker

Git Hub Example Code


graph LR
    B[MQTT Broker]
    CD1[Client Device]
    CD2[Client Device]
    CD3[Client Device]
    CD4[Client Device]
    CD5[Client Device]

    CD1 -->|Topic 2| B
    CD1 -->|Topic 1| B
    CD2 -->|Topic 2| B
    
    B -->|Topic 2| CD3
    B -->|Topic 1 
    Topic 3| CD4
    B -->|Topic 3| CD5


Quality of Service (QoS)

MQTT defines three QoS levels for message delivery. QoS is coordinated by the broker.

QoS 0 – At most once

  • No acknowledgment
  • Messages may be lost
  • Lowest latency
  • Use when message loss is acceptable
  • Example: Temperature sensor every 2 seconds. High volume of data.

QoS 1 – At least once

  • Message delivery is acknowledged
  • Messages may be duplicated
  • Commonly used in IoT
  • Use when Message loss is unacceptable and duplicate messages can be handled
  • Deduplication handled by message id.
  • Example: Smart meter readings. Door open/close.

QoS 2 – Exactly once

  • Guarantees single delivery
  • Highest overhead
  • Increased latency
  • Use only when message loss and duplication are both unacceptable.
  • Example: control commands, critical alerts, factory machine shutdown.

Higher QoS levels consume more network and compute resources.

Pub QoS 1, Sub QoS 0 → delivered as QoS 0
Pub QoS 2, Sub QoS 1 → delivered as QoS 1
Pub QoS 0, Sub QoS 2 → delivered as QoS 0

Message Persistence

Message persistence ensures messages are not lost when clients disconnect.

Non-persistent (Default)

  • Messages are not stored
  • Lost if subscriber is offline
  • Suitable for non-critical data

Queued Persistent

  • Broker stores messages for offline clients
  • Messages delivered when client reconnects

Similar to: Emails waiting on a server until you connect

Persistent with Acknowledgment

  • Messages stored until acknowledged
  • Messages resent until confirmation

Used when: Guaranteed processing is required

Persistent Session Stores

When persistence is enabled, brokers may store:

  • Client ID
  • Subscription list
  • Unacknowledged QoS messages
  • Queued messages

CONN Car Company

Vehicles are shifting from hardware to Software Defined Vehicles. (EVs like Tesla)

MQTT is used for:

  • Telemetry streaming
  • Remote diagnostics
  • Over-the-air updates
  • Feature enablement

EV companies use MQTT to connect vehicles, cloud systems, and mobile apps reliably.


MQTT doesn’t stop here

MQTT integrates with:

  • Cloud platforms
  • Data pipelines
  • Streaming systems
  • Analytics and monitoring tools

Source YouTube Links

(https://www.youtube.com/watch?v=brUsw_H9Gq8)

(https://www.youtube.com/watch?v=k103_LhF05w)


Advanced Learning about Brokers

https://www.hivemq.com/blog/mqtt-brokers-beginners-guide/

Download the Open Source Broker to learn more https://mosquitto.org/

#mqtt #http #broker #publisher #subscriber


1: http://hivemq.comVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

JSON

JSON (JavaScript Object Notation) is a lightweight, text-based data format that’s easy to read for both humans and machines. It was derived from JavaScript but is now language-independent, making it one of the most popular formats for data exchange between applications. Key Concepts:

What is JSON Used For?

  • Storing configuration settings
  • Exchanging data between web servers and browsers
  • APIs (Application Programming Interfaces)
  • Storing structured data in files or databases
  • Mobile app data storage

JSON Data Types:

Strings: Text wrapped in double quotes

{"name": "Rachel Green"}

Numbers: Integer or floating-point

{"age": 27, "height": 5.5}

Booleans: true or false

{"isStudent": true}

null: Represents no value

{"middleName": null}

Arrays: Ordered lists of values

{
  "hobbies": ["shopping", "singing", "swimming"]
}

Objects: Collections of key-value pairs

{
  "address": {
    "street": "123 Main St",
    "city": "NYC",
    "zipCode": "10001"
  }
}

Important Rules:

  • All property names must be in double quotes
  • Values can be strings, numbers, objects, arrays, booleans, or null
  • Commas separate elements in arrays and properties in objects
  • No trailing commas allowed
  • No comments allowed in JSON
  • Must use UTF-8 encoding

Example

{
  "studentInfo": {
    "firstName": "Monica",
    "lastName": "Geller",
    "age": 22,
    "isEnrolled": true,
    "courses": [
      {
        "name": "Web Development",
        "code": "CS101",
        "grade": 95.5
      },
      {
        "name": "Database Design",
        "code": "CS102",
        "grade": 88.0
      }
    ],
    "contact": {
      "email": "monica.g@friends.com",
      "phone": null,
      "address": {
        "street": "456 College Ave",
        "city": "Columbia",
        "state": "NY",
        "zipCode": "13357"
      }
    }
  }
}

Dont’s with JSON

  • Using single quotes instead of double quotes
  • Not enclosing property names in quotes
  • Adding trailing commas
  • Missing closing brackets or braces
  • Using undefined or functions (not allowed in JSON)
  • Adding comments (not supported in JSON)

Best Practices

  • Always validate JSON using a JSON validator tool
  • Pay attention to proper nesting of objects and arrays
  • Ensure all opening brackets/braces have matching closing ones
  • Check for proper use of commas

camelCase (e.g., firstName):

  • Most popular in JavaScript/JSON
  • Easy to read and type
  • Matches JavaScript convention

Example:

{
  "firstName": "John",
  "lastLoginDate": "2024-12-20",
  "phoneNumber": "555-0123"
}

snake_case (underscores, e.g., first_name):

  • Popular in Python and SQL
  • Very readable
  • Clear word separation

Example:

{
  "first_name": "John",
  "last_login_date": "2024-12-20",
  "phone_number": "555-0123"
}

kebab-case (hyphens, e.g., first-name):

  • Common in URLs and HTML attributes
  • NOT recommended for JSON
  • Can cause issues because hyphen is also the subtraction operator
  • Requires bracket notation to access in JavaScript

Example of why it’s problematic:

// This won't work
data.first-name  // JavaScript interprets as data.first minus name

// Must use bracket notation
data["first-name"]  // Works but less convenient

#json #dataformatVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

CBOR (Concise Binary Object Representation)

CBOR is a compact binary data format designed for efficiency, speed, and low overhead. It keeps JSON’s simplicity while delivering 30–50% smaller payloads and faster serialization, making it ideal for IoT, embedded systems, and high-throughput APIs.

https://cbor.dev

Why CBOR

JSON is human-friendly but wasteful for machines.

CBOR is Binary

  • Binary encoding instead of text
  • Smaller payloads
  • Faster parsing
  • Native binary support
  • Better fit for constrained environments

Use CBOR when:

  • Bandwidth is expensive
  • Latency matters
  • Devices are constrained
  • Message rates are high

Key Features

Binary Format

  • Compact payloads
  • Lower bandwidth usage
  • Faster transmission

Self-Describing

  • Encodes type information directly
  • No external schema required to decode

Schema-Less (Schema Optional)

  • Works like JSON
  • Supports validation using CDDL (Consise Data Definition Language)

Fast Serialization & Parsing

  • No expensive string parsing
  • Lower CPU overhead

Extensible

  • Supports semantic tags for:
  • Date / Time
  • URIs
  • Application-specific meanings

Data Types & Structure

CBOR natively supports JSON-like data structures:

Primitive Types:

  • Integers (positive, negative)
  • Byte strings (bstr)
  • text strings (tstr)
  • Floating-point numbers (16,32,64 bit)
  • Booleans (true, false)
  • null, and undefined values.

Composite Types:

  • Arrays (ordered lists)
  • Maps (key-value pairs, similar to JSON objects)

Semantic Tags:

  • Optional tags to add meaning (e.g., Tag 0 for date/time strings, Tag 32 for URIs).

Example: CBOR vs. JSON

JSON Object

{
  "id": 123,
  "name": "Temperature Sensor",
  "value": 25.5,
  "active": true
}

CBOR to/from JSON

cbor.williamchong.cloud

CBOR Playground

cbor.me

CBOR Encoding (Hex Representation)

B9 0004                                 # map(4)
   62                                   # text(2)
   6964                                 # "id"
   18 7B                                # unsigned(123)
   64                                   # text(4)
   6E616D65                             # "name"
   72                                   # text(18)
   54656D70657261747572652053656E736F72 # "Temperature Sensor"
   65                                   # text(5)
   76616C7565                           # "value"
   FB 4039800000000000                  # primitive(4627870829588250624)
   66                                   # text(6)
   616374697665                         # "active"
   F5                                   # primitive(21)

Size Comparison:

  • JSON: ~70 bytes.
  • CBOR: ~45 bytes (35% smaller)
FeatureCBORJSON/XML
Payload SizeCompact binary encoding (~30-50% smaller).Verbose text-based encoding
Parsing SpeedFaster (no string parsing).Slower (text parsing required).
Data TypesRich (supports bytes, floats, tags).Limited (no native byte strings).
Schema FlexibilityOptional schemas (CDDL).Often requires external schemas.
Human ReadabilityRequires tools to decode.Easily readable.

Limitations

Human-Unreadable: Requires tools (e.g., CBOR Playground) to decode.

Schema Validation: While optional, validation requires external tools like CDDL (Concise Data Definition Language).

When to Use CBOR

  • Low-bandwidth networks (e.g., IoT over LoRaWAN or NB-IoT).

  • High-performance systems needing fast serialization.

  • Interoperability between devices and web services.

Demo Code

git clone https://github.com/gchandra10/python_cbor_examples

CBOR + MQTT = Perfect Match

CBOR is ideal for MQTT payloads

Demonstrate how cbor can be used with mqtt.

#cbor #dataformatVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 4 minutes]

XML

XML (eXtensible Markup Language) is moderately popular in IoT.

With JSON gaining popularity, XML is still used in Legacy systems and regulated environments such as Govt/Military systems.

It uses XSD (Extended Schema Definition) to enforce strict data validation, ensuring integrity in critical applications like healthcare.

Legacy systesm use SOAP-based web services (newer ones use REST API) often use XML, rquiring IoT devices to adopt XML for compatibility.

<sensorData>
    <deviceId>TEMP_SENSOR_01</deviceId>
    <location>living_room</location>
    <reading>
        <temperature>23.5</temperature>
        <unit>Celsius</unit>
        <timestamp>2025-01-29T14:30:00</timestamp>
    </reading>
</sensorData>

Limitations of XML in IoT

  • Verbosity: Larger payloads increase bandwidth and storage costs.
  • Processing Overhead: Parsing XML can strain low-power IoT devices.
  • Modern Alternatives: JSON and binary formats (e.g., Protocol Buffers) are more efficient for most IoT use cases.

Here’s the XML vs. JSON Trade-offs comparison formatted as a markdown table:

FactorXMLJSON
Payload SizeVerbose (larger files)Compact (better for low-bandwidth IoT)
Parsing SpeedSlower (complex structure)Faster (lightweight parsing)
ValidationMature (XSD)Growing (JSON Schema)
Adoption in New ProjectsRare (outside legacy/regulated use cases)Dominant (preferred for new IoT systems)

#xml #dataformatVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 6 minutes]

TCP & UDP

  • Transmission Control Protocol
  • User Datagram Protocol

TCP and UDP are transport protocols. Their only job is to decide how data moves across the network.

Common IoT problems

  • Sensors generate data continuously
  • Networks are unreliable
  • Devices are constrained
  • Some data losses are acceptable and some are not.

UDP

  • Sends data without confirmation
  • No retries
  • No ordering
  • No connection
  • Very low overhead

UDP Usecases in IOT

  • Battery powered devices
  • High frequency telemetry
  • Small payloads
  • Occasional loss is acceptable
  • Speed matters more than accuracy

Typical IoT usage

  • CoAP
  • Device discovery
  • Heartbeats
  • Periodic measurements of Environmental sampling

Example

Smart street lighting

  • Each lamp sends a heartbeat every 5 to 10 seconds
  • Payload: device_id, status, battery, signal strength
  • If ‘n’ heartbeats are missed, mark lamp as offline
  • Losing one packet changes nothing.

Vehicle Telematics

  • Fleet vehicles send location and health pings
  • One ping every few seconds
  • Next ping overrides the previous

TCP

  • Confirms delivery
  • Retries lost data
  • Preserves order
  • Maintains a connection
  • Higher overhead

TCP use cases in IoT

  • Data must not be lost
  • Order matters
  • Sessions last minutes or hours

Typical IoT usage

  • MQTT
  • HTTP
  • HTTPS
  • TLS secured pipelines

With MQTT

  • Ordered messages
  • Delivery guarantees using QoS
  • Persistent sessions
  • Broker side buffering
  • Fan out to many subscribers

UDP vs TCP

QuestionUDPTCP
Is delivery guaranteedNoYes
Is ordering preservedNoYes
Is it lightweightYesNo
Does MQTT use itNoYes
Does CoAP use itYesNo
Best for battery devicesYesSometimes
Best for critical dataNoYes
          SENSOR
            |
            |
     -----------------
     |               |
   UDP Path         TCP Path
     |               |
 No confirmation   Confirmed delivery
 No retry          Retry on failure
 Possible loss     Ordered messages
     |               |
   CoAP           MQTT Broker
                     |
               Persistent sessions
                     |
                 Cloud Applications
````<span id='footer-class'>Ver 6.0.23</span>
<footer id="last-change">Last change: 2026-04-16</footer>````

[Avg. reading time: 8 minutes]

MessagePack

A compact binary data interchange format

What is MessagePack

MessagePack is an efficient binary serialization format designed for fast and compact data exchange between systems.

Core properties

  • Compact compared to text formats like JSON
  • Fast serialization and deserialization
  • Cross-language support across many ecosystems
  • Flexible data model with optional extensions

Why MessagePack

MessagePack solves a very specific problem:

  • JSON is easy to read but inefficient on the wire
  • IoT and distributed systems care about bytes, latency, and CPU
  • MessagePack keeps JSON-like simplicity but removes text overhead

In short, JSON Data model with Binary efficiency.


Key Use Cases

  1. IoT telemetry and device data
  2. Edge gateways aggregating high-frequency events
  3. Microservice-to-microservice communication
  4. Caching layers like Redis and Memcached
  5. Distributed systems logging and checkpoints

MessagePack vs JSON

  • Binary and compact
  • Faster to parse
  • Smaller payloads for most data
  • Not human-readable
  • Debugging requires tooling

MessagePack vs CBOR

  • MessagePack is simpler and lighter
  • CBOR supports semantic tags like datetime and URI
  • CBOR supports deterministic encoding for hashing and signatures
  • Size differences are workload-dependent, not guaranteed

Comparison with Similar Formats

FeatureMessagePackJSONCBOR
EncodingBinaryTextBinary
Human-readableNoYesNo
Data SizeSmall (varies)LargeSmall (varies)
Schema RequiredNoNoNo
StandardizationCommunityRFC 8259RFC 8949
Binary Data SupportNativeBase64Native
Semantic TagsNoNoYes
Deterministic EncodingNoNoYes

Key Differences:

  • vs JSON: 20-30% smaller payloads, faster parsing, but not human-readable
  • vs CBOR: More compact for simple types, CBOR has better semantic tagging

Basic Operations

  • packb() converts Python objects to MessagePack bytes
  • unpackb() converts MessagePack bytes back to Python objects

Python Example

git clone https://github.com/gchandra10/python_messagepack_examples.git

MessagePack in IoT and Edge Systems

  • Commonly used in edge gateways and ingestion pipelines
  • Efficient for short, frequent telemetry messages
  • Suitable for MQTT payloads where the broker is payload-agnostic
  • Rarely used directly in regulated firmware layers

Important:

  • MQTT does not care about payload format
  • MessagePack is an application-layer choice, not a protocol requirement

Summary

When to Choose MessagePack

  • Bandwidth or memory is constrained
  • JSON is too verbose
  • Binary data is common
  • Speed matters more than readability
  • Schema flexibility is acceptable

What MessagePack Does Not Do

  • No schema enforcement
  • No backward compatibility guarantees
  • No semantic meaning for fields
  • No built-in validation
  • No deterministic encoding

Devices like AppleWatch, Fitbit use Protocol Buffers for strict schema FDA regulated enforcement.

#dataformat #messagepackVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 12 minutes]

Protocol Buffers

What are Protocol Buffers

  • A method to serialize structured data into binary format
  • Created by Google
  • Its like JSON, but smaller and faster.
  • Protocol Buffers are more commonly used in industrial IoT scenarios.

Why Protobuf is great for IoT

  • Smaller size: Uses binary format instead of text, saving bandwidth
  • Faster processing: Binary format means less CPU usage on IoT devices
  • Strict schema: Helps catch errors early
  • Language neutral: Works across different programming languages
  • Great for limited devices: Uses less memory and battery power
  • Extensibility: Add new fields to your message definitions without breaking existing code.

Industrial Use Cases

  • Bridge structural sensors (vibration, stress)
  • Factory equipment monitors
  • Power grid sensors
  • Oil/gas pipeline monitors
  • Wind turbine telemetry
  • Industrial HVAC systems

Why Industries prefer Protobuf:

  • High data volume (thousands of readings per second)
  • Need for efficient bandwidth usage
  • Complex data structures
  • Multiple systems need to understand the data
  • Long-term storage requirements
  • Cross-platform compatibility needs
graph LR
    subgraph Bridge["Bridge Infrastructure"]
        S1[Vibration Sensor] --> GW
        S2[Strain Gauge] --> GW
        S3[Temperature Sensor] --> GW
        subgraph Gateway["Linux Gateway (Solar)"]
            GW[Edge Gateway]
            DB[(Local Storage)]
            GW --> DB
        end
    end
    
    subgraph Communication["Communication Methods"]
        GW --> |4G/LTE| Cloud
        GW --> |LoRaWAN| Cloud
        GW --> |Satellite| Cloud
    end
    
    Cloud[Cloud Server] --> DA[Data Analysis]
    
    style Bridge fill:#87CEEB,stroke:#333,stroke-width:2px,color:black
    style Gateway fill:#90EE90,stroke:#333,stroke-width:2px,color:red
    style Communication fill:#FFA500,stroke:#333,stroke-width:2px,color:black
    style Cloud fill:#4169E1,stroke:#333,stroke-width:2px,color:white
    style DA fill:#4169E1,stroke:#333,stroke-width:2px,color:white
    style GW fill:#000000,stroke:#333,stroke-width:2px,color:white
    style DB fill:#800020,stroke:#333,stroke-width:2px,color:white
    
    classDef sensor fill:#00CED1,stroke:#333,stroke-width:1px,color:black
    class S1,S2,S3 sensor

Consumer IoT devices (in general)

  • Use simpler formats (JSON, proprietary)
  • Have lower data volumes
  • Work within closed ecosystems (Google Home, Apple HomeKit)
  • Don’t need the optimization Protobuf provides

Data Types in Protobufs

Scalar Types:

int32, int64, uint32, uint64, sint32, sint64, fixed32, fixed64, sfixed32, sfixed64 float, double, bool, string, bytes

Composite Types:

  • message: Defines a structured collection of other fields.
  • enum: Defines a set of named integer constants.

Collections:

  • repeated: Allows you to define a list of values of the same type. Like Array.

Steps involved in creating a Proto Buf data file.

Step 1: Define the Data Structure of your data file as .proto text file. Ex: my_data.proto

syntax = "proto3";

message MyData {
  int32 id = 1;
  string name = 2;
  float value = 3;
}

Step 2: Compile the .proto file to Python Class (.pb) or Java Class (.java) using protoc library.

protoc --python_out=. my_data.proto

Generates my_data_pb2.py

Install Protoc

Step 3: Use the Generated Python Class file and use it to store data.

Note: Remember protoc –version should be same or closer as protobuf minor version number from pypi library.

In my setup protoc –version = 29.3, pypi protobuf = 5.29.2 Minor version of protobuf is 29.2 which is closer to 29.3

See example.

Demo Script

git clone https://github.com/gchandra10/python_protobuf_demo
flowchart LR
    subgraph Sensor["Temperature/Humidity Sensor"]
        S1[DHT22/BME280]
    end
    
    subgraph MCU["Microcontroller"]
        M1[ESP32/Arduino]
    end
    
    subgraph Gateway["Gateway/Edge Device"]
        G1[Raspberry Pi/\nIntel NUC]
    end
    
    subgraph Cloud["Cloud Server"]
        C1[AWS/Azure/GCP]
    end
    
    S1 -->|Raw Data 23.5°C, 45%| M1
    M1 -->|"JSON over MQTT {temp: 23.5,humidity: 45}"| G1
    G1 -->|Protocol Buffers\nover HTTPS| C1

#protobuf #googleVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 2 minutes]

HTTP Basics


HTTP (HyperText Transfer Protocol) is the foundation of data communication on the web, used to transfer data (such as HTML files and images).

GET - Navigate to a URL or click a link in real life.

POST - Submit a form on a website, like a username and password.

Popular HTTP Status Codes

200 Series (Success): 200 OK, 201 Created.

300 Series (Redirection): 301 Moved Permanently, 302 Found.

400 Series (Client Error): 400 Bad Request, 401 Unauthorized, 404 Not Found.

500 Series (Server Error): 500 Internal Server Error, 503 Service Unavailable.

We already learnt about Monolithic and Microservices.

#http #status #monolithic #microservicesVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

Statefulness


The server stores information about the client’s current session in a stateful system. This is common in traditional web applications. Here’s what characterizes a stateful system:

Session Memory: The server remembers past interactions and may store session data like user authentication, preferences, and other activities.

Server Dependency: Since the server holds session data, the same server usually handles subsequent requests from the same client. This is important for consistency.

Resource Intensive: Maintaining state can be resource-intensive, as the server needs to manage and store session data for each client.

Example: A web application where a user logs in, and the server keeps track of their authentication status and interactions until they log out.

sequenceDiagram
    participant C as Client
    participant LB as Load Balancer
    participant S1 as Server 1
    participant S2 as Server 2
    
    Note over C,S2: Initial Session Establishment
    C->>LB: Initial Request
    LB->>S1: Forward Request
    S1-->>LB: Response (Session ID)
    LB-->>C: Response (Session ID)
    
    rect rgb(255, 255, 200)
        Note over C,S2: Sticky Session Established
    end
    
    Note over C,S2: Session Continuation
    C->>LB: Subsequent Request (with Session ID)
    LB->>S1: Forward Request (based on Session ID)
    S1-->>LB: Response (Data)
    LB-->>C: Response (Data)
    
    rect rgb(255, 255, 200)
        Note over C,S2: Session Continues on Server 1
    end
    
    Note over C,S2: Session Termination
    C->>LB: Logout Request
    LB->>S1: Forward Logout Request
    S1-->>LB: Confirmation
    LB-->>C: Confirmation
    
    rect rgb(255, 255, 200)
        Note over C,S2: Session Ended
    end
    
    rect rgb(255, 255, 200)
        Note right of S2: Server 2 remains unused due to stickiness
    end

Stickiness (Sticky Sessions)

Stickiness or sticky sessions are used in stateful systems, particularly in load-balanced environments. It ensures that requests from a particular client are directed to the same server instance. This is important when:

Session Data: The server needs to maintain session data (like login status), and it’s stored locally on a specific server instance.

Load Balancers: In a load-balanced environment, without stickiness, a client’s requests could be routed to different servers, which might not have the client’s session data.

Trade-off: While it helps maintain session continuity, it can reduce the load balancing efficiency and might lead to uneven server load.

Methods of Implementing Stickiness

Cookie-Based Stickiness: The most common method, where the load balancer uses a special cookie to track the server assigned to a client.

IP-Based Stickiness: The load balancer routes requests based on the client’s IP address, sending requests from the same IP to the same server.

Custom Header or Parameter: Some load balancers can use custom headers or URL parameters to track and maintain session stickiness.

#statefulnessVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 7 minutes]

Statelessness


In a stateless system, each request from the client must contain all the information the server needs to fulfill that request. The server does not store any state of the client’s session. This is a crucial principle of RESTful APIs. Characteristics include:

No Session Memory: The server remembers nothing about the user once the transaction ends. Each request is independent.

Scalability: Stateless systems are generally more scalable because the server doesn’t need to maintain session information. Any server can handle any request.

Simplicity and Reliability: The stateless nature makes the system simpler and more reliable, as there’s less information to manage and synchronize across systems.

Example: An API where each request contains an authentication token and all necessary data, allowing any server instance to handle any request.

sequenceDiagram
    participant C as Client
    participant LB as Load Balancer
    participant S1 as Server 1
    participant S2 as Server 2
    
    C->>LB: Request 1
    LB->>S1: Forward Request 1
    S1-->>LB: Response 1
    LB-->>C: Response 1
    
    C->>LB: Request 2
    LB->>S2: Forward Request 2
    S2-->>LB: Response 2
    LB-->>C: Response 2
    
    rect rgb(255, 255, 200)
        Note over C,S2: Each request is independent
    end

In this diagram:

Request 1: The client sends a request to the load balancer.

Load Balancer to Server 1: The load balancer forwards Request 1 to Server 1.

Response from Server 1: Server 1 processes the request and sends a response back to the client.

Request 2: The client sends another request to the load balancer.

Load Balancer to Server 2: This time, the load balancer forwards Request 2 to Server 2.

Response from Server 2: Server 2 processes the request and responds to the client.

Statelessness: Each request is independent and does not rely on previous interactions. Different servers can handle other requests without needing a shared session state.

Token-Based Authentication

Common in stateless architectures, this method involves passing a token for authentication with each request instead of relying on server-stored session data. JWT (JSON Web Tokens) is a popular example.

#statelessnessVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

REST API


REpresentational State Transfer is a software architectural style developers apply to web APIs.

REST APIs provide simple, uniform interfaces because they can be used to make data, content, algorithms, media, and other digital resources available through web URLs. Essentially, REST APIs are the most common APIs used across the web today.

Use of a uniform interface (UI)

HTTP Methods

GET: This method allows the server to find the data you requested and send it back to you.

POST: This method permits the server to create a new entry in the database.

PUT: If you perform the ‘PUT’ request, the server will update an entry in the database.

DELETE: This method allows the server to delete an entry in the database.

Sample REST API

https://api.zippopotam.us/us/08028

http://api.tvmaze.com/search/shows?q=friends

https://jsonplaceholder.typicode.com/posts

https://jsonplaceholder.typicode.com/posts/1

https://jsonplaceholder.typicode.com/posts/1/comments

https://reqres.in/api/users?page=2

https://reqres.in/api/users/2

More examples

http://universities.hipolabs.com/search?country=United+States

https://itunes.apple.com/search?term=pop&limit=1000

https://www.boredapi.com/api/activity

https://techcrunch.com/wp-json/wp/v2/posts?per_page=100&context=embed

CURL

Install curl (Client URL)

curl is a CLI application available for all OS.

https://curl.se/windows/

brew install curl

Usage

curl https://api.zippopotam.us/us/08028
curl https://api.zippopotam.us/us/08028 -o zipdata.json

Browser based

https://httpie.io/app

VS Code based

Get Thunder Client

Using Python

git clone https://github.com/gchandra10/python_read_restapi

Summary

Definition: REST (Representational State Transfer) API is a set of guidelines for building web services. A RESTful API is an API that adheres to these guidelines and allows for interaction with RESTful web services.

How It Works: REST uses standard HTTP methods like GET, POST, PUT, DELETE, etc. It is stateless, meaning each request from a client to a server must contain all the information needed to understand and complete the request.

Data Format: REST APIs typically exchange data in JSON or XML format.

Purpose: REST APIs are designed to be a simple and standardized way for systems to communicate over the web. They enable the backend services to communicate with front-end applications (like SPAs) or other services.

Use Cases: REST APIs are used in web services, mobile applications, and IoT (Internet of Things) applications for various purposes like fetching data, sending commands, and more.

#restapi #restVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

CPU Architecture Fundamentals

Introduction

CPU architecture defines:

  • The instruction set a processor understands
  • Register structure
  • Memory addressing model
  • Binary format

It determines what machine code can run on a processor.

If software is compiled for one architecture, it cannot run on another without translation.


Major CPU Architectures

In todays world.

1. amd64 (x86_64)

  • Designed by AMD, adopted by Intel
  • Dominates desktops and traditional servers
  • Common in enterprise data centers
  • Most Windows laptops
  • Intel-based Macs

Characteristics:

  • High performance
  • Higher power consumption

2. arm64 (aarch64)

  • Designed for power efficiency
  • Common in embedded systems and mobile devices
  • Raspberry Pi
  • Apple Silicon (M*)
  • Many IoT gateways

Characteristics:

  • Energy efficient
  • Dominant in IoT and edge computing

Mac/Linux

uname -m

Windows

echo %PROCESSOR_ARCHITECTURE%%
systeminfo | findstr /B /C:"System Type"

In IoT environments:

Edge devices : usually arm64

Cloud : often amd64 (ARM growing fast)

How Programming Languages Relate to Architecture

                +----------------------+
                |     Source Code      |
                |  (C, Rust, Python)   |
                +----------+-----------+
                           |
                           v
                +----------------------+
                |     Compiler /       |
                |     Interpreter      |
                +----------+-----------+
                           |
         +-----------------+-----------------+
         |                                   |
         v                                   v
+---------------------+          +----------------------+
|  amd64 Binary       |          |  arm64 Binary       |
|  (x86_64 machine    |          |  (ARM machine       |
|   instructions)     |          |   instructions)     |
+----------+----------+          +----------+-----------+
           |                                |
           v                                v
+---------------------+          +----------------------+
|  Intel / AMD CPU    |          |  ARM CPU            |
|  (Laptop, Server)   |          |  (Raspberry Pi,     |
|                     |          |   IoT Gateway)      |
+---------------------+          +----------------------+

Compiled Languages

Examples: C, C++, Rust, Go

When compiled, they produce native machine code.

Compile on Windows - produces an amd64 binary.

Compile on Raspberry Pi or new Mac - produces an arm64 binary.

That binary cannot run on a different architecture.

Interpreted Languages

Examples: Python, Node.js

Source code is architecture-neutral. Interpreter handles it.

The interpreter (Python, Node) is architecture-specific

Native extensions are architecture-specific.

Java and Bytecode

            +------------------+
            |   Java Source    |
            +--------+---------+
                     |
                     v
            +------------------+
            |    Bytecode      |
            |   (.class file)  |
            +--------+---------+
                     |
         +-----------+-----------+
         |                       |
         v                       v
+------------------+     +------------------+
| JVM (amd64)      |     | JVM (arm64)      |
+--------+---------+     +--------+---------+
         |                        |
         v                        v
   Intel CPU                ARM CPU

Java uses a different model.

Compile: javac MyApp.java

Produces: MyApp.class

This is bytecode, not native machine code.

Bytecode runs on the JVM (Java Virtual Machine).

The JVM is architecture-specific.

Same bytecode runs on amd64 JVM

Same bytecode runs on arm64 JVM

Java achieves portability through a virtual machine layer.

Cross Compilation

It is possible to cross compile for a different architecture than your current architecture.

Developer Laptop (amd64)
        |
        | build
        v
   amd64 binary
        |
        | deploy
        v
Raspberry Pi (arm64)
        |
        X  Fails (architecture mismatch)
Developer Laptop
        |
        | cross-build for arm64
        v
   arm64 binary
        |
        v
Raspberry Pi (runs successfully)

Architecture in IoT Upper Stack

LayerTypical Architecture
MicrocontrollerARM (32-bit or 64-bit)
Edge Gatewayarm64
Cloud VMamd64 or arm64
Personal Machinesamd64 or arm64

#architecture #arm #amdVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 7 minutes]

Containers

World before containers

Physical Machines

Traditional Physical Stack

  • 1 Physical Server
  • 1 Host Machine (say some Linux)
  • 3 Applications installed

Limitation:

  • Need of physical server.
  • Version dependency (Host and related apps)
  • Patches ”hopefully” not affecting applications.
  • All apps should work with the same Host OS.

Multiple Physical Stack

  • 3 physical server
  • 3 Host Machine (diff OS)
  • 3 Applications installed

Limitation:

  • Need of physical server(s).
  • Version dependency (Host and related apps)
  • Patches ”hopefully” not affecting applications.
  • Maintenance of 3 machines.
  • Network all three so they work together.

Virtual Machines

Virtual Machines

  • Virtual Machines emulate a real computer by virtualizing it to execute applications,running on top of a real computer.

  • To emulate a real computer, virtual machines use a Hypervisor to create a virtual computer.

  • On top of the Hypervisor, we have a Guest OS that is a Virtualized Operating System where we can run isolated applications, called Guest Operating System.

  • Applications that run in Virtual Machines have access to Binaries and Libraries on top of the operating system.

( + ) Full Isolation, Full virtualization ( - ) Too many layers, Heavy-duty servers.

Key Benefits

  • Better resource utilization than separate physical servers
  • Strong isolation between applications
  • Ability to run different OS environments
  • Easier backup and snapshot capabilities
  • Better than single OS but still has overhead
  • Each VM requires its own OS resources
  • Slower startup times compared to containers
  • Higher memory usage due to multiple OS instances

Containers

Containers

Containers are lightweight, portable environments that package an application with everything it needs to run—like code, runtime, libraries, and system tools—ensuring consistency across different environments. They run on the same operating system kernel and isolate applications from each other, which improves security and makes deployments easier.

  • Containers are isolated processes that share resources with their host and, unlike VMs, don’t virtualize the hardware and don’t need a Guest OS.

  • Containers share resources with other Containers in the same host.

  • This gives more performance than VMs (no separate guest OS).

  • Container Engine in place of Hypervisor.

Pros

  • Isolated Process
  • Mounted Files
  • Lightweight Process

Cons

  • Same Host OS
  • Security

#containers #dockerVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

VMs or Containers

VMs are great for running multiple, isolated OS environments on a single hardware platform. They offer strong security isolation and are useful when applications need different OS versions or configurations.

Containers are lightweight and share the host OS kernel, making them faster to start and less resource-intensive. They’re perfect for microservices, CI/CD pipelines, and scalable applications.

Smart engineers focus on the right tool for the job rather than getting caught up in “better or worse” debates.

Use them in combination to make life better.

Docker: The most widely used container platform, known for its simplicity, portability, and extensive ecosystem.

Podman: A daemonless container engine that’s compatible with Docker but emphasizes security, running containers as non-root users.

We will be using Docker for this course.

#vm #container #dockerVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 1 minute]

What container does

It brings to us the ability to create applications without worrying about their environment.

What container does

  • Docker turns “my machine” into the machine
  • Docker is not a magic want.
  • It only guarantees the environment is identical
  • Correctness still depends on what you build and how you run it.

#worksforme #container #dockerVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 6 minutes]

Docker Basics

At a conceptual level, Docker is built around two core abstractions:

  • Images – what you build
  • Containers – what you run

Everything else in Docker exists to build, store, distribute, and execute these two artifacts.

Images

  • An image is an immutable, layered filesystem snapshot
  • Built from a Dockerfile
  • Each instruction creates a new read-only layer
  • Images are content-addressed via SHA256 digests

Image is a versioned, layered blueprint

Key properties:

  • Immutable
  • Reusable
  • Cached aggressively
  • Portable across environments

Container

A container is a running instance of an image

  • A writable layer on top of image layers
  • Namespaces for isolation (PID, USER)
  • Containers are processes, not virtual machines
  • When the main process exits, the container stops

Image vs Container

AspectImageContainer
NatureStaticDynamic
MutabilityImmutableMutable
LifecycleBuild-timeRuntime
RoleArtifactInstance

Where Do Images Come From?

Docker Hub

https://hub.docker.com/

  • Default public container registry
  • Hosts official and community images
  • Supports tags, digests, vulnerability scans
  • Docker Hub is default, not mandatory

Apart from Docker Hub, there are few other common registries

AWS ECR

GCP Artifact Registry

Azure Container Registry

GitHub Container Registry

Private / On-Prem Registries

Harbor

JFrog Artifactory

Enterprises widely use on-prem or private registries. JFrog Artifactory is extremely common in regulated environments.

#docker #container #repositories #hubVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 16 minutes]

Docker Examples

  • Lists images available on the local machine
docker image ls
  • To get a specific image
docker image pull <imagename>
docker image pull python:3.12-slim
  • To inspect the downloaded image
docker image inspect python:3.12-slim

Check the architecture, ports open etc..

  • Create a container
docker create \
    --name edge-http \
    -p 8000:8000 \
    python:3.12-slim \
    python -m http.server 

List the Image and container again

  • Start the container
docker start edge-http

Open browser and check http://localhost:8000 shows the docker internal file structure.

docker inspect edge-http
  • Shows all running containers
docker container ls
  • Shows all containers
docker container ls -a
  • Disk usage by images, containers, volumes
docker system df
  • Logs Inspection
docker logs edge-http
docker inspect edge-http
  • Stop and remove
docker stop edge-http
docker rm edge-http

docker run is a wrapper for docker pull, docker create, docker start

Run an MQTT Broker

MQTT broker typically runs at edge or cloud.

  • Create a new container
docker run -d \
  --name mqtt-broker \
  -p 1883:1883 \
  eclipse-mosquitto:2.0
  • Verify
docker container ls
docker logs mqtt-broker
  • Stop and Delete
docker stop mqtt-broker
docker rm mqtt-broker

Deploy MySQL Database using Containers

Create the following folder

Linux / Mac

mkdir -p container/mysql
cd container/mysql

Windows

md container
cd container
md mysql
cd mysql
mkdir data

Note: If you already have MySQL Server installed in your machine then please change the port to 3307 as given below.

-p 3307:3306 \

Run the container


docker run --name mysql -d \
    -p 3306:3306 \
    -e MYSQL_ROOT_PASSWORD=root-pwd \
    -e MYSQL_ROOT_HOST="%" \
    -e MYSQL_DATABASE=mydb \
    -e MYSQL_USER=remote_user \
    -e MYSQL_PASSWORD=remote_user-pwd \
    -v ./data:/var/lib/mysql \
    docker.io/library/mysql:8.4.4

-d : detached (background mode) -p : 3306:3306 maps mysql default port 3306 to host machines port 3306 3307:3306 maps mysql default port 3306 to host machines port 3307

-e MYSQL_ROOT_HOST=“%” Allows to login to MySQL using MySQL Workbench

Login to MySQL Container

docker exec -it mysql bash
CREATE DATABASE IF NOT EXISTS iot_telemetry;
USE iot_telemetry;

CREATE TABLE telemetry (
  id BIGINT AUTO_INCREMENT PRIMARY KEY,
  device_id VARCHAR(64),
  temperature_c FLOAT,
  humidity_pct FLOAT,
  event_ts TIMESTAMP DEFAULT CURRENT_TIMESTAMP
);

INSERT INTO telemetry (device_id, temperature_c, humidity_pct)
VALUES
('esp32-001', 24.1, 51.2),
('esp32-002', 23.4, 49.8);

SELECT * FROM telemetry;

List all the Containers

docker container ls -a

Stop MySQL Container

docker stop mysql

Delete the container**

docker rm mysql

Build your own Image


mkdir -p container
cd container

Calculator Example

Follow the README.md

Fork & Clone

git clone https://github.com/gchandra10/docker_mycalc_demo.git

Docker Compose

Docker Compose is a tool that lets you define and run multi-container Docker applications using a single YAML file.

Instead of manually running multiple docker run commands, you describe:

  • Services (containers)
  • Networks
  • Volumes
  • Environment variables
  • Dependencies between services

…all inside a docker-compose.yml file.

Sample docker-compose.yaml

version: "3.9"

services:
  app:
    build: .
    ports:
      - "5000:5000"
    depends_on:
      - db

  db:
    image: postgres:15
    environment:
      POSTGRES_PASSWORD: example
docker compose up -d
docker compose down

Usecases

  • Reproducible environments
  • Clean dev setups
  • Ideal for microservices
  • Great for IoT stacks like broker + processor + DB

MQTT Python Docker Compose Example

https://github.com/gchandra10/docker-compose-mqtt-demo

Web App Demo

Fork & Clone

git clone https://github.com/gchandra10/docker_webapp_demo.git

Publish Image to Docker Hub

Login to Docker Hub

  • Create a Repository “my_faker_calc”
  • Under Account Settings
    • Personal Access Token
    • Create a PAT token with Read/Write access for 1 day

Replace gchandra10 with yours.

docker login

enter userid
enter PAT token

Then build the Image with your userid

docker build -t gchandra10/my_faker_calc:1.0 .
docker image ls

Copy the ImageID of gchandra10/my_fake_calc:1.0

Tag the ImageID with necessary version and latest

docker image tag <image_id> gchandra10/my_faker_calc:latest

Push the Images to Docker Hub (version and latest)

docker push gchandra10/my_faker_calc:1.0 
docker push gchandra10/my_faker_calc:latest

Image Security

Trivy

Open Source Scanner.

https://trivy.dev/latest/getting-started/installation/

trivy image python:3.12-slim

# Focus on high risk only

trivy image --severity HIGH,CRITICAL python:3.12-slim

# Show only fixes available
trivy image --ignore-unfixed false python:3.12-slim

trivy image gchandra10/my_faker_calc

trivy image gchandra10/my_faker_calc --severity CRITICAL,HIGH --format table

trivy image gchandra10/my_faker_calc --severity CRITICAL,HIGH  --output result.txt

Grype

Open Source Scanner

grype python:3.12-slim

Common Mitigation Rules

  • Upgrade the base
    • move to newer version of python if 3.12 has issues
  • Minimize OS packages
    • check our how many layers of packages are installed
  • Pin versions on libraries
    • requirements.txt make sure Library versions are pinned for easy detection
  • Run as non-root
    • Create local user instead of running as root
  • Don’t share Secrets
    • dont copy .env or any secrets in your script or application.

#docker #container #dockerhubVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Containers in IoT Architecture

Where Containers Exist

Runtime Layers

  • Microcontrollers (ESP32, STM32)

    • Bare metal / RTOS / MicroPython
    • No Docker
  • Edge Gateway (Raspberry Pi, Industrial PC)

    • Linux-based
    • Docker runs here
    • Hosts broker + processing services
  • Cloud Infrastructure

    • Scalable ingestion, storage, APIs

Containers live above firmware.

What Runs in Containers at the Edge

Typical IoT gateway stack:

Edge Gateway
 ├── MQTT Broker (mosquitto)
 ├── Data Processor (Python service)
 ├── Local Buffer (SQLite / lightweight DB)
 └── Forwarder to Cloud

Each service:

  • Built as an image
  • Run as an isolated container
  • Independently restartable

Why Containers Matter at Edge

  • Service isolation
  • Independent restart
  • Controlled upgrades
  • Version pinning
  • Reduced “works on my machine” problems

IoT systems must be deterministic.

Never use

mosquitto:latest

Always Pin versions

mosquitto:2.0.18

Resource Constraints at Edge

IoT is not cloud.

Resource Limits

Edge gateways have:

  • Limited RAM
  • Limited CPU
  • Limited storage
docker run \
  --memory=256m \
  --cpus=1 \
  --restart=always \
  eclipse-mosquitto:2.0

Containers consume real hardware resources.

Persistence Matters

Edge devices lose power. Without volumes, state is lost.

  • Use volumes to preserve:
  • Logs
  • Broker sessions
  • Buffered sensor data
docker run \
  -v mosq_data:/mosquitto/data \
  eclipse-mosquitto:2.0

Networking and Security

  • Use internal Docker networks
  • Expose only required ports
  • Avoid running containers as root
  • Use minimal base images
  • Scan for vulnerabilities
  • Compromised gateway equals compromised fleet.

Deployment Flow in IoT

  • Build image
  • Push to private registry
  • Gateway pulls image
  • Run container with restart policy
  • Monitor and update safely

Containers are how software moves from developer laptop to physical infrastructure.

Summary

  • Firmware generates signals.
  • Containers turn signals into systems.

Containers are the operational layer of the IoT upper stack.

#docker #iotVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 24 minutes]

Python Environment

PEP

PEP, or Python Enhancement Proposal, is the official style guide for Python code. It provides conventions and recommendations for writing readable, consistent, and maintainable Python code.

PEP Conventions

  • PEP 8 : Style guide for Python code (most famous).
  • PEP 20 : “The Zen of Python” (guiding principles).
  • PEP 484 : Type hints (basis for MyPy).
  • PEP 517/518 : Build system interfaces (basis for pyproject.toml, used by Poetry/UV).
  • PEP 572 : Assignment expressions (the := walrus operator).
  • PEP 440 : Mention versions in Libraries

Indentation

  • Use 4 spaces per indentation level
  • Continuation lines should align with opening delimiter or be indented by 4 spaces.

Line Length

  • Limit lines to a maximum of 79 characters.
  • For docstrings and comments, limit lines to 72 characters.

Blank Lines

  • Use 2 blank lines before top-level functions and class definitions.
  • Use 1 blank line between methods inside a class.

Imports

  • Imports should be on separate lines.
  • Group imports into three sections: standard library, third-party libraries, and local application imports.
  • Use absolute imports whenever possible.
# Correct
    import os
    import sys

# Wrong
    import sys, os

Naming Conventions

  • Use snake_case for function and variable names.
  • Use CamelCase for class names.
  • Use UPPER_SNAKE_CASE for constants.
  • Avoid single-character variable names except for counters or indices.

Whitespace

  • Don’t pad inside parentheses/brackets/braces.
  • Use one space around operators and after commas, but not before commas.
  • No extra spaces when aligning assignments.

Comments

  • Write comments that are clear, concise, and helpful.
  • Use complete sentences and capitalize the first word.
  • Use # for inline comments, but avoid them where the code is self-explanatory.

Docstrings

  • Use triple quotes (“”“) for multiline docstrings.
  • Describe the purpose, arguments, and return values of functions and methods.

Code Layout

  • Keep function definitions and calls readable.
  • Avoid writing too many nested blocks.

Consistency

  • Consistency within a project outweighs strict adherence.
  • If you must diverge, be internally consistent.

PEP 20 - The Zen of Python

https://peps.python.org/pep-0020/

Simple is better than complex

Complex

result = (lambda x: (x*x + 2*x + 1))(5)

Simple

x = 5
result = (x + 1) ** 2

Readability counts

No Good

a=10;b=20;c=a+b;print(c)

Good

first_value = 10
second_value = 20
sum_of_values = first_value + second_value
print(sum_of_values)

Errors should never pass silently

No Good

try:
    x = int("abc")
except:
    pass

Good

try:
    x = int("abc")
except ValueError as e:
    print("Conversion failed:", e)

PEP 572

Walrus Operator :=

Assignment within Expression Operator

Old Way

inputs = []
current = input("Write something ('quit' to stop): ")
while current != "quit":
    inputs.append(current)
    current = input("Write something ('quit' to stop): ")

Using Walrus

inputs = []
while (current := input("Write something ('quit' to stop): ")) != "quit":
    inputs.append(current)

Another Example

Old Way

import re

m = re.search(r"\d+", text)
if m:
    print(m.group())

New Way

import re

if (m := re.search(r"\d+", text)):
    print(m.group())

Linting

Linting is the process of automatically checking your Python code for:

  • Syntax errors

  • Stylistic issues (PEP 8 violations)

  • Potential bugs or bad practices

  • Keeps your code consistent and readable.

  • Helps catch errors early before runtime.

  • Encourages team-wide coding standards.


# Incorrect
import sys, os

# Correct
import os
import sys
# Bad spacing
x= 5+3

# Good spacing
x = 5 + 3

Ruff : Linter and Code Formatter

Ruff is a fast, modern tool written in Rust that helps keep your Python code:

  • Consistent (follows PEP 8)
  • Clean (removes unused imports, fixes spacing, etc.)
  • Correct (catches potential errors)

Install

uv add ruff

Verify

ruff --version 
ruff --help

example.py

import os, sys 

def greet(name): 
  print(f"Hello, {name}")

def message(name): print(f"Hi, {name}")

def calc_sum(a, b): return a+b

greet('World')
greet('Ruff')
message('Ruff')

uv run ruff check example.py
uv run ruff check example.py --fix
uv run ruff format example.py --check
uv run ruff check example.py

PEP 484 - MyPy : Type Checking Tool

Python is a Dynamically typed programming language. Meaning

x=26 x= “hello”

both are valid.

MyPy is introduced to make it statically typed.

mypy is a static type checker for Python. It checks your code against the type hints you provide, ensuring that the types are consistent throughout the codebase.

It primarily focuses on type correctness—verifying that variables, function arguments, return types, and expressions match the expected types.

What mypy checks:

  • Variable reassignment types
  • Function arguments
  • Return types
  • Expressions and operations
  • Control flow narrowing

What mypy does not do:

  • Runtime validation
  • Performance checks
  • Logical correctness

Install

    uv add mypy

    or

    pip install mypy

Example 1 : sample.py

x = 1
x = 1.0
x = True
x = "test"
x = b"test"

print(x)

uv run mypy sample.py

or

mypy sample.py

Example 2: Type Safety

def add(a: int, b: int) -> int:
    return a + b

print(add(100, 123))
print(add("hello", "world"))

Example 3: Return Type Violation

def divide(a: int, b: int) -> int:
    if b == 0:
        return "invalid"
    return a // b

Example 4: Optional Types

from typing import Optional

def get_username(user_id: int) -> Optional[str]:
    if user_id == 0:
        return None
    return "admin"

name = get_username(0)
print(name.upper())

What is wrong in this? name can also be None and there is no upper for None

#mypy #pep #ruff #lintVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 15 minutes]

Code Quality & Safety

Type Hinting/Annotation

Type Hint

A type hint is a notation that suggests what type a variable, function parameter, or return value should be. It provides hints to developers and tools about the expected type but does not enforce them at runtime. Type hints can help catch type-related errors earlier through static analysis tools like mypy, and they enhance code readability and IDE support.

Type Annotation

Type annotation refers to the actual syntax used to provide these hints. It involves adding type information to variables, function parameters, and return types. Type annotations do not change how the code executes; they are purely for informational and tooling purposes.

Benefits

  • Improved Readability: Code with type annotations is easier to understand.

  • Tooling Support: IDEs can provide better autocompletion and error checking.

  • Static Analysis: Tools like mypy can check for type consistency, catching errors before runtime.

Basic Type Hints

age: int = 25
name: str = "Rachel"
is_active: bool = True
price: float = 19.99

Here, age is annotated as an int, and name is annotated as a str.

Collections

from typing import List, Set, Dict, Tuple

# List type hints
numbers: List[int] = [1, 2, 3]
names: List[str] = ["Alice", "Bob"]

# Set type hints
unique_ids: Set[int] = {1, 2, 3}

# Dictionary type hints
user_scores: Dict[str, int] = {"Alice": 95, "Bob": 87}

# Tuple type hints
point: Tuple[float, float] = (2.5, 3.0)

Function Annotations

def calculate_discount(price: float, discount_percent: float) -> float:
    """Calculate the final price after applying a discount."""
    return price * (1 - discount_percent / 100)

def get_user_names(user_ids: List[int]) -> Dict[int, str]:
    """Return a mapping of user IDs to their names."""
    return {uid: f"User {uid}" for uid in user_ids}

Advanced Type Hints

from typing import Optional, Union

def process_data(data: Optional[str] = None) -> str:
    """Process data with an optional input."""
    if data is None:
        return "No data provided"
    return data.upper()

def format_value(value: Union[int, float, str]) -> str:
    """Format a value that could be integer, float, or string."""
    return str(value)

Best Practices

  • Consistency: Apply type hints consistently across your codebase.
  • Documentation: Type hints complement but don’t replace docstrings.
  • Type Checking: Use static type checkers like mypy.
# Run mypy on your code
mypy your_module.py

Secret Management

Proper secret management is crucial for application security. Secrets include API keys, database credentials, tokens, and other sensitive information that should never be hardcoded in your source code or committed to version control.

Either create them in Shell or .env

Shell

export SECRET_KEY='your_secret_value'

Windows Users

Goto Environment Variables via GUI and create one.

pip install python-dotenv

Create a empty file .env

.env

SECRET_KEY=your_secret_key
DATABASE_URL=your_database_url

main.py


from dotenv import load_dotenv
import os

# Load environment variables from .env file
load_dotenv()

# Access the environment variables
secret_key = os.getenv("SECRET_KEY")
database_url = os.getenv("DATABASE_URL")

print(f"Secret Key: {secret_key}")
print(f"Database URL: {database_url}")

Best Practices

Never commit secrets to version control

  • Use .gitignore to exclude .env files
  • Regularly audit git history for accidental commits

Sample .gitignore

# .gitignore
.env
.env.*
!.env.example
*.pem
*.key
secrets/

Create a .env.example file with dummy values:

# .env.example
SECRET_KEY=your_secret_key_here
DATABASE_URL=postgresql://user:password@localhost:5432/dbname
API_KEY=your_api_key_here
DEBUG=False

Access Control

  • Restrict environment variable access to necessary processes
  • Use separate environment files for different environments (dev/staging/prod)

Secret Rotation

  • Implement procedures for regular secret rotation
  • Use separate secrets for different environments

Production Environments

Consider using cloud-native secret management services:

  • AWS Secrets Manager
  • Google Cloud Secret Manager
  • Azure Key Vault
  • HashiCorp Vault

PDOC

Python Documentation

pdoc is an automatic documentation generator for Python libraries. It builds on top of Python’s built-in doc attributes and type hints to create comprehensive API documentation. pdoc automatically extracts documentation from docstrings and generates HTML or Markdown output.

Docstring (Triple-quoted string)

def add(a: float, b: float) -> float:
    """
    Add two numbers.

    Args:
        a (float): The first number to add.
        b (float): The second number to add.

    Returns:
        float: The sum of the two numbers.

    Example:
        >>> add(2.5, 3.5)
        6.0
    """
    return a + b


def divide(a: float, b: float) -> float:
    """
    Divide one number by another.

    Args:
        a (float): The dividend.
        b (float): The divisor, must not be zero.

    Returns:
        float: The quotient of the division.

    Raises:
        ValueError: If the divisor (`b`) is zero.

    Example:
        >>> divide(10, 2)
        5.0
    """
    if b == 0:
        raise ValueError("The divisor (b) must not be zero.")
    return a / b
uv add pdoc
uv run pdoc filename.py -o ./docs
  • pdoc.config.json allows customization
{
    "docformat": "google",
    "include": ["your_module"],
    "exclude": ["tests", "docs"],
    "template_dir": "custom_templates",
    "output_dir": "api_docs"
}

#codequality #safety #pdocVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

Error Handling

Python uses try/except blocks for error handling.

The basic structure is:

try:
    # Code that may raise an exception
except ExceptionType:
    # Code to handle the exception
finally:
    # Code executes all the time

Uses

Improved User Experience: Instead of the program crashing, you can provide a user-friendly error message.

Debugging: Capturing exceptions can help you log errors and understand what went wrong.

Program Continuity: Allows the program to continue running or perform cleanup operations before terminating.

Guaranteed Cleanup: Ensures that certain operations, like closing files or releasing resources, are always performed.

Some key points

  • You can catch specific exception types or use a bare except to catch any exception.

  • Multiple except blocks can be used to handle different exceptions.

  • An else clause can be added to run if no exception occurs.

  • A finally clause will always execute, whether an exception occurred or not.


Without Try/Except

x = 10 / 0 

Basic Try/Except

try:
    x = 10 / 0 
except ZeroDivisionError:
    print("Error: Division by zero!")

Generic Exception

try:
    file = open("nonexistent_file.txt", "r")
except:
    print("An error occurred!")

Find the exact error

try:
    file = open("nonexistent_file.txt", "r")
except Exception as e:
    print(str(e))

Raise - Else and Finally

try:
    x = -10
    if x <= 0:
        raise ValueError("Number must be positive")
except ValueError as ve:
    print(f"Error: {ve}")
else:
    print(f"You entered: {x}")
finally:
    print("This will always execute")

try:
    x = 10
    if x <= 0:
        raise ValueError("Number must be positive")
except ValueError as ve:
    print(f"Error: {ve}")
else:
    print(f"You entered: {x}")
finally:
    print("This will always execute")

Nested Functions


def divide(a, b):
    try:
        result = a / b
        return result
    except ZeroDivisionError:
        print("Error in divide(): Cannot divide by zero!")
        raise  # Re-raise the exception

def calculate_and_print(x, y):
    try:
        result = divide(x, y)
        print(f"The result of {x} divided by {y} is: {result}")
    except ZeroDivisionError as e:
        print(str(e))
    except TypeError as e:
        print(str(e))

# Test the nested error handling
print("Example 1: Valid division")
calculate_and_print(10, 2)

print("\nExample 2: Division by zero")
calculate_and_print(10, 0)

print("\nExample 3: Invalid type")
calculate_and_print("10", 2)

#error #tryVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 22 minutes]

Faker

Faker: A Python Library for Generating Fake Data

Faker is a powerful Python library that generates realistic fake data for various purposes. It’s particularly useful for:

  • Testing: Populating databases, testing APIs, and stress-testing applications with realistic-looking data.

  • Development: Creating sample data for prototyping and demonstrations.

  • Data Science: Generating synthetic datasets for training and testing machine learning models.

  • Privacy: Anonymizing real data for sharing or testing while preserving data structures and distributions.

Key Features:

  • Wide Range of Data Types: Generates names, addresses, emails, phone numbers, credit card details, dates, companies, jobs, texts, and much more.

  • Customization: Allows you to customize the data generated using various parameters and providers.

  • Locale Support: Supports multiple locales, allowing you to generate data in different languages and regions.

  • Easy to Use: Simple and intuitive API with clear documentation.

from faker import Faker

fake = Faker()

print(fake.name())  # Output: A randomly generated name
print(fake.email())  # Output: A randomly generated email address
print(fake.address())  # Output: A randomly generated address
print(fake.date_of_birth())  # Output: A randomly generated date of birth

Using Faker in Data World

Data Exploration and Analysis: Generate synthetic datasets with controlled characteristics to explore data analysis techniques and algorithms.

Data Visualization: Create sample data to visualize different data distributions and patterns.

Data Cleaning and Transformation: Test data cleaning and transformation pipelines with realistic-looking dirty data.

Data Modeling: Build and test data models using synthetic data before applying them to real-world data.

Using Faker in IoT World

IoT Device Simulation: Simulate sensor data from various IoT devices, such as temperature, humidity, and location data.

IoT System Testing: Test IoT systems and applications with realistic-looking sensor data streams.

IoT Data Analysis: Generate synthetic IoT data for training and testing machine learning models for tasks like anomaly detection and predictive maintenance.

IoT Data Visualization: Create visualizations of simulated IoT data to gain insights into system behavior.

Luhn Algorithm (pronounced as Loon)

Used to detect accidental errors in data entry or transmission, particularly single-digit errors and transposition of adjacent digits.

The Luhn algorithm, also known as the modulus 10 or mod 10 algorithm, is a simple checksum formula used to validate a variety of identification numbers, such as credit card numbers, IMEI numbers and so on.

  • Step 1: Starting from the rightmost digit, double the value of every second digit.
  • Step 2: If doubling of a number results in a two digit number, then add the digts to get a single digit number.
  • Step 3: Now sum all the final digits.
  • Step 4: If the sum is divisible by 10 then its a valid number.

Example: 4532015112830366

Key Features

  • Can detect 100% of single-digit errors
  • Can detect around 98% of transposition errors
  • Simple mathematical operations (addition and multiplication)
  • Low computational overhead

Limitations

  • Not cryptographically secure
  • Cannot detect all possible errors
  • Some error types (like multiple transpositions) might go undetected

Common Use Cases

  • Device Authentication: Validating device identifiers
  • Asset Tracking: Verifying equipment serial numbers
  • Smart Meter Reading Validation: Ensuring meter readings are transmitted correctly
  • Sensor Data Integrity: Basic error detection in sensor data transmission
git clone https://github.com/gchandra10/python_faker_demo.git

Damm Algorithm

The Damm Algorithm is a check digit algorithm created by H. Michael Damm in 2004. It uses a checksum technique intended to identify mistakes in data entry or transmission, especially when it comes to number sequences.

Perfect Error Detection:

  • Detects all single-digit errors
  • Detects all adjacent transposition errors
  • No false positives or false negatives

To check where 234 is valid number

Start: interim = 0

First digit (2):
- Row = 0 (current interim)
- Column = 2 (current digit)
- table[0][2] = 1
- New interim = 1

Second digit (3):
- Row = 1 (current interim)
- Column = 3 (current digit)
- table[1][3] = 2
- New interim = 2

Third digit (4):
- Row = 2 (current interim)
- Column = 4 (current digit)
- table[2][4] = 8
- Final interim = 8 (this becomes check digit)

As the final interim is not Zero this is not a valid number as per Damm Algorithm.

    [0, 3, 1, 7, 5, 9, 8, 6, 4, 2],
    [7, 0, 9, 2, 1, 5, 4, 8, 6, 3],
    [4, 2, 0, 6, 8, 7, 1, 3, 5, 9],
    [1, 7, 5, 0, 9, 8, 3, 4, 2, 6],
    [6, 1, 2, 3, 0, 4, 5, 9, 7, 8],
    [3, 6, 7, 4, 2, 0, 9, 5, 8, 1],
    [5, 8, 6, 9, 7, 2, 0, 1, 3, 4],
    [8, 9, 4, 5, 3, 6, 2, 0, 1, 7],
    [9, 4, 3, 8, 6, 1, 7, 2, 0, 5],
    [2, 5, 8, 1, 4, 3, 6, 7, 9, 0]

Lets try 57240 and someone entered 57340.

Luhn is like a spell checker and Damm is Grammar checker.

IOT Uses Cases with Algorithms

Use CaseAlgorithm UsedDescription
Smart Metering (Electricity, Water, Gas)LuhnConsumer account numbers and meter IDs can use the Luhn algorithm to validate input during billing and monitoring.
IoT-based Credit Card TransactionsLuhnWhen smart vending machines or POS terminals process card payments, Luhn ensures credit card numbers are valid.
IMEI Validation in Smart DevicesLuhnIoT-enabled mobile and tracking devices use Luhn to validate IMEI numbers for device authentication.
Smart Parking Ticketing SystemsLuhnParking meters with IoT sensors can validate vehicle plate numbers or digital parking tickets using the Luhn algorithm.
Industrial IoT (IIoT) Sensor IDsDammFactory sensors and devices generate unique IDs with the Damm algorithm to prevent ID entry errors and misconfigurations.
IoT-based Asset TrackingDammLogistics and supply chain IoT devices use Damm to ensure tracking codes are error-free and resistant to transposition mistakes.
Connected Health Devices (Wearables, ECG Monitors)DammUnique patient monitoring device IDs use Damm for error-free identification in hospital IoT systems.
IoT-enabled Vehicle IdentificationDammVehicle chassis numbers and VINs in IoT-based fleet management use Damm for better error detection.

FeatureLuhn AlgorithmDamm Algorithm
TypeModulus-10 checksumNoncommutative quasigroup checksum
Use CaseCredit card numbers, IMEI, etc.Error detection in numeric sequences
Mathematical BasisWeighted sum with modulus 10Quasigroup operations
Error DetectionDetects single-digit errors and most transpositionsDetects all single-digit and adjacent transposition errors
Processing ComplexitySimple addition and modulus operationMore complex due to quasigroup operations
StrengthsSimple and widely adoptedStronger error detection capabilities
WeaknessesCannot detect all double transpositionsLess widely used and understood
PerformanceEfficient for real-time validationSlightly more computationally intensive

For Firmware updates we can use SHA-256 or SHA-512 (Hashing Algorithms)

#faker #damm #luhnVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 7 minutes]

Logging

Python’s logging module provides a flexible framework for tracking events in your applications. It’s used to log messages to various outputs (console, files, etc.) with different severity levels like DEBUG, INFO, WARNING, ERROR, and CRITICAL.

Use Cases of Logging

Debugging: Identify issues during development. Monitoring: Track events in production to monitor behavior. Audit Trails: Capture what has been executed for security or compliance. Error Tracking: Store errors for post-mortem analysis. Rotating Log Files: Prevent logs from growing indefinitely using size or time-based rotation.

Python Logging Levels

LevelUsageNumeric ValueDescription
DEBUGDetailed information for diagnosing problems.10Useful during development and debugging stages.
INFOGeneral information about program execution.20Highlights normal, expected behavior (e.g., program start, process completion).
WARNINGIndicates something unexpected but not critical.30Warns of potential problems or events to monitor (e.g., deprecated functions, nearing limits).
ERRORAn error occurred that prevented some part of the program from working.40Represents recoverable errors that might still allow the program to continue running.
CRITICALSevere errors indicating a major failure.50Marks critical issues requiring immediate attention (e.g., system crash, data corruption).

INFO

import logging

logging.basicConfig(level=logging.INFO)  # Set the logging level to INFO

logging.debug("This is a debug message.")
logging.info("This is an info message.")
logging.warning("This is a warning message.")
logging.error("This is an error message.")
logging.critical("This is a critical message.")

Error

import logging

logging.basicConfig(level=logging.ERROR)  # Set the logging level to ERROR

logging.debug("This is a debug message.")
logging.info("This is an info message.")
logging.warning("This is a warning message.")
logging.error("This is an error message.")
logging.critical("This is a critical message.")

import logging

logging.basicConfig(
    level=logging.DEBUG, 
    format = '%(asctime)s - %(name)s - %(levelname)s - %(message)s'
)

logging.debug("This is a debug message.")
logging.info("This is an info message.")
logging.warning("This is a warning message.")

More Examples

git clone https://github.com/gchandra10/python_logging_examples.git

#logging #info #debugVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 11 minutes]

Time Series Database (TSDB)

A Time Series Database (TSDB) is a type of database designed specifically to store and query time-stamped data.

In many modern systems, data is continuously generated with a timestamp attached to every event. Examples include sensor readings from IoT devices, system metrics from servers, financial price movements, or application performance metrics. Traditional databases can store this data, but they are not optimized for the access patterns that time-based data requires.

A TSDB is built to efficiently handle large volumes of sequential, time-ordered data and make it easy to analyze trends, patterns, and changes over time.


Key Characteristics

Time-centric design

Data is stored with time as the primary dimension.

Queries typically ask questions like:

  • What happened in the last 5 minutes?
  • What is the average CPU usage per minute today?
  • How did temperature change over the last 24 hours?

Because of this, TSDBs are optimized for time-range queries and chronological data access.


High ingestion rates

Many time-series systems generate data very frequently.

Examples:

  • IoT sensors publishing readings every few seconds
  • Servers emitting metrics every few milliseconds
  • Stock markets generating price ticks continuously

TSDBs are optimized to ingest large volumes of data points efficiently without slowing down.


Efficient storage

Time-series data often contains repeating patterns or slowly changing values.

To optimize storage, TSDBs commonly use:

  • Compression techniques
  • Column-oriented storage
  • Time-based partitioning

These techniques reduce storage costs while maintaining fast query performance.


Optimized time-series queries

TSDBs support operations commonly used when analyzing time-based data.

Filtering

Selecting data within a time range or based on tags/labels.

Aggregation

Calculating metrics such as average, sum, min, or max over time intervals.

Downsampling

Reducing high-resolution data into summarized intervals.
For example converting per-second data into hourly averages.

These capabilities allow efficient analysis of both recent high-resolution data and long-term trends.


Common Use Cases

IoT Systems

Devices such as sensors, wearables, smart meters, and industrial machines continuously generate timestamped measurements.

Examples:

  • Temperature readings
  • Pressure measurements
  • Energy consumption data

System Monitoring

Monitoring platforms collect metrics from infrastructure and applications.

Examples:

  • CPU usage
  • Memory utilization
  • Network throughput
  • Request latency

Financial Markets

Market data is inherently time-based.

Examples:

  • Stock prices
  • Trading volume
  • Tick-level market events

Scientific and Research Data

Many experiments produce sequential measurements over time.

Examples:

  • Climate data
  • Astronomy observations
  • Simulation outputs

Popular Time Series Databases

InfluxDB

A widely used open-source TSDB designed specifically for high-throughput time-series workloads.


TimescaleDB

A PostgreSQL extension that adds efficient time-series capabilities while retaining the SQL ecosystem.


Prometheus

An open-source monitoring system that includes its own time-series database for collecting and querying metrics.


Apache Cassandra

Although not a dedicated TSDB, Cassandra is often used for time-series workloads due to its distributed architecture and scalability.


Summary

Time-series data is everywhere in modern systems. IoT platforms, monitoring systems, financial markets, and scientific experiments all generate large volumes of timestamped data.

A Time Series Database provides a specialized architecture to:

  • efficiently ingest high-frequency data
  • store time-ordered events compactly
  • query trends and patterns quickly

Because of these optimizations, TSDBs have become an important component in observability platforms, IoT pipelines, and real-time analytics systems.

#tsdb #influxdb #prometheusVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 17 minutes]

InfluxDB

InfluxDB is a high-performance Time Series Database (TSDB) designed to store and analyze large volumes of timestamped data. It is commonly used in systems where data arrives continuously, such as IoT devices, monitoring platforms, telemetry pipelines, and financial market feeds.

InfluxDB is optimized for workloads where the primary queries involve time ranges, trends, aggregations, and real-time metrics.

With the release of InfluxDB 3, the platform has evolved significantly. Earlier versions relied on custom storage engines and specialized query languages, but InfluxDB 3 adopts a modern analytics architecture built on open standards.

The latest version uses:

  • Apache Arrow for in-memory analytics
  • Parquet for columnar storage
  • DataFusion as the SQL query engine
  • Object storage as the persistent storage layer

This architecture improves performance, scalability, and interoperability with modern data platforms.


Key Features

High Ingestion Performance

InfluxDB is designed to ingest millions of time-series data points per second, making it suitable for systems that generate high-frequency telemetry data.

Examples include:

  • IoT sensor streams
  • application monitoring metrics
  • infrastructure telemetry
  • financial tick data

Time-Series Optimized Storage

Data is stored using columnar formats (Parquet) which allow efficient compression and fast scanning of time-based data.

This significantly improves performance for queries such as:

  • time-range filtering
  • aggregations over time intervals
  • trend analysis

SQL Querying

InfluxDB 3 introduces standard SQL as the primary query language.

This change allows developers and analysts to query time-series data using familiar SQL tools rather than learning a specialized query language.

Example:

SELECT
  date_bin(INTERVAL '5 minutes', time) AS bucket,
  AVG(temperature)
FROM sensor_data
WHERE time > now() - INTERVAL '1 hour'
GROUP BY bucket
ORDER BY bucket;

Scalability with Object Storage

InfluxDB 3 separates compute from storage and stores data in object storage systems such as cloud storage.

Benefits include:

  • virtually unlimited storage
  • lower storage costs
  • improved scalability for large datasets

Built-in Visualization and Management

InfluxDB provides tools for:

  • data exploration
  • dashboards
  • monitoring metrics
  • administrative tasks

These tools help users quickly analyze real-time data streams.

Data Model

InfluxDB uses a simple time-series data model consisting of four main components.

Measurement

A measurement represents a logical category of data, similar to a table in relational databases.

Examples:

  • temperature
  • cpu_usage
  • network_latency

Tags

Tags are indexed key-value pairs used to describe metadata and enable fast filtering.

Examples:

  • location = kitchen
  • host = server01
  • device = sensor12

Because tags are indexed, queries that filter by tags perform efficiently.

Fields

Fields contain the actual measured values.

Examples:

  • temperature = 22.5
  • cpu_usage = 65
  • humidity = 40

Fields are not indexed to allow faster write performance.

Timestamp

Every data point includes a timestamp, which records when the event occurred.

Time is the primary dimension for storing and querying data in InfluxDB.

Common Use Cases

  • IoT Sensor Data
  • Infrastructure Monitoring (CPU,Memory, Network, Disk IO)
  • Observability & DevOPS
  • Financial TS Data (Stock, Trading, Market Indicators)
FeatureInfluxDB 1.xInfluxDB 2.xInfluxDB 3.x
Query LanguageInfluxQLFluxSQL
Storage EngineCustom TSDB engineCustom TSDB engineArrow + Parquet
Data ContainerDatabase + Retention PolicyBucketDatabase
Storage BackendLocal storageLocal storageObject storage
Query EngineInfluxQL engineFlux engineDataFusion
ArchitectureSingle node / clusterImproved platform with UIModern analytics architecture
Ecosystem IntegrationLimitedModerateStrong integration with modern data stack

InfluxDB3 uses DataFusion

SQL
  │
DataFusion Query Engine
  │
Apache Arrow
  │
Parquet files
  │
Object Storage

InfluxDB 3 UI

In InfluxDB 3, the user interface is separated from the core database engine. Unlike earlier versions where the UI was bundled with the database, the new architecture treats the UI as a separate service.

This change aligns with the overall design philosophy of InfluxDB 3, where storage, compute, and management tools are decoupled.

Why the UI is separated

Independent scaling

The database engine focuses purely on data ingestion, storage, and query execution, while the UI handles visualization and user interaction.
Separating them allows each component to scale independently.


Cleaner architecture

By separating the UI from the database engine, the system becomes more modular. The core database can remain lightweight and optimized for high-performance time-series workloads, while the UI evolves independently.


Flexible deployment

Users are not required to run the UI if they do not need it. Many production deployments interact with InfluxDB through:

  • APIs
  • SQL clients
  • monitoring tools
  • custom applications

The UI becomes an optional management layer rather than a required component.


Faster development

Because the UI is no longer tightly coupled with the database engine, improvements to dashboards, visualization, and management features can be released independently without impacting the database core.


What the UI provides

The InfluxDB UI helps users:

  • explore and query time-series data
  • build dashboards and visualizations
  • monitor metrics
  • manage databases and ingestion

It acts as a convenient interface for interacting with InfluxDB, while the core database focuses on performance and scalability.


Influxdb with IOT

Time Format

Epoch time represents the number of time units elapsed since:

1970-01-01 00:00:00 UTC

Epoch Time

#influxdb #tsdb #telegraf #sqlVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 25 minutes]

InfluxDB Demo

Software

Cloud

Influxdata Cloud

Via Docker

mkdir influxdb
cd influxdb
docker-compose.yml

docker-compose.yml

name: influxdb3
services:
  influxdb3-core:
    container_name: influxdb3-core
    image: influxdb:3-core
    ports:
      - 8181:8181
    command:
      - influxdb3
      - serve
      - --node-id=node0
      - --object-store=file
      - --data-dir=/var/lib/influxdb3/data
      - --plugin-dir=/var/lib/influxdb3/plugins
    volumes:
      - ./.influxdb3/core/data:/var/lib/influxdb3/data
      - ./.influxdb3/core/plugins:/var/lib/influxdb3/plugins
    restart: unless-stopped

  influxdb3-explorer:
    image: influxdata/influxdb3-ui:latest
    container_name: influxdb3-explorer
    ports:
      - "8888:80"
    volumes:
      - ./.influxdb3-ui/db:/db:rw
      - ./.influxdb3-ui/config:/app-root/config:ro
    environment:
      SESSION_SECRET_KEY: "${SESSION_SECRET_KEY:-$(openssl rand -hex 32)}"
    restart: unless-stopped
    command: ["--mode=admin"]

Launch the containers

docker compose up -d

Create Token

docker exec influxdb3-core influxdb3 create token --admin

create a file at

./.influxdb3-ui/config/config.json

Add the following contents

  "DEFAULT_INFLUX_SERVER": "http://influxdb3-core:8181",
  "DEFAULT_API_TOKEN": "",
  "DEFAULT_SERVER_NAME": "InfluxDB3 - Docker""

Restart Docker

docker compose restart

Load Data

Login via UI

http://localhost:8888

  • Create a Database, if prompted set retention period.
  • Load data via Line Protocol or CSV or JSON or Programatically

Line Protocol

Line protocol, is InfluxDB’s text-based format for writing time series data into the database. It’s designed to be both human-readable and efficient for machine parsing.

Format of Sample Data

In InfluxDB, a “measurement” is a fundamental concept that represents the data structure that stores time series data. You can think of a measurement as similar to a table in a traditional relational database.

Note:

Use singular form for measurement names (e.g., “temperature” not “temperatures”) Be consistent with tag and field names Consider using a naming convention (e.g., snake_case or camelCase)


Example 1

temperature,location=kitchen value=22.5
  • temperature : measurement
  • location=kitchen : tags
  • value=22.5 : field
  • if TimeStamp is missing then it assumes current TimeStamp

Example 2

temperature,location=kitchen,sensor=thermometer value=22.5 1614556800000000000

Example 3

Multiple Tags and Multiple Fields

temperature,location=kitchen,sensor=thermometer temp_c=22.5,humidity_pct=45.2
  • location=kitchen,sensor=thermometer : Tags
  • temp_c=22.5,humidity_pct=45.2 : Field

Example 4

temperature,location=kitchen,sensor=thermometer reading=22.5,battery_level=98,type="smart",active=true

Copy each section into Line Protocol window, bulk copying will only replace the data as it copies in the same timestamp.

temperature,location=kitchen value=22.5
temperature,location=living_room value=21.8
temperature,location=bedroom value=20.3

temperature,location=kitchen value=23.1
temperature,location=living_room value=22.0
temperature,location=bedroom value=20.7

temperature,location=kitchen value=22.8
temperature,location=living_room value=21.5
temperature,location=bedroom value=20.1

temperature,location=kitchen value=23.5
temperature,location=living_room value=21.9
temperature,location=bedroom value=19.8

temperature,location=kitchen value=24.2
temperature,location=living_room value=22.3
temperature,location=bedroom value=20.5

temperature,location=kitchen value=23.7
temperature,location=living_room value=22.8
temperature,location=bedroom value=21.0

temperature,location=kitchen value=22.9
temperature,location=living_room value=22.5
temperature,location=bedroom value=20.8

humidity,location=kitchen value=45.2
humidity,location=living_room value=42.8
humidity,location=bedroom value=48.3

humidity,location=kitchen value=46.1
humidity,location=living_room value=43.5
humidity,location=bedroom value=49.1

humidity,location=kitchen value=45.8
humidity,location=living_room value=42.3
humidity,location=bedroom value=48.7

humidity,location=kitchen value=46.5
humidity,location=living_room value=44.2
humidity,location=bedroom value=49.8

humidity,location=kitchen value=47.2
humidity,location=living_room value=45.1
humidity,location=bedroom value=50.2

humidity,location=kitchen value=46.8
humidity,location=living_room value=44.8
humidity,location=bedroom value=49.6

humidity,location=kitchen value=45.9
humidity,location=living_room value=43.7
humidity,location=bedroom value=48.5

co2_ppm,location=kitchen value=612
co2_ppm,location=living_room value=578
co2_ppm,location=bedroom value=495

co2_ppm,location=kitchen value=635
co2_ppm,location=living_room value=582
co2_ppm,location=bedroom value=510

co2_ppm,location=kitchen value=621
co2_ppm,location=living_room value=565
co2_ppm,location=bedroom value=488

co2_ppm,location=kitchen value=642
co2_ppm,location=living_room value=595
co2_ppm,location=bedroom value=502

co2_ppm,location=kitchen value=658
co2_ppm,location=living_room value=612
co2_ppm,location=bedroom value=521

co2_ppm,location=kitchen value=631
co2_ppm,location=living_room value=586
co2_ppm,location=bedroom value=508

co2_ppm,location=kitchen value=618
co2_ppm,location=living_room value=572
co2_ppm,location=bedroom value=491

Demo how to Query

Sample queries like MySQL

create database my_db;
CREATE DATABASE my-weather RETENTION 30d;
ALTER DATABASE my-weather SET RETENTION 30d;
select * from system.databases;
show tables;

InfluxDB SQL

Write CSV data

Set the measurement as csv_measurement

time,location,value
1741176000,kitchen,22.5
1741176000,living_room,21.8
1741176000,bedroom,20.3
1741176060,kitchen,23.1
1741176060,living_room,22.0
1741176060,bedroom,20.7
1741176120,kitchen,22.8
1741176120,living_room,21.5
1741176120,bedroom,20.1

Write JSON data

Set the measurement as json_measurement

[
  {"time":1741176000,"location":"kitchen","value":22.5},
  {"time":1741176000,"location":"living_room","value":21.8},
  {"time":1741176000,"location":"bedroom","value":20.3},

  {"time":1741176060,"location":"kitchen","value":23.1},
  {"time":1741176060,"location":"living_room","value":22.0},
  {"time":1741176060,"location":"bedroom","value":20.7},

  {"time":1741176120,"location":"kitchen","value":22.8},
  {"time":1741176120,"location":"living_room","value":21.5},
  {"time":1741176120,"location":"bedroom","value":20.1}
]

Login to Client CLI

docker exec -it influxdb3-core bash

Inside Container

export DEFAULT_TOKEN=""

influxdb3 query --database my-db "select * from yourmeasurement" --token $DEFAULT_TOKEN

Telegraf

Telegraf, a server-based agent, collects and sends metrics and events from databases, systems, and IoT sensors. Written in Go, Telegraf compiles into a single binary with no external dependencies–requiring very minimal memory.

Install Telegraf Cli

Telegraf Plugins

Telegraf Plugins Github

Add your host details

Mac / Linux

export MQTT_HOST_NAME=""
export MQTT_PORT=
export MQTT_USER_NAME=""
export MQTT_PASSWORD=""
export INFLUX_TOKEN=""
export INFLUX_DB_BUCKET=""

Windows

set MQTT_HOST_NAME=""
set MQTT_PORT=
set MQTT_USER_NAME=""
set MQTT_PASSWORD=""
set INFLUX_TOKEN=""
set INFLUX_DB_BUCKET=""

telegraf.conf

# Global agent configuration
[agent]
  interval = "5s"
  flush_interval = "10s"
  omit_hostname = true

# MQTT Consumer Input Plugin
[[inputs.mqtt_consumer]]
  servers = ["ssl://${MQTT_HOST_NAME}:${MQTT_PORT}"]
  username = "${MQTT_USER_NAME}"
  password = "${MQTT_PASSWORD}"

  # Set custom measurement name
  name_override = "my_python_sensor_temp"
  
  # Topics to subscribe to
  topics = [
    "sensors/temp",
  ]
  
  # Connection timeout
  connection_timeout = "30s"
  
  # TLS/SSL configuration
  insecure_skip_verify = true
  
  # QoS level
  qos = 1
  
  # Client ID
  client_id = "telegraf_mqtt_consumer"
  
  # Data format
  data_format = "value"
  data_type = "float"

# InfluxDB v2 Output Plugin
[[outputs.influxdb_v2]]
  # URL for your local InfluxDB
  urls = ["http://localhost:8181"]
  
  # InfluxDB token
  token = "${INFLUX_TOKEN}"
  
  # Organization name
  organization = ""
  
  # Destination bucket
  bucket = "${INFLUX_DB_BUCKET}"

  # Add tags - match the location from your MQTT script
  [outputs.influxdb_v2.tags]
    location = "room1"

Run Telegraph

telegraf --config telegraf.conf --debug

Storing output in InfluxDB and S3

export MQTT_HOST_NAME=""
export MQTT_PORT=
export MQTT_USER_NAME=""
export MQTT_PASSWORD=""
export INFLUX_TOKEN=""
export INFLUX_DB_ORG=""
export INFLUX_DB_BUCKET=""
export S3_BUCKET=""
export AWS_REGION=""
export AWS_ACCESS_KEY_ID=""
export AWS_SECRET_ACCESS_KEY=""

telegraf.conf

# Global agent configuration
[agent]
  interval = "5s"
  flush_interval = "10s"
  omit_hostname = true

# MQTT Consumer Input Plugin
[[inputs.mqtt_consumer]]
  servers = ["ssl://${MQTT_HOST_NAME}:${MQTT_PORT}"]
  username = "${MQTT_USER_NAME}"
  password = "${MQTT_PASSWORD}"

  # Set custom measurement name
  name_override = "my_python_sensor_temp"
  
  # Topics to subscribe to
  topics = [
    "sensors/temp",
  ]
  
  # Connection timeout
  connection_timeout = "30s"
  
  # TLS/SSL configuration
  insecure_skip_verify = true
  
  # QoS level
  qos = 1
  
  # Client ID
  client_id = "telegraf_mqtt_consumer"
  
  # Data format
  data_format = "value"
  data_type = "float"

# InfluxDB v2 Output Plugin
[[outputs.influxdb_v2]]
  # URL for your local InfluxDB
  urls = ["http://localhost:8181"]
  
  # InfluxDB token
  token = "${INFLUX_TOKEN}"
  
  # Organization name
  organization = ""
  
  # Destination bucket
  bucket = "${INFLUX_DB_BUCKET}"

  # Add tags - match the location from your MQTT script
  [outputs.influxdb_v2.tags]
    location = "room1"

# S3 Output Plugin with CSV format
[[outputs.remotefile]]
  remote = 's3,provider=AWS,access_key_id=${AWS_ACCESS_KEY_ID},secret_access_key=${AWS_SECRET_ACCESS_KEY},region=${AWS_REGION}:${S3_BUCKET}'
 
  # File naming
  files = ['{{.Name}}-{{.Time.Format "2025-03-26"}}']

InfluxDB University

Free Training

#telegraf #docker #measurementVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 0 minutes]

Data Visualization libraries

Popular tools

  • Grafana
  • Tableau
  • PowerBI
  • StreamLit
  • Python MatplotLib
  • Python Seaborn

#grafana #datavizVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

Grafana

Grafana is an open-source analytics and visualization platform that allows you to query, visualize, alert on, and understand your metrics from various data sources through customizable dashboards.

  • Provides real-time monitoring of IoT device data through intuitive dashboards
  • Supports visualization of time-series data (which is common in IoT applications)
  • Offers powerful alerting capabilities for monitoring device health and performance
  • Enables custom dashboards that can display metrics from multiple IoT devices in one view.
  • InfluxDB is optimized for storing and querying time-series data generated by IoT sensors.
  • The combination provides high-performance data ingestion for handling large volumes of IoT telemetry.
  • InfluxDB’s data retention policies help manage IoT data storage efficiently.
  • Grafana can easily visualize the time-series data stored in InfluxDB through simple queries.
  • Both tools are lightweight enough to run on edge computing devices for local IoT monitoring.

Deploy InfluxDB/Grafana

Create a network

  • Isolation and security - The dedicated network isolates your containers from each other and from the host system, reducing the attack surface.
  • Container-to-container communication - Containers in the same network can communicate using their container names (like “myinflux” and “mygrafana”) as hostnames, making connections simpler and more reliable.
  • Port conflict prevention - You avoid potential port conflicts on the host, as multiple applications can use the same internal port numbers within their isolated network.
  • Simpler configuration - Services can reference each other by container name instead of IP addresses, making configuration more maintainable.

Updated docker-compose.yml

Stop the previous containers

docker compose down

docker-compose.yml

name: influxdb3
services:
  influxdb3-core:
    container_name: influxdb3-core
    image: influxdb:3-core
    ports:
      - 8181:8181
    command:
      - influxdb3
      - serve
      - --node-id=node0
      - --object-store=file
      - --data-dir=/var/lib/influxdb3/data
      - --plugin-dir=/var/lib/influxdb3/plugins
    volumes:
      - ./.influxdb3/core/data:/var/lib/influxdb3/data
      - ./.influxdb3/core/plugins:/var/lib/influxdb3/plugins
    restart: unless-stopped

  influxdb3-explorer:
    image: influxdata/influxdb3-ui:latest
    container_name: influxdb3-explorer
    ports:
      - "8888:80"
    volumes:
      - ./.influxdb3-ui/db:/db:rw
      - ./.influxdb3-ui/config:/app-root/config:ro
    environment:
      SESSION_SECRET_KEY: "${SESSION_SECRET_KEY:-$(openssl rand -hex 32)}"
    restart: unless-stopped
    command: ["--mode=admin"]

  grafana:
    image: grafana/grafana-oss:latest
    container_name: grafana
    ports:
      - "3000:3000"
    volumes:
      - ./.grafana:/var/lib/grafana
    depends_on:
      - influxdb3-core
    restart: unless-stopped
docker compose up -d

InfluxDB UI

http://localhost:8888

Grafana

http://localhost:3000

userid/pwd: admin/admin

InfluxDB Host: http://influxdb3-core:8181 (as all 3 services are in same network)

Demo

Write SQL - Build Dashboards - Alerts

#influxdb #grafana #sqlVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 0 minutes]

Machine Learning with IoT

  1. IoT Data Characteristics
  2. Feature Engineering
  3. Predictive Maintenance
  4. Anomaly Detection
  5. ML with IoTVer 6.0.23

[Avg. reading time: 5 minutes]

IoT Data Characteristics

What is IoT Data?

IoT data is generated continuously from sensors and devices interacting with the physical world.

Unlike traditional datasets:

  • It is time-dependent
  • It arrives as a continuous stream
  • It reflects real-world conditions, not controlled inputs

Examples

  • Temperature readings every second
  • Machine vibration signals
  • GPS location streams


Key Characteristics of IoT Data

1. Time-Series Nature

  • Data is ordered by time
  • Past values influence future values

Example

  • Temperature at 10:01 depends on 10:00

2. High Frequency & Volume

  • Data generated every second (or faster)
  • Quickly becomes large-scale

3. Noisy Data

  • Sensors are imperfect
  • External conditions introduce fluctuations

Example

  • Temperature spikes due to environment, not actual issue

4. Missing Data

  • Network issues
  • Device downtime
  • Transmission failures

5. Outliers & Spikes

  • Sudden jumps or drops
  • Could be real events OR sensor errors

6. Correlated Signals

  • Multiple sensors interact

Example

  • Temperature ↑ → Pressure ↑ → Humidity ↓

7. Continuous & Streaming

  • Data is not static
  • Always flowing

Data Quality Challenges in IoT

1. Missing Values

  • Gaps in data streams
  • Need interpolation or handling strategies

2. Duplicate Data

  • Common with MQTT QoS1 (at-least-once delivery)

3. Out-of-Order Data

  • Events may arrive late
  • Timestamp handling becomes critical

4. Sensor Drift

  • Sensors degrade over time
  • Gradual deviation from true values

5. Noise vs Signal Problem

  • Hard to distinguish real events from random fluctuations

Why This Matters for ML

Raw IoT data:

  • Is not directly usable
  • Leads to poor model performance
  • Causes false alerts and missed predictions

Before applying ML, we must transform raw data into meaningful signals using Feature Engineering.

#iotdata #noiseVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

Feature Engineering

Feature engineering is the process of transforming raw IoT sensor data into meaningful signals that machine learning models can understand.

Raw sensor data is:

  • noisy
  • incomplete
  • difficult to interpret

Feature engineering converts it into:

  • trends
  • patterns
  • changes over time

One-line takeaway

  • Models don’t learn from raw data, they learn from engineered signals.

Why Feature Engineering is Critical in IoT

IoT data is fundamentally different from traditional datasets:

  • continuous streams
  • time-dependent
  • affected by environment

Without feature engineering:

  • models produce false alerts
  • important patterns are missed
  • predictions become unstable

Core Feature Types

1. Rolling / Window Features

Capture short-term behavior over a time window.

  • rolling mean
  • rolling standard deviation
  • rolling min/max

Example

  • average temperature over last 5 minutes

Purpose

  • smooth noise
  • identify stability vs fluctuation
| hr | temp |
|------|------|
| 1    | 20   |
| 2    | 21   |
| 3    | 35   |
| 4    | 22   |

Rolling Window (window = 2)

| hr | temp | rolling_mean_2 |
|----|------|----------------|
| 1  | 20   | 20             |
| 2  | 21   | 20.5           |
| 3  | 35   | 28             |
| 4  | 22   | 28.5           |

Rolling Window (window = 3)

| hr | temp | rolling_mean_3 |
|----|------|----------------|
| 1  | 20   | 20             |
| 2  | 21   | 20.5           |
| 3  | 35   | 25.3           |
| 4  | 22   | 26             |

window = 2 : current + previous value window = 3 : current + last 2 values

Small window shows spikes.

Large window smoothens the data.


2. Lag Features

Use past values of a signal.

  • temp(t-1), temp(t-5), temp(t-10)

Purpose

  • help models learn trends
  • capture temporal dependencies
| hr | temp | lag_1 |
| -- | ---- | ----- |
| 1  | 20   | -     |
| 2  | 21   | 20    |
| 3  | 35   | 21    |
| 4  | 22   | 35    |

3. Rate of Change (Delta)

Measure how fast a signal changes.

  • temp(t) - temp(t-1)
  • pressure change per second

Purpose

  • detect sudden spikes
  • highlight abnormal behavior

Raw Data

| hr | temp |
|------|------|
| 1    | 20   |
| 2    | 21   |
| 3    | 35   |
| 4    | 22   |

Feature Engineering

| hr | temp | rolling_mean | delta |
|------|------|--------------|-------|
| 1    | 20   | 20           | -     |
| 2    | 21   | 20.5         | +1    |
| 3    | 35   | 25.3         | +14   |
| 4    | 22   | 26           | -13   |

Insight

  • spike at hr=3

4. Aggregation Features

Summarize behavior over time.

  • average over 10 minutes
  • count of spikes
  • max/min values

Purpose

  • capture overall system behavior

5. Time-Based Features

Incorporate time context.

  • hour of day
  • day of week

Purpose

  • capture seasonality patterns

6. Cross-Sensor Features

Combine multiple sensor readings.

  • temperature + humidity
  • pressure vs vibration

Purpose

  • capture relationships between signals
  • improve model accuracy

How Feature Engineering Connects to ML in IoT

Predictive Maintenance

  • uses trends and long-term patterns
  • detects gradual degradation

Anomaly Detection

  • uses delta and rolling statistics
  • identifies sudden spikes and instability

Classification

  • uses patterns of behavior
  • distinguishes device states (normal vs faulty)

Key Principle

Feature engineering bridges the gap between:

  • raw sensor data
  • intelligent ML decisions

#featureengineering #datacleaningVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Predictive Maintenance

Predictive maintenance uses IoT telemetry to anticipate equipment failure before it happens, enabling intervention at the right time instead of reacting after breakdowns.

This shifts operations from reactive → preventive → predictive.


Core Components

  • Sensor Integration
    Capture continuous signals like vibration, temperature, pressure, and acoustic patterns from equipment.

  • Data Processing
    Clean, normalize, and time-align high-frequency sensor streams for downstream use.

  • Condition Monitoring
    Track real-time metrics against thresholds or baseline behavior to detect deviations.

  • Failure Prediction Models
    Apply statistical or ML models (regression, classification, anomaly detection) trained on historical failure patterns.


Implementation Architecture

  • Edge Layer
    Perform lightweight filtering and anomaly detection close to the device to reduce latency and bandwidth.

  • Fog Layer
    Aggregate multiple devices, run near-real-time analytics, and coordinate localized decisions.

  • Cloud Layer
    Train models, store long-term telemetry, and run deeper analysis across fleets.

  • Visualization & Alerting
    Dashboards, alerts, and automated triggers for maintenance teams.


Why This Matters in Data Engineering

  • Sensor data is high volume, high velocity, time-series heavy
  • Requires streaming pipelines (MQTT > Kafka > TSDB / Lakehouse)
  • Needs schema evolution + late arriving data handling
  • Models depend on feature engineering over time windows (rolling stats, lag features)
  • Poor design leads to unreliable predictions

Benefits

  • Reduced Downtime
    Failures are prevented, not reacted to

  • Cost Optimization
    Avoid unnecessary scheduled maintenance

  • Extended Asset Life
    Early detection prevents irreversible damage

  • Improved Safety
    Reduces risk of catastrophic failures


git clone https://github.com/gchandra10/python_iot_workflow_predictive_demo.git

Real Example

  • Motor vibration increases gradually over time
  • Edge detects anomaly spike
  • Fog aggregates patterns across similar machines
  • Cloud model predicts failure in ~5 days
  • Alert triggered → maintenance scheduled
  • Downtime avoided

#predictive #iot #edge #fog #timeseriesVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 15 minutes]

Anomaly Detection

Anomaly detection and predictive maintenance are important parts of the IoT upper stack. They help analyze device and sensor data to detect unusual behavior early and reduce the chance of equipment failure.

Anomaly Detection in IoT

Anomaly detection identifies data points or patterns that do not match normal system behavior.

In IoT systems, this is useful for:

  • detecting abnormal sensor readings
  • identifying device malfunctions
  • spotting unusual operational behavior
  • triggering alerts before failures become serious

This is especially valuable in industrial IoT, smart manufacturing, healthcare, logistics, and other environments where sensor data arrives continuously.

Common Approaches

Statistical Methods

Statistical approaches define a baseline of normal behavior and flag values that deviate significantly from it.

Examples:

  • mean and standard deviation
  • z-score
  • moving averages
  • seasonal thresholds

These methods are simple and fast, but they may struggle when the data is complex or changes over time.

Machine Learning Techniques

Machine learning models learn patterns from historical data and identify points that do not fit those patterns.

Examples:

  • Isolation Forest
  • One-Class SVM
  • Local Outlier Factor
  • clustering-based approaches

These methods are useful when normal behavior is not easy to define with simple rules.

Deep Learning Models

Deep learning models can detect anomalies in high-dimensional or sequential IoT data.

Examples:

  • autoencoders
  • LSTM-based sequence models
  • transformer-based time-series models

These models are powerful, but they usually require more data, more tuning, and more compute.

Isolation Forest

Isolation Forest is one of the most practical algorithms for anomaly detection.

Unlike many other methods, it does not rely on distance or density. Instead, it works on a simple idea:

Anomalies are few and different, so they are easier to isolate than normal points.

Core Idea

Isolation Forest builds many random trees.

In each tree:

  • a feature is selected randomly
  • a split value is selected randomly
  • the data is repeatedly divided until individual points become isolated

A point that gets isolated quickly is more likely to be an anomaly.

A point that needs more splits to isolate is more likely to be normal.

Why It Works

Normal points usually belong to dense regions of the dataset, so they take more splits to separate.

Anomalies are often far away from the bulk of the data, so they get isolated in fewer steps.

That is why:

  • shorter path length > more anomalous
  • longer path length > more normal

Simple Example

Dataset: [-100, 2, 11, 13, 100]

In practice, Isolation Forest builds many trees (100+).
Here we show only 4 trees for understanding.

Tree 1

                Root
                 |
        [Split at value = 7]
        /                 \
    [-100, 2]        [11, 13, 100]
        |                  |
[Split at value = -49]  [Split at value = 56]
    /         \          /         \
[-100]       [2]    [11, 13]      [100]

Path lengths:
- -100 → 2
- 2 → 2
- 11 → 3
- 13 → 3
- 100 → 2

Tree 2

                Root
                 |
        [Split at value = 1]
        /                 \
    [-100]          [2, 11, 13, 100]
                        |
                  [Split at value = 50]
                  /                 \
            [2, 11, 13]           [100]

Approx path lengths:
- -100 → 1
- 100 → 2
- 2, 11, 13 → 3 to 4

Tree 3


                Root
                 |
        [Split at value = 12]
        /                 \
[-100, 2, 11]         [13, 100]
        |                  |
[Split at value = -40]  [Split at value = 57]
    /         \          /         \
[-100]     [2, 11]     [13]       [100]

Path lengths:
- -100 → 2
- 2 → 3
- 11 → 3
- 13 → 2
- 100 → 2

Tree 4

                Root
                 |
        [Split at value = 80]
        /                 \
[-100, 2, 11, 13]        [100]
        |
[Split at value = -50]
    /         \
[-100]    [2, 11, 13]

Approx path lengths:
- 100 → 1
- -100 → 2
- others → 3+

Average Path Length

  • -100 → (2 + 1 + 2 + 2) / 4 = 1.75
  • 2 → (2 + 3 + 3 + 3) / 4 = 2.75
  • 11 → (3 + 3 + 3 + 3) / 4 = 3.00
  • 13 → (3 + 3 + 2 + 3) / 4 = 2.75
  • 100 → (2 + 2 + 2 + 1) / 4 ≈ 2.0

Anomaly Score

s(x, n) = 2^(-E[h(x)] / c(n))

Where:

  • E[h(x)] = average path length
  • c(n) = normalization factor

Score meaning:

  • closer to 1 > anomaly
  • closer to 0 > normal

Interpretation

The extreme values (-100 and 100) are isolated faster than the middle values.

That means:

  • -100 and 100 > anomalies
  • 2, 11, 13 > normal points

Key Points

  • anomalies are few and different
  • random splits isolate anomalies faster
  • path length determines anomaly likelihood
  • ensemble of trees improves reliability
  • no distance calculations required
  • scales well for large datasets

Advantages

  • simple and intuitive
  • fast and scalable
  • works with high-dimensional data
  • no need for distance calculations
  • good for unsupervised learning

Limitations

  • struggles with clustered anomalies
  • sensitive when anomalies are near normal data
  • randomness can cause variation in small datasets
  • threshold selection is use-case dependent

Isolation Forest in IoT

Used for:

  • temperature anomalies
  • vibration anomalies
  • pressure irregularities
  • device failure prediction
  • real-time alerting

Applications:

  • predictive maintenance
  • fault detection
  • industrial monitoring

#anomaly #predictivemaintenanceVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 17 minutes]

ML Models quick intro

Supervised Learning

In supervised learning, classification and regression are two distinct types of tasks, differing primarily in the nature of their output and the problem they solve.

Labeled historical data (e.g., sensor readings with timestamps of past failures).

Classification

Predicts discrete labels (categories or classes).

Example:

Binary: Failure (1) vs. No Failure (0).

Multi-class: Type of failure (bearing_failure, motor_overheat, lubrication_issue).

Regression

Predicts continuous numerical values.

Example:

Remaining Useful Life (RUL): 23.5 days until failure.

Time-to-failure: 15.2 hours.

Use Cases in Predictive Maintenance

Classification:

Answering yes/no questions:

  • Will this motor fail in the next week?
  • Is the current vibration pattern abnormal?
  • Identifying the type of fault (e.g., electrical vs. mechanical).

Regression:

Quantifying degradation:

  • How many days until the turbine blade needs replacement?
  • What is the current health score (0–100%) of the compressor?

Algorithms

CategoryAlgorithmDescription
ClassificationLogistic RegressionModels probability of class membership.
Random ForestEnsemble of decision trees for classification.
Support Vector Machines (SVM)Maximizes margin between classes.
Neural NetworksLearns complex patterns and nonlinear decision boundaries.
CategoryAlgorithmDescription
RegressionLinear RegressionModels linear relationship between features and target.
Decision Trees (Regressor)Tree-based model for predicting continuous values.
Gradient Boosting RegressorsEnsemble of weak learners (e.g., XGBoost, LightGBM).
LSTM NetworksRecurrent neural networks for time-series regression.

Evaluation Metrics

Classification:

  • Accuracy: % of correct predictions.
  • Precision/Recall: Trade-off between false positives and false negatives.
    • Precision: TP/(TP+FP)
    • Recall: TP/(TP+FN)
  • F1-Score: Harmonic mean of precision and recall.

Example:

Will the temperatue exceed 90F in 10 mins?

Positive: Yes will cross 90F Negative: Will not cross 90F

True Positive

Model: Temp will cross 90 Actual: It did cross 90

Result: Correct and we are prepared.

False Positive

Model: Temp will cross 90 Actual: Didn’t cross

Result: Predicted heat but it never happened.

True Negative

Model: Temp will be less than 90 Actual: Temp stayed less than 90

Result: Predicted low and it happened.

False Negative

Model: Temp will be less than 90 Actual: Temp went above 90

Result: Missed issue.

In IoT - False Negative are risky, False Positive are annoyance.

Regression:

  • Mean Absolute Error (MAE): Average absolute difference between predicted and actual values.
  • Mean Squared Error (MSE): Penalizes larger errors.
  • R² Score: How well the model explains variance in the data.

Unsupervised Learning

In unsupervised learning, clustering and anomaly detection serve distinct purposes and address different problems.

Primary Objective

Clustering

  • Assigns each data point to a cluster (e.g., Cluster 1, Cluster 2).
  • Outputs are groups of similar instances.

Goal: Group data points into clusters based on similarity.

  • Focuses on discovering natural groupings or patterns in the data.

Example: Segmenting devices into groups based on usage.

RoomTempHumidityCO₂Occupancy
R12240500Low
R22342520Low
R32860900High
R42965950High

Cluster 1 - R1 and R2, Cluster 2 - R3 and R4

Anomaly Detection

  • Labels data points as normal or anomalous (binary classification).
  • Outputs are scores or probabilities indicating how “outlier-like” a point is.

Goal: Identify rare or unusual data points that deviate from the majority.

Focuses on detecting outliers or unexpected patterns.

Example: Flagging fraudulent credit card transactions.

Algorithms

CategoryAlgorithmDescription
ClusteringK-MeansPartitions data into k spherical clusters.
Hierarchical ClusteringBuilds nested clusters using dendrograms.
DBSCANGroups dense regions and identifies sparse regions as outliers.
Gaussian Mixture Models (GMM)Probabilistic clustering using a mixture of Gaussians.
Anomaly DetectionIsolation ForestIsolates anomalies using random decision trees.
One-Class SVMLearns a boundary around normal data to detect outliers.
AutoencodersReconstructs input data; anomalies yield high reconstruction error.
Local Outlier Factor (LOF)Detects anomalies by comparing local density of data points.

Time Series

Forecasting and Anomaly Detection are two fundamental but distinct tasks, differing in their objectives, data assumptions, and outputs.

ModelTypeStrengthsLimitations
ARIMA/SARIMAClassicalSimple, interpretable, strong for univariate, seasonal dataRequires stationary data, manual tuning
Facebook ProphetAdditive modelEasy to use, handles holidays/seasonality, works with missing dataSlower for large datasets, limited to trend/seasonality modeling
Holt-Winters (Exponential Smoothing)ClassicalLightweight, works well with level/trend/seasonalityNot good with irregular time steps or complex patterns
LSTM (Recurrent Neural Network)Deep LearningLearns long-term dependencies, supports multivariateRequires lots of data, training is resource-intensive
XGBoost + Lag FeaturesMachine LearningHigh performance, flexible with engineered featuresRequires feature engineering, not “true” time series model
NeuralProphetHybrid (Prophet + NN)Better performance than Prophet, supports regressors/eventsHeavier than Prophet, still maturing
Temporal Fusion Transformer (TFT)Deep LearningSOTA for multivariate forecasts with interpretabilityOverkill for small/medium IoT data, very heavy

LayerModel(s)Why
EdgeHolt-Winters, thresholds, micro-LSTM (TinyML), Prophet (inference)Extremely lightweight, low latency
FogProphet, ARIMA, Isolation Forest, XGBoostModerate compute, supports both real-time + near-real-time
CloudLSTM, TFT, NeuralProphet, Prophet (training), XGBoostCan handle heavy training, multivariate data, batch scoring
git clone https://github.com/gchandra10/python_iot_ml_demo.git

#ml #iot #edgeVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 1 minute]

Security

  1. Introduction
  2. Application Layer
  3. Data Layer
  4. Communication Layer
  5. Number Systems
  6. Encryption
  7. IoT Privacy
  8. Auditing in IoT

Ver 6.0.23

[Avg. reading time: 9 minutes]

Introduction to IoT Security Challenges

IoT Security is not a theory its real.

News articles

MS Azure blocks Largest DDos Attack

Govt CISA Replace EOL Edge Devices

US DOJ Botnets


Why IoT Is Hard to Secure

ReasonExplanation
Resource ConstraintsLimited CPU, memory → hard to run strong security controls
Scale & DiversityThousands of devices, different vendors → hard to manage
Physical ExposureDevices can be accessed or tampered with in the field
Long LifespanDevices run for years with poor or no updates
Insecure DefaultsWeak passwords, open ports, outdated firmware
Inconsistent StandardsSecurity exists, but not applied consistently

What This Means in Practice

  • You cannot rely on one layer
  • You cannot patch easily
  • You must assume devices are compromised

Security Layers in IoT

LayerFocusKey Concerns
Device-LevelHardware + firmwareSecure boot, tampering, firmware integrity
Upper StackData, APIs, cloudAuth, encryption, APIs, IAM

Reality

If device layer fails, upper layers receive fake but valid-looking data.

Your dashboards will lie.

Upper Stack Attack Surfaces

Application

  • Insecure APIs
  • Weak authentication
  • Poor input validation
  • No rate limiting

Attack Example: Attacker sends 10,000 fake requests > API crashes > system unavailable

Data

  • No encryption (in transit / at rest)
  • Public cloud storage
  • Weak access control

Attack Example: Open S3 bucket -> attacker downloads sensitive sensor data

Communication

  • MITM attacks (MQTT, HTTP)
  • Replay attacks
  • Weak TLS/cert handling

Attack Example: Captured MQTT message replayed > system thinks event happened again

Fake Publisher Attack

[ Device ]     [ Attacker ]
     \             /
      \           /
       ---> [ MQTT Broker ] ---> [ Cloud ] ---> [ Dashboard ]

Man-in-the-Middle (MITM)

[ Device ] ---> ❌ Attacker ---> [ MQTT Broker ]

Lower Stack Attack Surfaces

Device

  • Firmware tampering
  • Debug port access
  • Insecure boot

Attack Example: Attacker plugs into device > flashes modified firmware > device becomes a bot

Network

  • No segmentation
  • Open ports
  • Weak local protocols (BLE, Zigbee)

Attack Example: Compromise one device -> scan network -> take over others

Supply Chain

  • Malicious firmware
  • Vulnerable libraries
  • Fake/cloned devices

Attack Example: Cheap cloned sensor sends manipulated data from day 1

Summary

  • One weak layer breaks everything
  • Device -> Network -> Cloud -> App (all connected)
  • Example: weak device auth -> attacker sends fake data > corrupts analytics

#security #firmwareVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 7 minutes]

Application Layer

Insecure APIs

Problem: APIs are the control plane of IoT systems. If they are weak, the entire system is exposed.

Common failures:

  • No authentication or weak auth
  • Over-permissive endpoints
  • No encryption (HTTP instead of HTTPS)
  • No rate limiting

Real-World Use Case:

  • CloudPets breach (2017)
    • API had no authentication
    • Exposed millions of voice recordings
    • Attackers accessed data directly from backend storage

Mitigation:

  • Enforce strong auth:
    • OAuth2 / JWT / Mutual TLS
  • Authorization per endpoint (RBAC)
  • Always use HTTPS
  • Hide internal APIs behind gateways
  • Add API gateway (rate limit + auth + logging)

Demo

git clone https://github.com/gchandra10/python_api_auth_demo.git

Poor Session Management

Sessions are often:

  • Long-lived
  • Reused across devices
  • Stored insecurely

This leads to session hijacking or replay attacks.

Real-World Use Case:

  • Smart thermostat app reused same session token
  • Attacker reused token -> controlled devices remotely

Mitigation:

  • Short-lived access tokens
  • Refresh tokens with rotation
  • Store tokens securely:
    • Avoid localStorage
    • Use HTTP-only cookies
  • Invalidate sessions:
    • Logout
    • Password change
  • Bind session to device/IP if possible

Weak Input Validation (XSS, Injection)

Without validation -> injection attacks:

  • XSS
  • SQL Injection
  • Command Injection

Real-World Example

  • Smart fridge dashboard
  • Attacker injected script -> executed on admin panel
  • Stole session cookies

Mitigation

  • Validate input schema strictly
  • Sanitize inputs
  • Escape outputs (HTML/JS)
  • Use parameterized queries
  • Never trust device-originated data

No Rate Limiting or Abuse Detection

Without rate limiting

  • Brute force attacks succeed
  • APIs get abused
  • Devices become botnet nodes

Using Bots hackers cause

  • Massive DDoS network
  • Millions of IoT devices causes Internet outage

Mitigation

  • Rate limit by IP / User / MAC address
  • Detect anomalies by too many failures, spikes
  • Pause access

Limit to 5 queries for given 60 seconds

from fastapi import Request
from time import time

requests = {}

def rate_limit(ip):
    now=time()
    window=60
    limit=5
    requests.setdefault(ip, [])
    requests[ip]=[t for t in requests[ip] if now-t<window]
    if len(requests[ip])>=limit:
        return False
    requests[ip].append(now)
    return True

@app.get("/login")
def login(req: Request):
    ip=req.client.host
    if not rate_limit(ip):
        raise HTTPException(status_code=429)
    return {"ok":True}

#xss #ratelimiting #insecureapiVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

Data Layer

Data in Transit (No Encryption)

Devices send data over MQTT, CoAP, or HTTP without encryption. Anyone on the network can read or modify it.

Real-World Use Case: A smart water meter system in a municipality was transmitting usage data over plain HTTP. Attackers intercepted and altered readings, affecting billing.

Mitigation:

  • Use TLS (HTTPS, MQTT over TLS)
  • Use DTLS for UDP-based protocols (CoAP)
  • Enforce certificate validation and pinning
  • Disable plaintext endpoints completely
import ssl
import paho.mqtt.client as mqtt

client = mqtt.Client()
client.tls_set(ca_certs="ca.crt",
               certfile="client.crt",
               keyfile="client.key",
               tls_version=ssl.PROTOCOL_TLS)

client.connect("broker.hivemq.com", 8883)
client.publish("iot/sensor", "secure message")
client.loop_start()
  • ca.crt : Certificate Authority (CA) used to trust broker (on device) AND trust devices (on broker)
  • client.crt : device identity (sent to broker)
  • client.key : proof device owns that identity

Data at Rest (Unencrypted Databases)

Problem:

  • Data stored on devices, gateways, or cloud is not encrypted.
  • Anyone with access can extract it.

Real-World Use Case: In 2020, a smart door lock vendor left unencrypted SQLite DBs in devices. Attackers extracted access logs and user PINs directly from flash memory.

  • Credential theft
  • Sensitive data exposure
  • Device compromise

Mitigation:

  • Enable AES-based encryption for device-side storage
  • Use full-disk encryption on gateways or fog nodes
  • Enforce encryption at rest (e.g., AWS KMS, Azure SSE) in cloud databases

Online Encrypt / Decrypt

from cryptography.fernet import Fernet

key = Fernet.generate_key()
cipher = Fernet(key)

data = b"temperature=25"
encrypted = cipher.encrypt(data)
decrypted = cipher.decrypt(encrypted)

Insecure Cloud Storage (e.g., Public S3 Buckets)

Problem: Cloud object storage like AWS S3 or Azure Blob often gets misconfigured as public, leaking logs, firmware, or user data.

Real-World Use Case: A fitness tracker company exposed terabytes of GPS and health data by leaving their S3 bucket public and unprotected — affecting thousands of users.

Mitigation:

  • Use least privilege IAM roles for all cloud resources
  • Audit and scan for public buckets (AWS Macie, Prowler)
  • Enable object-level encryption and access logging
  • Set up guardrails and policies (e.g., SCPs, Azure Blueprints)

Lack of Data Integrity Checks

Problem: Without integrity checks, even if data is encrypted, an attacker can alter it in transit or at rest without detection.

Real-World Use Case: A smart agriculture system relied on soil sensor readings to trigger irrigation. An attacker tampered with packets to falsify dry-soil readings, wasting water.

Mitigation:

  • Use Hash-based Message Authentication Code (HMAC) or digital signatures with shared secrets
  • Implement checksums or hashes (SHA-256) on stored data
  • Validate data consistency across nodes/cloud with audit trails
import hmac, hashlib

secret = b"key"
message = b"sensor_data=25"

signature = hmac.new(secret, message, hashlib.sha256).hexdigest()

# verify
valid = hmac.compare_digest(
    signature,
    hmac.new(secret, message, hashlib.sha256).hexdigest()
)

print(valid)

Sender:

  • Generates HMAC using secret key
  • Sends: message + signature

Receiver:

  • Recomputes HMAC using same key
  • Compares

#dataintransit #dataatrest #dataintegrityVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Communication Layer

MITM on MQTT / CoAP

Problem

MQTT and CoAP are lightweight protocols and are often deployed without strong encryption or authentication.

This makes them vulnerable to Man-in-the-Middle (MITM) attacks, where an attacker intercepts, reads, or alters traffic between the device and the broker/server.

Example Scenario

A smart lighting system uses MQTT over plain TCP without TLS.
An attacker on the same network spoofs the broker and sends fake commands, causing all lights to turn off remotely.

Mitigation

  • Use MQTT over TLS on port 8883
  • Use CoAP over DTLS
  • Enable mutual authentication using client and server certificates
  • Verify broker/server identity before accepting a connection
  • Use certificate pinning where appropriate
  • Disable anonymous access on MQTT brokers

Replay Attacks Due to Lack of Freshness

Problem

Some IoT systems do not check whether a message is fresh.
If timestamps, nonces, or sequence numbers are missing, an attacker can capture a valid message and replay it later.

Example Scenario

A smart lock accepts an unlock command without checking whether the message is new.
An attacker records a valid unlock message and replays it later to gain unauthorized access.

Mitigation

  • Add a timestamp, nonce, or message counter to each request
  • Reject duplicate or expired messages
  • Track recently used nonces or counters
  • Use challenge-response for critical actions
  • Use short-lived tokens with expiration checks

Example

{
  "device_id": "lock01",
  "command": "unlock",
  "nonce": "839275abc123",
  "timestamp": "2025-04-01T10:23:00Z"
}

#communicationlayer #mitmVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

Number Systems

Binary

0 and 1

Octal

0-7

Decimal

Standard Number system.

Hex

0 to 9 and A to F

Base36

A-Z & 0-9

Great for generating short uniqut IDs. Packs more information into fewer characters.

An epoch time stamp 1602374487561 (14 characters long) will be converted to 8 character long Base36 string “kg4cebk9”


Popular Use Cases:

Base 36 is used for Dell Express Service Codes and many other applications which have a need to minimize human error.

Base 36 Converter

Example : Processing 1 billion rows each hour for a day

Billion rows x 14 = 14 billion bytes = 14 GB x 24 hrs = 336 GB Billion rows x 8 = 8 billion bytes = 8 GB x 24 hrs = 192 GB

pip install base36
import base36
base36.dumps(1602374487561)
base36.loads('kg4cebk9') == 1602374487561

Base 64:

Base64 encoding schemes are commonly used when there is a need to encode binary data that needs be stored and transferred over media that are designed to deal with textual data. This is to ensure that the data remains intact without modification during transport.

Base64 is a way to encode binary data into an ASCII character set known to pretty much every computer system, in order to transmit the data without loss or modification of the contents itself.

2 power 6 = 64

So Base64 Binary values are six bits not 8 bits.

Base64 encoding converts every three bytes of data (three bytes is 3*8=24 bits) into four base64 characters.

Example:

Convert Hi! to Base64

Character - Ascii - Binary

H= 72 = 01001000

i = 105 = 01101001

! = 33 = 00100001

Hi! = 01001000 01101001 00100001

010010 000110 100100 100001 = S G k h

https://www.base64encode.org/

How about converting Hi to Base64

010010 000110 1001

Add zeros in the end so its 6 characters long

010010 000110 100100

Base 64 is SGk=

= is the padding character so the result is always multiple of 4.

Another Example

convert f to Base64

102 = 01100110

011001 100000

Zg==

Think about sending Image (binary) as JSON, binary wont work. But sending as Base64 works the best.

Image to Base64

https://elmah.io/tools/base64-image-encoder/

View Base64 online

https://html.onlineviewer.net/

#octal #decimal #hex #base64Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 6 minutes]

Encryption in IoT Upper Stack

Two foundational concepts that help protect data are hashing and encryption.

Hashing

Hashing is like creating a digital fingerprint of data. It takes input (e.g., a message or file) and produces a fixed-length hash value.

  • One-way function: You can’t reverse a hash to get the original data.
  • Deterministic: Same input = same hash.
  • Common use: Password storage, data integrity checks.

Use-case: When sending firmware updates to IoT devices, the server also sends a hash. The device re-hashes the update and compares — if it matches, the data wasn’t tampered with.

import hashlib
print(hashlib.sha256(b"iot-data").hexdigest())

Online Hash Generator

Encryption

Encryption transforms readable data (plaintext) into an unreadable format (ciphertext) using a key. Only those with the key can decrypt it back.

Two Types

Symmetric

  • Same key to encrypt and decrypt. Example: AES

ASymmetric

  • Public key to encrypt, private key to decrypt. Example: RSA

Use-case: Secure communication between sensors and cloud, protecting sensitive telemetry, encrypting data at rest.


sequenceDiagram
    participant Sensor
    participant Network
    participant Cloud

    Sensor->>Network: Temp = 28.5 (Plaintext)
    Network-->>Cloud: Temp = 28.5

    Note over Network: Data can be intercepted

    Sensor->>Network: AES(TLS): Encrypted Payload
    Network-->>Cloud: Encrypted Payload (TLS)
    Cloud-->>Cloud: Decrypt & Store

Encryption plays a critical role in securing IoT systems beyond the device level. Here’s how it applies across the upper layers of the stack:


  • Data in Transit
  • Data at Rest

Cloud & IAM Layer – Secrets and Identity

Purpose: Secure identity tokens, secrets, and environment variables.

Best Practices:

  • Encrypt secrets using cloud-native KMS (e.g., AWS KMS, Azure Key Vault)
  • Use tools like HashiCorp Vault to manage secrets
  • Apply token expiration and rotation policies

#encryption #hashing #secretsVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

IoT Data Privacy

  • IoT devices continuously collect highly sensitive data
    • Location, biometrics, behavior, health signals
  • Data collection is often passive and invisible
    • Users lack: Control, Visibility, Consent clarity Risk is not theoretical
    • Regulatory fines, Legal exposure, Reputation damage

GDPR (EU)

Applies if data subjects are EU citizens.

Focus: Consent, Right to access/erase, Data minimization, Security by design, Data portability.

HIPAA (USA)

Applies to Protected Health Information (PHI).

Focus: Confidentiality, Integrity, Availability of electronic health data.

Requires Business Associate Agreements if third parties handle data.


How to Implement Privacy in IoT Systems

Privacy by Design

  • Collect only necessary data
  • Anonymize/pseudonymize where possible
  • Use edge processing to reduce data sent to cloud

Security Practices

  • Encrypted storage & transport (TLS 1.3)
  • Mutual authentication (cert-based, JWT)
  • Secure boot & firmware validation

User Controls

  • Explicit opt-in for data collection
  • Transparent data usage policies
  • Easy delete/download of personal data

Audit & Monitoring

  • Logging access to sensitive data
  • Regular privacy impact assessments

What Industry is Doing Now

Company/PlatformWhat They Do
AppleLocal processing for Siri; minimal cloud usage
Google NestCentralized cloud with opt-out data sharing
AWS IoT CoreFine-grained access policies, audit logging
Azure IoTGDPR-compliant SDKs; data residency controls
Fitbit (Google)HIPAA-compliant services for health data

Pros & Cons of IoT Privacy Measures

ProsCons
Builds trust with usersMay increase latency (edge compute)
Avoids fines & legal issuesHigher infra cost (storage, encryption)
Enables secure ecosystemsLimits on innovation using personal data
Competitive differentiatorComplex to manage cross-border compliance

Data Masking

This is about obfuscating sensitive info during storage, transit, or access.

Types

  • Static masking: Permanent (e.g., obfuscating device ID at ingestion)
  • Dynamic masking: At query time (e.g., show only last 4 digits to analysts)
  • Tokenization: Replacing values with reversible tokens

Use Cases

  • Sharing data with 3rd parties without exposing PII
  • Minimizing insider threats
  • Compliance with HIPAA/GDPR

Tools & Approaches

  • Telegraf Preprocessor modules (Static Masking)
  • SQL-level masking (e.g., MySQL, SQL Server)
  • API gateways that redact fields
  • Custom middleware that masks data at stream-level (e.g., MQTT → InfluxDB)
[ IoT Device ]
    |  (Sensor Data)
    |  + TLS + Cert Auth
    v
[ Edge Layer ]
    - Filtering
    - Aggregation
    - Static Masking
    - Anonymization
    |
    v
[ Message Broker (MQTT/Kafka) ]
    - Encrypted Transport (TLS)
    - AuthN/AuthZ
    |
    v
[ Stream Processing Layer ]
    - Data Validation
    - Tokenization
    - Enrichment
    |
    v
[ Storage Layer ]
    - Encrypted Storage
    - Partitioned Data
    - Masked Fields
    |
    v
[ Access Layer ]
    - Dynamic Masking
    - Role-Based Access
    |
    v
[ Applications / Dashboard ]
    - Limited Views
    - User Consent Controls

#privacy #hipaa #maskingVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 11 minutes]

Auditing in IoT

Auditing in IoT means recording who did what, when, from where, and to which device or data so incidents can be investigated and compliance requirements can be met.

Why Auditing Matters

IoT environments are hard to trust because devices are distributed, long-lived, and often remotely managed.
Without proper audit trails, you cannot reliably answer:

  • Who accessed sensitive data
  • Who changed device configuration
  • Which API triggered a device action
  • Whether a firmware update was authorized
  • How an incident spread across systems

What to Audit

Device Activity

  • Device boot and shutdown events
  • Sensor status changes
  • Configuration changes
  • Local authentication attempts
  • Connectivity loss and recovery
  • Error and fault conditions

Data Access

  • Who accessed sensitive data
  • What data was accessed
  • When it was accessed
  • Whether it was viewed, exported, modified, or deleted

API Usage

  • Authentication attempts
  • Token usage
  • Read and write operations
  • Bulk exports
  • Failed requests
  • Rate limit violations

Firmware and Remote Control

  • Firmware update start and completion
  • Firmware version changes
  • Update source and signature verification result
  • Remote commands issued to devices
  • Command success or failure

Best Practices

Use Tamper-Resistant Logging

  • Store logs in append-only or write-once storage
  • Restrict log deletion and modification
  • Digitally sign critical audit records where needed

Standardize Time

  • Sync systems with NTP
  • Use UTC timestamps consistently
  • Record time with enough precision for investigations

Add Correlation IDs

  • Attach a correlation ID to each request or workflow
  • Propagate that ID across device, broker, API, processing, and dashboard layers
  • This makes incident tracing much easier

Log Enough, But Not Everything

  • Capture security-relevant actions
  • Avoid dumping unnecessary personal data into logs
  • Mask or hash sensitive values when possible

Separate Audit Logs from Application Logs

  • Application logs help debugging
  • Audit logs support accountability, forensics, and compliance
  • Do not mix them carelessly

Common Tools

ELK Stack

  • Elastic for indexing and search
  • Logstash for ingest and transformation
  • Kibana for dashboards and investigation

Good for:

  • Large-scale search
  • Centralized log analytics
  • Security investigations

Grafana

  • Lightweight alternative for log aggregation and visualization
  • Often simpler to operate than a full ELK stack

Good for:

  • Smaller teams
  • Cost-conscious environments
  • Fast operational dashboards

Retention Policies

Retention should balance:

  • Compliance needs
  • Security investigation needs
  • Storage cost
  • Privacy risk

Example Retention Guidelines

Data TypeRetention Period
Raw sensor data7 to 30 days
Aggregated metrics6 to 12 months
User consent logs5 to 7 years
Health-related regulated data6+ years, depending on policy and law

Storage Strategy

Use tiered storage so data moves through stages over time:

  • Hot for recent searchable data
  • Warm for less frequently accessed data
  • Cold for long-term retention
  • Delete after policy expiry

Enforcement Mechanisms

  • Object storage lifecycle policies
  • Blob storage lifecycle rules
  • Database TTL where supported
  • Scheduled archival and purge jobs

InfluxDB and TTL

For time-series workloads, TTL-style retention is useful because raw IoT telemetry grows fast.

Typical pattern:

  • Keep high-resolution raw data for a short period
  • Downsample into hourly or daily aggregates
  • Retain aggregates much longer
  • Expire raw data automatically

This reduces:

  • Storage cost
  • Query load
  • Compliance risk from over-retention

#auditing #ttl #influxdbVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 0 minutes]

Edge Computing

  1. Introduction
  2. Edge Decision Patterns
  3. Edge Data & Consistency Challenges
  4. Edge System Design Checklist
Ver 6.0.23

[Avg. reading time: 6 minutes]

Introduction

Edge computing enables data processing closer to the data source, reducing latency, bandwidth usage, and dependency on centralized cloud systems.

It is increasingly critical in systems that require real-time decision-making, offline capability, and AI inference at the edge.


Use Cases

Autonomous Vehicles
Process sensor data locally for real-time decisions (braking, steering), while periodically syncing models and telemetry with the cloud.

Smart Cities
Traffic lights and surveillance systems process data locally to reduce latency, while aggregated insights are sent to the cloud for planning.

Industrial Automation
Machines perform real-time monitoring and anomaly detection at the edge, with cloud used for long-term analytics and optimization.

Healthcare
Wearables and medical devices analyze patient vitals locally for immediate alerts, reducing reliance on continuous connectivity.

Agriculture
IoT sensors process soil and weather data locally to trigger irrigation decisions, minimizing cloud dependency in remote areas.

Supply Chain / Warehousing
Edge systems track inventory and movement in real time, while cloud systems handle forecasting and optimization.


Edge vs Cloud Responsibility

LayerResponsibility
EdgeReal-time processing, filtering, immediate decisions
FogAggregation, intermediate processing
CloudStorage, analytics, model training, long-term insights

  • AWS Greengrass
  • Azure IoT Edge
  • K3s (Lightweight Kubernetes for edge clusters)
  • NVIDIA Jetson (Edge AI hardware)
  • TensorFlow Lite / ONNX Runtime (Edge ML inference)
  • Apache IoTDB (Time-series storage for IoT)

Challenges in Edge Computing

Security Risks
Devices are physically exposed and harder to secure than centralized systems.

Device Management
Firmware updates, patching, and lifecycle management across thousands of devices is complex.

Scalability
Coordinating distributed edge nodes requires robust orchestration.

Interoperability
Heterogeneous devices and protocols complicate integration.

Observability
Monitoring and debugging distributed edge systems is difficult.

Network Reliability
Systems must handle intermittent connectivity and operate offline.

Model Drift (AI Systems)
Edge-deployed models can degrade over time without proper retraining and updates.

#edgecomputingVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 8 minutes]

Edge Decision Patterns

Edge systems are not just about processing data, but deciding what to do locally vs what to send to cloud.

Patterns

Filter at Edge

  • Send only important data
  • Example: Send temperature only if > 50°C

Aggregate at Edge

  • Combine data before sending
  • Example: Send hourly average instead of raw stream

Act at Edge

  • Immediate action without cloud
  • Example: Turn off machine if overheating

Forward to Cloud

  • Send raw or enriched data for analytics
  • Example: ML training data

Why it matters

  • Reduces bandwidth
  • Improves latency
  • Avoids cloud dependency

Offline-First Edge Systems

Edge systems must assume network failure is normal.

Key Concepts

Local Buffering

  • Store data locally when network is down

Retry Mechanisms

  • Send data when connection is restored

Eventual Sync

  • Edge and cloud will sync later

Example

A delivery truck loses connectivity:

  • Continues tracking locally
  • Syncs data when back online

Risk

  • Data duplication
  • Out-of-order events

Data Reduction at Edge

Sending all raw data to cloud is expensive and unnecessary.

Techniques

Sampling

  • Send every Nth record

Thresholding

  • Send only when values cross limits

Compression

  • Reduce payload size

Feature Extraction

  • Send insights instead of raw data
  • Example: send “anomaly detected” instead of full signal

Benefit

  • Lower bandwidth cost
  • Faster processing

Edge AI (Inference at Edge)

Edge devices can run ML models locally.

What runs at Edge

  • Image classification
  • Anomaly detection
  • Voice recognition

What stays in Cloud

  • Model training
  • Heavy computation
  • Model updates

Example

Security camera:

  • Detects person locally
  • Sends alert instead of full video

Challenge

  • Limited compute power
  • Model updates across devices

Edge Failure Scenarios

Edge systems fail differently than cloud systems.

Common Failures

Device Failure

  • Hardware crash

Network Loss

  • No connectivity to cloud

Data Loss

  • Buffer overflow or corruption

Clock Drift

  • Incorrect timestamps

Design Considerations

  • Retry logic
  • Local storage
  • Idempotent processing
  • Time synchronization

Edge vs Fog vs Cloud

Edge

  • Closest to device
  • Real-time decisions
  • Limited compute

Fog

  • Intermediate layer
  • Aggregation and coordination

Cloud

  • Centralized
  • Storage, analytics, ML training

Example

Smart factory:

  • Edge: machine sensor detects anomaly
  • Fog: aggregates factory data
  • Cloud: long-term analytics

Event-Driven Edge Systems

Edge systems are typically event-driven.

What is an Event?

A change or trigger:

  • Temperature exceeds threshold
  • Motion detected
  • Device status change

Flow

Device → Event → Edge Processing → Action / Cloud

Example

Motion sensor:

  • Detects movement
  • Triggers camera recording
  • Sends alert

Benefit

  • Efficient processing
  • Real-time response

#patternsVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 7 minutes]

Edge Data & Consistency Challenges

Edge systems introduce unique challenges in how data is generated, transmitted, and synchronized across distributed environments.

Unlike centralized systems, edge devices operate independently and may not always be connected to the cloud.


Latency vs Consistency Tradeoff

Edge systems prioritize low latency over strict consistency.

  • Decisions must be made instantly at the edge
  • Cloud may receive delayed or stale data

Example

Smart thermostat:

  • Adjusts temperature immediately (edge)
  • Cloud dashboard updates later

Key Insight

You cannot have both:

  • Real-time responsiveness
  • Perfect global consistency

Time and Ordering Issues

Edge-generated data may arrive out of order.

Why it happens

  • Network delays
  • Offline buffering
  • Device clock differences

Example

Sensor readings:

  • Event at 10:05 arrives first
  • Event at 10:01 arrives later

Impact

  • Incorrect analytics
  • Misleading dashboards

Approach

  • Use event time instead of arrival time
  • Apply windowing or reordering logic

Idempotency and Duplicate Handling

Edge systems often retry sending data, leading to duplicates.

Why duplicates occur

  • Network retries
  • Device reconnects
  • Message acknowledgment failures

Problem

  • Same event processed multiple times

Solution

  • Use unique event IDs
  • Ensure operations are idempotent

Example

Inventory update should not be applied twice for the same scan.


State Management at Edge

Edge devices maintain local state that may differ from cloud state.

Types of State

Transient State

  • Buffers, queues, temporary storage

Persistent State

  • Device configuration
  • Local logs

Challenge

  • Keeping edge and cloud in sync

Example

Warehouse scanner:

  • Updates stock locally
  • Syncs later with central system

Offline Data Synchronization

Edge systems must handle delayed synchronization with the cloud.

Behavior

  • Store data locally
  • Sync when connectivity is restored

Risks

  • Duplicate data
  • Conflicts between edge and cloud state

Strategy

  • Conflict resolution rules
  • Versioning or timestamps

Data Integrity and Loss

Data can be lost or corrupted at the edge.

Causes

  • Power failure
  • Storage limits
  • Device crashes

Mitigation

  • Local persistence
  • Checkpointing
  • Retry mechanisms

Summary

Edge systems require careful handling of:

  • Inconsistent data
  • Out-of-order events
  • Duplicate messages
  • Local vs global state

Designing for these challenges is critical for building reliable edge architectures.

#edgeconsistency #challengeVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

Edge System Design Checklist

Designing edge systems requires balancing latency, reliability, cost, and complexity.
This checklist provides a structured way to evaluate and design edge architectures.


1. Define the Objective

  • What decision needs to be made at the edge?
  • What is the acceptable latency?
  • What happens if the system is offline?

Example

  • Real-time alert → must run at edge
  • Daily report → can be handled in cloud

2. Decide What Runs Where

Clearly separate responsibilities across layers.

LayerResponsibility
EdgeReal-time processing, filtering, immediate action
FogAggregation, coordination
CloudStorage, analytics, model training

Key Question

  • Does this require immediate action?
    • Yes → Edge
    • No → Cloud

3. Handle Offline Scenarios

Assume network failure is normal.

  • Can the system operate without cloud?
  • How long can data be stored locally?
  • What happens when storage is full?

Design Patterns

  • Local buffering
  • Retry with backoff
  • Eventual synchronization

4. Design for Data Flow

Define how data moves through the system.

  • What data is filtered at edge?
  • What is aggregated?
  • What is sent to cloud?

Checklist

  • Avoid sending raw high-volume data
  • Send only meaningful events or summaries

5. Plan for Failures

Edge systems fail frequently and unpredictably.

Common Failures

  • Device crash
  • Network loss
  • Data corruption

Design Requirements

  • Retry logic
  • Local persistence
  • Graceful degradation

6. Ensure Idempotency

Duplicate events are unavoidable.

  • Can the same message be processed multiple times safely?
  • Are unique IDs used for events?

Rule

  • Every operation should be safe to repeat

7. Handle Time and Ordering

Data may arrive out of order.

  • Are you using event time or arrival time?
  • Can late-arriving data be handled?

Approach

  • Use timestamps
  • Allow reordering or windowing

8. Manage State

Edge devices maintain local state.

  • What state is stored locally?
  • How is it synced with the cloud?

Considerations

  • State conflicts
  • Versioning
  • Recovery after restart

9. Design for Security

Edge devices are exposed and vulnerable.

  • Is data encrypted in transit?
  • Are devices authenticated?
  • Can devices be compromised physically?

Minimum Requirements

  • Secure communication (TLS)
  • Device identity
  • Access control

10. Plan Observability

You cannot fix what you cannot see.

  • Can you monitor device health?
  • Are logs available centrally?
  • Can failures be traced?

Metrics to Track

  • Device uptime
  • Data throughput
  • Error rates

11. Consider Cost Tradeoffs

Edge shifts cost from cloud to devices.

  • Is edge hardware justified?
  • Is bandwidth reduction significant?

Example

  • Video streaming → process at edge, send alerts only

12. Think About Scale

Edge systems grow fast.

  • Can you manage thousands of devices?
  • How are updates deployed?

Challenges

  • Firmware updates
  • Configuration management
  • Fleet monitoring

Final Thought

A good edge system is not just about processing data locally.
It is about designing for:

  • Unreliable networks
  • Distributed state
  • Continuous failure

The best designs assume things will break and still work.

#edgedesign #checklistVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 1 minute]

IoT Cloud Computing

  1. Introduction
  2. Consistency Models
  3. Cloud Services
  4. IoT Cloud Services
  5. High Availability
  6. Disaster Recovery
  7. Pros and Cons
  8. IFTTT

Ver 6.0.23

[Avg. reading time: 9 minutes]

IoT Cloud Computing

Definitions

Hardware: physical computer / equipment / devices

Software: programs such as operating systems, Word, Excel

Web Site: Readonly web pages such as company pages, portfolios, newspapers

Web Application: Read Write - Online forms, Google Docs, email, Google apps


Advantages of Cloud for IoT

CategoryAdvantageDescription
ScalabilityElastic infrastructureEasily handle millions of IoT devices and sudden traffic spikes
StorageVirtually unlimited data storageIdeal for time-series sensor data, logs, images, video streams
Processing PowerHigh compute availabilityOffload heavy ML, analytics, and batch processing to cloud
IntegrationSeamless with APIs, servicesEasily connect to AI/ML tools, databases, event processing
Cost EfficiencyPay-as-you-go modelNo upfront infra cost; optimize for usage
Global ReachEdge zones and regional data centersConnect globally distributed devices with low latency
SecurityBuilt-in IAM, encryption, monitoringToken-based auth, TLS, audit logs, VPCs
Rapid DevelopmentPaaS tools and SDKsBuild, test, deploy faster using managed services
Maintenance-FreeNo server managementCloud handles uptime, patches, scaling
Disaster RecoveryRedundancy and backupAutomatic replication and geo-failover
Data AnalyticsIntegrated analytics platformsUse BigQuery, Databricks, AWS Athena etc. for deep insights
Updates & OTAEasy over-the-air updates to devicesRoll out firmware/software updates via cloud
Digital TwinsModel, simulate, and control remotelyCreate cloud-based digital representations of devices/systems

Types of Cloud Computing in IoT Context

Public Cloud (AWS, Azure, GCP, etc.)

Usage: Most common for IoT startups, scale-outs, and global deployments.

  • Easy to onboard devices via managed IoT hubs
  • Global reach with edge zones
  • Rich AI/ML toolsets (SageMaker, Azure ML, etc.)

Example: A smart home company using AWS IoT Core + DynamoDB.

Private Cloud

Usage: Enterprises with strict data policies (e.g., manufacturing, healthcare).

  • More control over data residency
  • Can comply with HIPAA, GDPR, etc.
  • Custom security and network setups

Example: A hospital managing patient monitoring devices on their private OpenStack cloud.

Hybrid Cloud

Usage: Popular in industrial IoT (IIoT) and smart infrastructure.

  • Store sensitive data on-prem (private), offload non-critical analytics to cloud (public)
  • Low latency control at the edge, cloud for training ML models

Example: A smart grid using on-prem SCADA + Azure for demand prediction.

Cloud Types in IoT – Comparison

Cloud TypeDescriptionIoT ExampleAdvantagesIdeal For
Public CloudHosted by providers like AWS, Azure, GCPSmart home devices using AWS IoT CoreScalable, global reach, pay-as-you-goStartups, large-scale consumer IoT
Private CloudDedicated infra for one org (e.g., on-prem OpenStack)Hospital storing patient monitoring data securelyMore control, security, complianceHealthcare, government, regulated industries
Hybrid CloudMix of public + private with data/apps moving betweenFactory with local control + cloud analyticsFlexibility, optimized costs, lower latencyIndustrial IoT, utilities, smart cities

#cloud #aws #azure #gcpVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 10 minutes]

Consistency Models

Eventual Consistency

A model where updates to data propagate across distributed nodes asynchronously. Temporary inconsistencies are allowed, but all replicas will eventually converge to the same state.

Example A smart vehicle updates its GPS location while offline. The cloud reflects the update once connectivity is restored.

Use Cases

  • Smart home devices
  • Vehicle tracking systems
  • Environmental monitoring

Limitations

  • Not suitable for financial systems or real-time critical decisions

Read-Your-Writes (RYW)

Once a client performs a write, all subsequent reads by that client must reflect that write.

Example A user turns OFF a smart light and immediately sees the updated OFF state in the app.

Use Cases

  • Device control systems
  • User-facing dashboards

Limitations

  • Requires session or client-level tracking

Monotonic Reads

Once a value is read, subsequent reads should never return an older value.

Example

TimeReading
10:00102 kWh
10:01103 kWh
10:02101 kWh ❌
10:03104 kWh

Use Cases

  • Energy meters
  • GPS tracking
  • Time-series monitoring

Limitations

  • Requires ordering guarantees across replicas

Causal Consistency

Ensures that causally related operations are observed in the correct order across the system.

Example Door opened > Alarm disabled
If reversed, system behavior becomes incorrect.

Use Cases

  • Security systems
  • Workflow-based automation

Limitations

  • Harder to implement than eventual consistency

Last Write Wins (LWW)

When multiple updates occur, the update with the most recent timestamp overwrites previous values.

Example Two users control the same smart light. The latest command determines the final state.

Use Cases

  • Smart home controls
  • IoT dashboards

Limitations

  • Risk of losing valid updates due to clock skew

Optimistic Concurrency

Allows multiple updates without locking resources. Conflicts are detected after execution, and one operation may need to retry.

Example

item_iditem_nmstock
1Apple10

Two users update simultaneously:

  • +5 and -3 applied concurrently
  • Conflict detected > one retries

Use Cases

  • Low-conflict environments
  • User-driven updates

Limitations

  • Not suitable for high-frequency concurrent writes

Strong Consistency

All reads return the most recent write immediately across all nodes.

Example Bank transaction reflects instantly across all systems.

Use Cases

  • Financial systems
  • Critical control systems

Limitations

  • Higher latency
  • Reduced availability in distributed systems

Session Consistency

Guarantees consistency within a single session but not across different clients.

Example A user sees consistent device state during a session, but another user may see stale data.

Use Cases

  • Mobile apps
  • User-specific IoT dashboards

Limitations

  • Not globally consistent

Bounded Staleness

Allows reads to lag behind writes by a defined time or number of versions.

Example A dashboard may show data up to 5 seconds old.

Use Cases

  • Monitoring dashboards
  • Analytics systems

Limitations

  • Requires defining acceptable staleness window

Term Mapping in IoT Context

ConceptRelevance in IoT
Eventual ConsistencyEdge devices syncing after offline periods
Read-Your-WritesImmediate feedback for device control
Monotonic ReadsPrevents backward movement in sensor readings
Causal ConsistencyMaintains correct event order in automation
Last Write WinsResolves conflicting device updates
Optimistic ConcurrencyHandles rare update conflicts
Strong ConsistencyRequired for critical operations
Session ConsistencyEnsures stable user experience
Bounded StalenessBalances freshness and performance

#consistency #eventualVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Cloud Services

SaaS – Software as a Service

SaaS provides ready-to-use cloud applications. Example: Google Docs, Gmail. In IoT, it offers real-time dashboards, alerts, and analytics.

Pros

  • No infrastructure management
  • Fast deployment
  • Built-in analytics and alerts

Cons

  • Limited customization
  • Possible vendor lock-in
  • Data stored in vendor cloud

PaaS – Platform as a Service

PaaS provides the tools and services to build and deploy IoT apps, including SDKs, APIs, device management, rules engines, and ML pipelines.

Example: HiveMQ (MQTT)

Pros

  • Scalable and customizable
  • Device lifecycle and security handled
  • Integration with ML, analytics tools

Cons

  • Learning curve
  • Requires cloud expertise
  • Still dependent on vendor ecosystem

IaaS – Infrastructure as a Service

IaaS gives you virtual machines, storage, and networking. In IoT, it lets you build fully custom pipelines from scratch.

Example: Virtual Machine

Pros

  • Full control over environment
  • Highly customizable
  • Can install any software

Cons

  • You manage everything: scaling, patching, backups
  • Not beginner-friendly
  • Higher ops burden

FaaS – Function as a Service

FaaS lets you run small pieces of code (functions) in response to events, like an MQTT message or sensor spike. Also called serverless computing.

Example: AWS Lambda, Azure Functions

When a temperature sensor sends a value > 90°C to MQTT, a Lambda function triggers an alert and stores the value in a DB.

Pros

  • No need to manage servers
  • Scales automatically
  • Event-driven and cost-effective

Cons

  • Cold start delays
  • Limited execution time and memory
  • Stateless only

#faas #saas #paas #iaassVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 10 minutes]

IoT Cloud Services

BaaS – Backend as a Service

BaaS provides backend features like authentication, real-time databases, and cloud functions, useful for mobile or lightweight IoT apps.

Example: Firebase. To some extent OAuth services like Google.

Pros

  • Easy to integrate with mobile/web apps
  • Realtime sync and authentication
  • Fast prototyping

Cons

  • Not designed for heavy industrial use
  • Vendor limitations on structure/storage
  • Less control over backend logic

DaaS – Device as a Service

DaaS bundles hardware devices with software, support, and cloud services, often with subscription billing.

A logistics company rents connected GPS from a provider, who also offers a dashboard and device monitoring as part of the plan.

Renting (house, car etc..)

Pros

  • No hardware management
  • Subscription model (OpEx > CapEx)
  • Full-stack support

Cons

  • Ongoing cost
  • Tied to specific hardware/software ecosystem
  • Less flexibility

Edge-aaS – Edge-as-a-Service

Edge-aaS enables local processing at the edge, closer to IoT devices. It reduces latency and bandwidth usage by handling logic locally.

Example: AWS Greengrass

Run Everything Locally

  • Camera sends input to Pi
  • Greengrass Lambda processes it in real time
  • Result (e.g., “object: person”) can be:
  • Logged locally
  • Sent to AWS via MQTT
  • Triggered to send message

Pros

  • Low latency, offline capable
  • Reduces cloud traffic and cost
  • Supports on-device inference

Cons

  • More complex deployment
  • Device resource limitations
  • Must sync carefully with cloud

DTaaS – Digital Twin as a Service

DTaaS offers cloud-hosted platforms to create, manage, and simulate digital replicas of physical systems (machines, buildings, etc.).

Example: Siemens MindSphere

A manufacturing firm models its conveyor system using MindSphere to monitor, predict failures, and optimize throughput using simulated conditions.

For understanding - Flight / Video Game Simulator

Pros

  • Powerful simulation and monitoring
  • Real-time mirroring of assets
  • Integrates well with AI/ML

Cons

  • Complex to model accurately
  • Requires continuous data flow
  • Can be costly at scale

Cloud Service Models for IoT

Service ModelFull FormIoT-Specific Role/UsageExamples
SaaSSoftware as a ServiceReady-to-use IoT dashboards, analytics, asset trackingUbidots, ThingSpeak, AWS SiteWise, Azure IoT Central
PaaSPlatform as a ServiceBuild, deploy, manage IoT apps with SDKs and device APIsAzure IoT Hub, AWS IoT Core, Google Cloud IoT (legacy), Kaa IoT
IaaSInfrastructure as a ServiceRun VMs, store raw sensor data, scale infraAWS EC2, Azure VMs, GCP Compute Engine
FaaSFunction as a ServiceEvent-driven micro-processing (e.g., react to MQTT events)AWS Lambda, Azure Functions, Google Cloud Functions
DaaSDevice as a ServiceSubscription-based hardware + cloud updatesCisco DaaS, HP DaaS
BaaSBackend as a ServiceAuth, DB, messaging backend for IoT appsFirebase, Parse Platform
Edge-aaSEdge-as-a-ServiceRun ML + logic at the edge, sync with cloudAWS Greengrass, Azure IoT Edge, ClearBlade
DTaaSDigital Twin as a ServiceSimulate, monitor, and control physical devices virtuallySiemens MindSphere, PTC ThingWorx

#saas #paas #iaas #faas #baasVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

High Availability

High Availability refers to how much uptime (availability) a system guarantees over a period — usually per year.

It’s expressed using “nines” — like 99%, 99.9%, etc. More 9’s = Less downtime.


Availability Formula

  • Availability = (Total Time - Downtime) / Total Time

This formula is used in SLAs and monitoring systems to measure system reliability.


High Availability – Nines and Downtime

AvailabilitynameAllowed Downtime per YearPer MonthUse Case Example
99%“Two nines”~3.65 days~7.2 hoursSmall apps, dev/test environments
99.9%“Three nines”~8.76 hours~43.8 minsBasic web services
99.99%“Four nines”~52.6 minutes~4.38 minsPayment systems, APIs
99.999%“Five nines”~5.26 minutes~26.3 secondsMedical, Telco, IoT control loops
99.9999%“Six nines”~31.5 seconds~2.63 secondsMission-critical systems

How High Availability is Achieved

  • Redundancy (multiple servers or instances)
  • Failover mechanisms (automatic switching)
  • Load balancing
  • No single point of failure
  • Multi-region deployments
  • Continuous monitoring and auto-recovery

For IoT

  • Smart Home Light Bulb → 99% is okay (a few hours of downtime is fine)
  • Smart Grid Control System → 99.999% is essential (every second counts)
  • Medical IoT (e.g., Heart Monitor) → Needs high availability

Beyond Just Nines

ConceptWhy It Matters in IoT + Cloud
RedundancyBackup sensors, edge nodes, and cloud instances ensure system keeps running if one fails
Failover SystemsAutomatically switch to standby components during failure
Load BalancingSpreads traffic across devices or cloud zones to prevent overload
Latency vs AvailabilityA service may be “up” but still slow — availability ≠ performance
Disaster Recovery (DR)Ensures systems and data can recover from outages or disasters
Geographic DistributionSpreading across regions/availability zones improves uptime and resilience
SLA (Service Level Agreement)Understand what cloud vendors promise and what downtime you’re actually allowed
Edge ProcessingEnables critical operations to continue even if cloud is unreachable (e.g., AWS Greengrass)
Monitoring & AlertingDetect and respond to failures fast using tools like CloudWatch, Datadog, Prometheus
Cost vs HA TradeoffHigher availability usually means higher costs — design smart based on use case

Fun Discussion Pointers

To design each system, do we need Edge computing or Fog computing, should we go to Cloud if so how many 9’s we need.

  • How many 9’s we need for smart light switch at home?
  • How many 9’s we need for smart light switch at Bank ATM?
  • A temperature sensor on a cold-storage truck is sending data to the cloud.
  • You’re designing an IoT wearable for elderly patients that detects falls. What should be the design?
  • What happens if the MQTT broker goes down? How would you make it fault-tolerant?
  • A weather station publishes sensor data every 15 minutes. Do they need Highly Available system?

#ha #slaVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 12 minutes]

Disaster Recovery

What is Disaster Recovery in IoT?

Disaster Recovery (DR) in IoT refers to the process of restoring devices, communication, and data pipelines after failures affecting both physical and digital components.

These failures include:

  • Device crashes or firmware corruption
  • Network outages (edge ↔ cloud disconnect)
  • Gateway / Fog node failures
  • Cloud region outages
  • Cyberattacks (e.g., ransomware, botnets)

Disaster Recovery vs High Availability (HA)

  • High Availability (HA)
    Focuses on preventing downtime
    Systems continue running with minimal interruption

  • Disaster Recovery (DR)
    Focuses on recovering after failure
    Accepts downtime but minimizes recovery impact

Simple View:

  • HA = Avoid failure
  • DR = Recover from failure

Why Disaster Recovery is Important in IoT

  • Physical Impact
    Failures can affect real-world systems
    Example: Smart grid, healthcare devices

  • Device State Recovery
    Requires restoring firmware, configs, and device identity

  • Connectivity Constraints
    Devices may go offline frequently

  • Data Integrity
    Missing telemetry can impact analytics and ML models


Types of Disaster Recovery Strategies

1. Backup and Restore

  • Periodic backups of data and configurations
  • Systems restored after failure

Pros:

  • Low cost
  • Simple implementation

Cons:

  • High recovery time
  • Possible data loss

Example:
Smart home system restoring device configs from cloud backup

2. Pilot Light

  • Minimal system always running in another region
  • Scaled up during disaster

Pros:

  • Faster recovery than backup
  • Cost-efficient

Cons:

  • Requires scaling during recovery

Example:
IoT backend with minimal services active in secondary region


3. Warm Standby

  • Fully functional but scaled-down system running

Pros:

  • Faster recovery
  • Moderate cost

Cons:

  • Not instant failover

Example:
Industrial monitoring system with standby cloud environment


4. Active-Active (Multi-Region)

  • Systems run simultaneously across regions

Pros:

  • Near-zero downtime
  • High resilience

Cons:

  • High cost
  • Complex architecture

Example:
Healthcare IoT system monitoring patients in real time


IoT-Specific Recovery Layers

Device-Level Recovery

  • Local buffering of data
  • Firmware rollback
  • Auto-reconnect mechanisms

Example:
Sensor stores readings locally during outage and syncs later


Edge / Fog Recovery

  • Redundant gateways
  • Local processing fallback
  • Sync to cloud after recovery

Example:
Factory continues operations using edge analytics


Cloud Recovery

  • Multi-region deployment
  • Broker failover (MQTT cluster)
  • Stream processing recovery

Example:
Traffic rerouted to secondary region after outage


End-to-End Recovery

  • Restore full pipeline (Device → Edge → Cloud)
  • Replay missed data
  • Restore dashboards and alerts

Example:
Fleet tracking system reconstructs missed routes


Key Concepts

RTO (Recovery Time Objective)

  • Maximum acceptable time to restore system

Examples:

  • Smart home: Minutes
  • Healthcare device: Seconds

RPO (Recovery Point Objective)

  • Maximum acceptable data loss

Examples:

  • Weather station: Few minutes acceptable
  • ICU monitor: Near zero

Backup Types

  • Full Backup – Entire dataset and configurations
  • Incremental Backup – Changes since last backup
  • Differential Backup – Changes since last full backup

Replication

  • Synchronous Replication
    Data written to multiple locations simultaneously
    Low data loss, higher latency

  • Asynchronous Replication
    Data replicated with delay
    Faster, but risk of data loss


Disaster Recovery in Cloud for IoT

  • Multi-region deployments
  • Managed IoT services and brokers
  • Automated backups
  • Infrastructure as Code (IaC)

Example:

  • Primary region processes IoT data
  • Secondary region maintains backup/standby system

Common Challenges

  • Device firmware inconsistencies
  • Offline data conflicts during sync
  • Broker single point of failure
  • Data consistency issues
  • Human error during recovery

Best Practices

  • Define clear RTO and RPO targets
  • Design offline-first devices
  • Implement edge buffering and replay mechanisms
  • Use multi-region deployments
  • Maintain device state/shadow in cloud
  • Automate backups and recovery
  • Regularly test disaster recovery plans

Summary

Disaster Recovery in IoT ensures systems can recover across:

  • Devices
  • Communication layers
  • Data pipelines
  • Cloud infrastructure

A strong DR strategy minimizes downtime, protects data, and maintains continuity of real-world operations.

#dr #RTO #RPOVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 9 minutes]

IoT Cloud – Pros and Cons

Pros

1. Scalability

Cloud platforms can automatically scale to handle millions of devices and events.

Example:
A smart city traffic system can scale from 1,000 sensors to 1 million without redesigning infrastructure.


2. Data Storage & Processing

Virtually unlimited storage with built-in analytics and processing capabilities.

Example:
A fleet management system stores years of GPS and telemetry data to analyze driving patterns and fuel efficiency.


3. Integrated Services

Cloud providers offer ready-made services like ML, streaming, APIs, and dashboards.

Example:
An IoT healthcare app uses cloud ML services to detect anomalies in patient heart rate data in real time.


4. Rapid Development

Developers can build and deploy solutions quickly without managing infrastructure.

Example:
A startup builds a smart irrigation system using managed MQTT brokers and serverless functions within days.


5. Remote Access

Devices and data can be accessed from anywhere.

Example:
A factory manager monitors machine health across multiple plants using a centralized dashboard.


6. Built-in Security Features

Cloud platforms provide encryption, IAM, monitoring, and compliance tools.

Example:
Devices authenticate using certificates, and all data is encrypted using TLS before reaching the cloud.


7. Disaster Recovery & Reliability

Cloud systems offer high availability, backups, and failover mechanisms.

Example:
If one region fails, IoT data pipelines automatically switch to another region with minimal downtime.


Cons

1. Latency

Cloud communication introduces delays, especially for real-time or critical operations.

Example:
An autonomous vehicle cannot rely on cloud decisions for braking due to network delay.


2. Connectivity Dependency

IoT systems depend heavily on stable internet connectivity.

Example:
A smart home system fails to respond if the internet goes down.


3. Privacy Concerns

Sensitive data is transmitted and stored externally, increasing exposure risk.

Example:
Wearable devices sending health data to cloud servers may raise compliance concerns (HIPAA/GDPR).


4. Recurring Costs

Cloud usage incurs ongoing costs for storage, compute, and data transfer.

Example:
A video surveillance system streaming continuously to the cloud results in high monthly bills.


5. Vendor Lock-In

Heavy reliance on a specific cloud provider makes migration difficult.

Example:
Using proprietary IoT services (like device twins or rules engine) makes switching providers complex.


6. System Complexity

Managing distributed systems across device, edge, and cloud increases architectural complexity.

Example:
Debugging data loss across device → gateway → cloud pipeline can be challenging.


7. Data Transfer Costs

Frequent data movement between devices and cloud can become expensive.

Example:
Streaming raw sensor data every second instead of aggregating at the edge increases bandwidth costs significantly.


Summary

ProsCons
ScalabilityLatency
Data StorageConnectivity Dependency
Integrated ServicesPrivacy Concerns
Rapid DevelopmentRecurring Costs
Remote AccessVendor Lock-In
Security FeaturesComplexity
Disaster RecoveryData Transfer Costs

#cloud #pros #consVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 2 minutes]

IFTTT

If This Then That

Ifttt

​IFTTT (If This Then That) is primarily a cloud-based automation platform that connects various web services and devices to enable users to create simple conditional statements, known as applets. These applets allow one service or device to trigger actions in another, facilitating automation across different platforms.

IFTTT facilitates communication between cloud services and edge devices, enabling users to create automations that leverage both cloud-based processing and local edge computing capabilities. However, the core functionality of IFTTT itself remains cloud-centric.

#iftttVer 6.0.23

Last change: 2026-04-16

[Avg. reading time: 4 minutes]

Good Reads

ESP32 - MicroPython : https://github.com/gchandra10/esp32-demo

IoT Arduino Projects

Projecthub

Autodesk

Hackster Seed Studio


MQTT Explorer

GUI desktop tool for inspecting MQTT topics & messages

mqtt-explorer

Wokwi

Online Arduino + ESP32 simulator. No hardware needed. VSCode / JetBrains supported.

wokwsi

Node Red

Visual flow-based tool for IoT logic and automation

nodered


Career Path

RoadMap

Example: RoadMap for Python Learning


Cloud Providers

Run and Code Python in Cloud. Free and Affordable plans good for demonstration during Interviews.

Python Anywhere


Cheap/Affordable GPUs for AI Workloads

RunPod


AI Tools

NotebookLM

---Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 4 minutes]

Notebooks vs IDE

FeatureNotebooks (.ipynb)Python Scripts (.py)
Use Case - DEQuick prototyping, visualizing intermediate stepsProduction-grade ETL, orchestration scripts
Use Case - DSEDA, model training, visualizationPackaging models, deployment scripts
InteractivityHigh – ideal for step-by-step executionLow – executed as a whole
VisualizationBuilt-in (matplotlib, seaborn, plotly support)Needs explicit code to save/show plots
Version ControlHarder to diff and mergeEasy to diff/merge in Git
ReusabilityLower, unless modularizedHigh – can be organized into functions, modules
Execution ContextCell-based executionLinear, top-to-bottom
Production ReadinessPoor (unless using tools like Papermill, nbconvert)High – standard for CI/CD & Airflow etc.
DebuggingEasy with cell-wise changesNeeds breakpoints/logging
IntegrationJupyter, Colab, Databricks NotebooksAny IDE (VSCode, PyCharm), scheduler integration
Documentation & TeachingMarkdown + codeDocstrings and comments only
Unit TestsNot practicalEasily written using pytest, unittest
Package ManagementAd hoc, via %pip, %condaManaged via requirements.txt, poetry, pipenv
Using LibrariesEasy for experimentation, auto-reloads supportedCleaner imports, better for dependency resolution
Last change: 2026-04-16

[Avg. reading time: 5 minutes]

Assignments

Note 1: LinkedIn Learning is Free for Rowan Students.

Note 2: Submission should be LinkedIn Learning Certificate URLs. (No Screenshots or Google Docs or Drives)


Assignment 1 - Python


Assignment 2 - Ethical Hacking IoT Devices


Assignment 3 - Learning Git and GitHub


Assignment 4 - Raspberry Pi

Raspberry Pi Essential Training


Assignment 5 - Cloud


Extra Credit Choices (Optional)

(Extra credit should be submitted before the Finals.)Ver 6.0.23

Last change: 2026-04-16

[Avg. reading time: 3 minutes]

Answers

Chapter 1

  1. For each of the following IoT components, identify whether it belongs to the upper stack or the lower stack and explain why.
  • 1.1 Upper stack - It deals with user interaction and control applications.
  • 1.2 Lower stack - It involves data collection from the environment.
  • 1.3 Lower stack - It handles data transport and protocol translation.
  • 1.4 Upper stack - It focuses on data processing and analytics.
  • 1.5 Lower stack - It manages device operations and hardware control.
  1. Determine whether the following statements are true or false.
  • 2.1 False - Edge computing is generally considered part of the lower stack.
  • 2.2 False - These are aspects of the upper stack.
  • 2.3 True - It involves hardware (lower stack) and application (upper stack) components.
  • 2.4 False - They are used for low-bandwidth, short-range communication.
  • 2.5 True - Predictive maintenance uses processed data and analytics from the upper stack.Ver 6.0.23
Last change: 2026-04-16

Tags

amd

/Data Processing/CPU Architecture

anomaly

/Machine Learning with IoT/Anomaly Detection

api

/Data Processing/Application Layer

applicationlayer

/Data Processing/Application Layer

architecture

/Data Processing/CPU Architecture

arm

/Data Processing/CPU Architecture

auditing

/Security/Auditing in IoT

aws

/IoT Cloud Computing/Introduction

azure

/IoT Cloud Computing/Introduction

baas

/IoT Cloud Computing/IoT Cloud Services

base64

/Security/Number Systems

broker

/Data Processing/Application Layer/MQTT

cbor

/Data Processing/Application Layer/CBOR

centralized

/IoT Introduction/Computing Types

challenge

/Edge Computing/Edge Data & Consistency Challenges

checklist

/Edge Computing/Edge System Design Checklist

cloud

/IoT Cloud Computing/Introduction

/IoT Cloud Computing/Pros and Cons

codequality

/Data Processing/Python Environment/Code Quality & Safety

communicationlayer

/Security/Communication Layer

computing

/IoT Introduction/Computing Types

cons

/IoT Cloud Computing/Pros and Cons

consistency

/IoT Cloud Computing/Consistency Models

container

/Data Processing/Containers/Docker

/Data Processing/Containers/Docker Examples

/Data Processing/Containers/VMs or Containers

/Data Processing/Containers/What Container does

containers

/Data Processing/Containers

damm

/Data Processing/Python Environment/Faker

data

/IoT Introduction/Upper Stack

dataatrest

/Security/Data Layer

datacleaning

/Machine Learning with IoT/Feature Engineering

dataecosystem

/IoT Introduction/JOBS

dataformat

/Data Processing/Application Layer/CBOR

/Data Processing/Application Layer/JSON

/Data Processing/Application Layer/MessagePack

/Data Processing/Application Layer/XML

dataintegrity

/Security/Data Layer

dataintransit

/Security/Data Layer

dataviz

/Data Processing/Data Visualization libraries

debug

/Data Processing/Python Environment/Logging

decimal

/Security/Number Systems

docker

/Data Processing/Containers

/Data Processing/Containers/Container in IOT

/Data Processing/Containers/Docker

/Data Processing/Containers/Docker Examples

/Data Processing/Containers/VMs or Containers

/Data Processing/Containers/What Container does

/Data Processing/Time Series Databases/InfluxDB Demo

dockerhub

/Data Processing/Containers/Docker Examples

dr

/IoT Cloud Computing/Disaster Recovery

edge

/Machine Learning with IoT/ML with IoT

/Machine Learning with IoT/Predictive Maintenance

edgecomputing

/Edge Computing/Introduction

edgeconsistency

/Edge Computing/Edge Data & Consistency Challenges

edgedesign

/Edge Computing/Edge System Design Checklist

encryption

/Security/Encryption

environmental

/IoT Introduction/IoT Use Cases

error

/Data Processing/Python Environment/Error Handling

eventual

/IoT Cloud Computing/Consistency Models

evolution

/IoT Introduction/Evolution of IOT

faas

/IoT Cloud Computing/Cloud Services

/IoT Cloud Computing/IoT Cloud Services

faker

/Data Processing/Python Environment/Faker

featureengineering

/Machine Learning with IoT/Feature Engineering

firmware

/Security/Introduction

fog

/Machine Learning with IoT/Predictive Maintenance

formats

/Data Processing/Application Layer

gcp

/IoT Cloud Computing/Introduction

google

/Data Processing/Application Layer/Protocol Buffers

grafana

/Data Processing/Data Visualization libraries

/Data Processing/Data Visualization libraries/Grafana

ha

/IoT Cloud Computing/High Availability

hashing

/Security/Encryption

hex

/Security/Number Systems

hipaa

/Security/IoT Privacy

http

/Data Processing/Application Layer/HTTP & REST API

/Data Processing/Application Layer/MQTT

/IoT Introduction/Protocols

hub

/Data Processing/Containers/Docker

iaas

/IoT Cloud Computing/Cloud Services

/IoT Cloud Computing/IoT Cloud Services

ifttt

/IoT Cloud Computing/IFTTT

importance

/IoT Introduction/Introduction

influxdb

/Data Processing/Data Visualization libraries/Grafana

/Data Processing/Time Series Databases

/Data Processing/Time Series Databases/InfluxDB

/Security/Auditing in IoT

info

/Data Processing/Python Environment/Logging

insecureapi

/Security/Application Layer

integrationlayer

/IoT Introduction/Upper Stack

iot

/Data Processing/Containers/Container in IOT

/IoT Introduction/Computing Types

/IoT Introduction/Evolution of IOT

/IoT Introduction/Introduction

/IoT Introduction/Puzzle

/Machine Learning with IoT/ML with IoT

/Machine Learning with IoT/Predictive Maintenance

iotarchitects

/IoT Introduction/JOBS

iotdata

/Machine Learning with IoT/IoT Data Characteristics

iotdevelopers

/IoT Introduction/JOBS

iotusecases

/IoT Introduction/IoT Use Cases

jobs

/IoT Introduction/JOBS

json

/Data Processing/Application Layer/JSON

lint

/Data Processing/Python Environment

logging

/Data Processing/Python Environment/Logging

logistics

/IoT Introduction/IoT Use Cases

lowerstack

/IoT Introduction/Lower Stack

luhn

/Data Processing/Python Environment/Faker

masking

/Security/IoT Privacy

measurement

/Data Processing/Time Series Databases/InfluxDB Demo

messagepack

/Data Processing/Application Layer/MessagePack

microservices

/Data Processing/Application Layer/HTTP & REST API

mitm

/Security/Communication Layer

ml

/Machine Learning with IoT/ML with IoT

monolithic

/Data Processing/Application Layer/HTTP & REST API

mqtt

/Data Processing/Application Layer/MQTT

/IoT Introduction/Protocols

mypy

/Data Processing/Python Environment

network

/IoT Introduction/Introduction

noise

/Machine Learning with IoT/IoT Data Characteristics

octal

/Security/Number Systems

paas

/IoT Cloud Computing/Cloud Services

/IoT Cloud Computing/IoT Cloud Services

patterns

/Edge Computing/Edge Decision Patterns

pdoc

/Data Processing/Python Environment/Code Quality & Safety

pep

/Data Processing/Python Environment

physicaldevices

/IoT Introduction/Lower Stack

predictive

/Machine Learning with IoT/Predictive Maintenance

predictivemaintenance

/Machine Learning with IoT/Anomaly Detection

privacy

/Security/IoT Privacy

prometheus

/Data Processing/Time Series Databases

pros

/IoT Cloud Computing/Pros and Cons

protobuf

/Data Processing/Application Layer/Protocol Buffers

protocol

/IoT Introduction/IOT Stack Overview

/IoT Introduction/Protocols

protocols

/Data Processing/Application Layer

publisher

/Data Processing/Application Layer/MQTT

puzzle

/IoT Introduction/Puzzle

ratelimiting

/Security/Application Layer

repositories

/Data Processing/Containers/Docker

rest

/Data Processing/Application Layer/HTTP & REST API/REST API

restapi

/Data Processing/Application Layer/HTTP & REST API/REST API

rpo

/IoT Cloud Computing/Disaster Recovery

rto

/IoT Cloud Computing/Disaster Recovery

ruff

/Data Processing/Python Environment

saas

/IoT Cloud Computing/Cloud Services

/IoT Cloud Computing/IoT Cloud Services

safety

/Data Processing/Python Environment/Code Quality & Safety

secrets

/Security/Encryption

security

/Security/Introduction

services

/Data Processing/Application Layer

sla

/IoT Cloud Computing/High Availability

smart

/IoT Introduction/Introduction

sql

/Data Processing/Data Visualization libraries/Grafana

/Data Processing/Time Series Databases/InfluxDB

stack

/IoT Introduction/IOT Stack Overview

statefulness

/Data Processing/Application Layer/HTTP & REST API/Statefulness

statelessness

/Data Processing/Application Layer/HTTP & REST API/Statelessness

status

/Data Processing/Application Layer/HTTP & REST API

subscriber

/Data Processing/Application Layer/MQTT

telegraf

/Data Processing/Time Series Databases/InfluxDB

/Data Processing/Time Series Databases/InfluxDB Demo

timeseries

/Machine Learning with IoT/Predictive Maintenance

try

/Data Processing/Python Environment/Error Handling

tsdb

/Data Processing/Time Series Databases

/Data Processing/Time Series Databases/InfluxDB

ttl

/Security/Auditing in IoT

upperstack

/IoT Introduction/Upper Stack

vm

/Data Processing/Containers/VMs or Containers

worksforme

/Data Processing/Containers/What Container does

xml

/Data Processing/Application Layer/XML

xss

/Security/Application Layer