Introduction
If you're designing an MQTT communication architecture for industrial systems, the typical "just publish and subscribe" approach won't get you very far. Whether you're working with autonomous vehicles, manufacturing equipment, or any complex IoT system, you need a structure that can scale across multiple sites, handle hundreds of assets, support multiple subscribers, and remain maintainable as your system grows.
Through my experience working with MQTT technology in industrial environments, I've found that combining ISA-95 equipment hierarchy, Unified Namespace (UNS) principles, and MQTT 5 features creates a robust, self-documenting architecture that optimizes bandwidth, improves discoverability, and supports enterprise-scale growth.
In this article, I'll share these best practices and design patterns I've learned. The examples used throughout are from autonomous vehicle and industrial control scenarios—I've chosen these because I work in this industry and implement MQTT solutions here. Whether you're building a manufacturing IoT platform, managing autonomous fleets, or designing any distributed system, these patterns will help you create scalable and maintainable architectures.
The Problem with Flat MQTT Topics
Many MQTT implementations start with simple, flat topics like:
sim/telemetry/localization
av/commands/navigation
This approach seems fine at first, but quickly reveals limitations:
- No scalability: How do you support multiple simulation instances or sites?
- Poor discoverability: New developers struggle to understand the data flow
- Limited filtering: Subscribers can't efficiently filter by asset type or location
- No standards: Custom patterns mean reinventing the wheel
- Metadata bloat: Stuffing metadata into payloads wastes bandwidth
A structured, hierarchical approach solves these challenges systematically.
Design Principles: The Foundation
A robust MQTT architecture should be built on three core principles:
1. ISA-95 Equipment Hierarchy Model
The ISA-95 standard defines five levels of organizational hierarchy used across manufacturing and industrial IoT:
Level 4: Enterprise
Level 3: Site (Manufacturing Operations)
Level 2: Area (Production Unit)
Level 1: Cell (Work Center)
Level 0: Equipment (Process/Device)
For an industrial IoT system, these levels can be mapped as follows:
- Enterprise: Company-wide namespace
-
Site: Simulation/deployment instance (e.g.,
sim-instance-001
,production-site-01
) - Area: System component (Simulation Engine, Control System)
- Line: Asset management domain (assets, infrastructure)
- Cell: Asset type (truck, crane, agv, robot)
- Equipment: Asset instance (specific vehicle/equipment ID)
2. Unified Namespace (UNS) Principles
UNS creates a single source of truth for all data:
- Hierarchical Organization: Clear, logical structure
- Publish Once, Subscribe Many: Data producers publish to canonical topics
- Self-Documenting: Topic structure reveals data meaning
- Event-Driven: Real-time data propagation
- Context Preservation: Location and metadata embedded in topic hierarchy
3. MQTT 5 Native Features
Instead of reinventing the wheel, leverage MQTT 5's built-in capabilities:
- Request-Response Pattern: Built-in correlation mechanism
- User Properties: Extensible metadata without payload bloat
- Content Type: Standard content negotiation
- Topic Aliases: Bandwidth optimization for long topics
- Shared Subscriptions: Load balancing for scalability
Key Takeaway: Don't fight the protocol. MQTT 5 already solved many of your problems—use its features!
The 8-Level Topic Hierarchy
Here's a recommended topic structure that follows ISA-95 and UNS principles:
enterprise/
{simulationId | siteId}/
{area}/
assets/
{assetType}/
{assetId}/
{dataCategory}/
{messageType}
Breaking Down Each Level
Level | Component | Description | Example Values |
---|---|---|---|
1 | Enterprise | Company namespace | enterprise |
2 | Simulation ID | Unique instance |
sim-instance-001 , production-site-01
|
3 | Area | System component |
simulation-engine , control-system
|
4 | Line | Domain category |
assets , infrastructure
|
5 | Cell | Asset type |
truck , crane , agv , robot
|
6 | Equipment | Asset instance ID |
vehicle-001 , crane-alpha
|
7 | Data Category | Message category |
telemetry , commands , events
|
8 | Message Type | Specific message |
localization , navigation , safety
|
Why 8 Levels? ISA-95 + MQTT Extensions
ISA-95 provides the first 5 levels (organizational hierarchy). Three additional levels address MQTT-specific needs:
Extension 1: Line Level (assets/
)
- Reason: Organizational flexibility
- Allows separation of
assets/
from future categories likeinfrastructure/
,fleet/
, etc. - Example:
enterprise/sim-001/simulation-engine/infrastructure/broker/status
Extension 2: Data Category (telemetry/
, commands/
, events/
)
- Reason: MQTT topic organization
- Enables efficient wildcard subscriptions
- Clear separation between data types
- Example:
enterprise/sim-001/simulation-engine/assets/+/+/telemetry/#
Extension 3: Message Type (localization
, perception
, etc.)
- Reason: Message type specificity
- Allows highly detailed topic filtering
- Example:
enterprise/sim-001/simulation-engine/assets/truck/+/telemetry/localization
Practical Topic Examples
Telemetry Topics (Data from Simulation)
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-001/telemetry/localization
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-002/telemetry/perception
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-001/telemetry/vehicle_feedback
enterprise/sim-instance-001/simulation-engine/assets/crane/crane-alpha/telemetry/localization
Command Topics (Commands to Simulation)
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-001/commands/navigation
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-001/commands/control
enterprise/sim-instance-001/simulation-engine/assets/truck/vehicle-001/commands/handle
Understanding Topic Organization by Destination
Important: Topics represent where the data lives (the target system), not who sends it.
Example:
enterprise/sim-001/simulation-engine/assets/truck/vehicle-001/commands/navigation
↑ ↑
| |
Target system Command being sent
(Simulation Engine to Simulation Engine's
owns these assets) asset
Who publishes vs who subscribes:
Topic | Publisher | Subscriber | Why this area? |
---|---|---|---|
enterprise/.../simulation-engine/assets/vehicle-001/telemetry/localization |
Simulation Engine | Control System | Simulation Engine's asset telemetry |
enterprise/.../simulation-engine/assets/vehicle-001/commands/navigation |
Control System | Simulation Engine | Commands for Simulation Engine's asset |
The source of the message is captured in MQTT 5 User Properties:
User-Properties:
source-system: control-system ← Control System sent it
source-system: simulation-engine ← Simulation Engine sent it
Wildcard Subscription Patterns
MQTT supports powerful wildcard subscriptions:
-
+
(single-level wildcard): Matches exactly one level -
#
(multi-level wildcard): Matches zero or more levels
Common patterns:
Use Case | Pattern | Description |
---|---|---|
All data for one asset | enterprise/sim-001/simulation-engine/assets/truck/vehicle-001/# |
All messages for vehicle-001 |
All trucks in simulation | enterprise/sim-001/simulation-engine/assets/truck/+/# |
All messages from all trucks |
All localization data | enterprise/sim-001/simulation-engine/assets/+/+/telemetry/localization |
Localization from all assets |
All telemetry | enterprise/sim-001/simulation-engine/assets/+/+/telemetry/# |
All telemetry from all assets |
All commands to trucks | enterprise/sim-001/simulation-engine/assets/truck/+/commands/# |
Commands to any truck |
All safety events | enterprise/sim-001/simulation-engine/assets/+/+/events/safety |
Safety events from all assets |
Entire simulation | enterprise/sim-001/# |
Everything in simulation |
MQTT 5 Message Properties: Eliminating Payload Bloat
A key best practice in MQTT 5 is moving metadata from payloads to message properties.
The Payload Bloat Problem
Traditionally, all metadata is included within the payload - resulting in bloated messages:
{
"id": "550e8400-e29b-41d4-a716-446655440000", // 36 bytes
"correlationId": "prev-msg-123", // 12 bytes
"type": "localization", // 12 bytes
"timestamp": {"Ns": 1234567890, "Utc": "2025-10-20..."}, // 50 bytes
"source": {"name": "simulation-engine", "role": "sim", "assetId": "vehicle-001"}, // 45 bytes
"payload": { // ACTUAL DATA STARTS HERE
"payloadType": "localization",
"towHeadPose": {"x": 12.3, "y": 4.5, "z": 0.0}
}
}
155 bytes of metadata + 45 bytes of actual data = 200 bytes total
Using MQTT v5 User Properties – Streamlined Payload:
{
"payloadType": "localization",
"towHeadPose": {"x": 12.3, "y": 4.5, "z": 0.0}
}
45 bytes of actual data + metadata in MQTT properties
Why This Matters
High-frequency telemetry at 20 Hz:
- v1.0: 200 bytes × 20 messages/sec = 4,000 bytes/sec per asset
- v2.0: ~80 byte (binary encoded metadata) × 20 messages/sec = 1600 bytes/sec per asset
- ~35% reduction in bandwidth!
For 100 assets, that's ≈ 0.23 MB/sec saved on bandwidth costs.
Standard MQTT 5 Properties
The user properties can be represented as shown in the table below. Since they are binary encoded, they are lighter than the payload itself.
Property Key | Type | Required | Description | Example |
---|---|---|---|---|
Content-Type |
String | Yes | Payload format | application/json |
Message-Expiry-Interval |
UInt32 | Yes | TTL in seconds |
60 (commands), 5 (telemetry) |
schema-version |
String | Yes | Schema semantic version | 2.0.0 |
source-system |
String | Yes | Publishing system |
simulation-engine , control-system
|
source-instance |
String | No | Instance ID for multi-node | sim-node-01 |
timestamp-ns |
String | Yes | Nanosecond timestamp | 1234567890123456789 |
timestamp-utc |
String | Yes | ISO-8601 UTC timestamp | 2025-10-20T14:30:00.123Z |
asset-type |
String | Yes | Type of asset |
truck , crane
|
asset-id |
String | Yes | Specific asset instance | vehicle-001 |
message-id |
String | Yes | Unique message identifier | UUID/ULID |
Request-Response Pattern with MQTT 5
MQTT 5's native request-response eliminates custom correlation logic.
Request Message (Control System → Simulation Engine)
Topic: enterprise/sim-001/simulation-engine/assets/truck/vehicle-001/commands/navigation
MQTT Properties:
Response Topic: enterprise/sim-001/control-system/responses/vehicle-001
Correlation Data: <binary-correlation-id>
Content Type: application/json
User Properties:
schema-version: 2.0.0
source-system: control-system
message-id: 550e8400-e29b-41d4-a716-446655440000
Payload:
{
"payloadType": "nav_command",
"mode": "waypoints",
"route": [
{
"pose": {
"x": 10.0,
"y": 20.0,
"z": 0.0,
"qx": 0,
"qy": 0,
"qz": 0.707,
"qw": 0.707
}
}
],
"behavior": {
"targetSpeedMps": 4.5
}
}
Response Message (Simulation Engine → Control System)
Topic: enterprise/sim-001/control-system/responses/vehicle-001
(from Response Topic)
MQTT Properties:
Correlation Data: <same-binary-correlation-id>
Content Type: application/json
User Properties:
schema-version: 2.0.0
source-system: simulation-engine
message-id: 660e8400-e29b-41d4-a716-446655440111
Payload:
{
"payloadType": "command_ack",
"status": "accepted",
"message": "Navigation command accepted and queued"
}
Message Schema Design Philosophy
Following these principles for all payloads creates cleaner, more maintainable systems:
- Minimal Payload: Only business data in payload; metadata in MQTT properties
-
Strongly Typed: Use
payloadType
field for message discrimination - Flat Structure: Avoid deep nesting where possible
- Extensibility: Optional fields for future enhancement
- Validation: JSON Schema available for all message types
Base Payload Structure
All payloads follow this minimal structure:
{
"payloadType": "<message-type>",
"<message-specific-fields>": "..."
}
Examples - Telemetry Messages
1. Localization
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/telemetry/localization
Frequency: 20 Hz
QoS: 0
{
"payloadType": "localization",
"towHeadPose": {
"x": 12.3,
"y": 4.5,
"z": 0.0,
"qx": 0.0,
"qy": 0.0,
"qz": 0.707,
"qw": 0.707
},
"trailerPose": {
"x": 10.1,
"y": 4.3,
"z": 0.0,
"qx": 0.0,
"qy": 0.0,
"qz": 0.707,
"qw": 0.707
}
"covariance": null,
"frameId": "map"
}
2. Perception
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/telemetry/perception
Frequency: 10 Hz
QoS: 0
{
"payloadType": "perception",
"objects": [
{
"trackId": "trk-42",
"class": "truck",
"globalPosition": {
"x": 18.1,
"y": 6.2,
"z": 0.0,
"qx": 0.0,
"qy": 0.0,
"qz": 0.0,
"qw": 1.0
},
....
],
...
}
3. Vehicle Feedback
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/telemetry/vehicle_feedback
Frequency: 10 Hz
QoS: 0
{
"payloadType": "vehicle_feedback",
"speedMps": 2.7,
"steerAngleDeg": 1.2,
"gear": "D",
"throttlePct": 0.12,
"brakePct": 0.0,
"lights": {
"head": true,
"brake": false,
"leftIndicator": false,
"rightIndicator": false,
"hazard": false
},
"faults": []
}
4. Heartbeat
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/telemetry/heartbeat
Frequency: 10 Hz
QoS: 0
Retained: true (optional, for last-will monitoring)
{
"payloadType": "heartbeat",
"simTimeNs": 1234567890123456789,
"paused": false,
"status": "running"
}
Sample Command Messages
Navigation Command
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/commands/navigation
Frequency: On-demand / 10 Hz
QoS: 1
Response Required: Yes
{
"payloadType": "nav_command",
"mode": "waypoints",
"route": [
{
"pose": {
"x": 25.0,
"y": 30.0,
"z": 0.0,
...
},
"speedLimitMps": 5.0
}
],
"behavior": {
...
},
"hardStop": false
}
Status Event (Acknowledgment)
Topic: enterprise/{simId}/simulation-engine/assets/{assetType}/{assetId}/events/status
Frequency: On-demand
QoS: 1
{
"payloadType": "status_event",
"status": "acknowledged",
"referenceMessageId": "550e8400-e29b-41d4-a716-446655440000",
"message": "Navigation command accepted"
}
Quality of Service (QoS) Strategy
Choosing the right QoS level for each message type is critical for performance.
QoS Level Selection
Message Category | QoS | Rationale |
---|---|---|
High-frequency telemetry (localization, perception, vehicle_feedback) | 0 | Low latency priority; acceptable message loss; next message arrives quickly |
Heartbeat | 0 | Periodic; missing one heartbeat acceptable |
Commands (navigation, control, handle) | 1 | Must be delivered at least once; deduplication acceptable |
Events (safety, status) | 1 | Critical information; must be delivered |
Configuration changes | 2 | Must be delivered exactly once (if needed) |
Key Takeaway: Don't use QoS 1 or 2 for high-frequency telemetry. The acknowledgment overhead will kill your performance.
Retention Policy
Topic Pattern | Retained | Reason |
---|---|---|
*/telemetry/# |
No | High-frequency, time-sensitive data |
*/telemetry/heartbeat |
Yes (optional) | Last-will indicator for monitoring |
*/commands/# |
No | Time-sensitive, obsolete quickly |
*/events/# |
No | Point-in-time events |
*/state/# |
Yes | Reflects current state |
Message Expiry
Set Message Expiry Interval property based on data type:
- Telemetry: 5 seconds (data stale quickly)
- Commands: 30 seconds (time-sensitive actions)
- Events: 60 seconds (important but not time-critical)
Security and Access Control
Security is implemented through Access Control Lists (ACLs) at the broker level.
Simulation Engine Permissions
PUBLISH:
enterprise/+/simulation-engine/assets/+/+/telemetry/#
enterprise/+/simulation-engine/assets/+/+/events/#
enterprise/+/control-system/responses/#
SUBSCRIBE:
enterprise/+/simulation-engine/assets/+/+/commands/#
Control System Permissions
PUBLISH:
enterprise/+/simulation-engine/assets/+/+/commands/#
enterprise/+/control-system/assets/+/+/state/#
SUBSCRIBE:
enterprise/+/simulation-engine/assets/+/+/telemetry/#
enterprise/+/simulation-engine/assets/+/+/events/#
enterprise/+/control-system/responses/#
Monitoring/Dashboard Permissions
SUBSCRIBE:
enterprise/+/#
PUBLISH:
(none)
TLS/SSL Configuration
- Protocol: TLS 1.3
- Mutual Authentication: Required for production
- Certificate Management: Auto-renewal with Let's Encrypt or internal CA
Why TLS 1.3?
- Faster: 1 round-trip vs 2-3 for TLS 1.2
- Secure: Removed all known vulnerabilities
- Future-proof: Will be supported for years
- Compliance: Meets all current security standards
TLS 1.2: Client → Server → Client → Server (4 steps)
TLS 1.3: Client → Server (2 steps)
For high-frequency MQTT (20 Hz telemetry), TLS 1.3 adds minimal overhead while TLS 1.2 can cause noticeable latency.
Key Lessons from Working with MQTT at Scale
1. Don't Reinvent MQTT 5 Features
I've seen many projects build custom correlation logic, custom metadata envelopes, and custom content negotiation—all of which are already built into MQTT 5. Learn the protocol features first before building custom solutions!
2. Standards Matter
Using ISA-95/UNS principles makes your architecture:
- Easier to explain to new team members
- Compatible with third-party tools
- Ready for organizational growth
- Self-documenting
3. Pay Attention to QoS
Using QoS 1 for 20 Hz telemetry can be disastrous. The acknowledgment overhead crushes performance. QoS 0 for high-frequency data is almost always the right choice.
4. Metadata in Properties, Not Payload
Leveraging MQTT 5 User Properties for metadata:
- Reduces bandwidth significantly
- Makes payloads cleaner and easier to parse
- Separates concerns properly
- Enables better caching and routing
5. Think Multi-Tenant from Day One
Even if you only have one instance today, design for multiple from the start. Adding instance/site ID to the topic hierarchy early prevents painful refactoring later.
Topic Naming Conventions
Here are practical conventions that work well in industrial MQTT systems:
-
Use lowercase:
enterprise/sim-001/simulation-engine
notEnterprise/Sim-001/SimulationEngine
-
Use hyphens for multi-word:
vehicle-001
notvehicle_001
orvehicle001
- Be consistent: Same pattern across all levels
-
Keep it readable:
telemetry
nottlm
,localization
notloc
- No special characters: Only alphanumeric and hyphens
Conclusion
Building an industrial-grade MQTT architecture isn't just about publishing and subscribing. It's about creating a system that:
- Scales gracefully from development to production
- Documents itself through topic structure
- Optimizes bandwidth without sacrificing functionality
- Follows industry standards for interoperability
- Provides flexibility for future growth
These best practices—combining ISA-95 hierarchy, UNS principles, and MQTT 5 features—come from my hands-on experience working with MQTT in industrial environments. When applied properly, this approach can reduce bandwidth consumption significantly, improve discoverability, and create architectures that remain maintainable as systems grow.
Whether you're building an autonomous vehicle system, a manufacturing IoT platform, or any other industrial system, these patterns provide a solid foundation for scalable and maintainable MQTT communication.
What Would You Do Differently?
I'd love to hear your thoughts:
- Have you implemented similar patterns?
- What challenges did you face with MQTT at scale?
- What would you change in this architecture?
Drop a comment below or reach out—I'm always learning!
Thanks for reading! If you found this helpful, please give it a ❤️ and share with your team.
Top comments (0)