DEV Community

Cover image for Guide — Simulating Supply Chain Incidents with GPT12-X Test Data
applekoiot
applekoiot

Posted on

Guide — Simulating Supply Chain Incidents with GPT12-X Test Data

Who this is for: platform/data engineers working on shipment visibility, cold-chain monitoring, alerting and dashboards.

What you’ll get: a reproducible way (GPT12-X) to synthesize telemetry, inject incidents, publish to MQTT or save as NDJSON/CSV, so you can validate alert rules and dashboards quickly and safely.

Why simulate?

Real data is expensive, slow to obtain and hard to control. Synthetic data lets you:

  • Reproduce delays/route deviations/temperature breaches/door events/etc. on demand
  • Regression-test alert/dash changes in minutes
  • Cover edge cases (GPS jamming/drift, sudden battery drop, humidity spikes)
  • Exercise the whole pipeline: MQTT → streaming jobs → alerts/dashboards, or NDJSON/CSV → lakehouse/BI

Incident types we simulate

  • DELAY — prolonged standstill (warehouse/traffic)
  • ROUTE_DEVIATION — geofence/corridor deviation
  • TEMP_EXCURSION — cold-chain breach
  • SHOCK — handling impact
  • DOOR_OPEN — unauthorized door event
  • BATTERY_DRAIN — abnormal battery drop
  • GPS_JAMMING — missing/invalid GNSS fix
  • HUMIDITY_ANOMALY — humidity out of range

Unified data model

{
  device_id: string;
  shipment_id: string;
  ts: string;              // ISO timestamp
  lat: number | null;
  lon: number | null;
  speed_kph: number | null;
  temp_c: number | null;
  humidity: number | null;
  shock_g: number | null;
  door_open: boolean | null;
  battery_pct: number | null;
  events: string[];        // e.g. ["TEMP_EXCURSION"]
  meta?: { route: string; step: number; }  // helpful for replay/debug
}
Enter fullscreen mode Exit fullscreen mode

Sample record

{
  "device_id": "ELK-SIM-204913",
  "shipment_id": "SHP-1736389123456-0",
  "ts": "2025-09-08T06:02:00.000Z",
  "lat": 34.0643,
  "lon": -118.2519,
  "speed_kph": 42.7,
  "temp_c": 5.1,
  "humidity": 67,
  "shock_g": 0,
  "door_open": false,
  "battery_pct": 82,
  "events": ["DELAY"],
  "meta": { "route": "US-LA-CHI-NYC", "step": 2 }
}
Enter fullscreen mode Exit fullscreen mode

GPT12-X at a glance

GPT12-X is a single Node.js CLI script that generates tracks along predefined routes and injects incidents at configurable rates, then:

  • publishes live to MQTT (for streaming consumption), or
  • writes NDJSON/CSV for offline analysis/replay.

Prereqs: Node.js ≥ 18

CLI options (quick reference)

Option Type Default Description
--route string US-LA-CHI-NYC Predefined route (e.g. US-LA-CHI-NYC, CN-SZ-SH)
--minutes number 180 Total duration (minutes)
--interval number 60 Sampling / MQTT publish interval (s)
--shipments number 1 Concurrent shipments
--coldchain boolean true 2–8 °C baseline (if false → ambient)
--incident-rate number 0.18 Incident intensity (per hour)
--mqtt string mqtt(s)://host:port
--topic string sim/telemetry MQTT topic
--out string Output NDJSON path
--csv string Output CSV path
--username / --password string MQTT auth
--insecure boolean false Allow self-signed cert (test only)

Tips

  • In production tests, keep a fixed --interval for steady event cadence.
  • Interpret --incident-rate as per-hour average (Poisson-like): 0.25 ≈ 1 time every 4 hours.

Quickstart (copy & run)

# Two shipments, 3 hours, 60-second interval; write NDJSON
node gpt12x-sim.js --minutes 180 --interval 60 --shipments 2 --out gpt12x.ndjson

# Publish to local MQTT (topic sim/telemetry) for 60 minutes
node gpt12x-sim.js --mqtt mqtt://localhost:1883 --topic sim/telemetry --minutes 60

# Generate CSV on the CN South → East route
node gpt12x-sim.js --route CN-SZ-SH --minutes 120 --csv gpt12x.csv
Enter fullscreen mode Exit fullscreen mode

Use it to validate alerts & dashboards

Rules

  • Temperature breach (TEMP_EXCURSION): trigger immediately; confirm notification within SLA; clear when temp returns to safe range for N minutes.
  • Route deviation: compare against geofence/corridor; require M consecutive deviations to escalate.
  • Delay/stall: near-zero speed & minimal positional delta for T minutes.
  • Door: open outside authorized stops triggers alert (combine with geofence).
  • Battery: sudden drops or slope above threshold → warn; below lower bound → escalate.
  • GPS quality: mark GPS_JAMMING/drift, trigger self-check and data-quality flags.

Visualization

  • Map polyline + incident bubbles; link with timeline brushing.
  • Time series (temp/speed/battery) with colored anomalies.
  • Incident histograms by type/time/route/device.
  • Data-quality tiles: GNSS accuracy, gaps, deviation rate, incident coverage, etc.

Pipelines

  • Streaming: MQTT → Flink/Spark/Kafka Streams → alert service/metrics store (Influx/TSDB/ClickHouse).
  • Batch: NDJSON/CSV → Lakehouse (Iceberg/Hudi/Delta) → BI/Notebook.

FAQ

Q: Synthetic ≠ real. How to close the gap?

A: Parameterize the generator with distributions from real devices (speed/dwell, temp drift, congestion windows, etc.).

Q: Can I mix synthetic with real?

A: Yes. Tag synthetic records (e.g., meta.synthetic=true) and load-test throughput/latency & alert false-negatives/positives.

Q: More realistic routes?

A: Import multi-segment Polyline/GeoJSON or use road-network APIs (OSRM/Valhalla/Mapbox Directions) and add congestion models.

Extensions

  • More sensors: light, CO₂, tilt, vibration spectrum
  • Road-network & congestion modeling by POI/time-of-day
  • Statistical control of intensity/duration (Poisson, Exponential, Gaussian mixture)
  • Mixed fleets across routes/time-zones/holidays
  • Blend with real devices for stress tests

License & Disclaimer

  • Code & examples for educational/testing under MIT.
  • TLV/payload examples are demonstrative, not any vendor’s production protocol/spec.

CTA

Want the full script (or a Git/Gist link), plus ready-made routes and incident-distribution templates?

Tell me where to host it and I’ll add it to this post (or as an Appendix).

Top comments (0)