DEV Community

Farhan Munir

Posted on Apr 8

Build Log: Shipping a Lean Python Telemetry Agent (CPU, Memory, Disk)

#python #monitoring #linux #devops

Build Log (April 8, 2026)

Today I implemented the first production-ready telemetry collectors for heka-insights-agent and wired them into the main polling loop.

What I built

Added an optimized CPUCollector in src/collectors/cpu.py
Added a MemoryCollector in src/collectors/memory.py
Added a DiskCollector in src/collectors/disk.py
Wired all collectors into src/main.py with a shared loop
Added environment-based poll interval support via CPU_POLL_INTERVAL_SECONDS
Added python-dotenv in requirements.txt

CPU collector design

I built CPU collection around psutil.cpu_times(...) snapshots and delta math (single source), instead of calling both cpu_percent and cpu_times_percent per cycle.

Key design points:

No thread offloading (to_thread) for this workload
First cycle is warm-up by design
Supports basic and detailed output modes
Optional per-core output
Uses MonotonicTicker to keep fixed cadence without drift

Memory collector design

Memory collection is intentionally lightweight:

One call each to psutil.virtual_memory() and psutil.swap_memory()
basic mode returns compact key fields
detailed mode returns full psutil fields
Raw byte values are preserved (server-side compute handles transformations)

Disk collector design

For disk, I chose cumulative I/O counters (not rates) because central compute is done server-side.

Uses psutil.disk_io_counters(perdisk=True)
Returns aggregate and per-disk counters
Filters to physical devices only
Excludes partitions from per-disk payload
Added device-name cache with periodic refresh to reduce repeated filtering overhead