DEV Community

Cover image for Code Autopsy #1: How ~90 Lines Turned System Monitoring Into A Conversation
Marcin Firmuga
Marcin Firmuga

Posted on

Code Autopsy #1: How ~90 Lines Turned System Monitoring Into A Conversation

Code Autopsy #1: How 30 Lines Turned System Monitoring Into A Conversation

Part of the PC_Workman build-in-public series. Code Autopsy drops every Wednesday.


The Problem: Numbers Without Answers

You open Task Manager.

"CPU: 87%"

Cool.

But WHY 87%?

Is that normal? Should you worry? What process caused it? When did it start?

Task Manager doesn't answer. HWMonitor doesn't answer. MSI Afterburner doesn't answer.

They show you WHAT is happening. Never WHY.

That's the gap PC_Workman fills.


PC Workman 1.6.8 - hck_GPT in action. Service Setup - quick access to disable useless services, or services what you don't will use (Bluetooth, Print, fax). Today Report - Info about correctly collecting data by sessions. Daily usage averages. And Alerts from suspected spikes/moments by temperatures or voltage.
Enter fullscreen mode Exit fullscreen mode

The Solution: EventDetector

After 800 hours building PC_Workman (most of it on a laptop that peaks at 94°C), I realized: users don't need more data. They need context.

So I built EventDetector.

30 lines of Python that turn monitoring into a conversation.

Here's how it works.


Step 1: Track YOUR Baseline (Not Generic Averages)

Most tools compare against hardcoded thresholds:

  • "50% CPU is normal"
  • "60% RAM is high"
  • "80°C is warm"

Problem: Your normal isn't my normal.

A gaming PC idling at 30% CPU? Normal.

A lightweight laptop idling at 30% CPU? Something's wrong.

EventDetector tracks YOUR baseline from the last 10 minutes:

def _get_baseline(self, now):
    """Get recent baseline averages from minute_stats.
    Cached for 60 seconds to avoid excessive queries.
    """
    cutoff = now - SPIKE_BASELINE_WINDOW  # 10 minutes

    rows = conn.execute("""
        SELECT AVG(cpu_avg) as cpu_avg, 
               AVG(ram_avg) as ram_avg,
               AVG(gpu_avg) as gpu_avg,
               AVG(cpu_temp) as cpu_temp, 
               AVG(gpu_temp) as gpu_temp
        FROM minute_stats
        WHERE timestamp >= ?
    """, (cutoff,)).fetchone()

    return baseline_cache
Enter fullscreen mode Exit fullscreen mode

Key insight: The baseline is YOU. Not everyone. Just you.


PC Workman 1.6.8 - Events detector for hck_GPT insights. Based on long-term monitoring: CPU, GPU, RAM. EventDetector code with highlights on baseline, delta, rate limiting, severity
Enter fullscreen mode Exit fullscreen mode

Step 2: Calculate Delta (Current vs YOUR Normal)

Once we have YOUR baseline, detecting spikes is simple math:

def _check_metric(self, now, metric_name, current_val, 
                  baseline_val, threshold, description):
    """Check if a metric exceeds its threshold above baseline"""

    delta = current_val - baseline_val

    if delta < threshold:
        return  # No spike - you're within YOUR normal range
Enter fullscreen mode Exit fullscreen mode

Example:

  • Your CPU baseline (last 10 min): 42%
  • Current CPU: 87%
  • Delta: +45%
  • Threshold: 20%

Result: Spike detected. But we're not done yet.


Step 3: Rate Limiting (No Alert Spam)

Early versions of EventDetector had a problem: alert spam.

Chrome spikes CPU every 30 seconds? You'd get 120 alerts per hour.

Useless.

Solution: Rate limiting.

# Rate limiting: {metric_name: last_event_timestamp}
self._last_event_time = {}

def _check_metric(self, ...):
    # ... delta calculation ...

    # Rate limiting
    last_time = self._last_event_time.get(metric_name, 0)
    if now - last_time < SPIKE_COOLDOWN:  # 5 minutes
        return  # Too soon since last alert

    # Log the event
    self._last_event_time[metric_name] = now
Enter fullscreen mode Exit fullscreen mode

Result: Max 1 alert per metric per 5 minutes. No spam.


Step 4: Severity Levels (Critical vs Warning vs Info)

Not all spikes are equal.

CPU spiking 21% above baseline? Worth noting.

CPU spiking 60% above baseline? Drop everything.

EventDetector categorizes:

# Determine severity
if delta >= threshold * 2:
    severity = 'critical'  # 🔴
elif delta >= threshold * 1.5:
    severity = 'warning'   # ⚠️
else:
    severity = 'info'      # ℹ️
Enter fullscreen mode Exit fullscreen mode

Example thresholds:

  • CPU threshold: 20%
  • Delta 40%+: Critical
  • Delta 30%+: Warning
  • Delta 20-29%: Info

Result: Alerts match urgency.


The Final Output: Context, Not Just Numbers

Here's what you see in PC_Workman when a spike happens:

Before (Task Manager):

CPU: 87%
Enter fullscreen mode Exit fullscreen mode

After (PC_Workman):

⚠️ CPU spike: 87% (baseline: 42%, delta: +45%)
Chrome.exe - started 3 hours ago
Enter fullscreen mode Exit fullscreen mode

Same data. Different story.

One gives you anxiety. The other gives you action.


PC Workman 1.6.8 - My PC - Center of Actions.
STATS & ALERTS - Long term monitoring your components usage, process usage. And mainly time-travel TEMP and Voltages alerts about spikes, or suspected moments. Optimization & Services - For optimize and improve your PC performance. First Setup & Drivers - All for setup your new device/new os. Stability Tests - For check about correctly working of PC Workman and Database check. Your Account-Details - Soon :)
Enter fullscreen mode Exit fullscreen mode

Implementation Notes

Handles 5 Metrics With Same Logic

The beauty of this design: reusable.

Same _check_metric function handles:

  • CPU usage
  • RAM usage
  • GPU usage
  • CPU temperature
  • GPU temperature
def check_and_log_spike(self, cpu_avg, ram_avg, gpu_avg,
                        cpu_temp=None, gpu_temp=None):
    baseline = self._get_baseline(now)

    # Check each metric with same logic
    self._check_metric(now, 'cpu', cpu_avg, 
                      baseline['cpu_avg'], 
                      SPIKE_THRESHOLD_CPU, 'CPU usage')

    self._check_metric(now, 'ram', ram_avg, 
                      baseline['ram_avg'],
                      SPIKE_THRESHOLD_RAM, 'RAM usage')

    # ... and so on
Enter fullscreen mode Exit fullscreen mode

Clean. Maintainable. Scalable.

Performance: Cached Baselines

Baseline queries hit SQLite. Could be slow.

Solution: 60-second cache.

if now - self._baseline_cache_time < 60 and self._baseline_cache:
    return self._baseline_cache  # Use cached data
Enter fullscreen mode Exit fullscreen mode

Result: Query once per minute, not once per second.

Storage: SQLite Events Table

All events logged to database:

INSERT INTO events
(timestamp, event_type, severity, metric, value, 
 baseline, process_name, description)
VALUES (?, ?, ?, ?, ?, ?, ?, ?)
Enter fullscreen mode Exit fullscreen mode

Benefits:

  • Historical tracking (what spiked last week?)
  • Pattern detection (Chrome spikes every Tuesday?)
  • Exportable data

What I Learned Building This

1. Users Don't Need More Data

Early versions of PC_Workman showed 20+ metrics.

Users ignored them all.

Lesson: Context, no quantity.

2. Rate Limiting Is User Experience

First version: no rate limiting.

Result: 500 alerts per hour. Unusable.

Lesson: Silence is a feature.

3. Personalization.

"50% CPU is high" works for nobody.

YOUR 50% vs MY 50% = different stories.

Lesson: Baselines must be personal.

PC Workman 1.6.8 - hck_GPT Insights
Enter fullscreen mode Exit fullscreen mode

The Numbers

EventDetector stats:

  • ~30 lines core logic
  • Handles 5 metrics
  • Max 1 alert per metric per 5 min
  • Baseline cached 60 sec
  • 3 severity levels

PC_Workman stats:

  • 800+ hours development
  • Built on 94°C laptop
  • v1.6.8 current (v2.0 -> Microsoft Store, Q3 2026)
  • 60+ downloads
  • 17 stars
  • Open source, MIT licensed

Try It Yourself

PC_Workman is open source.

EventDetector is in hck_stats_engine/events.py.

Download, run, break it, improve it.

GitHub: github.com/HuckleR2003/PC_Workman_HCK
File what I show you: PC_Workman_HCK/hck_stats_engine/events.py

Building in public. Code Autopsy every Wednesday.

Follow the journey:


Next Week: Wednesday Code Autopsy #2

Topic: ProcessAggregator - how PC_Workman tracks which apps eat your CPU without destroying performance.
See you Wednesday.


Questions? Comments? Roasts? I'm building in public. Feedback welcome.

About the Author

I’m Marcin Firmuga. Solo developer and founder of HCK_Labs.

I created PC Workman , an open-source, AI-powered
PC resource monitor
built entirely from scratch on dying hardware during warehouse
shifts in the Netherlands.

This is the first time I’ve given one of my projects a real, dedicated home.

Before this: game translations, PC technician internships, warehouse operations in multiple countries, and countless failed projects I never finished.

But this one? This one stuck.
800+ hours of code. 4 complete UI rebuilds. 16,000 lines deleted.
3 AM all-nighters. Energy drinks and toast.

And finally, an app I wouldn’t close in 5 seconds.
That’s the difference between building and shipping.

PC_Workman is the result.

Top comments (0)