DEV Community

Ertugrul
Ertugrul

Posted on

Part 1 — Data Gathering on Raspberry Pi Pico for Edge AI Trend Alarm

🌍 Motivation

Most SCADA-like monitoring systems rely on absolute thresholds (e.g., "trigger alarm if temperature > 80 °C").
But in predictive maintenance, engineers often care more about trends: if a motor, battery, or CPU is heating too quickly, they want to know before it hits the limit.

That’s exactly what we aim to do in this project:

  • Build a tiny trend-based early alarm system on Raspberry Pi Pico.
  • Train a simple logistic regression model on collected data.
  • Deploy the model back on Pico for real-time inference.

But before any ML magic can happen, we need data.


⚙️ Why Raspberry Pi Pico and its Internal Sensor?

  • RP2040 has a built-in temperature sensor connected to ADC4.
  • Accuracy is low (±2–4 °C), resolution is coarse (~0.468 °C per LSB).
  • Despite this, it is perfect for a demo:

    • No external hardware needed.
    • Demonstrates typical “noisy industrial sensor” conditions.
    • Good enough to show trends (°C per second), which is all we care about.

🛠️ System Setup

We split the setup into two roles:

  1. Pico firmware (C++)
  • Initializes ADC4 and the internal temperature sensor.
  • Prints data in CSV format over USB Serial.
  • Supports command-controlled phase switching: PC can send L0 for idle or L1 for load.
  • Uses Utils::heavy_work_ms() to generate CPU heat.
  • Default sampling ~1 Hz (either via sleep_ms(1000) or ~1 s compute load).
  1. PC-side logger (Python)
  • Opens serial port and writes rows into pico_log_20min.csv.
  • Automatically inserts a phase transition (after 10 minutes, switches to load by sending L1).
  • Later analysis (e.g. in plot_data.py) computes slope using a rolling OLS window.

Example dataset:

uptime_ms,temp_c,load
9504,24.798,1
10507,24.798,1
11511,24.798,1
...
Enter fullscreen mode Exit fullscreen mode

csv

🔍 Challenges in Data Gathering

1. Sensor Sensitivity

  • Problem: temperature barely moves under small workloads.
  • Fix: increase load intensity (heavy_work_ms(1000, 50000)) → measurable heating.

2. Quantization Noise

  • Problem: ADC jumps in 0.468 °C steps.
  • Fix: implemented oversampling and averaging in read_temp_oversampled().
  • Further smoothing via exponential moving average on the PC side.

3. Timing Control

  • Target: 1 Hz logs.
  • Issue: Load phase consumes CPU for ~1 s.
  • Fix: Balanced sleep_ms in idle phase and compute load in active phase to maintain near-constant sampling rate.

4. CSV Consistency

  • Observation: firmware originally printed header with 2 columns, but data had 3 (uptime_ms,temp_c,load).
  • Fix: corrected header line for downstream compatibility.

adc raw


📜 Firmware Snippet (C++)

// Init ADC
adc_init();
adc_set_temp_sensor_enabled(true);
adc_select_input(4);

// Print header (fixed)
printf("uptime_ms,temp_c,load\n");

// In loop
float T = read_temp_oversampled(128, 1500);
uint32_t ms = to_ms_since_boot(get_absolute_time());
printf("%u,%.3f,%d\n", ms, T, (int)load_flag);

// Load vs idle
if (load_flag) {
    Utils::heavy_work_ms(1000); // stress core ~1s
} else {
    sleep_ms(1000); // idle ~1s
}
Enter fullscreen mode Exit fullscreen mode

putty


🐍 Logger Snippet (Python)

import serial, csv, time

PORT = "COM6"
BAUD = 115200

ser = serial.Serial(PORT, BAUD, timeout=1)
with open("pico_log_20min.csv", "w", newline="") as f:
    writer = csv.writer(f)
    writer.writerow(["uptime_ms", "temp_c", "load"])
    start = time.time()
    while True:
        line = ser.readline().decode().strip()
        if not line:
            continue
        # switch to load after 10 min
        if time.time() - start > 600:
            ser.write(b"L1\n")
        writer.writerow(line.split(","))
Enter fullscreen mode Exit fullscreen mode

📊 Example Analysis

Using plot_data.py we:

  • Apply a 120-second sliding window.
  • Fit an OLS regression for temperature vs. time.
  • Extract the slope (°C/s) as the main feature.

Results:

  • Idle phase: slope ≈ 0.000 °C/s (flat).
  • Load phase: slope ≈ 0.02–0.03 °C/s (rising).

This clean separation is exactly what makes a simple linear model feasible.

slopy


🚀 Why This Matters

This dataset is the foundation of our trend-alarm system:

  • By splitting into idle vs load, we obtain two labeled classes.
  • By focusing on slope (°C/s), we turn a noisy, low-resolution sensor into a useful predictive feature.
  • With these features, even a simple logistic regression can separate “normal” vs “heating” behavior.

In Part 2, we will:

  • Train a Logistic Regression model on extracted slopes.
  • Export model weights into C++ (model_params.hpp).
  • Run real-time slope estimation and early warning directly on Pico.

🔗 Explore More

Top comments (0)