DEV Community

mikebains41-debug
mikebains41-debug

Posted on

Title: I Built a Production GPU Energy Optimizer in One Day — From My Phone

I Built a Production GPU Energy Optimizer in One Day — From My Phone

Not from a MacBook. Not from a cloud VM. From my Android phone,
using Termux.

Here's what shipped by end of day:

  • Real-time GPU energy dashboard
  • DESYNC & GHOST power anomaly detection
  • 17 cloud provider support
  • Per-user API keys
  • Time-series metrics scaling to 100+ GPUs
  • 18/18 smoke tests passing
  • 60-second Docker install

The Problem

GPU providers lie. Not intentionally — but telemetry desync is real.

Two failure modes kill your energy budget:

DESYNC — GPU drawing 420W but reporting 8% utilization.
You're paying full price for a GPU doing nothing useful.

GHOST power — GPU reporting 98% utilization at 40W draw.
Physically impossible. Your scheduler is making decisions on
fake data.

We found both in the wild across AWS and Vast.ai during testing.

The Solution

An open validation stack that:

  1. Detects DESYNC and GHOST anomalies automatically
  2. Works across 17 GPU cloud providers
  3. Evicts bad workloads via Kubernetes or Run:ai
  4. Alerts via Slack
  5. Stores time-series data for 100+ GPUs

What We Built

Component Status
CEI Formal Specification
Grafana Dashboard
GPU Agent Script
Per-user API Keys
Time-series DB
17-Provider Validator
Smoke Test 18/18
Docker one-liner

Why Termux

No laptop. No cloud IDE. Just an Android phone with Termux.

This matters because it proves the stack is lightweight enough
to run anywhere. If it builds and runs on a phone, it runs on
any bare metal server, VPS, or edge node.

60-Second Install


bash
 Install Docker (skip if already installed)
curl -fsSL https://get.docker.com | sh

 Clone and run
git clone https://github.com/mikebains41-debug/ai-gpu-energy-optimizer-
cd ai-gpu-energy-optimizer-
docker-compose up
Enter fullscreen mode Exit fullscreen mode

Top comments (0)