Title: I Built a Production GPU Energy Optimizer in One Day — From My Phone

I Built a Production GPU Energy Optimizer in One Day — From My Phone

Not from a MacBook. Not from a cloud VM. From my Android phone,
using Termux.

Here's what shipped by end of day:

Real-time GPU energy dashboard
DESYNC & GHOST power anomaly detection
17 cloud provider support
Per-user API keys
Time-series metrics scaling to 100+ GPUs
18/18 smoke tests passing
60-second Docker install

The Problem

GPU providers lie. Not intentionally — but telemetry desync is real.

Two failure modes kill your energy budget:

DESYNC — GPU drawing 420W but reporting 8% utilization.
You're paying full price for a GPU doing nothing useful.

GHOST power — GPU reporting 98% utilization at 40W draw.
Physically impossible. Your scheduler is making decisions on
fake data.

We found both in the wild across AWS and Vast.ai during testing.

The Solution

An open validation stack that:

Detects DESYNC and GHOST anomalies automatically
Works across 17 GPU cloud providers
Evicts bad workloads via Kubernetes or Run:ai
Alerts via Slack
Stores time-series data for 100+ GPUs

What We Built

Component	Status
CEI Formal Specification	✅
Grafana Dashboard	✅
GPU Agent Script	✅
Per-user API Keys	✅
Time-series DB	✅
17-Provider Validator	✅
Smoke Test 18/18	✅
Docker one-liner	✅

Why Termux

No laptop. No cloud IDE. Just an Android phone with Termux.

This matters because it proves the stack is lightweight enough
to run anywhere. If it builds and runs on a phone, it runs on
any bare metal server, VPS, or edge node.

60-Second Install


bash
 Install Docker (skip if already installed)
curl -fsSL https://get.docker.com | sh

 Clone and run
git clone https://github.com/mikebains41-debug/ai-gpu-energy-optimizer-
cd ai-gpu-energy-optimizer-
docker-compose up

DEV Community

Title: I Built a Production GPU Energy Optimizer in One Day — From My Phone

Top comments (0)