I found 146W of "ghost power" on NVIDIA A100 at 0% utilization. Here's my API to detect it

#api #infrastructure #monitoring #showdev

I ran 35 controlled energy tests on NVIDIA A100 and H100 GPUs on RunPod.

Standard monitoring tools missed something critical.

When nvidia-smi reports 0% GPU utilization, data center operators assume the GPU is idle. Billing stops. Cost tracking stops.

The power draw does not stop.

On A100 SXM, I consistently measured 67–146W while utilization reported 0%. I call it ghost power.

Key findings

A100 idle floor: 67.1W — never drops lower
Peak ghost power: 146.7W at 0% utilization — sustained 11 minutes
H100 SXM: zero ghost power detected across 11 tests
FP16 draws 60% more power than FP32 at same matrix size
Power capping blocked at hypervisor level on RunPod — tenants cannot fix this
$58.70 per GPU per year in pure idle waste
At 1 million GPUs globally — $58M/year invisible waste

Base API URL: https://ai-gpu-brain-v3.onrender.com

Interactive Swagger docs (try it in your browser):

👉 https://ai-gpu-brain-v3.onrender.com/docs


bash
curl https://ai-gpu-brain-v3.onrender.com/detect/a100/13