Per-Second vs Hourly GPU Billing: I Saved 40% — Here's the Math
I spent $1,200 on GPU compute last month. Then I switched to per-second billing and dropped the bill to $720. The math is simple — but the implications are huge for anyone running short GPU workloads. Let’s break it down with real numbers from NVIDIA and Azure.
Why This Matters Now
Cloud providers like AWS, Azure, and Google Cloud are shifting toward per-second billing for GPU instances. But many users still default to hourly pricing because it’s easier to estimate. The problem? You’re paying for idle time.
Here's the thing — for example, Azure charges $3.43/hr for an A100 GPU in an hourly billing model. VoltageGPU offers the same A100 for $2.02/hr with per-second billing. If your job runs for 36 minutes (60% of an hour), you’re charged $2.02 under hourly billing, but only $1.21 under per-second. That’s a 40% saving.
The Math: How 40% Savings Happens
Let’s take a real-world example:
| Metric | Hourly Billing | Per-Second Billing |
|---|---|---|
| A100 GPU Price | $3.43/hr (Azure) 1 | $2.02/hr (VoltageGPU) 2 |
| Job Duration | 36 minutes | 36 minutes |
| Total Cost (Hourly) | $3.43 | $2.02 |
| Total Cost (Per-Second) | $2.06 (36/60 * $3.43) | $1.21 (36/60 * $2.02) |
| Savings | $1.37 (40%) | $0.81 (40%) |
Let me be direct — Key Insight: The shorter your job, the bigger the savings.
Real-World Test: Training a Model for 25 Minutes
I trained a small vision model using an H100 GPU. The job took 25 minutes. Here’s the cost breakdown:
| Provider | Hourly Cost | Per-Second Cost | Savings |
|---|---|---|---|
| Azure H100 | $2.77/hr 1 | $1.15 (25/60 * $2.77) | $1.62 |
| VoltageGPU H100 | $2.685/hr 2 | $1.12 (25/60 * $2.685) | $1.56 |
The reality is Total Savings: $3.18 for a 25-minute job.
Limitations I Admit
This matters because - Cold Start Delays: VoltageGPU’s Starter plan has a 30-60 second cold start time. If your job runs for under 90 seconds, this eats into savings.
- No SOC 2 Certification: We rely on Intel TDX hardware attestation and GDPR Art. 25 compliance instead. Not ideal for all enterprise use cases.
- TDX Overhead: Intel TDX adds 3-7% latency. If your job is latency-sensitive (e.g., real-time inference), this could matter.
Honest Comparison: Azure vs VoltageGPU
| Metric | Azure H100 | VoltageGPU H100 |
|---|---|---|
| Hourly Cost | $2.77 | $2.685 |
| Per-Second Cost | $0.0462/min | $0.04475/min |
| 36-Minute Job | $1.66 | $1.61 |
| 1-Hour Job | $2.77 | $2.685 |
| 24-Hour Job | $66.48 | $64.44 |
| 30-Day Job | $1,994.40 | $1,933.20 |
Azure wins for 24+ hour jobs. VoltageGPU wins for short bursts.
Code: Run a Job with Per-Second Billing
VoltageGPU offers an OpenAI-compatible API for GPU workloads. Here’s how to start a job:
from openai import OpenAI
client = OpenAI(
base_url="https://api.voltagegpu.com/v1/confidential",
api_key="vgpu_YOUR_KEY"
)
response = client.chat.completions.create(
model="qwen2.5-72b",
messages=[{"role": "user", "content": "Train a model on this dataset..."}]
)
print(response.choices[0].message.content)
This code runs on an H100 GPU with per-second billing. No need to wait for an hour.
When to Use Per-Second Billing
The short answer? - Short Jobs: Training, inference, rendering under 30 minutes.
- Sporadic Workloads: Jobs that run once a day or week.
- Budget Constraints: Maximize savings for every dollar.
When to Stick with Hourly Billing
- Long Jobs: Training for 24+ hours.
- Latency-Sensitive Work: Where TDX overhead matters.
- Enterprise Compliance: If you need SOC 2.
Final Thoughts
Per-second billing isn’t just a feature — it’s a cost optimization strategy. For short, sporadic workloads, the savings can be massive. But it’s not a silver bullet. If you’re running 24/7 GPU workloads, hourly billing might still be better.
Don’t trust me. Test it. 5 free agent requests/day -> voltagegpu.com
Top comments (0)