Most AI infrastructure decisions get made on hourly GPU rates. That's the wrong input variable.
Where your data lives determines what your AI costs. A 50TB dataset sitting in S3 doesn't move to CoreWeave for free — and the cost of moving it can exceed the compute savings before you've run a single training job.
We built the AI Gravity & Placement Engine to make that friction calculable before the architecture is committed.
What It Does
The engine calculates Token TCO for running Llama 3 70B at BF16 precision across six infrastructure tiers:
- AWS (p5.48xlarge — 8x H100)
- GCP (A3-High — 8x H100)
- CoreWeave HGX (bare-metal InfiniBand)
- Lambda H100
- Nutanix AHV (H100, 36-mo CapEx amortized)
- Cisco UCS M7 (H100, 36-mo CapEx amortized)
All providers are normalized to cost-per-GPU-hour at the 8-GPU BF16 configuration. On-prem providers use 36-month CapEx amortization plus a configurable OpEx Adder (default 20%) for power, cooling, and maintenance.
Why BF16 — Not INT4
BF16 requires approximately 145GB of VRAM just for Llama 3 70B model weights. That forces a multi-GPU configuration on every provider and reveals which platforms have the high-speed interconnects (InfiniBand or NVLink equivalent) needed to bridge those GPUs without introducing latency penalties.
INT4 quantization fits on a single 48GB GPU. BF16 tells you what the architecture actually costs at production fidelity — and which providers can handle it without fabric limitations.
The Data Gravity Score
This is the differentiator. The Gravity Score (G) measures egress cost as a fraction of monthly compute cost:
G = (Dataset Size in GB × Egress Rate) ÷ Monthly Compute Cost
- G > 0.5: Egress exceeds 50% of compute cost. The data is too heavy to move economically. Verdict: Stay Put or Full Repatriation.
- G < 0.1: Data is effectively weightless. Cheapest compute wins. Verdict: Hybrid Burst.
- Between 0.1 and 0.5: The architectural decision space — where provider selection actually matters.
At 50TB with AWS egress at $0.09/GB, the Gravity Score against AWS compute lands around 19.6%. GCP's higher egress rate ($0.12/GB) pushes its score to 34.2% on the same dataset. CoreWeave's near-zero egress ($0.01/GB) drops to 1.4% — making it effectively weightless despite being the highest per-GPU-hour provider.
Provider Table (April 2026, Normalized)
| Provider | Unit Rate ($/GPU-hr) | Egress/GB | Note |
|---|---|---|---|
| AWS (p5.48xlarge) | $3.93 | $0.09 | On-demand US-East-1 |
| GCP (A3-High) | $3.00 | $0.12 | Post-2025 price reduction |
| CoreWeave HGX | $6.16 | $0.01 | Bare-metal InfiniBand |
| Lambda H100 | $2.99 | $0.00* | *Bandwidth caps apply |
| Nutanix AHV | $2.15 | $0.00 | 36-mo amort + 20% OpEx |
| Cisco UCS M7 | $2.45 | $0.00 | 36-mo amort + 20% OpEx |
The Placement Verdict
The output is not a table. It's a verdict:
- Stay Put — data gravity makes migration economically irrational
- Hybrid Burst — keep data on-prem, burst compute to cloud for training
- Full Repatriation — steady-state 24/7 inference favors CapEx ownership
Each verdict includes reasoning against your specific inputs and an Architect Tip — the Day 2 operational consideration the cost comparison alone doesn't surface.
For example, at 50TB steady-state 100% duty cycle, the verdict is Full Repatriation to Nutanix AHV at $125.56/1M tokens vs $274.51 on AWS. The Architect Tip: configure Nutanix Metro Availability on Cisco UCS to match cloud-native SLA expectations without the hyperscaler dependency.
Additional Controls
- OpEx Adder — adjustable from 20% to 35% for older facilities or full staff allocation
- Sovereign Mode — excludes all public cloud providers, constrains verdict to Nutanix and Cisco only
- Duty Cycle — model burst training (20–40%) vs steady-state inference (100%)
Below 70% duty cycle, on-prem CapEx begins losing its cost advantage versus elastic cloud pricing. The engine identifies that crossover dynamically.
Try It
Free, no signup, runs entirely in the browser.
Tool: https://gpe.rack2cloud.com
Methodology + full breakdown: https://www.rack2cloud.com/ai-gravity-placement-engine/
The providers.json and Gravity Score formula are documented on the landing page for anyone who wants to validate or adapt the model.

Top comments (0)