Originally published on AI Tech Connect.
The headline numbers, and the trap Two numbers landed on the desks of AI infra leads this week and they appear, at first glance, to contradict each other. The first comes from NVIDIA and SemiAnalysis: a GB300 NVL72 system running agentic AI workloads delivers up to 35× lower cost-per-token than the Hopper generation it replaces, alongside 50× higher throughput per megawatt. The second comes from the cloud price boards: on-demand B300 capacity has risen from roughly $5.00/hr in November 2025 to $9.16/hr on certain providers — an 83% jump in six months. If the chip is so much cheaper to run, why has renting one become so much more expensive? The answer is the central question of every infrastructure decision being taken in Bengaluru, Hyderabad, London and Manchester right now. Cost-per-…
Top comments (0)