Pascal CESCATO

Posted on Jun 28

1%

#ai #geopolitics #hardware #fiction

Speculative look at hardware sanctions

Santa Clara, March 14, 2029.

Four objects on Jensen's desk.

An NSA report, face down.

Kai Chen's badge.

An iPhone, screen lit: "The White House."

And on the wall, framed since 2019, a GeForce 256 signed by hand: "The one that started it all — 1999."

Jensen isn't looking at the frame.

He's looking at the number on the screen in front of him.

1%.

Global inference datacenter market share. One percent.

12% domestic — captive hyperscalers, federal contracts, enterprise stacks buried too deep in CUDA to move.

Iran.

He thinks back to November 2023. The Senate. His own voice: "These restrictions will only accelerate the development of their own chips."

Nobody listened.

Neither did he, in the end. He had known. He had chosen silence — the right relationships, the right slice of the pie. The silence that paid well.

From the screen left on in the hallway, a familiar voice:

"We're winning. We've always been winning. Everything else is fake news."

Jensen stands. Turns off the screen on his way out.

He picks up Kai Chen's badge.

Twelve years at nVidia. Lead architect of the Hopper inference engine. His departure message, brief: "New opportunity." LinkedIn said Chengdu.

The badges had started piling up faster than the new hires could replace them. He wasn't the first. He wasn't the last.

From Washington, the voice, a few weeks earlier:

"Traitors. Losers. They'll come crawling back."

None of them had.

Jensen sets the badge down.

He looks at the NSA report.

Doesn't turn it over.

The HX-9 Pro had shipped in October 2028. 18 ARM cores. 512GB of soldered xGDDR8 — a memory variant co-developed for LLM inference workloads, where standard GDDR8 still aimed at graphics rendering. 210 watts TDP. $7,800. Built in Chengdu, assembled in Penang, sold everywhere.

Everywhere but here.

Three weeks earlier, a Tier-2 datacenter in Ohio had quietly swapped its last nVidia rack. The migration had taken a weekend.

Somewhere in San Francisco, this morning, a CTO had opened Signal.

"What are you paying?"

Berlin replied twenty seconds later.

"811,000."

The quote on the CTO's screen read $4,032,000. Same compute capacity. Same workload. Same output.

The 500% tariffs on Asian components weren't protecting anything anymore. They were just taxing Americans.

Intel and AMD had their own versions — ARM, xGDDR8, NPU clusters, open stack. Competitive on paper. Built in Penang and Taiwan. Caught in the same tariffs on their own components, manufactured offshore in fabs Washington no longer really controlled. 19% and 23% domestic market share — ahead of nVidia, but for reasons nobody in their boardrooms found particularly glorious.

It was all there.

Jensen stops in front of the GeForce 256.

1999. The GPU that started everything — transform and lighting in hardware, for the first time. The competition had smiled. 3dfx. S3. Matrox. They were still smiling six months later when nVidia had buried them one by one.

He knew this story.

He had lived it from the other side.

2026. The Commerce Department publishes restrictions on Fable 5 and Mythos 5. A Friday. 5:21 PM.

Thirty hours later, Zhipu releases GLM-5.2. 744 billion parameters. One million token context window. MIT license. Open weights. No geographic restrictions.

Thirty hours.

Sakana AI ships Fugu the following week — an orchestrator aggregating the best available models behind a single API. Performance comparable to Fable 5 on engineering benchmarks. Twenty dollars a month.

From Washington, the voice thundered across every screen:

"Our technologies are the best in the world. The best. Nobody can catch us. Nobody."

GLM-5.2 had been live for eighteen hours.

In the boardrooms of Santa Clara, the information was noted.

Nothing changed.

It was all there.

2025. Apple ships the M4 Ultra. 512GB unified memory. 200 watts. 70B inference locally, without breaking a sweat. The blueprint for what an ARM accelerator with massive memory could be — designed for Final Cut Pro and Xcode, used to run frontier models on a desktop.

Apple hadn't meant to prove anything to nVidia. It was a side effect.

That same year, Moffett AI publishes its MLPerf Inference results. The S30: twice the H100 throughput. One third of the power draw. Built in China, on sparsification architectures that nobody in Santa Clara was taking seriously.

ROCm 9.x was there too. Open source, MIT, native PyTorch. 90% of H100 inference performance — on nVidia hardware. 600% on an HX-9 Pro.

Microsoft had smiled too, in 1998, reading the first Linux server benchmarks.

It was all there.

2023. DeepSeek releases R1. Training cost: $6 million. Not $100 million. Six million — on chips they had been refused, optimizing the algorithm where they had been blocked from optimizing the hardware.

Constraint produces innovation.
Abundance produces dependency.

In a situation room in Washington, an analyst traces a curve.

"They're slowing down."

Nobody disagrees.

The curve would keep climbing for eighteen months.

Jensen had known. He had said it in front of the Senate. In the room, a national security advisor had murmured to his neighbor: "If we do nothing, they get immediate access to the best chips. If we act, we risk accelerating their own industry. Either way, the risk is real."

The senators had nodded and voted for the restrictions anyway.

Jensen had flown back to Santa Clara. Had sold H100s at $40,000 a unit. Margins no other industry would have dared post.

Rational. Short term.

It was all there.

2022. Bureau of Industry and Security. Entity lists. H100, A100, H800 — progressively locked down. ASML barred from delivering EUV machines. ARM pressured into restricting its licensees.

The intent: maintain technological hegemony. Preserve the lead. Consolidate control.

The precedent existed. It had a name.

AMD.

1982: IBM forces Intel to license x86. Intel agrees, convinced AMD will remain a follower indefinitely. 1993: the K5 starts to worry them. 1999: the Athlon K7 crushes the Pentium III across integer benchmarks. Intel's internal memos from that period still exist — engineers had been raising the alarm since 1996. NetBurst was a known dead end before it was ever announced publicly. 2003: Opteron. The early architectures that would lead to Zen — and to server market dominance. Intel recovers in 2006 with the Core architecture — three years too late to salvage the image, ten years too late to recover the server market share.

AMD hadn't needed to invent a new architecture. It had taken the existing one, optimized it where Intel had stopped looking, and sold it cheaper to everyone Intel's pricing had excluded.

The playbook was known. Taught. Documented.

It was all there.

Santa Clara, March 14, 2029.

Jensen walks to the window.

The 101 below, clear at this hour.

He thinks of an engineer somewhere in Chengdu, in 2024, looking at the M4 Ultra specs and understanding exactly what needed to be built.

The iPhone has stopped vibrating.

The NSA report is still face down on the desk.

The GeForce 256 still hangs on the wall.

"The one that started it all."

On the screen, the cursor blinks.

The Taiwan Strait.

He doesn't finish the thought.

1%.

Top comments (72)

Aryan Choudhary • Jun 28

What I found most interesting wasn't even the geopolitical prediction, it was the recurring line, "It was all there."

It made the story feel less like a warning about one company and more like a pattern that keeps repeating. Someone notices the signals early, the information exists, but incentives make acting on it much harder than seeing it.

I also liked how the piece was structured backwards. Starting with the 1% and then slowly uncovering how every decision led there made it feel more like reading an investigation than a prediction.

Really enjoyed this one.