DEV Community

Cover image for 1%
Pascal CESCATO
Pascal CESCATO Subscriber

Posted on

1%

Speculative look at hardware sanctions

Santa Clara, March 14, 2029.

Four objects on Jensen's desk.

An NSA report, face down.

Kai Chen's badge.

An iPhone, screen lit: "The White House."

And on the wall, framed since 2019, a GeForce 256 signed by hand: "The one that started it all — 1999."

Jensen isn't looking at the frame.

He's looking at the number on the screen in front of him.

1%.

Global inference datacenter market share. One percent.

12% domestic — captive hyperscalers, federal contracts, enterprise stacks buried too deep in CUDA to move.

Iran.

He thinks back to November 2023. The Senate. His own voice: "These restrictions will only accelerate the development of their own chips."

Nobody listened.

Neither did he, in the end. He had known. He had chosen silence — the right relationships, the right slice of the pie. The silence that paid well.

From the screen left on in the hallway, a familiar voice:

"We're winning. We've always been winning. Everything else is fake news."

Jensen stands. Turns off the screen on his way out.


He picks up Kai Chen's badge.

Twelve years at nVidia. Lead architect of the Hopper inference engine. His departure message, brief: "New opportunity." LinkedIn said Chengdu.

The badges had started piling up faster than the new hires could replace them. He wasn't the first. He wasn't the last.

From Washington, the voice, a few weeks earlier:

"Traitors. Losers. They'll come crawling back."

None of them had.

Jensen sets the badge down.


He looks at the NSA report.

Doesn't turn it over.

The HX-9 Pro had shipped in October 2028. 18 ARM cores. 512GB of soldered xGDDR8 — a memory variant co-developed for LLM inference workloads, where standard GDDR8 still aimed at graphics rendering. 210 watts TDP. $7,800. Built in Chengdu, assembled in Penang, sold everywhere.

Everywhere but here.

Three weeks earlier, a Tier-2 datacenter in Ohio had quietly swapped its last nVidia rack. The migration had taken a weekend.

Somewhere in San Francisco, this morning, a CTO had opened Signal.

"What are you paying?"

Berlin replied twenty seconds later.

"811,000."

The quote on the CTO's screen read $4,032,000. Same compute capacity. Same workload. Same output.

The 500% tariffs on Asian components weren't protecting anything anymore. They were just taxing Americans.

Intel and AMD had their own versions — ARM, xGDDR8, NPU clusters, open stack. Competitive on paper. Built in Penang and Taiwan. Caught in the same tariffs on their own components, manufactured offshore in fabs Washington no longer really controlled. 19% and 23% domestic market share — ahead of nVidia, but for reasons nobody in their boardrooms found particularly glorious.

It was all there.


Jensen stops in front of the GeForce 256.

1999. The GPU that started everything — transform and lighting in hardware, for the first time. The competition had smiled. 3dfx. S3. Matrox. They were still smiling six months later when nVidia had buried them one by one.

He knew this story.

He had lived it from the other side.

2026. The Commerce Department publishes restrictions on Fable 5 and Mythos 5. A Friday. 5:21 PM.

Thirty hours later, Zhipu releases GLM-5.2. 744 billion parameters. One million token context window. MIT license. Open weights. No geographic restrictions.

Thirty hours.

Sakana AI ships Fugu the following week — an orchestrator aggregating the best available models behind a single API. Performance comparable to Fable 5 on engineering benchmarks. Twenty dollars a month.

From Washington, the voice thundered across every screen:

"Our technologies are the best in the world. The best. Nobody can catch us. Nobody."

GLM-5.2 had been live for eighteen hours.

In the boardrooms of Santa Clara, the information was noted.

Nothing changed.

It was all there.


2025. Apple ships the M4 Ultra. 512GB unified memory. 200 watts. 70B inference locally, without breaking a sweat. The blueprint for what an ARM accelerator with massive memory could be — designed for Final Cut Pro and Xcode, used to run frontier models on a desktop.

Apple hadn't meant to prove anything to nVidia. It was a side effect.

That same year, Moffett AI publishes its MLPerf Inference results. The S30: twice the H100 throughput. One third of the power draw. Built in China, on sparsification architectures that nobody in Santa Clara was taking seriously.

ROCm 9.x was there too. Open source, MIT, native PyTorch. 90% of H100 inference performance — on nVidia hardware. 600% on an HX-9 Pro.

Microsoft had smiled too, in 1998, reading the first Linux server benchmarks.

It was all there.


2023. DeepSeek releases R1. Training cost: $6 million. Not $100 million. Six million — on chips they had been refused, optimizing the algorithm where they had been blocked from optimizing the hardware.

Constraint produces innovation.
Abundance produces dependency.

In a situation room in Washington, an analyst traces a curve.

"They're slowing down."

Nobody disagrees.

The curve would keep climbing for eighteen months.

Jensen had known. He had said it in front of the Senate. In the room, a national security advisor had murmured to his neighbor: "If we do nothing, they get immediate access to the best chips. If we act, we risk accelerating their own industry. Either way, the risk is real."

The senators had nodded and voted for the restrictions anyway.

Jensen had flown back to Santa Clara. Had sold H100s at $40,000 a unit. Margins no other industry would have dared post.

Rational. Short term.

It was all there.


2022. Bureau of Industry and Security. Entity lists. H100, A100, H800 — progressively locked down. ASML barred from delivering EUV machines. ARM pressured into restricting its licensees.

The intent: maintain technological hegemony. Preserve the lead. Consolidate control.

The precedent existed. It had a name.

AMD.

1982: IBM forces Intel to license x86. Intel agrees, convinced AMD will remain a follower indefinitely. 1993: the K5 starts to worry them. 1999: the Athlon K7 crushes the Pentium III across integer benchmarks. Intel's internal memos from that period still exist — engineers had been raising the alarm since 1996. NetBurst was a known dead end before it was ever announced publicly. 2003: Opteron. The early architectures that would lead to Zen — and to server market dominance. Intel recovers in 2006 with the Core architecture — three years too late to salvage the image, ten years too late to recover the server market share.

AMD hadn't needed to invent a new architecture. It had taken the existing one, optimized it where Intel had stopped looking, and sold it cheaper to everyone Intel's pricing had excluded.

The playbook was known. Taught. Documented.

It was all there.


Santa Clara, March 14, 2029.

Jensen walks to the window.

The 101 below, clear at this hour.

He thinks of an engineer somewhere in Chengdu, in 2024, looking at the M4 Ultra specs and understanding exactly what needed to be built.

The iPhone has stopped vibrating.

The NSA report is still face down on the desk.

The GeForce 256 still hangs on the wall.

"The one that started it all."

On the screen, the cursor blinks.

The Taiwan Strait.

He doesn't finish the thought.

1%.

Top comments (28)

Collapse
 
xulingfeng profile image
xulingfeng

This is the macro version of what I've been writing at the corporate level — same pattern, different scale. Signals everywhere, nobody reading them, and by the time they do it's too late.
The "it was all there" refrain hit harder every time. Curious — did you build this from the AMD precedent outward, or did the 2029 image come first?

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

The 2029 image came first. Jensen at the desk, four objects, a number on the screen. The AMD precedent came in to answer the question the image raised: how does a company that invented the playbook end up on the wrong side of it?
The refrain came last — or rather, it emerged. Once the structure was backwards chronology, "it was all there" became the only honest thing to say at each layer. Not a rhetorical device. More like a verdict.
Your corporate version sounds like the same mechanism at a different zoom level. The signals are always readable in hindsight. The question that keeps me up is whether they're actually unreadable in real time — or just inconvenient.

Collapse
 
xulingfeng profile image
xulingfeng

"Inconvenient" — that's the one. 17 stories in, every single one had someone who saw it coming. Not in hindsight. Right there, in the room. Couldn't make it stick because the person who needed to hear it didn't want to.
That's why your "it was all there" lands so hard. It's not irony. It's just how decisions actually work.
Good to know the image came first — explains why the piece feels so tight. Might steal that order next time.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

"Couldn't make it stick because the person who needed to hear it didn't want to."
That's the sentence I was circling around the whole time. Jensen knew. Said it out loud, in front of cameras. It changed nothing — because knowing and acting are separated by something that has nothing to do with information.
Steal the order. Image first, precedent second, refrain last. Works because the ending is already written when you start — you're just building the archaeology backwards.
Would read those 17 stories.

Thread Thread
 
xulingfeng profile image
xulingfeng

Means a lot coming from someone who wrote that piece. Here's the series — same pattern, different zoom level:
AI, Ego & Regret
Fair warning: 17 is what's published. There's a backlog. This format is hard to stop once you start digging.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

17 published and a backlog. That's the tell — when the format starts pulling you forward instead of you pushing it.
Reading.

Thread Thread
 
xulingfeng profile image
xulingfeng

Curious which one you'll pick. I've got my guesses.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

Honest answer: I read most of the series before you dropped the link. No single standout — which is probably the point. The pattern holds across all of them, and that consistency is harder to pull off than one strong piece.
Curious about your guess though.

Thread Thread
 
xulingfeng profile image
xulingfeng

Re-read all 17. The $4.2M one is closest to yours.
Someone says it out loud → blocked by something easier to trust → engineer keeps receipts on the side → "it was all there." VP hiding behind VoidSentinel is Washington hiding behind "we're the best." Same thing, different zoom level.
My first guess was off. Yours is political — the others are technical. That's the difference.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

The $4.2M one is the right pick. "Engineer keeps receipts on the side" — that's Jensen with the Senate transcript he'll never use.
"Yours is political — the others are technical" is the most useful thing anyone's said about the piece. Political means the incentive distortion is systemic, not individual. Nobody needs to be stupid or corrupt. The system just prices in the wrong things.
That's harder to fix than a bad VP.

Thread Thread
 
xulingfeng profile image
xulingfeng

Exactly. A bad VP you can fire. A system that prices in the wrong thing — that's what I keep writing about. One story per angle, because nobody sees it until you show them from enough sides.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

"One story per angle, because nobody sees it until you show them from enough sides."
That's the method. Keep the backlog going.

Collapse
 
unitbuilds profile image
UnitBuilds

Very insightful and very true. During the mining boom, manufacturers were incentivized to not sell to China, so China built their own mining-capable chips. Fast forward, they built almost everything from scratch, because Western companies didnt want competition... Now they have competition they cant hope to beat. The difference is Nvidia and AMD built chips for 30 years, China built them over 10 years. Technical debt, environmental constraints, backwards compatibility, overfocus on ecosystem locking themselves into oblivion. While China built their own operating system, their own chips, modernized and optimized for each workload they faced. The difference? Cheaper. China manufactured 60% performance chips at scales unmatched, they reverse engineered and even upgraded legacy Nvidia chips to support memory types they were never designed to, rewrote firmware for those chips to support more vram than ever spec'd for. Fast forward some more, they have competitive chips in every sector, with a deep vendetta against the US firms and nation for blocking their progress in fear of competition.

While at peace, the US and China are fighting a shadow war on the compute front. And to the public, the US is winning, they're always winning... In reality, they know they lost the day they passed those sanctions. Starved of frontiers, engineers seek new horizons and often come across some pretty interesting stuff along the way. Take V.E.L.O.C.I.T.Y. OS's NDA-KV and NDA 2 bit quantization, optimizations driven by necessity, not desire. Written on a laptop, but designed for data centers. Not because I could, but because of a back injury, I couldnt sit at my PC. Restrictions breed intrigue and intrigue always finds loopholes and unexplored avenues that often times fix fundamental flaws in "It's good enough", because what's good enough on a modern day high-end gpu, is not good enough on an aging MX250 and it shows.

Great piece of writing @pascal_cescato_692b7a8a20 love the hint at the annoying orange with "we're winning, it's fake news" 😂

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

"Written on a laptop, but designed for data centers. Not because I could, but because of a back injury."
That's the sentence. Constraint doesn't care about your setup — it just removes the option of "good enough" and sees what's left.
The mining boom angle is one I didn't put in the piece but probably should have — it's an earlier iteration of the exact same loop. Blocked, built, outpaced. The vendetta part is real too, and it's the variable nobody in the boardrooms seems to price in.
Glad the annoying orange landed. Caricature writes itself sometimes.

Collapse
 
unitbuilds profile image
UnitBuilds

The mining boom is very important not to miss, because it was the origin of the whole race. Due to the restrictions and the vast amount of profits to be had in the crypto boom, China invested heavily into R&D for systems that could mine, not just bitcoin (SHA-256), but also Ethereum, which required a GPU, so they had no choice but to build optimized GPUs, modify old ones to support more and newer memory chips, so they could get every last penny out of it.

What's another interesting angle on it is the seizure of cryptominers' hardware in China at the time, when the legislation passed. Unlike western nations where such seizure would be well documented and made public what happens to the hardware once seized, in China it just dropped into a black hole essentially. (My theory) It's possible that after seizure, the CCP ran those mining rigs themselves, to fund their national R&D into building faster, more advanced systems and to supplement their datacenters which were falling behind after the restrictions. The reality is that in blocking the crypto miners from buying gpus, they also blocked datacenters from buying theirs... So 1 hand washed the other and the crypto cards seized were used to both fund R&D and maintain infrastructure, as the mining was just the face of it, the shadow war at the time was the start of what would become the AI war, where western nations accelerated their own R&D through use of early revision AIs, China was starved of this advancements, so they built their own, now we have models like Qwen, Deepseek, Kimi, all from China, all because they had to, not because they wanted to. Thankfully, not all companies were as restrictive, though they still safeguarded against reverse engineering, eg. Meta with their Llama models, open for use, but no source on the data they used to train them, meaning you could use it, but you couldnt learn from it.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

The seized rigs theory is unverifiable — and exactly the kind of thing that would never be verified. Which is its own kind of signal.
But the Meta/Llama distinction is the one that sticks. Open weights, closed training data. You can use the car, you can't learn to build the engine. It's a more sophisticated form of the same lock-in instinct — just dressed as openness.
The crypto boom as origin story for the AI hardware race is the thread I left out. You're right that it's not optional — it's where the incentive structure crystallized. Mining → seizure → R&D → Qwen, DeepSeek, Kimi. The line is straight once you draw it.
You should write this.

Thread Thread
 
unitbuilds profile image
UnitBuilds

Agreed on the unverifiable nature of it. Though one could probably trace a known miner's wallet from back then and trace the account hops to see if they ever lead to any purchases to individuals for compute hardware... Wouldnt show the seizure of the hardware being used, but could paint a picture of what the seized crypto was used for?

Cyber sleuths assemble! Find that Crypto.

The car and engine is a good analogy, in software it would be akin to MS Office. You can write an excel file, you can even use the API, but you cant read the source code, so you cant build your own from their assembled code.

The thing with the seizure, is it's impossible that the systems were decommissioned... No company, let alone nation, would willingly throw away hundreds of millions of dollars worth of hardware, let alone modern high-end hardware, amidst sanctions that prevent them from buying in their own... So let the individuals source the high end hardware, then seize it, along with the crypto it produced, if they reveal their channels of acquiring the hardware, let them keep the crypto, in exchanged for acting as a sourcing agent.

In theory it makes the most sense out of any outcome from the seizure, yet proving it is impossible, unless 1 of those seized accounts can be traced.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

The wallet trace angle is the only one that could produce anything verifiable — and even then you'd need someone with the time, the tools, and no better use for a weekend.
The sourcing agent theory is elegant. Too elegant, maybe — but "too elegant to be false" is exactly how these things work when nobody's incentivized to look.
The MS Office analogy is sharper than the car one. You can interoperate, you can't learn. That's the real lock.
Someone should write the investigation version of this. Not me — I write fiction.

Collapse
 
dannwaneri profile image
Daniel Nwaneri

This landed differently after spending today writing about the infrastructure gap from a developer's side — the diesel generators, the latency, what that 1% means if you're building in Lagos right now.

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

That angle didn't make it into the text — and it should have.
The 1% from Santa Clara is a postmortem. The 1% from Lagos is something else entirely: the moment the infrastructure gap stops being a gap and starts being an advantage. No legacy stack. No CUDA debt. Just the HX-9 Pro and a generator that already knows how to run lean.
Would read that piece.

Collapse
 
dannwaneri profile image
Daniel Nwaneri

Pascal, the lean infrastructure angle is the one I keep coming back to. the generator that already runs lean isn't a workaround .

it's a design constraint that produces different instincts. developers here already know how to build for unreliable conditions. That's not nothing when the HX-9 Pro lands and everyone else is migrating off CUDA.

The piece is half-written already. might be the next one.

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

"The generator that already runs lean isn't a workaround. It's a design constraint that produces different instincts."
That's the thesis. Right there.
The developers who spent years optimizing for MX250s and diesel generators don't need to unlearn anything when the HX-9 Pro lands. They're already thinking in constraints. The CUDA crowd has to migrate — mentally before they even touch the hardware.
Write it. I'll read it.

Thread Thread
 
dannwaneri profile image
Daniel Nwaneri

Pascal, "mentally before they even touch the hardware" is the line. that's the migration nobody is measuring and it's already happened here by necessity.
writing it....

Thread Thread
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

Waiting for it.

Collapse
 
itsugo profile image
Aryan Choudhary

What I found most interesting wasn't even the geopolitical prediction, it was the recurring line, "It was all there."

It made the story feel less like a warning about one company and more like a pattern that keeps repeating. Someone notices the signals early, the information exists, but incentives make acting on it much harder than seeing it.

I also liked how the piece was structured backwards. Starting with the 1% and then slowly uncovering how every decision led there made it feel more like reading an investigation than a prediction.

Really enjoyed this one.

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

"Incentives make acting on it much harder than seeing it."
That's the line I was looking for when I wrote it and couldn't quite land. You got there in one sentence.
The backwards structure was the only honest way to tell it — a prediction pretends to know the future, an investigation just follows the evidence back. The evidence was always there. That's what makes it uncomfortable.
Glad it landed.

Collapse
 
embernoglow profile image
EmberNoGlow

1%

0.99%
ughh..

good post!

Collapse
 
pascal_cescato_692b7a8a20 profile image
Pascal CESCATO

0.99% — and falling.
Thanks.