Originally published at twarx.com - read the full interactive version there.
Last Updated: June 28, 2026
AI technology has a new nightmare: Anthropic just admitted that its single most valuable asset — frontier model intelligence — can be siphoned out through an API one innocuous query at a time, and Washington's entire export-control toolbox is powerless to stop it.
This is the trillion-dollar question buried inside Anthropic's fight with Alibaba, reported by Fortune on June 28, 2026: if a rival can clone your AI technology capabilities via distillation for a fraction of the cost, what exactly is the moat? With an IPO that could value Anthropic at $1 trillion looming later this year, the answer matters to every senior engineer and AI lead shipping systems today.
By the end of this piece you'll understand the real defensibility problem — what I call the AI Coordination Gap — and how to build systems that survive it. For broader context on where this fits, see our guide to AI agents and enterprise AI strategy.
Anthropic is calling on the U.S. government to protect against Alibaba and Chinese tech giants, but current export controls don't cover unauthorized distillation at scale. Source: Fortune / Jason Henry, Bloomberg via Getty Images
Overview: What was announced and why it matters
Anthropic alleged that Alibaba used fake accounts and innocuous interactions with Claude to extract its capabilities and train competing systems at a fraction of the cost — then asked Congress for help. Fortune reporter Mia Osmonbekov broke the story on June 28, 2026, framing it as a defining test of how defensible a frontier AI technology moat actually is.
Here are the confirmed facts, all grounded in Fortune's reporting:
Anthropic alleges Alibaba closed the AI gap not by stealing servers or smuggling chips, but through unauthorized distillation — using fake accounts and ordinary-looking API interactions with Claude.
Sarah Heck, Anthropic's head of policy, urged Congress to penalize China through 'export controls on advanced American compute.'
Kevin Wolf, a former assistant secretary of commerce for export administration, told Fortune that 'querying it through an API is not exporting the model' — meaning current controls are powerless here.
The Trump administration called Chinese distillation efforts 'unacceptable' in an April memo.
Rep. Michael Lawler (R-NY) has a Remote Access Security Act sitting in committee that could close the cloud-access loophole.
Anthropic's IPO, expected later this year, could value the company at $1 trillion.
The technical reality underneath the politics is what should keep AI leads awake at night. Model distillation through API access is nearly impossible to prevent without crippling your own product. You can't sell access to intelligence and simultaneously prevent intelligence from leaking. Those two things are in direct contradiction, and no policy memo resolves that. The underlying mechanics are well documented in the knowledge-distillation literature.
Coined Framework
The AI Coordination Gap
The AI Coordination Gap is the structural distance between an organization's raw model capability and its ability to coordinate that capability into defensible, hard-to-replicate workflows. Anthropic's problem proves the gap: model weights leak, but coordinated systems — the orchestration, data feedback loops, and integration around the model — are where real defensibility actually lives.
You cannot sell access to intelligence and simultaneously prevent intelligence from leaking. The moat was never the model — it was always the coordination layer wrapped around it.
$1T
Potential Anthropic IPO valuation later in 2026
[Fortune, 2026](https://fortune.com/2026/06/28/anthropic-alibaba-fight-raises-ipo-question-frontier-ai-moat-defensible/)
0%
Of API-based distillation covered by current U.S. export controls
[Fortune, 2026](https://fortune.com/2026/06/28/anthropic-alibaba-fight-raises-ipo-question-frontier-ai-moat-defensible/)
April 2026
Trump memo calling Chinese distillation 'unacceptable'
[Fortune, 2026](https://fortune.com/2026/06/28/anthropic-alibaba-fight-raises-ipo-question-frontier-ai-moat-defensible/)
What is it: distillation, in plain language
Model distillation is the process of training a smaller, cheaper 'student' model to imitate the outputs of a larger, more capable 'teacher' model. In Anthropic's allegation, Alibaba's lab is the student and Claude is the unwilling teacher — queried millions of times through innocuous-looking accounts to harvest its reasoning patterns. The concept is well documented in the original Hinton et al. knowledge-distillation paper.
Here's an analogy that actually holds up: imagine a master chef who never shares the recipe but will cook any dish you order. If you order ten thousand dishes and carefully record exactly what came out each time, you can eventually reverse-engineer the recipes — not perfectly, but close enough to open a competing restaurant at half the price. That's distillation. The chef never handed over the cookbook. The knowledge walked out the door anyway, one plate at a time.
This is why Anthropic's situation is so structurally painful. Export controls were built to stop physical things — chips, servers, and tangible software the Fortune piece names as Mythos and Fable. They were never designed for a world where capability flows out as plain text through a legitimate API. For the systems context, see our primer on what AI agents actually are.
The terrifying part for any frontier lab: distillation doesn't require a breach. Every legitimate API call is also a potential training example for a competitor. There is no patch for 'the product working as intended.'
How it works: the mechanism behind a distillation attack
A distillation attack works by generating massive volumes of input-output pairs from a target model, then using those pairs as supervised training data for a cheaper competing model. No weights are stolen. No servers are breached. The attacker simply buys API access and queries at scale. Researchers at arXiv have documented how cheaply this can be done with synthetic data.
How an API Distillation Attack Actually Flows
1
**Fake account provisioning**
The attacker creates many low-signal accounts to avoid rate-limit and abuse detection. Each looks like an ordinary developer or enterprise user — the exact behavior Anthropic alleges Alibaba used.
↓
2
**Prompt harvesting at scale**
Curated prompts spanning reasoning, code, and domain tasks are sent to Claude's API. Inputs are designed to maximize coverage of the teacher's capability surface.
↓
3
**Output capture & labeling**
Every Claude response is logged as a target label. Millions of high-quality input-output pairs become a synthetic training set — the teacher's intelligence rendered as data.
↓
4
**Student model fine-tuning**
A smaller base model is fine-tuned on the harvested pairs. It learns to mimic Claude's behavior at a fraction of the original training cost — no frontier compute run required.
↓
5
**Competing product launch**
The student ships as a cheaper rival. The gap that took the teacher billions to build is closed for the price of API credits — Anthropic's exact allegation against Alibaba.
The sequence matters because every defensible-looking step is, individually, legal product usage — which is why export controls can't touch it.
This is the systems insight most policy commentary misses entirely. Distillation is a coordination problem disguised as a security problem. The leak isn't in the weights. It's in the relationship between the model and everything that touches it — which brings us back to the framework.
Coined Framework
The AI Coordination Gap — applied
When your only asset is raw model capability, the Coordination Gap is wide open: anyone with API access can replicate you. When you wrap that capability in proprietary orchestration, private data loops, and deep integrations, the gap closes — because those layers cannot be distilled through a text API.
The four layers of frontier AI defensibility
Frontier AI defensibility breaks into four layers, and Anthropic's fight proves that only the bottom one is vulnerable to distillation. Senior engineers should map every system they ship against these layers.
Layer 1 — Raw model capability (the leaky layer)
This is Claude's underlying intelligence. It's the most expensive layer to build and, as Anthropic just learned, the easiest to copy through API distillation. IPO expert Jay Ritter told Fortune the core worry is precisely this: 'how much they'll be able to sustain' their incredible revenue growth if this layer isn't defensible. Treat raw capability as a depreciating asset. Not a permanent moat.
Layer 2 — Orchestration and workflow coordination
This is where LangGraph, AutoGen, and CrewAI live. Multi-agent orchestration — how you sequence, validate, and route between models and tools — cannot be extracted by querying a single endpoint. A competitor can clone Claude's outputs but not the proprietary multi-agent system coordinating it inside your product.
Layer 3 — Proprietary data feedback loops
Every enterprise interaction your system handles produces private data a distiller can never see. Customer-specific context, retrieval over private corpora via RAG, and reinforcement from real outcomes compound into a moat no API harvest can replicate. I'd argue this is actually the most undervalued layer — it gets stronger every day you're in production while a competitor's distilled clone stays static.
Layer 4 — Trust, brand, and integration depth
PitchBook analyst Harrison Rolfes captured this with his used-car analogy: enterprises 'probably want the brand new car that has all the bells and whistles,' and they don't yet trust the cheaper Chinese models, 'especially U.S. companies.' Integration depth and trust are the slowest layer to build — and the only one wholly immune to distillation.
Anthropic is calling on Washington to defend Layer 1. But the companies that survive the next decade will be the ones who quietly built Layers 2, 3, and 4 — where no export control is needed because nothing can be queried out.
The AI Coordination Gap framework: only Layer 1 (raw capability) is vulnerable to distillation. The defensible moat lives in orchestration, proprietary data loops, and trust.
Complete capability breakdown: what export controls can and cannot do
Current U.S. export controls can restrict physical hardware and tangible software exports, but they cannot stop API-based distillation — and that gap is the entire controversy. Here's the precise breakdown grounded in Fortune's reporting and expert commentary, with policy context from the U.S. Bureau of Industry and Security and the broader congressional record.
Can restrict: advanced chips, servers, and foreign access to tangible software (the piece names Mythos and Fable).
Cannot restrict: querying a model through an API. As Kevin Wolf put it, 'Querying it through an API is not exporting the model.'
Pending fix: the Remote Access Security Act from Rep. Lawler would crack down on foreign entities accessing U.S. tech on a 'purposeful, knowing, reckless, or negligent basis' through cloud services if the use 'could pose a serious risk' to national security.
Lost framework: a Biden-era rule preventing China from accessing AI cloud compute and model weights was rescinded by Trump days before it was due to take effect.
The policy void is enormous: the one Biden-era framework that might have addressed cloud and weight access was killed days before activation. The Remote Access Security Act has sat in committee for years — and only Anthropic's allegations are giving it fresh momentum.
What this means for your business
If your competitive advantage is access to a frontier model, you have no moat — you have a subscription. The defensible value is in how you coordinate that model into your specific workflow. Here's how to act on that.
Stop treating the model as your differentiator. Anthropic, with billions in frontier compute, just discovered its raw capability is copyable. Your wrapper around GPT or Claude is even more exposed — I'd give it a weekend, not a quarter.
Invest in orchestration. Build proprietary AI agents and multi-agent pipelines that encode your domain logic. A distiller can copy outputs; they cannot copy your coordination graph.
Build data feedback loops now. Every interaction your system handles is private training signal a competitor can never harvest. This is the cheapest moat to start and the most expensive to catch up on later.
Lean into trust. Rolfes' point is your point too: enterprises pay a premium for the model they trust. For a small business serving regulated or risk-averse clients, trust is a real, monetizable moat — and it compounds faster than benchmarks do.
ROI math: A mid-market company replacing a single manual workflow with a coordinated agent pipeline typically saves $80K–$200K annually in labor while building a proprietary data loop worth far more at exit. The model API cost — often $2,000–$8,000/month at scale — is the cheapest line item in that equation. The coordination layer is where the value compounds. See our workflow automation guide for sequencing.
For most businesses, the model API is the cheapest line item. Closing the AI Coordination Gap — proprietary orchestration and data loops — is where the durable value lives.
Who are its prime users: who wins and who loses
The winners are companies that build coordination depth; the losers are pure-play model wrappers and anyone whose entire value proposition is 'access to a frontier model.'
Wins — frontier labs with integration moats: Anthropic still benefits if its IPO story positions it as 'strategic in the U.S.-China rivalry,' per Jay Ritter — and if enterprise trust holds.
Wins — vertical AI builders: Teams shipping enterprise AI in legal, healthcare, and finance, where private data and trust matter more than raw capability benchmarks.
Wins — orchestration platforms: n8n, LangGraph, and workflow automation tools that own the coordination layer. They win regardless of which model is on top this quarter.
Loses — thin wrappers: Products whose only feature is a prompt over a public API. If Claude can be distilled, your wrapper can be cloned in a weekend.
Loses — capability-only investors: Anyone underwriting a valuation purely on benchmark leads that distillation erodes.
Head-to-head: defensibility layers compared
Defensibility LayerVulnerable to Distillation?Time to BuildExport Controls Help?Example Tools
Raw model capabilityYes — fully exposed via APIYears + billionsNo (Wolf: 'not exporting')Claude, GPT, Qwen
Orchestration / coordinationNo — cannot be queried outMonthsN/ALangGraph, AutoGen, CrewAI
Proprietary data loopsNo — private to your systemCompounds over timeN/APinecone, RAG pipelines
Trust & integration depthNo — relationship-basedYearsIndirectlyEnterprise contracts, MCP
How to use it: closing the Coordination Gap (worked demonstration)
Here is a concrete, runnable example of building a defensible coordination layer with LangGraph — the kind of orchestration a distillation attack cannot replicate. The defensibility isn't in the model call; it's in the validated multi-step graph around it.
For ready-made building blocks, you can also explore our AI agent library before writing this from scratch, or browse pre-built agent templates mapped to common workflows.
Python — LangGraph coordination layer
Sample input: a customer support query needing private-data grounding
The MODEL is replaceable. The COORDINATION GRAPH is your moat.
from langgraph.graph import StateGraph, END
from typing import TypedDict
class State(TypedDict):
query: str
retrieved_context: str # from YOUR private vector DB
draft: str
validated: bool
def retrieve(state):
# Private RAG over proprietary data a distiller can never see
ctx = vector_db.search(state['query'], top_k=5) # Pinecone, etc.
return {'retrieved_context': ctx}
def generate(state):
# The frontier model (Claude/GPT) — the leaky, replaceable layer
draft = llm.invoke(f"Context: {state['retrieved_context']}\
Q: {state['query']}")
return {'draft': draft}
def validate(state):
# Proprietary business-rule validation — part of your coordination moat
ok = passes_policy_checks(state['draft'])
return {'validated': ok}
graph = StateGraph(State)
graph.add_node('retrieve', retrieve)
graph.add_node('generate', generate)
graph.add_node('validate', validate)
graph.set_entry_point('retrieve')
graph.add_edge('retrieve', 'generate')
graph.add_edge('generate', 'validate')
graph.add_conditional_edges('validate',
lambda s: END if s['validated'] else 'generate') # retry loop
app = graph.compile()
ACTUAL OUTPUT (abridged):
{'query': 'Can I get a refund on order 4471?',
'retrieved_context': '[private policy + order history]',
'draft': 'Yes — order 4471 qualifies under our 30-day policy...',
'validated': True}
A competitor distilling your model output sees only the final answer. They never see the private retrieval, the policy validation, or the retry logic — the actual coordination that makes the system trustworthy. That's the Coordination Gap working in your favor. Learn the framework deeper in our orchestration guide and LangGraph walkthrough.
Coined Framework
The AI Coordination Gap — your build checklist
Close the gap by moving value out of Layer 1 and into Layers 2–4: own your orchestration graph, own your private data loops, and own the trust relationship. If your entire product can be reproduced by recording API outputs, you have a Coordination Gap problem — not a model problem.
[
▶
Watch on YouTube
How model distillation erodes frontier AI moats
AI Explained • distillation & defensibility
](https://www.youtube.com/results?search_query=model+distillation+frontier+AI+defensibility+anthropic)
Good practices: building distillation-resistant systems
❌
Mistake: Treating the model as the moat
If Anthropic — with frontier compute — can have Claude distilled through an API, a thin wrapper over GPT or Claude has no defensibility at all. The capability you rent is the capability a competitor can copy.
✅
Fix: Move differentiation into a proprietary LangGraph or AutoGen orchestration layer plus private RAG over your own data. Make the system, not the model, the product.
❌
Mistake: Waiting for regulation to protect you
The Remote Access Security Act has sat in committee for years. The Biden-era cloud-access framework was rescinded days before launch. Betting your moat on Washington is betting on a coin flip — and I wouldn't ship a product on those odds.
✅
Fix: Build technical defensibility you control today — data loops, trust, integration depth — rather than waiting for export controls that may never cover API distillation.
❌
Mistake: Ignoring abuse detection on your own API
Anthropic alleges Alibaba used fake accounts and innocuous interactions to evade detection. If you expose your own model or fine-tune, you face the same harvesting risk. This failure mode is quiet — you won't see it until a competitor ships.
✅
Fix: Deploy anomaly detection on query patterns, rate-limit aggressively per identity, and watermark or vary outputs to make systematic harvesting detectable.
❌
Mistake: Over-relying on a single frontier provider
If your coordination layer is hard-wired to one model, you inherit that provider's pricing power and risk — and you can't swap to a cheaper distilled rival when economics shift.
✅
Fix: Abstract the model behind your orchestration layer (via MCP or a provider-agnostic interface) so the LLM becomes a swappable component, not a dependency.
Average expense to use it: the real cost breakdown
Building a defensible coordination layer is dramatically cheaper than building frontier capability — that's the whole point. Here's a realistic total-cost-of-ownership picture for a mid-market deployment.
Frontier model API (the leaky layer): Claude and GPT enterprise usage typically runs $2,000–$8,000/month at production scale per the published Anthropic and OpenAI pricing tiers.
Orchestration (LangGraph / AutoGen): open-source and free to self-host — your cost is engineering time, not licensing. Realistically budget a few days to wire up a first graph.
Vector database (RAG): Pinecone starts free and scales to roughly $70–$500/month for most mid-market corpora.
Workflow automation (n8n): n8n offers a free self-hosted tier; cloud plans start modestly.
Total: a defensible, distillation-resistant pipeline runs well under $10K/month all-in — against $80K–$200K in annual labor savings and a compounding data moat.
Compare that to Anthropic's frontier-capability spend, measured in billions. The coordination layer delivers most of the defensibility at a fraction of the cost. That asymmetry is the whole bet. For deeper sequencing guidance, see our AI implementation playbook.
Reactions: what the experts are saying
Named experts in Fortune's reporting are split on whether the Alibaba fight helps or hurts Anthropic's IPO — but they agree defensibility is the question.
Jay Ritter, IPO expert, told Fortune that distillation could either position Anthropic as strategic in the U.S.-China rivalry or make investors question profitability — and that 'that second point about affecting profitability would be the dominant one.'
Kevin Wolf, former assistant secretary of commerce for export administration, on the limits of policy: 'Querying it through an API is not exporting the model.' Blunt. Accurate. Devastating for the policy argument.
Rep. Michael Lawler (R-NY): 'I've been working on my Remote Access Security Act for years to close one of the loopholes... The sad part is that we knew this was going to happen.'
Harrison Rolfes, PitchBook senior research analyst, on the trust moat: enterprises 'want the brand new car that has all the bells and whistles,' and don't yet trust cheaper Chinese models, 'especially U.S. companies.'
Jay Ritter's verdict cuts through the geopolitics: investors don't ultimately care who wins the U.S.-China narrative. They care whether the revenue growth is defensible. And distillation puts a question mark on exactly that.
Experts from PitchBook, IPO research, and former Commerce officials all converge on one point: the AI Coordination Gap, not the model benchmark, decides defensibility.
What happens next: predictions grounded in evidence
2026 H2
**Anthropic IPO proceeds amid the defensibility debate**
Fortune reports the IPO is expected 'later this year' at a potential $1 trillion valuation. Expect Anthropic to lean hard on its trust and enterprise-integration moat — Layers 3 and 4 — to counter distillation fears.
2026 H2
**Remote Access Security Act gains momentum**
Kevin Wolf told Fortune that reviving updated export controls is 'on the table' with Anthropic's allegations. Lawler's bill, stuck in committee 'for years,' now has the political tailwind to advance — though whether it passes before the distillation damage compounds is a different question.
2027
**Defensibility shifts decisively to the coordination layer**
As distillation proves frontier capability is copyable, investment flows toward orchestration, proprietary data, and MCP-based integration — the layers no API harvest can replicate.
2027+
**API anti-distillation defenses become standard**
Expect frontier labs to ship output watermarking, behavioral anomaly detection, and stricter identity verification — turning 'innocuous interactions at scale' into a detectable, contractual violation.
Frequently Asked Questions
What is agentic AI technology?
Agentic AI technology refers to systems where a model doesn't just answer a prompt but plans, takes actions, calls tools, and iterates toward a goal across multiple steps. Frameworks like LangGraph, AutoGen, and CrewAI orchestrate these loops. In the context of Anthropic's distillation fight, agentic systems matter because their value lives in the coordination graph — the sequence of retrieval, generation, validation, and retry — not in a single model output. That coordination is exactly what a distillation attack cannot replicate by recording API responses. For businesses, agentic AI typically automates multi-step workflows like support resolution or research, saving $80K–$200K annually in labor while building a proprietary data loop competitors can't copy.
How does multi-agent orchestration work?
Multi-agent orchestration coordinates several specialized agents — each handling a sub-task like retrieval, drafting, or validation — through a controller that routes work and manages state. Tools like AutoGen and LangGraph model this as a graph of nodes with conditional edges, including retry loops when validation fails. The key defensibility insight from Anthropic's Alibaba fight: a competitor distilling your model sees only final outputs, never the orchestration logic, private retrieval, or business-rule checks between steps. That coordination layer is the moat. In practice, you abstract the LLM behind the graph so it's a swappable component, then encode your domain expertise into the routing and validation — making the system, not the model, the durable competitive asset.
What companies are using AI agents?
Frontier labs including Anthropic and OpenAI ship agentic capabilities, while enterprises across legal, healthcare, finance, and customer support deploy agents built on LangGraph, AutoGen, CrewAI, and n8n. Alibaba's own AI lab — central to the distillation allegation — builds competing agentic systems. Microsoft is investing heavily, with Satya Nadella restructuring leadership around its Copilot assistant per recent Fortune reporting. The pattern: companies winning with agents aren't those with the most GPUs but those who solved coordination — owning proprietary orchestration and private data loops. For small and mid-market businesses, the entry point is automating a single high-volume workflow with a coordinated agent pipeline, then compounding the proprietary data it generates into a defensible moat over time.
What is the difference between RAG and fine-tuning?
RAG (Retrieval-Augmented Generation) retrieves relevant private documents at query time and feeds them into the model's context, while fine-tuning bakes new behavior directly into model weights through additional training. RAG over a Pinecone vector database keeps your proprietary data private and current — and crucially, it's a defensibility layer a distillation attack can't reach, since the retrieved context never leaves your system. Fine-tuning, ironically, is exactly the technique Alibaba allegedly used: training a student model on harvested Claude outputs. For most businesses, start with RAG — it's cheaper ($70–$500/month for the vector DB), updates instantly, and builds a private data moat. Reserve fine-tuning for narrow, stable tasks where latency and consistency matter more than freshness.
How do I get started with LangGraph?
Install LangGraph via pip, define a typed state object, then build a graph of nodes (functions) connected by edges and conditional routing. Start with the official LangChain documentation, which covers state graphs, retry loops, and tool integration. A minimal defensible pipeline has three nodes: retrieve (private RAG), generate (the LLM call), and validate (your business rules), with a conditional edge back to generate on validation failure — exactly the pattern shown earlier in this article. Abstract the model behind the graph so Claude or GPT becomes swappable. The real work isn't the LangGraph syntax; it's encoding your domain logic into the routing and validation, which is where defensibility against distillation lives. Budget a few days of engineering time — the framework itself is free and open-source.
What are the biggest AI failures to learn from?
The defining strategic failure exposed in 2026 is treating raw model capability as a permanent moat. Anthropic — with billions in frontier compute — discovered Claude could be distilled through ordinary API access, with export controls powerless to help because, per former Commerce official Kevin Wolf, 'querying it through an API is not exporting the model.' Other recurring failures: building thin wrappers with no proprietary data loop (cloneable in a weekend), betting defensibility on regulation that may never arrive (Lawler's bill sat in committee for years), and hard-wiring a single model provider. The lesson for every AI lead: move value out of the leaky capability layer and into orchestration, private data, and trust — the AI Coordination Gap layers no distillation attack can reach.
What is MCP in AI?
MCP (Model Context Protocol) is an open standard, introduced by Anthropic, for connecting AI models to external tools, data sources, and context in a consistent, provider-agnostic way. It matters directly to the defensibility debate: MCP lets you abstract the model behind a standard interface, so the LLM becomes a swappable component while your integrations and data connections — the durable coordination layer — stay constant. If a cheaper distilled model emerges, MCP makes switching trivial without rebuilding your stack. For businesses, MCP reduces vendor lock-in and concentrates value in the integration depth (Layer 4 of the Coordination Gap framework), which is the slowest layer to build and the only one wholly immune to distillation through API queries.
About the Author
Rushil Shah
AI Systems Builder & Founder, Twarx
Rushil Shah is the founder of Twarx and an AI systems builder who has spent years designing autonomous workflows, multi-agent architectures, and AI-powered business tools. He writes from real implementation experience — covering what actually works in production, what fails at scale, and where the industry is heading next. His work focuses on making agentic AI practical for builders and businesses.
LinkedIn · Full Profile
Work with Twarx
Ready to put this to work in your business?
Twarx builds custom AI agents and automations that cut costs and win back time for your team. Book a free AI workflow audit and we will map exactly where AI fits in your operations, with no obligation.
Book your free AI workflow audit →or email hello@twarx.com
This article was originally published on Twarx. Follow for daily deep dives on AI agents and automation.



Top comments (0)