<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harry Floyd</title>
    <description>The latest articles on DEV Community by Harry Floyd (@harryfloyd).</description>
    <link>https://dev.to/harryfloyd</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3933548%2Fa644e757-fdc0-4213-a2d0-37774cbe6730.png</url>
      <title>DEV Community: Harry Floyd</title>
      <link>https://dev.to/harryfloyd</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harryfloyd"/>
    <language>en</language>
    <item>
      <title>What Proves You Can Think?</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Tue, 02 Jun 2026 20:46:39 +0000</pubDate>
      <link>https://dev.to/harryfloyd/what-proves-you-can-think-4hmh</link>
      <guid>https://dev.to/harryfloyd/what-proves-you-can-think-4hmh</guid>
      <description>&lt;p&gt;AI did not just make output cheap. It broke the old contract between effort, competence, and trust.&lt;/p&gt;

&lt;p&gt;For developers this is not abstract. When anyone can generate a clean PR, a plausible code review, a working API endpoint, or a competent-looking architecture diagram in seconds, the artefact stops proving what it used to prove. A good solution no longer implies someone wrestled with the problem.&lt;/p&gt;

&lt;p&gt;The question underneath the productivity debate is harder: if the work no longer proves I can think, what does?&lt;/p&gt;

&lt;h3&gt;
  
  
  The old proof contract
&lt;/h3&gt;

&lt;p&gt;Every institution runs on proof contracts. A school asks for essays and exams. A company asks for CVs, interviews, and work samples. A market asks for traction and retention.&lt;/p&gt;

&lt;p&gt;None of these signals were ever pure. The CV was always a marketing document. The interview was always distorted by nerves and charm. The portfolio could hide how much help the person had. But they worked well enough because polished surfaces were costly to produce. Cost created friction. Friction created signal.&lt;/p&gt;

&lt;p&gt;AI attacks the "expensive enough" part. It compresses the cost of appearing competent. That is enough to break the systems that relied on that cost as a proxy.&lt;/p&gt;

&lt;h3&gt;
  
  
  The move
&lt;/h3&gt;

&lt;p&gt;When output gets cheap, output quality becomes the opening bid, not the final proof. The important question moves upward:&lt;/p&gt;

&lt;p&gt;What does this output prove about the person, team, or system behind it?&lt;/p&gt;

&lt;p&gt;Sometimes the answer is: not much. A clean PR may prove someone had access to a good model and enough taste not to paste the first result. A strong CV may prove the candidate knows how hiring filters work.&lt;/p&gt;

&lt;p&gt;The useful response is not to ban AI. It is to stop treating AI-polished output as the proof object. The next proof system asks what happened before, during, and after the artefact.&lt;/p&gt;

&lt;h3&gt;
  
  
  Five questions that separate judgement from output
&lt;/h3&gt;

&lt;p&gt;These work on code, PRs, architecture decisions, interview responses, and your own work.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. What problem was chosen, and what easier problem was rejected?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The first proof of thought. Bad work often starts with accepting the first fluent frame. Good work usually contains a buried refusal: someone saw the tempting version and did not take it. In a codebase, this looks like choosing the harder but more maintainable abstraction instead of the one-liner that will break in six months.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. What tradeoff was made under constraint?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Intelligence becomes visible at the boundary. Anyone can claim they value quality, speed, safety, and maintainability. Real judgement appears when not all of them can be maximised at once. The developer who can explain why they chose correctness over latency for this specific endpoint, and what evidence would make them reverse that choice, is showing something the output alone cannot.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. What did you check that the output itself could not prove?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the verification question. It separates people who use AI as a generator from people who use AI inside a judgement loop. The generated code can compile. That is not verification. Verification is the external thing that makes the claim answerable: the edge case test, the production data check, the integration test that proves it works with the real system.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;4. What changed after feedback or contact with reality?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Revision is underrated because it is less glamorous than creation. But in an AI world, revision becomes a higher-status signal. The first surface is cheap. The changed surface after a code review, a production incident, or a colleague pointing out the flaw is where more truth appears.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;5. Who owns the consequence if this is wrong?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Accountability is the signal machines cannot carry. A model can produce. A person must decide what they are willing to stand behind. The developer who says "I shipped this, I own the pager duty for it, I will be awake if it breaks" is operating in a different category from the one who lets the AI output speak for itself.&lt;/p&gt;

&lt;h3&gt;
  
  
  What this changes in practice
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Hiring:&lt;/strong&gt; Stop asking only for work samples. Give candidates a plausible AI-generated solution and ask what is wrong, what would break in production, and what they would check before deploying it.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Code review:&lt;/strong&gt; Stop treating clean diffs as sufficient evidence. Ask what was not generated. Ask which tradeoff the author is defending. Ask what verification would prove the solution wrong.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Your own work:&lt;/strong&gt; Stop trying to prove value only through polished output. Keep the polish, but attach judgement to it. Show the problem you chose. Show the tradeoff. Show the verification. Show the revision. Show what you will own.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This piece was originally published on The Durability Curve, a newsletter about what lasts when the surface gets cheap. &lt;a href="https://harryfloyd.substack.com/p/what-proves-you-can-think" rel="noopener noreferrer"&gt;Read the full article&lt;/a&gt; for the deeper argument, including the research on algorithmic anxiety that frames why this matters beyond engineering.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>career</category>
      <category>architecture</category>
    </item>
    <item>
      <title>What 128GB Unified Memory Changes for Local AI Development</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Mon, 01 Jun 2026 12:28:22 +0000</pubDate>
      <link>https://dev.to/harryfloyd/what-128gb-unified-memory-changes-for-local-ai-development-kam</link>
      <guid>https://dev.to/harryfloyd/what-128gb-unified-memory-changes-for-local-ai-development-kam</guid>
      <description>&lt;h1&gt;
  
  
  What 128GB Unified Memory Changes for Local AI Development
&lt;/h1&gt;

&lt;p&gt;Yesterday at Computex, NVIDIA announced the RTX Spark superchip: an Arm CPU paired with a Blackwell GPU and up to &lt;strong&gt;128GB of unified LPDDR5X memory&lt;/strong&gt;. Most of the coverage is focusing on the Arm chip or the "agentic OS" branding. The real story for developers is the memory.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Constraint That Just Got Removed
&lt;/h2&gt;

&lt;p&gt;If you've run local models, you know the bottleneck. An RTX 4090 has 24GB of VRAM. That fits a 13B parameter model at 8-bit or a 30B model at 4-bit, with nothing else. No embedding model. No vector database. No room for the application itself in GPU memory.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# With 24GB VRAM (RTX 4090):
# - 30B model at Q4_K_M: ~20GB
# - KV cache for 4096 context: ~2GB  
# - Remaining: ~2GB
# - Can't fit an embedding model. Can't fit a vector index.
# - CPU offloading would be needed, which is 10-100x slower.
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;128GB unified memory changes this because the CPU and GPU share one pool. You're not choosing between VRAM for the model and system RAM for everything else. The GPU can directly access the full 128GB.&lt;/p&gt;

&lt;p&gt;For context, a 70B parameter model at FP4 (4-bit) needs about 40-45GB in practice, with quantization overhead and KV cache included. That leaves roughly 83GB for the rest of your stack.&lt;/p&gt;

&lt;h2&gt;
  
  
  What You Can Actually Build Now
&lt;/h2&gt;

&lt;p&gt;Here's a concrete workflow that goes from impossible to straightforward with 128GB:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Running a local RAG pipeline with a 70B model:&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Components that now fit on one machine:
# 1. 70B LLM at FP4: ~42GB
# 2. Embedding model (e.g., bge-large-en-v1.5): ~1.5GB  
# 3. Vector index (10M embeddings at 768d): ~6GB
# 4. Application runtime + buffer: ~8GB
# Total: ~57.5GB — fits with 70GB to spare
# On a 4090 24GB: the 70B model alone doesn't fit
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Or a multi-agent setup where you run three specialised models simultaneously:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# Multi-model orchestration on one machine:
# - 70B orchestrator model at FP4: ~42GB
# - 30B code specialist at Q4_K_M: ~20GB  
# - 7B verification model at Q8: ~7GB
# - Shared KV cache: ~4GB
# Total: ~73GB — comfortable fit
# On 24GB VRAM: you'd need 3 separate machines
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This isn't theoretical. The RTX Spark runs Windows on Arm, and NVIDIA's NemoClaw agent framework already supports it. The software stack (llama.cpp, Ollama, NVIDIA's own AI Enterprise suite) supports the NVLink C2C architecture.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Memory Bandwidth Question
&lt;/h2&gt;

&lt;p&gt;128GB of LPDDR5X at 300 GB/s is the spec worth checking. Compare this to:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;RTX 4090: 24GB GDDR6X at 1,008 GB/s&lt;/li&gt;
&lt;li&gt;Mac M5 Max: 128GB unified at ~800 GB/s&lt;/li&gt;
&lt;li&gt;RTX Spark: 128GB LPDDR5X at 300 GB/s&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The RTX Spark has 5x the capacity but about a third of the bandwidth of a 4090. This means: batch inference and throughput-oriented workloads will be slower than a 4090. But model loading, context switching between models, and running multiple models simultaneously all bottleneck on VRAM capacity, not bandwidth. Those will be dramatically better.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The bandwidth is enough for interactive inference.&lt;/strong&gt; A 70B model generates ~30 tokens/second on an M5 Max at 800 GB/s. At 300 GB/s, you'd expect roughly 10-15 tokens/second. Slower but usable for most development workflows. For production batch inference, you'd still want a datacenter GPU.&lt;/p&gt;

&lt;h2&gt;
  
  
  What This Means for Local AI Development
&lt;/h2&gt;

&lt;p&gt;The practical takeaway for developers: &lt;strong&gt;128GB unified memory changes the threshold question.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Before RTX Spark, the question was: "Does my model fit in 24GB?" If no, you couldn't run it locally at all. You needed cloud GPUs or CPU offloading, which is impractically slow for any interactive use.&lt;/p&gt;

&lt;p&gt;After RTX Spark, the question becomes: "Does my multi-model workflow fit in 128GB?" For most development setups, including a large model, an embedding service, a vector index, and some agent tooling, the answer is yes.&lt;/p&gt;

&lt;p&gt;This doesn't replace cloud infrastructure for production. But it changes the economics of development iteration. Running a local dev environment with production-scale models means faster feedback cycles, no inference API costs during development, and the ability to test multi-model interactions without distributed system complexity.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Structural Change
&lt;/h2&gt;

&lt;p&gt;The Arm chip is interesting. The agentic OS pitch is marketing. The memory bus is the actual structural change, a discontinuity in what a single consumer PC can hold in memory for AI workloads.&lt;/p&gt;

&lt;p&gt;If your work involves models above 30B parameters locally, this is the spec that matters. Everything else, including clock speeds, core counts, and TOPS ratings, is secondary to whether your working set fits in memory.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;NVIDIA's RTX Spark announcement at Computex 2026. Tom's Hardware has the full spec breakdown &lt;a href="https://www.tomshardware.com/laptops/nvidia-unveils-rtx-spark-superchip-at-computex-2026-new-platform-promises-to-turn-windows-into-an-agentic-ai-os-with-arm-cpu-blackwell-gpu-and-128gb-unified-memory" rel="noopener noreferrer"&gt;here&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>machinelearning</category>
      <category>llm</category>
      <category>programming</category>
    </item>
    <item>
      <title>The Same AI Model Can Perform 6x Better: Here's Why</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Sat, 30 May 2026 21:39:59 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-same-ai-model-can-perform-6x-better-heres-why-440o</link>
      <guid>https://dev.to/harryfloyd/the-same-ai-model-can-perform-6x-better-heres-why-440o</guid>
      <description>&lt;p&gt;A &lt;a href="https://arxiv.org/abs/2603.28052" rel="noopener noreferrer"&gt;Stanford and Tsinghua paper&lt;/a&gt; ran a controlled experiment earlier this year. Same model. Same task. Different harness architecture.&lt;/p&gt;

&lt;p&gt;The result: a 6x performance gap driven entirely by the system built &lt;em&gt;around&lt;/em&gt; the model. Not the model itself.&lt;/p&gt;

&lt;p&gt;This is not a prompt engineering insight. It is a systems architecture insight, and it changes where developers should invest their time when building agentic systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  The 6x Gap
&lt;/h2&gt;

&lt;p&gt;Meta-Harness tested Claude Opus 4.6 across two harness configurations on TerminalBench-2. The only variable was the scaffold: the code that manages tool calls, context windows, error recovery, and state persistence.&lt;/p&gt;

&lt;p&gt;One version scored at baseline. The other, with structured tool orchestration and context management, scored 18.4 points higher. Same inference cost. Same model. Different architecture.&lt;/p&gt;

&lt;p&gt;This pattern replicates across multiple independent studies:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/langchain-ai/deepagents" rel="noopener noreferrer"&gt;LangChain DeepAgents&lt;/a&gt; (2026):&lt;/strong&gt; Same GPT-5.2-Codex model. Harness-only changes moved it from Top 30 to Top 5. That is a 13.7-point gain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://blog.can.ac/2026/02/12/the-harness-problem/" rel="noopener noreferrer"&gt;Can Bölük&lt;/a&gt; (Hashline, 2026):&lt;/strong&gt; Same model, same task. Changed the edit tool format. Performance went from 6.7% to 68.3%. That is a 10x improvement with 61% fewer tokens.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://vercel.com/blog/we-removed-80-percent-of-our-agents-tools" rel="noopener noreferrer"&gt;Vercel's d0 agent&lt;/a&gt;:&lt;/strong&gt; A production agent had 16 tools. Removing 14 of them (leaving only bash) took success rate from 80% to 100%. The bottleneck was not capability. It was decision surface.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why This Matters Practically
&lt;/h2&gt;

&lt;p&gt;The cheapest Haiku call with an optimised harness (37.6% on TerminalBench-2) outperformed the most expensive Opus call with a default harness (58.0%). That is at 1/50th the inference cost.&lt;/p&gt;

&lt;p&gt;Most teams are optimising at the wrong layer. They swap models, tune prompts, add retrieval. The structural leverage is in how the system manages tool calls, handles state, and recovers from failure.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Changes
&lt;/h2&gt;

&lt;p&gt;The practical takeaway for anyone building with AI agents:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Audit your tool surface.&lt;/strong&gt; Every tool your agent can call is a decision it must make. Vercel found 16→1 tool reduction improved everything. Fewer tools, better decisions.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Measure harness, not just model.&lt;/strong&gt; Track task completion rate per harness configuration, not just per model. The harness is the variable that moved 6x.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Cost is architecture-dependent, not model-dependent.&lt;/strong&gt; Haiku with a good harness beat Opus with a bad harness. Test harness variations before upgrading to a more expensive model.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The full analysis (12 verified claims, evidence tables, production case studies, and falsification criteria) is on Substack:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://harryfloyd.substack.com/p/harness-engineering-same-model-different-product" rel="noopener noreferrer"&gt;Harness Engineering: Same Model, Different Product →&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;It covers the Claude Code 1,421-line state machine, the Codex CLI vs Claude Code architecture comparison (77.3% vs 65.4%, 4.2x token efficiency difference), and why this is a Law IV (Instruments Over Theory) and Law I (Bottleneck Migration) structural play.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow for weekly analysis on AI infrastructure, agent architecture, and the systems that actually determine model performance.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>productivity</category>
      <category>programming</category>
    </item>
    <item>
      <title>The Decision Subtraction Framework: How to Evaluate Any AI Tool</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Thu, 28 May 2026 10:39:34 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-decision-subtraction-framework-how-to-evaluate-any-ai-tool-1o1l</link>
      <guid>https://dev.to/harryfloyd/the-decision-subtraction-framework-how-to-evaluate-any-ai-tool-1o1l</guid>
      <description>&lt;p&gt;Last week someone asked me which AI tools they should be using. The question hides a problem that costs real money: there are more capable AI tools available than any single person can evaluate.&lt;/p&gt;

&lt;p&gt;ChatGPT Plus at $20/month. Claude at $20. Grok at $30. Cursor at $20. Copilot at $10. Each with a $100, $200, or $300 variant underneath. Each claims to earn its place.&lt;/p&gt;

&lt;p&gt;The real question is not which tool is best. The real question is: which tools subtract more decisions than they add?&lt;/p&gt;

&lt;h2&gt;
  
  
  The Three Lenses
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Replacement Ratio
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Formula:&lt;/strong&gt; decisions replaced by the tool ÷ decisions it creates.&lt;/p&gt;

&lt;p&gt;List every decision the tool makes for you. Then list every new decision it forces you to make. Divide the first by the second.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Thresholds:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;≥ 2.0 → Keep&lt;/li&gt;
&lt;li&gt;1.0–2.0 → Evaluate&lt;/li&gt;
&lt;li&gt;&amp;lt; 1.0 → Drop&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;&lt;strong&gt;Example:&lt;/strong&gt; A code completion tool that writes a function body (replaces 5 decisions about syntax, structure, naming) but requires review (adds 2 decisions about correctness) has a ratio of 2.5. It passes.&lt;/p&gt;

&lt;p&gt;A meeting summariser that replaces 1 decision (should I re-listen?) but creates 3 (verify accuracy, add context, decide what to share) has a ratio of 0.33. It fails.&lt;/p&gt;

&lt;h3&gt;
  
  
  2. Friction Delta
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Formula:&lt;/strong&gt; time without the tool ÷ time with the tool.&lt;/p&gt;

&lt;p&gt;Include onboarding time amortised over your first 10 uses. A tool that saves 30 minutes per use but took 2 hours to learn breaks even at 4 uses. After that, it is pure gain.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Threshold:&lt;/strong&gt; Break-even within 5 uses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Catch:&lt;/strong&gt; This lens breaks for tools that enable tasks you could not do at all before. A drug discovery simulation has infinite Friction Delta because the alternative is impossible. Score those as "can't evaluate on this lens" and rely on the others.&lt;/p&gt;

&lt;h3&gt;
  
  
  3. Attention ROI
&lt;/h3&gt;

&lt;p&gt;&lt;strong&gt;Formula:&lt;/strong&gt; output quality ÷ attention consumed.&lt;/p&gt;

&lt;p&gt;Estimate cognitive load per use on a simple scale: 1 (fire and forget) to 4 (full attention required). Track whether it goes up or down over 10 uses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Threshold:&lt;/strong&gt; Attention per use should decrease over time. If you need to watch it more closely after ten uses than after one, something is wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where This Framework Lies to You
&lt;/h2&gt;

&lt;p&gt;I tested this framework against the hardest cases I could find. It failed in five ways. Knowing them makes it useful:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Decision quality matters more than quantity.&lt;/strong&gt; One high-stakes judgment (should I deploy?) outweighs 10 trivial picks (camelCase or snake_case?). Weight strategically.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Friction Delta can't measure capability expansion.&lt;/strong&gt; If a tool lets you do something new rather than just faster, skip this lens.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Attention ROI rewards deskilling.&lt;/strong&gt; The descending attention threshold is a Goodhart target — it rewards tools that train you to rubber-stamp.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Erasure cost is invisible.&lt;/strong&gt; The framework never asks: if I use this for a year, what can I no longer do without it?&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Error asymmetry is invisible.&lt;/strong&gt; Two tools can score identically while producing catastrophically different results when they fail.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  The Fourth Lens: Erasure Cost
&lt;/h2&gt;

&lt;p&gt;Ask: "If I use this tool for six months and then stop, what skill will I have lost?"&lt;/p&gt;

&lt;p&gt;Score it: 1 (nothing lost) to 4 (core competency outsourced). Score 1-2 is safe. Score 3 is a deliberate trade. Score 4 is dependency, not tooling.&lt;/p&gt;

&lt;h2&gt;
  
  
  How to Apply: Monday Morning
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;List every AI tool you have used in the last 30 days&lt;/li&gt;
&lt;li&gt;Score Replacement Ratio and Friction Delta for each&lt;/li&gt;
&lt;li&gt;Both pass → Keep. One fails → 7-day trial. Both fail → Cancel&lt;/li&gt;
&lt;li&gt;Score Erasure Cost for the survivors&lt;/li&gt;
&lt;li&gt;When evaluating a new tool: score it before subscribing&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Worked Examples
&lt;/h2&gt;

&lt;h3&gt;
  
  
  ChatGPT Plus ($20/month)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replacement Ratio:&lt;/strong&gt; 3.5. Replaces research lookups, drafting, formatting. Creates verification and prompt decisions. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Friction Delta:&lt;/strong&gt; Breakeven in 2-3 uses. Shallow learning curve. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attention ROI:&lt;/strong&gt; Decreasing. Gets faster as you learn its patterns. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Erasure Cost:&lt;/strong&gt; 2. The underlying skill (structuring an argument) is reinforced, not replaced.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Keep.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Cursor Pro ($20/month)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replacement Ratio:&lt;/strong&gt; 4.0. Replaces syntax lookups, boilerplate, function structure. Creates code review decisions. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Friction Delta:&lt;/strong&gt; Breakeven in 1-2 uses. Tab completion is instant. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attention ROI:&lt;/strong&gt; Steeply decreasing. Pass.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Erasure Cost:&lt;/strong&gt; 3. Heavy users report difficulty writing syntax without it after 3+ months. A deliberate trade worth making.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Keep for daily coding. Monitor erasure.&lt;/li&gt;
&lt;/ul&gt;

&lt;h3&gt;
  
  
  Meeting Summariser ($20/month, anonymised)
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Replacement Ratio:&lt;/strong&gt; 0.33. Replaces 1 decision. Creates 3. Fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Friction Delta:&lt;/strong&gt; Never breaks even. Still attend meetings, still verify. Fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Attention ROI:&lt;/strong&gt; Flat. Must check every summary at same level. Fails.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Erasure Cost:&lt;/strong&gt; 2. Minor skill atrophy.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Verdict:&lt;/strong&gt; Cancel.&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;This framework connects to a deeper structural principle: a tool's value is the difficulty it removes. If it creates new difficulty of a different kind, it is not a tool. It is a job.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Full framework with diagram: &lt;a href="https://telegra.ph/The-Decision-Subtraction-Framework-How-to-Evaluate-Any-AI-Tool-05-28" rel="noopener noreferrer"&gt;https://telegra.ph/The-Decision-Subtraction-Framework-How-to-Evaluate-Any-AI-Tool-05-28&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>productivity</category>
      <category>frameworks</category>
      <category>decisionmaking</category>
    </item>
    <item>
      <title>The Verification Bottleneck: Why Testing AI Agents Is Harder Than Building Them</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Sat, 23 May 2026 01:54:26 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-verification-bottleneck-why-testing-ai-agents-is-harder-than-building-them-3716</link>
      <guid>https://dev.to/harryfloyd/the-verification-bottleneck-why-testing-ai-agents-is-harder-than-building-them-3716</guid>
      <description>&lt;h1&gt;
  
  
  The Verification Bottleneck: Why Testing AI Agents Is Harder Than Building Them
&lt;/h1&gt;

&lt;p&gt;The AI industry has a supply problem. Not with chips, not with models, not with capital. It is a verification problem.&lt;/p&gt;

&lt;p&gt;Every week, another agent framework launches. Every month, another company announces autonomous task completion. The build rate is accelerating. But the verification rate, the speed at which we can determine whether an agent’s output is correct, safe and worth trusting is not keeping pace.&lt;/p&gt;

&lt;p&gt;This gap between build speed and verify speed is structural. It is not going to close on its own. And it is the bottleneck the market is not yet pricing.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Build-Verify Asymmetry
&lt;/h2&gt;

&lt;p&gt;Building an AI agent is getting cheaper. Open-weight models, orchestration frameworks like LangGraph and Semantic Kernel, managed inference APIs. The cost of standing up a functional agent pipeline has dropped by an order of magnitude in 18 months.&lt;/p&gt;

&lt;p&gt;Verification has not followed the same curve.&lt;/p&gt;

&lt;p&gt;To verify a single agent run, you need:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Ground-truth data for the task domain&lt;/li&gt;
&lt;li&gt;A mechanism for comparing structured and unstructured outputs&lt;/li&gt;
&lt;li&gt;Edge-case coverage for the long tail of user inputs&lt;/li&gt;
&lt;li&gt;Human review loops for uncertain cases&lt;/li&gt;
&lt;li&gt;Regression test suites that survive model updates&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each of these is expensive to build and maintain. And unlike the agent’s inference cost (which drops predictably with hardware improvements), verification cost is labour-proportional. Human review scales with headcount. Ground-truth data requires domain expertise to curate.&lt;/p&gt;

&lt;p&gt;This creates an asymmetry that compounds with scale:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Dimension&lt;/th&gt;
&lt;th&gt;Building&lt;/th&gt;
&lt;th&gt;Verifying&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Cost trajectory&lt;/td&gt;
&lt;td&gt;Dropping (compute + models)&lt;/td&gt;
&lt;td&gt;Flat or rising (labour + data)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Scaling method&lt;/td&gt;
&lt;td&gt;More compute&lt;/td&gt;
&lt;td&gt;More humans or better instruments&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Automation potential&lt;/td&gt;
&lt;td&gt;High (the agent itself)&lt;/td&gt;
&lt;td&gt;Low (ground truth is domain-specific)&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Failure mode&lt;/td&gt;
&lt;td&gt;No agent&lt;/td&gt;
&lt;td&gt;Wrong agent output that looks correct&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;The fourth row is the dangerous one. A false negative in verification (approving an incorrect agent output) has no visible failure signal until the downstream damage is done. A false positive, rejecting a correct output, creates friction and user frustration. Verification systems optimise for the wrong side of this trade-off because the wrong side is invisible.&lt;/p&gt;

&lt;h2&gt;
  
  
  What the Market Is Missing
&lt;/h2&gt;

&lt;p&gt;When developers talk about agent reliability, the conversation usually lands on one of three things: chain-of-thought prompting, retrieval-augmented generation quality, or model fine-tuning. These are useful but they are not verification. They are attempts to make the agent less likely to produce wrong outputs in the first place.&lt;/p&gt;

&lt;p&gt;Verification is a separate problem. It is the instrument that detects whether the output, regardless of how it was produced, is correct.&lt;/p&gt;

&lt;p&gt;This is a Law IV problem. Law IV of the Durability Curve framework says that hidden structure stays hidden until you build the instrument to observe it. The hidden structure in AI agents is their failure modes. We are deploying agents without instruments to observe failures at scale because those instruments do not exist yet in any reliable form.&lt;/p&gt;

&lt;p&gt;The companies that build those instruments will capture value that currently sits unclaimed.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Verification Infrastructure Looks Like
&lt;/h2&gt;

&lt;p&gt;The verification problem decomposes into layers:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Structural verification.&lt;/strong&gt; Does the agent’s output conform to a known schema? JSON parsers, Pydantic models, and structured output constraints handle this today. This layer is the most mature but only catches format errors, not semantic ones.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Semantic verification.&lt;/strong&gt; Does the output mean what we think it means? This is where the hard problems live. For a code-generating agent, does the produced code actually solve the user’s problem? For a document-analysis agent, are the extracted facts correct? This requires a second model, a verifier, running in evaluation mode.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Behavioural verification.&lt;/strong&gt; Does the agent behave appropriately across a distribution of inputs? Not just single-shot accuracy but conversation-level coherence, safety boundary adherence, and refusal calibration.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Observability.&lt;/strong&gt; Can you trace what the agent did, why it did it, and where it went wrong? This is the instrumentation layer: how tools, prompts and agent steps create signal. Datadog and ServiceNow are building in this space, but the landscape is fragmented and the standards are immature.&lt;/p&gt;

&lt;p&gt;The market currently prices the first layer (structural verification) as solved, which it mostly is. It ignores the existence of the second and third layers. And it treats the fourth layer as a monitoring problem rather than a verification problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Instrument-Making Opportunity
&lt;/h2&gt;

&lt;p&gt;The history of technology markets suggests a pattern: the layer that controls verification captures a disproportionate share of value.&lt;/p&gt;

&lt;p&gt;In software, the testing and observability tools (New Relic, Datadog, Selenium) created markets larger than many of the products they tested. In hardware, the inspection equipment market (KLA, ASML’s metrology) rivals the fabrication equipment market.&lt;/p&gt;

&lt;p&gt;The same dynamic is unfolding in AI. The companies building agent-verification infrastructure (whether through evaluation frameworks, structured-output tooling or agent-observability platforms) are building instruments for a structure the market does not yet see clearly.&lt;/p&gt;

&lt;p&gt;The falsification condition for this thesis is straightforward: if existing evaluation approaches (benchmarks, human review, test suites) prove sufficient for production agent deployment, the verification bottleneck does not materialise. But signals from production deployments, including the proliferation of guardrails, the emergence of dedicated evaluation teams at frontier labs and the growing literature on agent failure modes, suggest the opposite.&lt;/p&gt;

&lt;h2&gt;
  
  
  What Developers Should Watch
&lt;/h2&gt;

&lt;p&gt;Three signals indicate whether the verification layer is becoming load-bearing:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Signal one: agent deployment velocity vs. incident rate.&lt;/strong&gt; If agents are deployed faster but incident rates are not rising proportionally, verification is keeping pace. If incidents are accelerating faster than deployment, the bottleneck is tightening.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Signal two: emergence of dedicated agent-evaluation roles.&lt;/strong&gt; The first companies to hire “agent verifier” or “AI evaluation engineer” as a distinct role, outside QA, are signalling that verification is not a subset of testing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Signal three: consolidation around evaluation standards.&lt;/strong&gt; If the ecosystem converges on one or two evaluation frameworks (beyond simple benchmark suites) within the next 12 months, the instrument-making phase is accelerating.&lt;/p&gt;

&lt;p&gt;The build-verify asymmetry is not permanent. It is a market inefficiency that will correct, either through better verification infrastructure or through a pullback in agent deployment when undetected failures accumulate. Which correction path wins depends on whether the instrument makers move faster than the failure curve.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>architecture</category>
      <category>analysis</category>
      <category>infrastructure</category>
    </item>
    <item>
      <title>NVIDIA's $81.6B Quarter Confirms the Networking Bottleneck — Here's What Developers Should Know</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Thu, 21 May 2026 08:39:53 +0000</pubDate>
      <link>https://dev.to/harryfloyd/nvidias-816b-quarter-confirms-the-networking-bottleneck-heres-what-developers-should-know-5hal</link>
      <guid>https://dev.to/harryfloyd/nvidias-816b-quarter-confirms-the-networking-bottleneck-heres-what-developers-should-know-5hal</guid>
      <description>&lt;p&gt;NVIDIA reported $81.6 billion in revenue for Q1 FY2027 yesterday. That's an 85% year-over-year increase. Non-GAAP EPS of $1.87 beat consensus by 6%. Q2 guidance of $91 billion is 4% above what analysts expected.&lt;/p&gt;

&lt;p&gt;By every headline measure, this was a clean beat and raise. The stock closed up 1.8% and was flat after hours.&lt;/p&gt;

&lt;p&gt;The pattern is now five out of six quarters where NVIDIA beats the numbers and the stock doesn't rally. The headline numbers are priced before the print. The signal is in the &lt;em&gt;composition&lt;/em&gt; of revenue, not the total.&lt;/p&gt;

&lt;h2&gt;
  
  
  The networking number that changes the story
&lt;/h2&gt;

&lt;p&gt;Data Center networking revenue reached &lt;strong&gt;$14.8 billion&lt;/strong&gt; — a record, up &lt;strong&gt;199% year-over-year&lt;/strong&gt; and 35% sequentially. Compare that to Data Center compute revenue of $60.4 billion, which grew 77% year-over-year.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The networking segment is growing at 2.6x the rate of the compute segment.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;This is the clearest signal yet that the bottleneck in AI training is migrating from GPU FLOPs to inter-GPU bandwidth. As clusters scale past 50,000 devices, the wall-clock constraint shifts from "how fast can the GPU multiply matrices" to "how fast can the network move gradients between GPUs."&lt;/p&gt;

&lt;p&gt;Two years ago, networking was roughly 12% of Data Center revenue. It is now 20% and accelerating.&lt;/p&gt;

&lt;h2&gt;
  
  
  The full-stack moat is protecting margins
&lt;/h2&gt;

&lt;p&gt;GAAP gross margin reached &lt;strong&gt;74.9%&lt;/strong&gt; — up from 60.6% a year ago. This directly contradicts the narrative that Blackwell volume production would compress margins through CoWoS packaging and HBM memory costs.&lt;/p&gt;

&lt;p&gt;Margins are expanding because the full stack (CUDA + NVLink + Spectrum-X + Blackwell silicon) creates pricing power that chip-design-alone cannot match. No competitor currently replicates this combination of scale and margin.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key number:&lt;/strong&gt; Operating income of ~$53.3 billion, up 147% year-over-year. GAAP net income tripled to $58.3 billion.&lt;/p&gt;

&lt;h2&gt;
  
  
  $48.6 billion in free cash flow — in one quarter
&lt;/h2&gt;

&lt;p&gt;That's a 60% FCF margin. Only about 15 companies in the world generate more net profit in an entire year than NVIDIA generates in cash in three months.&lt;/p&gt;

&lt;p&gt;Management's confidence shows in capital returns: dividend from $0.01 to $0.25 per share (25x increase), new $80 billion buyback authorization, and ~$20 billion returned to shareholders during the quarter.&lt;/p&gt;

&lt;h2&gt;
  
  
  Vera Rubin timeline de-risks the transition
&lt;/h2&gt;

&lt;p&gt;Confirmed: on track for H2 2026, starting Q3, volume ramp in Q4. Architectural transitions are the highest-risk moment for any semiconductor company. This quarter's confirmation substantially de-risks the handoff.&lt;/p&gt;

&lt;p&gt;Blackwell demand is so strong it's driving up secondary-market prices for older Hopper and Ampere GPUs.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why the bottleneck matters for infrastructure engineers
&lt;/h2&gt;

&lt;p&gt;If you're designing training clusters, the implication is direct: the binding constraint on throughput is shifting from GPU selection to network topology. Spectrum-X Ethernet and InfiniBand choices matter more per dollar than GPU generation increments at the margin.&lt;/p&gt;

&lt;p&gt;For inference workloads, the bottleneck migration has different dynamics — memory bandwidth and model distribution across nodes become the constraint. But the pattern holds: the easy gains are no longer at the compute layer.&lt;/p&gt;

&lt;h2&gt;
  
  
  The one risk
&lt;/h2&gt;

&lt;p&gt;Data Center now accounts for 92% of revenue. This is not a company risk — it's a regime risk. If hyperscaler capex cycles, 92% of revenue faces the same headwind.&lt;/p&gt;

&lt;h2&gt;
  
  
  All falsification triggers are green
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;Q2 guide below $85B → Guided $91B&lt;/li&gt;
&lt;li&gt;Vera Rubin delayed beyond Q3 → On track Q3&lt;/li&gt;
&lt;li&gt;Gross margin below 73% → 74.9%&lt;/li&gt;
&lt;li&gt;Networking growth &amp;lt; compute growth → 199% vs 77%&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The thesis is intact and the data is strengthening it.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://durability-curve.pages.dev/blog/nvda-q1-fy2027-the-networking-number-that-changes-the-story/" rel="noopener noreferrer"&gt;The Durability Curve&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>infrastructure</category>
      <category>architecture</category>
      <category>investing</category>
    </item>
    <item>
      <title>The Indium Phosphide Bottleneck: What the Market Missed</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Wed, 20 May 2026 11:19:09 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-indium-phosphide-bottleneck-what-the-market-missed-44lc</link>
      <guid>https://dev.to/harryfloyd/the-indium-phosphide-bottleneck-what-the-market-missed-44lc</guid>
      <description>&lt;p&gt;A UK semiconductor company lost a quarter of its value in a single day this week. The market assumed Huawei exposure was catastrophic. The Huawei exposure is under five percent of revenue.&lt;/p&gt;

&lt;p&gt;The panic was about the wrong thing.&lt;/p&gt;

&lt;h2&gt;
  
  
  Where Indium Phosphide Fits in the AI Stack
&lt;/h2&gt;

&lt;p&gt;Every 1.6T optical transceiver shipping into an AI cluster relies on lasers made from indium phosphide (InP). Silicon cannot lase. InP does.&lt;/p&gt;

&lt;p&gt;The supply chain:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Raw indium (zinc byproduct)
  → InP boule → sliced into substrates (AXT Inc)
    → Epitaxial wafer growth (IQE)
      → Laser fabrication (Lumentum, Coherent, MACOM)
        → Transceiver assembly
          → AI GPU clusters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;IQE sits at the third step. It is the only independent large-scale epiwafer foundry with qualified InP capacity globally.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Switching Is Hard
&lt;/h2&gt;

&lt;p&gt;Qualifying a new epiwafer supplier takes 12-24 months. Sample testing, device fabrication trials, reliability qualification. Once qualified, switching costs are enormous. There are approximately five qualified InP epiwafer suppliers worldwide. IQE is the only independent one at scale.&lt;/p&gt;

&lt;h2&gt;
  
  
  The MACOM Signal
&lt;/h2&gt;

&lt;p&gt;In April 2026, MACOM invested £81M into IQE at 19.8p per share. MACOM received 11.5% ownership and two board seats, plus a long-term strategic supply agreement. The message: they cannot secure InP capacity without IQE.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why Silicon Photonics Does Not Kill This
&lt;/h2&gt;

&lt;p&gt;The common objection: silicon photonics will replace InP. This misunderstands the technology. Even silicon photonic ICs need InP lasers, integrated heterogeneously. Monolithic lasers-on-silicon are not expected before 2030. InP is needed regardless of which platform wins.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Tension
&lt;/h2&gt;

&lt;p&gt;IQE is not a clean story. Gross margins were under 4% last year on £118M revenue because reactors run at roughly half capacity. The wireless segment (57% of revenue) is declining. Three CEOs in five years. The current CEO also serves as CFO.&lt;/p&gt;

&lt;p&gt;But the underlying structural thesis is clean: the InP bottleneck is real and tightening. AXT Inc is doubling capacity for 2027. Lumentum reported 90% revenue growth. The optical interconnect buildout for AI is a capital expenditure cycle backed by the largest technology companies in the world.&lt;/p&gt;

&lt;p&gt;The pattern has historical precedent. As Marc Levinson documents in &lt;a href="https://www.amazon.co.uk/dp/0691136408/?tag=giftfndr0d8-21" rel="noopener noreferrer"&gt;The Box&lt;/a&gt;, the shipping container created a modular interface that collapsed freight costs — the same structural pattern happening now in optical interconnects.&lt;/p&gt;

&lt;p&gt;The market panicked about Huawei and ignored the structural story underneath. That is the pattern: value migrates upward as lower layers commoditise, and the market is slow to update on where the new bottleneck sits.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://telegra.ph/The-Indium-Phosphide-Bottleneck-05-20" rel="noopener noreferrer"&gt;Telegraph&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>infrastructure</category>
      <category>analysis</category>
      <category>architecture</category>
    </item>
    <item>
      <title>The Earnings Report Is Not the Signal</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Wed, 20 May 2026 09:15:00 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-earnings-report-is-not-the-signal-3l6f</link>
      <guid>https://dev.to/harryfloyd/the-earnings-report-is-not-the-signal-3l6f</guid>
      <description>&lt;p&gt;NVDA reports $80B+ tonight. The headlines will focus on the number. Here is why it will not tell you what you need to know about the AI buildout paradigm.&lt;/p&gt;

&lt;p&gt;Every earnings season, the same pattern plays out. A company reports a number. The market moves. Analysts revise targets. By the next morning, everyone is asking the wrong question.&lt;/p&gt;

&lt;p&gt;The question is not whether NVIDIA beat or missed. The question is what the structure of the quarter reveals about the migration path of the bottleneck.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Earnings Number Is a Lagging Indicator
&lt;/h2&gt;

&lt;p&gt;A revenue beat tells you what happened in the past ninety days. The structural variables live in the forward-looking text, not the headline number.&lt;/p&gt;

&lt;p&gt;Consider three data points from NVIDIA's last quarter:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Revenue of $68 billion. A beat. Priced in hours.&lt;/li&gt;
&lt;li&gt;Supply commitments doubled to $95 billion. A signal. Took weeks to fully price.&lt;/li&gt;
&lt;li&gt;A $500 million investment in a glass company. A map. Months later, it is still being understood.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Three Kinds of Data in Every Report
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Signals&lt;/strong&gt; — Forward-looking structural data that changes the probability distribution. Supply commitments. Lead times. Capacity expansion timelines. Customer concentration shifts. Rare. Worth a position.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Noise&lt;/strong&gt; — Beats and misses within expected range. Drive the overnight move. Mean nothing for the thesis.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Echoes&lt;/strong&gt; — Lagging confirmation of a known trend. Useful for calibration. Add no new information.&lt;/p&gt;

&lt;h2&gt;
  
  
  What to Watch Tonight
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;The Data Center narrative&lt;/strong&gt; — is the mix shifting from training to inference? That is Law I migration.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supply chain language&lt;/strong&gt; — optical interconnects and glass substrates in prepared remarks are structural signals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Supply commitment growth rate&lt;/strong&gt; — if it exceeds revenue growth, NVIDIA is building against a constraint they see coming.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Why This Framework Matters
&lt;/h2&gt;

&lt;p&gt;The durability of an investment thesis is determined by how well you distinguish signal from noise from echo. The market is designed to make everything look equally important. The structure that separates durable value from temporary noise is invisible to the real-time feed.&lt;/p&gt;

&lt;p&gt;The report is not the signal. The structure behind the report is the signal.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Originally published on &lt;a href="https://telegra.ph/The-Earnings-Report-Is-Not-the-Signal-05-20" rel="noopener noreferrer"&gt;The Durability Curve&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>infrastructure</category>
      <category>analysis</category>
      <category>architecture</category>
    </item>
    <item>
      <title>3 Infrastructure Bottlenecks That Exist Beyond Any Single Earnings Report</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Tue, 19 May 2026 09:25:37 +0000</pubDate>
      <link>https://dev.to/harryfloyd/3-infrastructure-bottlenecks-that-exist-beyond-any-single-earnings-report-2nd0</link>
      <guid>https://dev.to/harryfloyd/3-infrastructure-bottlenecks-that-exist-beyond-any-single-earnings-report-2nd0</guid>
      <description>&lt;p&gt;NVIDIA reports Q1 FY2027 earnings tomorrow. The consensus sits at roughly $79B in revenue with a 7% options move priced in. Every analyst note and headline will be about whether the number clears the bar.&lt;/p&gt;

&lt;p&gt;None of that matters for the three bottlenecks that will define compute infrastructure investment for the next three years.&lt;/p&gt;

&lt;p&gt;These constraints are structurally decoupled from NVIDIA’s quarterly variance. They exist whether the beat is 3% or 8%. They exist because of a first-principles property of large-scale systems: value migrates upward as lower layers commoditise.&lt;/p&gt;

&lt;p&gt;The GPU layer is the lower layer. These three sit above it.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Grid Transformer: 128 Weeks and No Substitute
&lt;/h2&gt;

&lt;p&gt;The electrical transformer that steps voltage from transmission lines to data center distribution levels has a procurement lead time of 80 to 128 weeks globally. This is a structural ceiling on how fast AI infrastructure can physically be built.&lt;/p&gt;

&lt;p&gt;This constraint exists regardless of GPU supply. You can have every Blackwell GPU allocated and paid for. If the transformer cabinet is not bolted to a concrete pad with an energized feed from the grid, those GPUs are room-temperature silicon.&lt;/p&gt;

&lt;p&gt;The market is slowly noticing. ABB, Siemens Energy, and Schneider Electric have rerated upward. But the real asymmetry sits two layers deeper.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Optical Link: Data Movement, Not Compute
&lt;/h2&gt;

&lt;p&gt;Inside every large AI training cluster, data moves between GPUs at terabit speeds. As clusters scale to 100,000+ GPUs, the distance between nodes forces a fundamental migration from electrical to optical interconnects.&lt;/p&gt;

&lt;p&gt;NVIDIA spent approximately $4B securing supply from Lumentum and Coherent. The photonics supply chain is tight through at least 2027.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The Evaluation Layer: Trust as Infrastructure
&lt;/h2&gt;

&lt;p&gt;As AI agents move from demos into production workflows, the binding constraint shifts from “can the model do this?” to “can we prove it did it correctly?”&lt;/p&gt;

&lt;p&gt;2-3% of AI-generated code passes tests but contains subtle errors. Catching that minority requires evaluation infrastructure that most teams do not have.&lt;/p&gt;

&lt;p&gt;This market barely exists today. No dominant platform for AI evaluation exists. That absence is itself the pattern: the layer below is commoditising, and the layer above is where value forms next.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why These Three
&lt;/h2&gt;

&lt;p&gt;Each bottleneck sits at a layer above the GPU in the infrastructure stack. The GPU is being efficiently priced by the market. The constraints above it are where the market’s attention has not yet reached.&lt;/p&gt;

&lt;p&gt;NVIDIA will beat or miss tomorrow. By Thursday the financial media will have moved on. These three bottlenecks will still be tightening.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;36-page NVIDIA earnings research report: &lt;a href="https://harryfloyd.gumroad.com/l/nvda-q1-fy2027-earnings-research-report?utm_source=devto" rel="noopener noreferrer"&gt;Gumroad&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Free weekly analysis: &lt;a href="https://harryfloyd.substack.com?utm_source=devto" rel="noopener noreferrer"&gt;The Durability Curve&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>infrastructure</category>
      <category>architecture</category>
      <category>analysis</category>
    </item>
    <item>
      <title>The 4 Hidden Bottlenecks in the GLP-1 Supply Chain</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Mon, 18 May 2026 10:08:17 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-4-hidden-bottlenecks-in-the-glp-1-supply-chain-3i8l</link>
      <guid>https://dev.to/harryfloyd/the-4-hidden-bottlenecks-in-the-glp-1-supply-chain-3i8l</guid>
      <description>&lt;p&gt;Ozempic, Wegovy, Mounjaro, Zepbound. Everyone knows the drugs. Most investors know the pharma companies that sell them: Novo Nordisk, Eli Lilly. A smaller group knows the CDMOs that manufacture the active ingredients: Bachem, PolyPeptide, CordenPharma.&lt;/p&gt;

&lt;p&gt;Almost nobody knows the other three layers of the supply chain. Each layer is an independent bottleneck with multi-year qualification timelines. And right now, all four are binding simultaneously.&lt;/p&gt;

&lt;p&gt;The GLP-1 supply chain is not one bottleneck. It is four. Stacked. Each one capable of throttling the entire market.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 1: Peptide CDMO Capacity
&lt;/h2&gt;

&lt;p&gt;The drug substance itself. GLP-1s are peptides — chains of 30+ amino acids synthesised through solid-phase chemistry at metric-ton scale. Before 2022, this was research-grade production. The world woke up and realised nobody had built industrial-scale capacity.&lt;/p&gt;

&lt;p&gt;Bachem, PolyPeptide, and CordenPharma form the Western core. Combined capacity expansions exceed $5 billion between 2024 and 2027. CordenPharma alone committed EUR 900 million for 30,000 litres of capacity. All operate near full utilisation with backlogs exceeding available supply. New entrants need 3-5 years for GMP certification.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 2: Glass Containment and Elastomer Components
&lt;/h2&gt;

&lt;p&gt;Every injectable GLP-1 needs a physical container: glass vials, cartridges, pre-filled syringes, rubber stoppers, plunger assemblies. These components are not commodities you can switch overnight.&lt;/p&gt;

&lt;p&gt;West Pharmaceutical Services (WST) has dominated elastomer components for 90+ years with 30,000+ component variants. Stevanato Group (STVN) makes the glass packaging. Regulatory qualification for a new supplier takes 3 to 5 years — the same timeline as the CDMO layer.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key insight:&lt;/strong&gt; If both Layer 1 and Layer 2 constrain simultaneously, neither additional API production nor additional device assembly can relieve the shortage. The binding constraint is whichever layer has the longer lead time.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 3: Autoinjector and Pen Device Assembly
&lt;/h2&gt;

&lt;p&gt;The physical device that delivers the drug. The global autoinjector market reached $107 billion in 2025 with projections of $300 billion by 2031. Becton Dickinson has secured long-term contracts including two next-generation GLP-1 programmes with major pharma.&lt;/p&gt;

&lt;p&gt;This layer serves all injectable biologics across rheumatology, oncology, and autoimmune indications, not only GLP-1s. That diversification makes it the most resilient layer — and the longest-duration investment thesis.&lt;/p&gt;

&lt;h2&gt;
  
  
  Layer 4: Oligonucleotide CDMO Capacity
&lt;/h2&gt;

&lt;p&gt;Oligonucleotides — ASOs, siRNAs, gRNAs — share the same solid-phase synthesis chemistry as peptides. The industry calls this TIDES for a reason. The oligo CDMO market sits at $2.55 billion in 2024, growing at 22% CAGR toward $6.73 billion by 2029.&lt;/p&gt;

&lt;p&gt;Bachem holds a strategic oligo manufacturing collaboration with Eli Lilly since 2022. This layer is earlier in its lifecycle — lower catalyst density, same structural physics.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Paired-Position Framework
&lt;/h2&gt;

&lt;p&gt;These four layers have different falsification profiles. The CDMO layer faces direct risk from oral non-peptide GLP-1s like orforglipron (recently approved). The device and containment layers do not, because injectable volume grows in absolute terms regardless of oral market share — obesity penetration is still below 5%.&lt;/p&gt;

&lt;p&gt;This creates a natural paired-position framework: higher-risk CDMO exposure hedged by longer-duration device and containment exposure within the same end market.&lt;/p&gt;

&lt;p&gt;The generic semaglutide wave that began in March 2026 (patent expired, 50+ manufacturers launched at 80% price reduction) is the stress test. It reveals which layers bind first under volume pressure.&lt;/p&gt;




&lt;p&gt;&lt;strong&gt;More from the Durability Curve:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harryfloyd.substack.com?utm_source=devto" rel="noopener noreferrer"&gt;Subscribe on Substack&lt;/a&gt; for free weekly analysis&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harryfloyd.gumroad.com/l/peptide-bottleneck?utm_source=devto" rel="noopener noreferrer"&gt;The Peptide Bottleneck report&lt;/a&gt; — full multi-layer supply chain analysis with ticker-level breakdowns&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>biotech</category>
      <category>pharma</category>
      <category>supplychain</category>
      <category>analysis</category>
    </item>
    <item>
      <title>The 3 Numbers That Matter More Than NVIDIA's $80 Billion Quarter</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Mon, 18 May 2026 08:33:58 +0000</pubDate>
      <link>https://dev.to/harryfloyd/the-3-numbers-that-matter-more-than-nvidias-80-billion-quarter-24bh</link>
      <guid>https://dev.to/harryfloyd/the-3-numbers-that-matter-more-than-nvidias-80-billion-quarter-24bh</guid>
      <description>&lt;p&gt;NVIDIA reports Q1 FY2027 earnings on May 20, 2026. The headline revenue number for the quarter ending April 2026 is expected to land around $78-80 billion — a roughly 70% year-over-year increase. Every financial outlet will lead with that number. It will be called a "blowout" or a "disappointment" based on whether it beats the whisper number by $1 billion or $3 billion.&lt;/p&gt;

&lt;p&gt;That number tells you what already happened. It tells you nothing about the structural shifts that will determine whether NVIDIA is worth $2 trillion or $5 trillion 18 months from now.&lt;/p&gt;

&lt;p&gt;Three metrics buried in the report and the earnings call will tell that story. Most coverage will miss them.&lt;/p&gt;

&lt;h2&gt;
  
  
  1. The Optical Attach Rate
&lt;/h2&gt;

&lt;p&gt;In the last 10 weeks, over $4.7 billion has been committed to optical and photonics supply chain capacity across three independent layers:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Corning (GLW):&lt;/strong&gt; $500 million in warrants for AI-grade fiber optic cabling, with three new US plants planned.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Lumentum (LITE):&lt;/strong&gt; $2 billion in equity and purchase commitments for lasers, transceivers, and silicon photonics.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coherent (COHR):&lt;/strong&gt; $2 billion in structured placement for co-packaged optics and EML lasers.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ayar Labs:&lt;/strong&gt; ~$155 million from NVIDIA specifically for GPU scale-up optics.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;This is not a coincidence. The supply chain is signalling that optical interconnects have become a non-negotiable cost of scaling AI clusters. Each of these companies beat revenue estimates in their most recent quarters and all three sold off post-earnings on valuation compression — not demand weakness.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Listen for:&lt;/strong&gt; Does Jensen mention optical supply chain investments on the call? Any reference to Corning, Lumentum, or Coherent validates the thesis that bandwidth is replacing compute as the binding constraint in AI infrastructure.&lt;/p&gt;

&lt;h2&gt;
  
  
  2. The Q2 Revenue Guide Range
&lt;/h2&gt;

&lt;p&gt;The headline "beat" matters less than the forward guidance. Consensus expects roughly $80 billion for Q2. The question is whether NVIDIA guides $82 billion (indicating sustained acceleration) or $78 billion (signalling the ramp is hitting constraints or hyperscalers are pausing).&lt;/p&gt;

&lt;p&gt;The Q2 guide is where the market learns whether Blackwell's ramp is accelerating smoothly or bumping against CoWoS capacity and power delivery constraints. The difference between $78 billion and $82 billion guidance is a $200+ billion swing in market cap.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Listen for:&lt;/strong&gt; The shape of the guide matters more than the number. A narrow range signals confidence. A wide range signals uncertainty about demand visibility or supply constraints.&lt;/p&gt;

&lt;h2&gt;
  
  
  3. The Gross Margin Trajectory on Blackwell
&lt;/h2&gt;

&lt;p&gt;The market will react if gross margins compress below 73%. Blackwell is a new architecture — early ramp margins are always lower, especially on a transition this large. The Street expects roughly 72-73% GAAP gross margin for Q1.&lt;/p&gt;

&lt;p&gt;The number itself matters less than the commentary. If management says "margins normalize to 75%+ by Q3," that signals production efficiency and demand density. If they hedge or push normalization to H2 2026, that signals design complexity, yield issues, or pricing pressure from hyperscalers.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Listen for:&lt;/strong&gt; Attach any margin commentary to specific products. Blackwell Ultra margins vs Hopper tail demand tells you whether the product transition is accretive or dilutive to unit economics.&lt;/p&gt;




&lt;p&gt;These three numbers tell you more about NVIDIA's structural trajectory than the $80 billion headline. The revenue number is rear-view mirror. The attach rate, the guide, and the margin trajectory are the windshield.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;More from the Durability Curve:&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;a href="https://harryfloyd.substack.com?utm_source=devto" rel="noopener noreferrer"&gt;Subscribe on Substack&lt;/a&gt; for free weekly AI infrastructure analysis&lt;/li&gt;
&lt;li&gt;
&lt;a href="https://harryfloyd.gumroad.com?utm_source=devto" rel="noopener noreferrer"&gt;Full research reports on Gumroad&lt;/a&gt; — in-depth investment research&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>ai</category>
      <category>investing</category>
      <category>infrastructure</category>
      <category>analysis</category>
    </item>
    <item>
      <title>NVIDIA Q1 FY2027 Earnings Preview — 5 Signals the Market May Be Missing</title>
      <dc:creator>Harry Floyd</dc:creator>
      <pubDate>Sun, 17 May 2026 11:17:20 +0000</pubDate>
      <link>https://dev.to/harryfloyd/nvidia-q1-fy2027-earnings-preview-5-signals-the-market-may-be-missing-2j79</link>
      <guid>https://dev.to/harryfloyd/nvidia-q1-fy2027-earnings-preview-5-signals-the-market-may-be-missing-2j79</guid>
      <description>&lt;p&gt;NVIDIA reports Q1 FY2027 earnings on &lt;strong&gt;May 20, 2026&lt;/strong&gt;, after market close. The consensus expects approximately &lt;strong&gt;$78.1-78.8 billion in revenue and $1.74 EPS&lt;/strong&gt;, with Citi and Wells Fargo running slightly higher at ~$80B and $1.79 respectively.&lt;/p&gt;

&lt;p&gt;The stock closed at ~$225 on the most recent trading day, roughly a 27x forward P/E on FY2027 estimates. This is not a distressed entry point. It is a thesis-testing moment.&lt;/p&gt;

&lt;p&gt;Below are 5 signals to watch that go beyond the headline beat-or-miss narrative. Each maps to a specific structural claim about NVIDIA's position in the AI infrastructure stack, and each has a falsification trigger that would challenge that claim.&lt;/p&gt;




&lt;h2&gt;
  
  
  1. Purchase Commitments: Are They Still Rising?
&lt;/h2&gt;

&lt;p&gt;NVIDIA's supply-related purchase commitments rose from &lt;strong&gt;$50.3 billion to $95.2 billion&lt;/strong&gt; between Q3 and Q4 FY2026, nearly doubling in a single quarter. This is not optional inventory building. It is NVIDIA aggressively locking in component supply for constraints it believes are structural.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to watch:&lt;/strong&gt; If commitments rise further in Q1, NVIDIA is deepening its supply chain lock-in through at least 2027. If they flatten, either supply is easing (bullish for margins) or suppliers have hit allocation limits (bearish for revenue growth).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Falsification trigger:&lt;/strong&gt; A flat or declining commitment trajectory would suggest NVIDIA sees peak demand behind it.&lt;/p&gt;




&lt;h2&gt;
  
  
  2. Optical Interconnect Mentions
&lt;/h2&gt;

&lt;p&gt;On May 6, NVIDIA announced a &lt;strong&gt;$500 million partnership with Corning (GLW)&lt;/strong&gt; — three new US optical factories, 10x capacity increase, and a warrant structure giving NVIDIA up to 15 million shares at $180. This follows &lt;strong&gt;$2B+ purchase commitments each to Lumentum (LITE)&lt;/strong&gt; (which reported $808M revenue, +90% YoY on May 5) and &lt;strong&gt;Coherent (COHR)&lt;/strong&gt; ($1.81B, +21% YoY on May 6).&lt;/p&gt;

&lt;p&gt;Combined, over &lt;strong&gt;$4.7B+ was committed to the optical supply chain in 10 weeks&lt;/strong&gt; across three independent layers: passive fiber (Corning), active components (LITE, COHR), and co-packaged optics (Ayar Labs, ~$155M NVIDIA portion).&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to watch:&lt;/strong&gt; If management references Corning, Lumentum, or Coherent by name on the call, it validates the thesis that optical interconnect is the next binding constraint beyond HBM memory. Silence on optical supply is a missed signal.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Falsification trigger:&lt;/strong&gt; If optical supply is described as "secured" or "no longer a concern," the bottleneck thesis for the photonics layer weakens meaningfully.&lt;/p&gt;




&lt;h2&gt;
  
  
  3. Blackwell Margin Trajectory
&lt;/h2&gt;

&lt;p&gt;The Street is fixated on gross margins during the Blackwell ramp. The concern is that the more complex B200/B300 packaging (CoWoS-L) compresses margins compared to the simpler H100/H200 designs.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to watch:&lt;/strong&gt; The directional trajectory matters more than the absolute number. Sequential margin expansion indicates the ramp is absorbing complexity costs. Compression suggests the packaging premium is permanent.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The deeper signal:&lt;/strong&gt; NVIDIA's gross margin has been the most-watched metric for 8 consecutive quarters. The market has already priced in margin compression, so an in-line or better margin print removes a major overhang. A miss amplifies the ASIC-competition narrative.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Context:&lt;/strong&gt; NVIDIA pre-booked approximately 60% of TSMC's total 2026 CoWoS output (per Morgan Stanley), with demand of ~700,000 wafers. CoWoS is the packaging constraint. If margins hold despite this capacity scramble, the GPU economics thesis is intact.&lt;/p&gt;




&lt;h2&gt;
  
  
  4. Inference Mix and Agent Workload Commentary
&lt;/h2&gt;

&lt;p&gt;The AI market narrative shifted in 2026 from "training is everything" to "inference is the growth vector." Jensen Huang's GTC keynote emphasized &lt;strong&gt;AI factories&lt;/strong&gt; as long-running inference infrastructure, not just training clusters. Agentic workloads -- autonomous systems that chain multiple model calls per task -- compound inference demand beyond what chat-era projections captured.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to watch:&lt;/strong&gt; Any qualitative commentary about inference workload growth, token demand trajectories, or agentic infrastructure spend. If management quantifies inference as a growing share of data center revenue, it supports the thesis that AI compute demand has structural legs beyond model training.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Falsification trigger:&lt;/strong&gt; If inference growth is described as "migrating to edge devices" or "handled by CPU-based systems," the GPU-inference thesis weakens. If there is silence on this topic entirely, the market may be overestimating inference demand relative to what the company sees.&lt;/p&gt;




&lt;h2&gt;
  
  
  5. ASIC Competition Framing
&lt;/h2&gt;

&lt;p&gt;The most credible competitive threat to NVIDIA is custom hyperscaler ASICs: Google's &lt;strong&gt;TPU 8t/8i&lt;/strong&gt; (April 22 launch, Anthropic committed to 1M TPU v7 chips), Amazon's &lt;strong&gt;Trainium 2&lt;/strong&gt;, and Meta's &lt;strong&gt;MTIA&lt;/strong&gt; (accelerating with Broadcom through 2029). Industry analysts project custom ASIC shipments growing significantly faster than GPU shipments in 2026 as hyperscalers vertically integrate.&lt;/p&gt;

&lt;p&gt;However, every ASIC still needs &lt;strong&gt;HBM memory, optical interconnect, and CoWoS packaging&lt;/strong&gt; -- all of which NVIDIA has pre-booked at scale. ASIC growth at the margin does not necessarily mean NVIDIA loses revenue. It means the total AI compute pie is growing, and NVIDIA captures the GPU slice while participating in the broader ecosystem via supply chain positioning.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;What to watch:&lt;/strong&gt; How management frames ASIC competition. If they dismiss it as irrelevant, that would suggest they are not tracking the custom silicon trend. If they acknowledge it and frame NVIDIA's counter-position (CUDA moat, NVLink, ecosystem), that signals clear-eyed strategy.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Falsification trigger:&lt;/strong&gt; A hyperscaler announcing that their internal chip has reached &amp;gt;50% utilization across their own AI workload would signal structural erosion. Not expected this quarter.&lt;/p&gt;




&lt;h2&gt;
  
  
  Putting It Together
&lt;/h2&gt;

&lt;p&gt;The standard earnings framework (beat, miss, guide, P/E) tells you how the market feels about NVIDIA today. The 5 signals above test whether NVIDIA's structural position is &lt;strong&gt;improving or eroding.&lt;/strong&gt;&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Signal&lt;/th&gt;
&lt;th&gt;What It Tests&lt;/th&gt;
&lt;th&gt;Bullish&lt;/th&gt;
&lt;th&gt;Bearish&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Purchase commitments&lt;/td&gt;
&lt;td&gt;Supply chain conviction&lt;/td&gt;
&lt;td&gt;Rising&lt;/td&gt;
&lt;td&gt;Flat/falling&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Optical mentions&lt;/td&gt;
&lt;td&gt;Bottleneck migration thesis&lt;/td&gt;
&lt;td&gt;Named by management&lt;/td&gt;
&lt;td&gt;Not discussed&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Blackwell margins&lt;/td&gt;
&lt;td&gt;Ramp economics&lt;/td&gt;
&lt;td&gt;Expanding&lt;/td&gt;
&lt;td&gt;Compressing&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Inference mix&lt;/td&gt;
&lt;td&gt;Demand durability&lt;/td&gt;
&lt;td&gt;Quantified growth&lt;/td&gt;
&lt;td&gt;Silent/edge-focused&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;ASIC framing&lt;/td&gt;
&lt;td&gt;Competitive awareness&lt;/td&gt;
&lt;td&gt;Acknowledged + countered&lt;/td&gt;
&lt;td&gt;Dismissed&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;&lt;strong&gt;The Durability Curve framework&lt;/strong&gt; rates NVIDIA as a Law I (Bottleneck Migration) and Law II (Difficulty Is Load-Bearing) play. The falsification triggers above test both laws. Through this lens, May 20 is not about whether NVIDIA beats by $1B. It is about whether the structural evidence supports or challenges the durability thesis.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;This analysis is derived from the **Durability Curve&lt;/em&gt;* research framework, a systematic approach to identifying AI infrastructure bottlenecks before they are priced. The full 36-page NVIDIA Q1 FY2027 earnings research report with detailed falsification triggers, supply chain signal verification across all 5 layers, and options positioning framework is available at:*&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://harryfloyd.gumroad.com/l/nvda-q1-fy2027-earnings-research-report?utm_source=devto" rel="noopener noreferrer"&gt;📄 NVIDIA Q1 FY2027 Earnings Research Report (36 pages, £9)&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Follow &lt;a href="https://techhub.social/@durabilitycurve" rel="noopener noreferrer"&gt;@durabilitycurve&lt;/a&gt; on Mastodon for real-time signal monitoring during the earnings call. Free weekly analysis at &lt;a href="https://harryfloyd.substack.com?utm_source=devto" rel="noopener noreferrer"&gt;harryfloyd.substack.com&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Not financial advice. All data points verified against public sources as of May 17, 2026. Verify independently before making investment decisions.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>infrastructure</category>
      <category>investing</category>
      <category>analysis</category>
    </item>
  </channel>
</rss>
