<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Hideki Mori</title>
    <description>The latest articles on DEV Community by Hideki Mori (@hidekimori).</description>
    <link>https://dev.to/hidekimori</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3903757%2F5d1a2986-7f25-4c35-b5e8-4d489fc18a94.png</url>
      <title>DEV Community: Hideki Mori</title>
      <link>https://dev.to/hidekimori</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/hidekimori"/>
    <language>en</language>
    <item>
      <title>Dynamic isn't enough. Operations is the other half.</title>
      <dc:creator>Hideki Mori</dc:creator>
      <pubDate>Mon, 18 May 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/hidekimori/dynamic-isnt-enough-operations-is-the-other-half-2d8f</link>
      <guid>https://dev.to/hidekimori/dynamic-isnt-enough-operations-is-the-other-half-2d8f</guid>
      <description>&lt;p&gt;If you've been writing software for a few years, you eventually start designing for change.&lt;/p&gt;

&lt;p&gt;A content delivery system: a new format will appear, a new player will appear, so let's not hard-code formats. An e-commerce site: a new payment method will be requested, so let's not hard-code billing logic. A document pipeline: a new processing engine will appear, so let's not hard-code engines.&lt;/p&gt;

&lt;p&gt;This is good instinct. You read books like &lt;em&gt;Analysis Patterns&lt;/em&gt;, you internalize the idea that the model should be flexible, you push concrete decisions out of code and into data. New format? Insert a row. New payment? Insert a row. New engine? Insert a row.&lt;/p&gt;

&lt;p&gt;It's a beautiful pattern. It really does extend the lifespan of a system.&lt;/p&gt;

&lt;p&gt;But after enough years of running systems built this way, here's the part that doesn't show up in the books:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;Dynamic isn't enough. Operations is the other half — and that other half is usually paid by someone else.&lt;/p&gt;
&lt;/blockquote&gt;




&lt;h2&gt;
  
  
  The "wait, maybe..." moment
&lt;/h2&gt;

&lt;p&gt;Reality always overshoots the model. Always.&lt;/p&gt;

&lt;p&gt;You build a content system that handles formats dynamically, and then someone asks for a format whose delivery rules don't fit the existing shape. You build a payment system that takes new methods through configuration, and then someone wants a billing model where the "amount" isn't even computed at the same time as the "charge." You build an engine registry, and then a new engine doesn't conform to the lifecycle every other engine has assumed.&lt;/p&gt;

&lt;p&gt;Each time, the moment looks the same. You stare at the request, and your first thought is: &lt;em&gt;that's outside the model&lt;/em&gt;. And your second thought, a few minutes later, is:&lt;/p&gt;

&lt;p&gt;&lt;em&gt;Wait, maybe... if I interpret this field a little differently, and if I add one column here, and if I let this branch handle a special case... it fits. Sort of. It doesn't break anything.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;So you do that.&lt;/p&gt;

&lt;p&gt;The system keeps running. No downtime. No coordinated migration. No painful conversation with the integration partners. From your seat as the designer, the dynamic structure absorbed the change. The pattern worked.&lt;/p&gt;

&lt;p&gt;That's the part designers love. That's the part that gets written about.&lt;/p&gt;




&lt;h2&gt;
  
  
  Who pays for the other half
&lt;/h2&gt;

&lt;p&gt;Here's what's harder to see from the designer's seat: every "wait, maybe..." moment leaves a small residue somewhere in operations.&lt;/p&gt;

&lt;p&gt;The operator now has to remember that for &lt;em&gt;this&lt;/em&gt; particular case, the workflow is slightly different. The reporting tool needs a footnote. The on-call playbook gets one more conditional. The next person to onboard needs another paragraph of context. The integrating system has to send a value in a slightly unexpected place.&lt;/p&gt;

&lt;p&gt;None of these are catastrophic. None of them break the system. Each one, on its own, is a small inconvenience.&lt;/p&gt;

&lt;p&gt;But the small inconveniences accumulate. And the people accumulating them are usually not the people who designed the dynamic structure in the first place.&lt;/p&gt;

&lt;p&gt;The designer experiences each "wait, maybe..." as a successful absorption. The operator experiences it as one more thing to remember. Same event, two ledgers. Only the designer's ledger gets reviewed.&lt;/p&gt;

&lt;p&gt;This is where engineering satisfaction can become quietly self-serving. The system didn't break. The pattern held. Therefore the design is good — except &lt;em&gt;good for whom&lt;/em&gt;, exactly?&lt;/p&gt;




&lt;h2&gt;
  
  
  The road I didn't take
&lt;/h2&gt;

&lt;p&gt;There is, of course, an alternative. You can stop the system, ship version 2, ask everyone to migrate, and start clean.&lt;/p&gt;

&lt;p&gt;A scheduled 24-hour outage. A v2 documentation packet sent to every integrating partner. A coordination meeting. A cutover.&lt;/p&gt;

&lt;p&gt;I have, for twenty-four years, mostly chosen &lt;em&gt;not&lt;/em&gt; to do this. I have preferred quiet absorption — interpretation, gentle reshaping, "wait, maybe..." — over coordinated rebuild.&lt;/p&gt;

&lt;p&gt;Whether that was right, I genuinely don't know. There were probably operators along the way who would have been happier with one painful migration than with years of small accommodations. There were probably integrating systems whose engineers would have preferred a clean v2 over a v1 that kept quietly bending.&lt;/p&gt;

&lt;p&gt;I chose to keep the shape. That choice has a cost. The cost is paid in operator-hours, in playbook footnotes, in the small extra cognitive load of working with a system that has absorbed more than its original model anticipated.&lt;/p&gt;

&lt;p&gt;I'm not saying I was wrong. I'm saying: I'm still not sure.&lt;/p&gt;




&lt;h2&gt;
  
  
  Generic underneath, specific on top
&lt;/h2&gt;

&lt;p&gt;There's one thing I do believe, with more confidence than the rest:&lt;/p&gt;

&lt;p&gt;When the model gets very generic, the UI usually has to get very specific to compensate.&lt;/p&gt;

&lt;p&gt;A maximally flexible data model — the kind that absorbs new formats, new payment methods, new engines — is, almost by definition, awkward to interact with directly. Direct interaction with a generic model means the human has to supply all the missing context. That's exhausting.&lt;/p&gt;

&lt;p&gt;The good outcome is: the model stays generic underneath, and the UI is built specific to each use case on top. Each surface is custom. Each surface is rigid in exactly the way that makes it usable.&lt;/p&gt;

&lt;p&gt;This is a hard decision to make, because every specific UI you build feels temporary. You're going to throw it away when the use case shifts. You build it knowing it has a shorter half-life than the model below it. That asymmetry is uncomfortable.&lt;/p&gt;

&lt;p&gt;But I think the discomfort is the right discomfort. A long-lived generic core, with shorter-lived specific surfaces on top, is — based on what I've watched survive over twenty-four years — closer to how systems actually age well.&lt;/p&gt;




&lt;h2&gt;
  
  
  Three observations
&lt;/h2&gt;

&lt;p&gt;None of these are conclusions:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;Dynamic structure extends a system's life. It does not, on its own, make operating that system pleasant.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Every "wait, maybe..." absorption is a real engineering achievement &lt;em&gt;and&lt;/em&gt; a small tax on someone downstream. Both of these are true.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;If you make the model very generic, give the operators a UI that isn't generic.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;blockquote&gt;
&lt;p&gt;Dynamic isn't enough. Operations is the other half — and that other half is usually paid by someone else.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;After twenty-four years of this, that's the only thing I feel sure of.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Earlier in this series:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://dev.to/hidekimori/the-accordion-pattern-why-i-stopped-writing-one-fat-llm-prompt-18mb"&gt;The Accordion Pattern: Why I stopped writing one fat LLM prompt&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Nobody knows when a job will finish. I'd still like to report it accurately.&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;What survives when you build alone for 24 years.&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>backend</category>
      <category>devops</category>
    </item>
    <item>
      <title>What survives when you build alone for 24 years</title>
      <dc:creator>Hideki Mori</dc:creator>
      <pubDate>Mon, 11 May 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/hidekimori/what-survives-when-you-build-alone-for-24-years-4e7d</link>
      <guid>https://dev.to/hidekimori/what-survives-when-you-build-alone-for-24-years-4e7d</guid>
      <description>&lt;p&gt;In December 2002 I joined a company that was nearly bankrupt.&lt;/p&gt;

&lt;p&gt;I was twenty-five. A software engineer with no title beyond that. The company was burning cash, and the next product launch had to happen in January — about a month after I started.&lt;/p&gt;

&lt;p&gt;By January 2003, the demo shipped.&lt;/p&gt;

&lt;p&gt;It was an HTML browser for mobile phones — with packet reduction built in. It rendered HTML, played back MIDI files linked from &lt;code&gt;&amp;lt;a&amp;gt;&lt;/code&gt; tags, ran simple background animations. I wrote it in Java.&lt;/p&gt;

&lt;p&gt;The server was PHP. There was no engineer for the server-side compression part, so I did that too — also in Java, called from PHP over localhost.&lt;/p&gt;

&lt;p&gt;A strange shape, but the deadline was real.&lt;/p&gt;

&lt;p&gt;I remember being a little angry. I also remember being much happier than angry. I had shipped something, in a month.&lt;/p&gt;

&lt;p&gt;That's how I started designing things alone. Not as a plan. Just: someone had to, and nobody else moved.&lt;/p&gt;




&lt;h2&gt;
  
  
  The first thing that wasn't a plan
&lt;/h2&gt;

&lt;p&gt;Looking back twenty-three years later, here's something I didn't notice at the time:&lt;/p&gt;

&lt;p&gt;I had no responsibility. I was a junior engineer with a deadline. If the demo had failed, the company would have collapsed faster — but the failure wouldn't have been mine to own.&lt;/p&gt;

&lt;p&gt;That made it easy to ship.&lt;/p&gt;

&lt;p&gt;After that first month, I never really had another hard deadline. Not on the projects I owned. I'd just answer "as soon as I can" and try to deliver on that.&lt;/p&gt;

&lt;p&gt;By 2005 I was a CTO. The work didn't get lighter; the responsibility arrived. And here's the thing I didn't expect: &lt;strong&gt;the shape of my code didn't change.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I kept writing things by myself. I kept ending up alone on the deep parts. The org chart rearranged around me, but at the bottom — when something had to actually work — I was still just there, writing.&lt;/p&gt;

&lt;p&gt;Productivity and accountability don't mix easily. The shape I learned at twenty-five — &lt;em&gt;write it because nobody else will&lt;/em&gt; — was the shape that kept showing up after the title changed.&lt;/p&gt;




&lt;h2&gt;
  
  
  A shape I didn't notice was a shape (2003–2005)
&lt;/h2&gt;

&lt;p&gt;After the packet-reduction tool, I built a music distribution platform for mobile phones. This was the carrier-billed era. Feature phones. Dropped connections in elevators and tunnels.&lt;/p&gt;

&lt;p&gt;The carrier had a billing rule that taught me something I wouldn't articulate for twenty years.&lt;/p&gt;

&lt;p&gt;The rule was: &lt;strong&gt;don't bill until the user's device confirms the download finished.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;So the flow looked like this:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;User taps "buy"&lt;/li&gt;
&lt;li&gt;The phone starts downloading (with HTTP Range, because connections drop)&lt;/li&gt;
&lt;li&gt;The server delivers in chunks&lt;/li&gt;
&lt;li&gt;When the device finishes, it sends a hidden completion signal&lt;/li&gt;
&lt;li&gt;
&lt;em&gt;Then&lt;/em&gt; the server bills&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;If the train went into a tunnel, the user wasn't charged. The server kept holding the work for as long as the device was alive. Eventually either the download completed and we billed, or it didn't and we ate the cost.&lt;/p&gt;

&lt;p&gt;Twenty years later I'd phrase the rule like this:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;&lt;em&gt;Nobody knows when a job will finish. But when it does, I want to report it accurately. So I make users wait — and in exchange, I retry as hard as I can until I have a result.&lt;/em&gt;&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That was already the rule for the music platform in 2003. I just didn't know it was a rule yet.&lt;/p&gt;




&lt;h2&gt;
  
  
  The shape kept showing up (2006–2018)
&lt;/h2&gt;

&lt;p&gt;In 2006 I started on an e-book distribution platform. Four months from blank page to production. I maintained it for eight years, alone.&lt;/p&gt;

&lt;p&gt;The shape stayed the same:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Direct delivery, no intermediary platforms in the way&lt;/li&gt;
&lt;li&gt;Real-time settlement back to publishers&lt;/li&gt;
&lt;li&gt;HTTP Range support, because phones still dropped connections&lt;/li&gt;
&lt;li&gt;Tar archives generated &lt;em&gt;per request&lt;/em&gt; — buttons, branding, rights metadata stitched in at the moment of delivery, not pre-built and stored&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;I stepped off in 2014. The platform kept running.&lt;/p&gt;

&lt;p&gt;By 2018, with someone else maintaining it, it was distributing tens of billions of yen in transactions per year.&lt;/p&gt;

&lt;p&gt;It ran for seven years after I left, and then, in 2021, it was discontinued.&lt;/p&gt;

&lt;p&gt;Eight years on my watch. Seven more without me. Then it stopped.&lt;/p&gt;

&lt;p&gt;Not a tragedy. Not a triumph. Just what happened.&lt;/p&gt;




&lt;h2&gt;
  
  
  Same shape, bigger numbers (2021–)
&lt;/h2&gt;

&lt;p&gt;In 2021 I started on what's now called &lt;a href="https://gw.portal.ldxhub.io/introduction" rel="noopener noreferrer"&gt;LDX hub&lt;/a&gt; — a document processing platform for translation, OCR, structured extraction, and AI-assisted refinement.&lt;/p&gt;

&lt;p&gt;Different domain. Same shape.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Year   Documents processed   Characters processed
2021   2,200                 ~20M
2022   48,500                ~810M
2023   85,000                ~2.7B
2024   163,000               ~3.5B
2025   351,000               ~8.3B
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The numbers grew. I didn't change anything fundamental about how the system is shaped. Jobs are submitted, the API holds the work, the API retries through whatever needs retrying, the API reports when there's a real result to report.&lt;/p&gt;

&lt;p&gt;Same as the carrier-billed music platform. Same as the e-book distribution. Same as that strange first month with two languages talking through localhost.&lt;/p&gt;

&lt;p&gt;I didn't decide to keep this shape. I just never found a better one.&lt;/p&gt;




&lt;h2&gt;
  
  
  Productivity and accountability, again
&lt;/h2&gt;

&lt;p&gt;Looking back at twenty-four years, here's the only honest thing I can say:&lt;/p&gt;

&lt;p&gt;The shape I learned with no responsibility — the shape that lets one person ship something in a month — turned out to be the shape that kept showing up after responsibility arrived.&lt;/p&gt;

&lt;p&gt;Not because I planned for that. Because it was the only shape I knew how to write, and I kept writing it, and nothing about the larger numbers ever required me to write differently.&lt;/p&gt;

&lt;p&gt;Whether that means anything for your system, I genuinely don't know.&lt;/p&gt;

&lt;p&gt;I haven't found a different shape worth writing.&lt;/p&gt;

&lt;p&gt;For twenty-four years, I keep ending up here.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Earlier in this series:&lt;/em&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;&lt;em&gt;&lt;a href="https://dev.to/hidekimori/the-accordion-pattern-why-i-stopped-writing-one-fat-llm-prompt-18mb"&gt;The Accordion Pattern: Why I stopped writing one fat LLM prompt&lt;/a&gt;&lt;/em&gt;&lt;/li&gt;
&lt;li&gt;&lt;em&gt;Nobody knows when a job will finish. I'd still like to report it accurately. (scheduled for May 4)&lt;/em&gt;&lt;/li&gt;
&lt;/ul&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>career</category>
      <category>solo</category>
    </item>
    <item>
      <title>Nobody knows when a job will finish. I'd still like to report it accurately.</title>
      <dc:creator>Hideki Mori</dc:creator>
      <pubDate>Mon, 04 May 2026 13:00:00 +0000</pubDate>
      <link>https://dev.to/hidekimori/nobody-knows-when-a-job-will-finish-id-still-like-to-report-it-accurately-26nn</link>
      <guid>https://dev.to/hidekimori/nobody-knows-when-a-job-will-finish-id-still-like-to-report-it-accurately-26nn</guid>
      <description>&lt;p&gt;Most async APIs commit to one thing: starting your job. They return &lt;code&gt;202 Accepted&lt;/code&gt;, hand you a job ID, and that's where the contract ends. The rest is your problem.&lt;/p&gt;

&lt;p&gt;I do something different. I make one promise:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When your job is done, I'll tell you accurately. Until then, I'll keep retrying.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;That's the entire contract for everything I've ever shipped. It sounds small. In practice, it's the only thing I actually do.&lt;/p&gt;




&lt;h2&gt;
  
  
  The shape every job in my system shares
&lt;/h2&gt;

&lt;p&gt;You hand me work.&lt;/p&gt;

&lt;p&gt;You wait.&lt;/p&gt;

&lt;p&gt;I retry as hard as I can.&lt;/p&gt;

&lt;p&gt;I report when it's done.&lt;/p&gt;

&lt;p&gt;That's it. Whether the job is OCR on a scanned PDF, structured extraction from a long document, or refining the translation of an XLIFF file — the shape is identical. You give me an input. You don't watch the screen. I come back when I have something honest to report.&lt;/p&gt;

&lt;p&gt;This sounds obvious until you try to actually deliver it.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why "started" is easier than "finished"
&lt;/h2&gt;

&lt;p&gt;Returning &lt;code&gt;202 Accepted&lt;/code&gt; is easy. The hard part starts right after that.&lt;/p&gt;

&lt;p&gt;Real jobs hit things like:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Vendor APIs that occasionally throw 503. No reason. Just sometimes.&lt;/li&gt;
&lt;li&gt;Native binaries that core dump. Twice in a row, then fine for a week.&lt;/li&gt;
&lt;li&gt;Subprocesses that go zombie. Not crashed. Not finished. Just defunct. The OS still holds them.&lt;/li&gt;
&lt;li&gt;Disks that fill up with stale debug files because something somewhere wrote them and forgot.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;If you ship "started, here's a job ID, good luck" and call that an API, you're outsourcing all of the above to your user.&lt;/p&gt;

&lt;p&gt;I'm not willing to do that. So I take the work back inside.&lt;/p&gt;




&lt;h2&gt;
  
  
  What that looks like in code
&lt;/h2&gt;

&lt;p&gt;I'm not going to name any vendor. They don't matter. What matters is the shape. The code below is a simplified sketch — the production version handles a lot more (PDF library version quirks, fallback engines when the first one rejects the input, demo-mode page limits, and a long list of vendor-specific error codes that mean "retry," "skip," or "stop"). The shape is what survives.&lt;/p&gt;

&lt;p&gt;Here's a sketch of the inside of one of my conversion services:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight java"&gt;&lt;code&gt;&lt;span class="kd"&gt;public&lt;/span&gt; &lt;span class="nc"&gt;JobResult&lt;/span&gt; &lt;span class="nf"&gt;runJob&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;Input&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="kd"&gt;throws&lt;/span&gt; &lt;span class="nc"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="no"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;++)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
        &lt;span class="nc"&gt;Process&lt;/span&gt; &lt;span class="n"&gt;child&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;ProcessBuilder&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;
                &lt;span class="s"&gt;"java"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"-cp"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;classpath&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="nc"&gt;EngineMain&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;class&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;getName&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;redirectErrorStream&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="kc"&gt;true&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt;
            &lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;start&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="n"&gt;passInputToStdin&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;

        &lt;span class="kt"&gt;long&lt;/span&gt; &lt;span class="n"&gt;started&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentTimeMillis&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="k"&gt;while&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isAlive&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="nc"&gt;System&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;currentTimeMillis&lt;/span&gt;&lt;span class="o"&gt;()&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="n"&gt;started&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="no"&gt;MAX_RUNTIME_MS&lt;/span&gt;&lt;span class="o"&gt;)&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;destroyForcibly&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;isDefunct&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;))&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;reap&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
                &lt;span class="k"&gt;break&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt;
            &lt;span class="o"&gt;}&lt;/span&gt;
            &lt;span class="n"&gt;sweepStaleCoreFiles&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;workDir&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="no"&gt;MAX_CORE_AGE_MS&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="nc"&gt;Thread&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;sleep&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;POLL_INTERVAL_MS&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;

        &lt;span class="nc"&gt;ChildOutcome&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;readOutcome&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;child&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isTransientError&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="k"&gt;continue&lt;/span&gt;&lt;span class="o"&gt;;&lt;/span&gt; &lt;span class="c1"&gt;// retry&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;isIrrelevantError&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="o"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;info&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"irrelevant error, treating as success: {}"&lt;/span&gt;&lt;span class="o"&gt;,&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toSuccessResult&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
        &lt;span class="o"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;hasResult&lt;/span&gt;&lt;span class="o"&gt;())&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;outcome&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;toResult&lt;/span&gt;&lt;span class="o"&gt;();&lt;/span&gt;
    &lt;span class="o"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nc"&gt;JobResult&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="na"&gt;failedAfterRetries&lt;/span&gt;&lt;span class="o"&gt;(&lt;/span&gt;&lt;span class="no"&gt;MAX_RETRIES&lt;/span&gt;&lt;span class="o"&gt;);&lt;/span&gt;
&lt;span class="o"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A few things in there are worth pointing at.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;new ProcessBuilder("java", ..., EngineMain.class.getName())&lt;/code&gt;.&lt;/strong&gt; Not "call a library function." Not "use the SDK." I literally re-enter &lt;code&gt;main&lt;/code&gt; from another process. The reason is that the underlying engine, in its native form, is unreliable enough that I want process-level isolation. If it dies, only the child dies.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;if (isDefunct(child)) { reap(child); break; }&lt;/code&gt;.&lt;/strong&gt; Native binaries don't always exit cleanly. Sometimes they're not crashed and not running — they're stuck. The parent has to notice, decide, and clean up.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;sweepStaleCoreFiles(workDir, MAX_CORE_AGE_MS)&lt;/code&gt;.&lt;/strong&gt; When a child crashes hard, the OS dumps a core file. That file is huge. If you don't sweep it, the disk fills up. There is no clever solution here. You sweep.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;outcome.isTransientError()&lt;/code&gt; → &lt;code&gt;continue&lt;/code&gt;.&lt;/strong&gt; Some vendor errors come and go. The fix is to wait and try again. If you don't try again, your user sees &lt;code&gt;failed&lt;/code&gt;. If you do try again, your user sees "took a bit longer." I pick the second one.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;code&gt;outcome.isIrrelevantError()&lt;/code&gt; → log and return success.&lt;/strong&gt; This is the part that surprises people. Some errors aren't actually errors for the use case. They're noise the engine emits. Knowing which is which takes years, and is most of the actual product.&lt;/p&gt;

&lt;p&gt;None of this is elegant. None of it shows up in an architecture diagram. It all lives in the gap between "the job was submitted" and "the job is done, here's the result."&lt;/p&gt;

&lt;p&gt;That gap is what I do.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I gave up
&lt;/h2&gt;

&lt;p&gt;I don't promise low latency. I can't. The thing I'm waiting on isn't predictable.&lt;/p&gt;

&lt;p&gt;I don't promise the job will always succeed. Sometimes the input is genuinely broken. Then I report that, accurately, instead of pretending.&lt;/p&gt;

&lt;p&gt;I don't promise streaming partial results. I keep the user out of the loop until I have something stable to hand back. The cost is they wait. The benefit is they don't see noise.&lt;/p&gt;

&lt;p&gt;These trade-offs aren't sophisticated. They're just consistent.&lt;/p&gt;




&lt;h2&gt;
  
  
  I didn't design this. It survived.
&lt;/h2&gt;

&lt;p&gt;Looking back, this is how every job-shaped API I've ever built has worked. I didn't sit down one day and decide on a contract. I kept ending up here.&lt;/p&gt;

&lt;p&gt;Each time I tried to ship something where the API said &lt;code&gt;started&lt;/code&gt; and stopped caring, the user came back asking what happened. So I started caring. Each time I tried to surface every transient error to the user, the user got scared. So I started absorbing them. Each time I tried to make jobs faster by skipping the cleanup, the disks filled up. So I started sweeping.&lt;/p&gt;

&lt;p&gt;After enough years of this, what's left is a single rule:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;When the job is done, I'll tell you accurately. Until then, I'll keep retrying.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;Whether that's the right contract for your system, I genuinely don't know. It's just the only one I've found that survives.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;Earlier in this series: &lt;a href="https://dev.to/hidekimori/the-accordion-pattern-why-i-stopped-writing-one-fat-llm-prompt-18mb"&gt;The Accordion Pattern: Why I stopped writing one fat LLM prompt&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>api</category>
      <category>architecture</category>
      <category>backend</category>
      <category>devops</category>
    </item>
    <item>
      <title>The Accordion Pattern: Why I stopped writing one fat LLM prompt</title>
      <dc:creator>Hideki Mori</dc:creator>
      <pubDate>Wed, 29 Apr 2026 07:51:20 +0000</pubDate>
      <link>https://dev.to/hidekimori/the-accordion-pattern-why-i-stopped-writing-one-fat-llm-prompt-18mb</link>
      <guid>https://dev.to/hidekimori/the-accordion-pattern-why-i-stopped-writing-one-fat-llm-prompt-18mb</guid>
      <description>&lt;p&gt;Most structured-extraction tutorials look the same. Take a document, write one big prompt that says "extract A, B, C, D, E, F", get JSON back. Done.&lt;/p&gt;

&lt;p&gt;This works on short inputs.&lt;/p&gt;

&lt;p&gt;It quietly breaks on long ones.&lt;/p&gt;

&lt;p&gt;After running this in production for a while, I stopped doing it. Here's what I switched to and why.&lt;/p&gt;




&lt;h2&gt;
  
  
  The fat prompt problem
&lt;/h2&gt;

&lt;p&gt;Say you have a 50-page report and you want a structured summary out of it. The natural first move is something like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="err"&gt;Extract:&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;title&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;sections&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;(with&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;headings)&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;purpose&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;mentioned&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;services&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;acceptance&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;criteria&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;-&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="err"&gt;Return&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;JSON&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;in&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;this&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;shape:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;You hand the whole document to the model. It returns JSON. It looks fine on the first try.&lt;/p&gt;

&lt;p&gt;Then you scale it up and three things happen:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Quality drifts.&lt;/strong&gt; The model "forgets" mid-document. Later sections are summarized worse than earlier ones, or fields go missing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;One bad field poisons the whole call.&lt;/strong&gt; If "acceptance criteria" hallucinates, you don't just lose that field — the whole record gets quarantined for review.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Latency goes up, parallelism goes down.&lt;/strong&gt; A single 30k-token call takes what it takes. You can't shard it.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;You can fight this with longer prompts, more examples, stricter formatting rules. I did. It buys you maybe 10% more reliability and costs you a lot of prompt-engineering time.&lt;/p&gt;

&lt;p&gt;The structural problem doesn't go away.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I do now: split it
&lt;/h2&gt;

&lt;p&gt;The pattern I use looks like an accordion that expands:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[ document ]
     │
     ▼
[ Stage 1: segment ]   ← one prompt, one job: produce a list
     │
     ▼
[ array of segments ]
     │
     ▼ (fan out)
[ Stage 2: extract ]   ← one prompt, runs per segment
     │
     ▼
[ structured records ]
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Stage 1 reads the whole document and returns a clean array of segments — sections, paragraphs, line items, whatever the right unit is for the task.&lt;/p&gt;

&lt;p&gt;Stage 2 takes one segment at a time and extracts the structured fields you actually want.&lt;/p&gt;

&lt;p&gt;Two prompts, each doing one thing.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why this works better
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Each prompt has a single job.&lt;/strong&gt;&lt;br&gt;
Stage 1 is "find the boundaries". Stage 2 is "extract the schema". Neither prompt has to hold both ideas at once. You can write each one tightly. Examples are shorter and more on-point.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Errors localize.&lt;/strong&gt;&lt;br&gt;
If Stage 2 fails on segment 7, you re-run segment 7. You don't redo the whole document. Bad fields get isolated to one record instead of contaminating the whole batch.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stage 2 parallelizes naturally.&lt;/strong&gt;&lt;br&gt;
The output of Stage 1 is an array. Fan it out. Run 50 small extractions in parallel instead of one big one. Total wall-clock time drops, and so does the variance.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cache hits go up.&lt;/strong&gt;&lt;br&gt;
If the same segment shows up twice (templates, standard headers, repeated forms), Stage 2 sees the same input and you can cache. The fat-prompt version sees the entire document as one unique input every time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Long documents stop being scary.&lt;/strong&gt;&lt;br&gt;
The hard limit on a fat prompt is the model's context window. The accordion pattern doesn't have that ceiling. Stage 1 still has to read the whole document, but its output is small. Stage 2 only ever sees one segment.&lt;/p&gt;


&lt;h2&gt;
  
  
  What it costs
&lt;/h2&gt;

&lt;p&gt;It's not free.&lt;/p&gt;

&lt;p&gt;You're making more LLM calls — one for Stage 1 plus N for Stage 2 instead of one. On short inputs that's wasteful. The accordion pattern is for documents long enough that fat prompts start failing, not for two-paragraph emails.&lt;/p&gt;

&lt;p&gt;You also need to think a little harder about what a "segment" is for your task. Sometimes it's a section heading. Sometimes it's a row in a table. Sometimes it's a logical unit that doesn't map to any visible boundary. That's a design decision and it matters.&lt;/p&gt;


&lt;h2&gt;
  
  
  When to use it
&lt;/h2&gt;

&lt;p&gt;Reach for the accordion when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The document is long enough that you've seen the model lose the thread mid-way.&lt;/li&gt;
&lt;li&gt;The output schema has more than ~5 fields and they don't all care about the same context.&lt;/li&gt;
&lt;li&gt;You need to retry failed records without redoing successful ones.&lt;/li&gt;
&lt;li&gt;You want parallelism.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Stick with one fat prompt when:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;The input is short and the schema is small.&lt;/li&gt;
&lt;li&gt;The fields are tightly coupled (extracting one needs context from another).&lt;/li&gt;
&lt;li&gt;You're prototyping and don't care yet.&lt;/li&gt;
&lt;/ul&gt;


&lt;h2&gt;
  
  
  A small concrete example
&lt;/h2&gt;

&lt;p&gt;I run this on a service called StructFlow. The shape of the calls is roughly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Stage 1: segment&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://gw.ldxhub.io/structflow/jobs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "google/gemini-3-flash-preview",
    "system_prompt": "Split this document into logical sections. Return one JSON record per section.",
    "example_output": { "section_title": "...", "section_text": "..." },
    "inputs": [{ "id": "doc1", "data": { "text": "..." } }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The response gives you back an array. Then Stage 2:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="c"&gt;# Stage 2: extract (one call per segment, run in parallel)&lt;/span&gt;
curl &lt;span class="nt"&gt;-X&lt;/span&gt; POST https://gw.ldxhub.io/structflow/jobs &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-H&lt;/span&gt; &lt;span class="s2"&gt;"Authorization: Bearer &lt;/span&gt;&lt;span class="nv"&gt;$KEY&lt;/span&gt;&lt;span class="s2"&gt;"&lt;/span&gt; &lt;span class="se"&gt;\&lt;/span&gt;
  &lt;span class="nt"&gt;-d&lt;/span&gt; &lt;span class="s1"&gt;'{
    "model": "google/gemini-3-flash-preview",
    "system_prompt": "From this section, extract: purpose, mentioned services, acceptance criteria.",
    "example_output": { "purpose": "...", "mentioned_services": [], "acceptance_criteria": [] },
    "inputs": [{ "id": "sec1", "data": { "section_text": "..." } }]
  }'&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Two calls, each focused. One returns segments. The other turns each segment into structured fields.&lt;/p&gt;

&lt;p&gt;That's the whole pattern.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why I'm posting this
&lt;/h2&gt;

&lt;p&gt;I built &lt;a href="https://gw.portal.ldxhub.io" rel="noopener noreferrer"&gt;LDX hub&lt;/a&gt; partly to make this pattern easy to run — one API, async jobs, file-based input/output so Stage 1's output is directly usable as Stage 2's input. But the pattern itself doesn't depend on any specific tool. You can do it with raw OpenAI calls, Anthropic calls, anything that takes a prompt and returns text.&lt;/p&gt;

&lt;p&gt;The takeaway isn't "use my API". It's: &lt;strong&gt;if your structured extraction is getting flaky on long inputs, the answer probably isn't a longer prompt. It's two prompts.&lt;/strong&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;If you've tried something similar — or if you've got a case where this falls apart — I'd be curious to hear it.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>llm</category>
      <category>ai</category>
      <category>architecture</category>
      <category>api</category>
    </item>
  </channel>
</rss>
