<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: arif</title>
    <description>The latest articles on DEV Community by arif (@arif).</description>
    <link>https://dev.to/arif</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.us-east-2.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F264601%2F069d88a1-9b73-4129-856c-a2ddd59fc0f2.JPG</url>
      <title>DEV Community: arif</title>
      <link>https://dev.to/arif</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/arif"/>
    <language>en</language>
    <item>
      <title>i built a cad editor in the browser, then taught an llm to use it</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Fri, 03 Jul 2026 14:27:04 +0000</pubDate>
      <link>https://dev.to/arif/i-built-a-cad-editor-in-the-browser-then-taught-an-llm-to-use-it-1l92</link>
      <guid>https://dev.to/arif/i-built-a-cad-editor-in-the-browser-then-taught-an-llm-to-use-it-1l92</guid>
      <description>&lt;p&gt;the demo moment i did not plan: i asked my app "how many doors and windows are there?" and the ai answered the counts, then added, unprompted:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;note D3 is only 300 mm wide, likely a mis-detected door. want me to check it?&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;it was right. my extraction pipeline had turned a piece of geometry into a 30 cm door. no human had noticed. the model read the quantity take-off, saw a door narrower than a shoebox, and flagged it.&lt;/p&gt;

&lt;p&gt;that little moment is the payoff of a longer story: parsing one of the most hostile file formats in existence (autocad dwg), reconstructing real building models from thousands of anonymous line segments, building a 2d cad editor from scratch in canvas, and then handing the whole thing to claude as a set of tools it can call.&lt;/p&gt;

&lt;p&gt;this is how i built it, including the parts that went wrong.&lt;/p&gt;

&lt;h2&gt;
  
  
  what the thing does
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;          upload                durable etl job                     viewer + editor
 browser ────────▶ fastapi ───────────────────▶ dwg → dxf → model ────▶ react + three.js
 (react)           + dbos        (libredwg + ezdxf, by layer)           2d cad editor
                                       │                                    ▲
                                       ▼                                    │
                                 entity rows in db  ◀──── ai copilot ───────┘
                                 (walls, doors, rooms)     (claude + tools)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;you drop a dwg floor plan in the browser. a durable background job converts it, extracts the geometry, and reconstructs a structured building model: walls as centerlines, doors and windows placed on their host walls, rooms as polygons with names and areas. you get a 3d viewer, a full 2d cad editor (snapping, dimensions, undo, print to scale), quantity take-offs, and export back to dwg.&lt;/p&gt;

&lt;p&gt;and then there is a chat panel where an ai agent can read the model and edit it. for real. the edits land in the database, show up live in 2d and 3d, and sit in your undo history like any manual edit.&lt;/p&gt;

&lt;p&gt;the important thing to understand before any of that: a dwg file does not contain walls. it contains lines. everything interesting in this project is the distance between those two sentences.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 1: dwg is a format that hates you
&lt;/h2&gt;

&lt;p&gt;dwg is a closed binary format with about 30 years of versions. there is no official spec. the open source world has exactly one serious answer: libredwg, which can read basically everything and write reliably to one version (r2000).&lt;/p&gt;

&lt;p&gt;two decisions saved me weeks here:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;run libredwg as a subprocess, not a library.&lt;/strong&gt; it is gpl, my code is not, and a subprocess boundary keeps the licenses clean. it also means a segfault in a 30-year-old format parser kills a child process, not my server. the converter turns dwg into dxf, which is the same data model as text, and from there ezdxf (a superb python library) takes over.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;never trust the file.&lt;/strong&gt; my favorite example: dwg files carry a &lt;code&gt;$INSUNITS&lt;/code&gt; header that declares the drawing units. it lies constantly. i had a file that declared "metres" while every coordinate was clearly millimetres. trust it and your apartment renders a thousand times too big.&lt;/p&gt;

&lt;p&gt;the fix is embarrassingly simple: ignore the header, look at the numbers.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;infer_metres_per_unit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spans&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# a building floor plan is 5 to 100 metres across. if the raw
&lt;/span&gt;    &lt;span class="c1"&gt;# coordinate span is ~8000, the drawing is in millimetres,
&lt;/span&gt;    &lt;span class="c1"&gt;# whatever the header claims.
&lt;/span&gt;    &lt;span class="n"&gt;span&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;spans&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;1.0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.001&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.01&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.0254&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mf"&gt;0.3048&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;metres&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;span&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;plausible_extent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;metres&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;kind&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;scale&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="mf"&gt;1.0&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;same paranoia everywhere else. real files come structurally broken: truncated sections, invalid handles, entities the strict parser refuses. when &lt;code&gt;ezdxf.recover&lt;/code&gt; gives up, i do not: a dxf is at heart a stream of &lt;code&gt;(group code, value)&lt;/code&gt; pairs, so i drop to a raw tag scanner that walks the text and pulls out every &lt;code&gt;LINE&lt;/code&gt;, &lt;code&gt;LWPOLYLINE&lt;/code&gt;, &lt;code&gt;CIRCLE&lt;/code&gt; and &lt;code&gt;TEXT&lt;/code&gt; it can find, geometry only, no object model. you lose elegance, you keep the floor plan. and a converter that emits latin-1 diagnostics into a utf-8 pipe cannot blow up the job either (capture bytes, decode with &lt;code&gt;errors="replace"&lt;/code&gt;, a real crash i hit on 7 of my first 35 test files).&lt;/p&gt;

&lt;p&gt;to prove the pipeline was solid i collected 231 real dwg files from the internet: architectural, structural, mechanical, electrical, every version from r1.4 to 2025. a smoke test harness runs each file through the full pipeline in an isolated subprocess with a timeout, so one hang cannot take down the run.&lt;/p&gt;

&lt;p&gt;result after fixing everything the corpus surfaced: &lt;strong&gt;231 files, 0 crashes.&lt;/strong&gt; graceful "i could not reconstruct much from this" is allowed. a traceback is not.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 2: from thousands of line segments to actual walls
&lt;/h2&gt;

&lt;p&gt;extraction gives you a soup. blocks (reusable symbols like door leafs and window frames) get exploded recursively into their primitive entities, arcs get sampled into chords, polylines get split into edges. what comes out is tens of thousands of bare segments, each carrying exactly one piece of metadata: its layer name.&lt;/p&gt;

&lt;p&gt;layer names are the only semantics a dwg has, and they are a free-text field filled by humans. german architects write &lt;code&gt;WAND&lt;/code&gt; or &lt;code&gt;MAUER&lt;/code&gt;, english ones write &lt;code&gt;A-WALL&lt;/code&gt; or &lt;code&gt;WALLS&lt;/code&gt;, and everyone abbreviates differently. so classification is humble substring matching over a multilingual hint list:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_WALL_PATS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wall&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wand&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;mauer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gebäude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;gebaeude&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;_DOOR_PATS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;door&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tür&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;tuer&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;porte&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;_WIN_PATS&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;window&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;fenster&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;wind&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the same trick, with a different hint list (flurstück, parcel, gelände, straße...), decides whether the whole drawing is a building floor plan or a site plan, because the two need completely different reconstruction. and when a file has no recognizable wall layer at all (everything dumped on layer &lt;code&gt;0&lt;/code&gt;, a classic), the pipeline falls back to "every segment that is not obviously a door, window, text or dimension" and attaches a warning saying results are approximate. honesty over confidence.&lt;/p&gt;

&lt;p&gt;now the real problem. on the wall layer, a wall is never one line. it is drawn as parallel pairs, and each line is broken into fragments wherever a door or window interrupts it, often drawn in opposite directions by different commands years apart:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; what the dwg contains on the wall layer:

   ──────▶      ◀──────────        ──▶        four fragments, mixed directions

 what the model needs:

   ══════════════════════════════════         one wall centerline
         └─ 900mm gap ─┘   └─ 1200mm gap ─┘   with its gaps remembered
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;merging those fragments is a grouping problem: which segments lie on the same infinite line? i hash each segment by its quantized angle (4 degree buckets, modulo 180 so direction does not matter) plus its perpendicular offset from the origin (50 mm buckets). segments that share a hash are collinear neighbours.&lt;/p&gt;

&lt;p&gt;the bug that taught me the most lived right here. the offset is computed along the line's normal vector, and a segment drawn right-to-left has its normal pointing the opposite way from its left-to-right twin, which flips the sign of the offset, which puts the two halves of the same wall into different hash buckets. the fix is one canonicalization:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="c1"&gt;# flip the normal into one half-plane so segments drawn in opposite
# directions on the SAME line (two halves of a wall split by a door,
# extremely common in real dwgs) hash together
&lt;/span&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ny&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ny&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ny&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;nx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt;&lt;span class="n"&gt;ny&lt;/span&gt;
&lt;span class="n"&gt;offset&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;round&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;nx&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;ny&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mf"&gt;50.0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;before that fix, half my test corpus reconstructed twice as many walls as the buildings had.&lt;/p&gt;

&lt;p&gt;within each group the merge becomes one-dimensional: project every fragment onto the group's direction, sort the intervals, and sweep. touching or overlapping intervals fuse. gaps up to 2600 mm (a generous door width) get bridged and recorded, because a door-sized hole in a wall line is not noise, it is a door. gaps bigger than that split the wall in two. the recorded gaps come out the other side as opening candidates for free.&lt;/p&gt;

&lt;p&gt;with centerlines in hand, classification is geometric. a wall whose &lt;em&gt;midpoint&lt;/em&gt; hugs the bounding box perimeter is external (230 mm brick by default), everything else is a partition (100 mm block). testing the midpoint rather than the endpoints matters: an interior wall whose ends touch the facade would otherwise get promoted to external.&lt;/p&gt;

&lt;p&gt;doors and windows then snap to their hosts: take each segment from a door or window layer, find the nearest wall within a 500 mm perpendicular tolerance, and project the midpoint onto the wall's axis. that projection &lt;em&gt;is&lt;/em&gt; the door's position, stored as one number, &lt;code&gt;center_mm&lt;/code&gt; along the wall. openings with the same type and size (rounded to 50 mm) share a schedule mark, which is how D1, D2, W1 get assigned exactly the way a human drafter would.&lt;/p&gt;

&lt;p&gt;every dimension nobody drew (storey height 3000, door height 2100, window sill 900) is a stated construction default, appended to the model's warnings so the ui can show "assumed" instead of pretending the file said so.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 3: rooms are faces of a planar graph
&lt;/h2&gt;

&lt;p&gt;rooms are the part that feels like magic and is actually graph theory.&lt;/p&gt;

&lt;p&gt;once you have wall centerlines, a floor plan is a planar subdivision: the lines carve the plane into faces, and the bounded faces are the rooms. extracting them is a classic computational geometry exercise:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; 1. split every segment at every         2. walk the half-edges, always
    intersection, snap nodes                taking the sharpest clockwise
    to a 60mm grid                          turn at each node

    ┌──────┬────────┐                       ┌──────┬────────┐
    │      │        │                       │  R1  │   R2   │
    ├──────┴───┬────┤          ──▶          ├──────┴───┬────┤
    │          │    │                       │    R3    │ R4 │
    └──────────┴────┘                       └──────────┴────┘

 3. every minimal face gets traced exactly once.
    one face is the unbounded outside: drop it.
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;step 2 is the elegant bit. from every directed edge, keep turning as sharply clockwise as possible and you trace exactly one minimal face, and every interior face of the plan gets traced exactly once. no recursion, no flood fill, just angles.&lt;/p&gt;

&lt;p&gt;step 3 hid my favorite geometry bug. you get all the rooms &lt;em&gt;plus&lt;/em&gt; one extra polygon: the outer boundary of the whole building, traced from the outside. my first heuristic dropped any face bigger than some fraction of the bounding box. worked on every rectangular test building, then an l-shaped floor plan arrived. the outer face of an l-shape is nowhere near its bounding box area, so it slipped under the threshold and appeared in the ui as a giant phantom room covering the entire footprint. the correct rule is topological, not metric: in a connected arrangement the unbounded face encloses everything else, so it is always the single &lt;em&gt;largest&lt;/em&gt; face. drop the max, keep the rest. no threshold to tune, no shape that breaks it.&lt;/p&gt;

&lt;p&gt;naming the survivors is almost an afterthought: floor plans carry text labels ("kitchen", "büro 2.13"), so run point-in-polygon of each label against each room face and the names fall into place. rooms without a label become "room 7", which is at least honest.&lt;/p&gt;

&lt;p&gt;it is worth pausing on what happened across these two parts. the input was an unordered pile of anonymous line segments. the output is: 7 walls with types and materials, 6 doors and 3 windows that know which wall they sit on and where, 4 rooms with names, areas and perimeters, and a structural grid derived from the dominant wall axes. nothing in the file said any of that. it was all latent in the geometry.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 4: the model is rows, not a blob
&lt;/h2&gt;

&lt;p&gt;the easy design would be: etl emits a json file, viewer reads the json file. done.&lt;/p&gt;

&lt;p&gt;i did not do that, and it turned out to be the single most important decision in the project. the extraction writes the model into the database as entity rows:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="n"&gt;walls&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;thickness_mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height_mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;material&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;openings&lt;/span&gt;     &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;mark&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wall_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;center_mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;width_mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height_mm&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;sill_mm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;rooms&lt;/span&gt;        &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;polygon&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;floor_finish&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wall_finish&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;dimensions&lt;/span&gt;   &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;x2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;y2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;offset_mm&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;labels&lt;/span&gt;       &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_kind&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;target_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;text&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ty&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;one endpoint assembles the rows into the model json the viewers consume. another validates an edited model against pydantic schemas and writes it back to the rows.&lt;/p&gt;

&lt;p&gt;notice what that buys you. editing is just row updates. dwg export is just reading rows and emitting dxf entities (real &lt;code&gt;DIMENSION&lt;/code&gt; and &lt;code&gt;TEXT&lt;/code&gt; entities, mitered wall outlines). and, foreshadowing, &lt;strong&gt;an ai tool that edits the model is just another caller of the same validated save path.&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;a door is not a hole in a picture. it is a row that knows which wall it lives on and how far along it sits. everything downstream falls out of that.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 5: a cad editor is mostly three problems
&lt;/h2&gt;

&lt;p&gt;i researched the buy option first. commercial browser cad sdks either cannot actually edit dwg (they are viewers with markup) or start around $7,500 a year for a general-dwg scope i did not need. i only needed to edit &lt;em&gt;my&lt;/em&gt; reconstructed model. so i built it on a raw canvas 2d context.&lt;/p&gt;

&lt;p&gt;a cad editor that feels professional is mostly three problems: rendering speed, snapping, and undo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;rendering.&lt;/strong&gt; my first version redrew everything on every mousemove. on a 963-wall plan that was 1154 ms per frame. one frame per second. the fix is the oldest trick in graphics: split the scene by how often it changes.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt; layer 3: overlay canvas    cursor, snap markers, drag previews   every mousemove
 ─────────────────────────────────────────────────────────────────────────────────
 layer 2: model cache       walls, rooms, doors, dimensions       on commit
 ─────────────────────────────────────────────────────────────────────────────────
 layer 1: base cache        grid + original dwg linework          on pan/zoom
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;layers 1 and 2 are offscreen canvases, redrawn only when their content changes. mousemove touches only the overlay. walls get batched into two &lt;code&gt;Path2D&lt;/code&gt; objects (one per wall type) so the browser does two fill calls instead of a thousand.&lt;/p&gt;

&lt;p&gt;1154 ms became 82 ms under a software rasterizer, and comfortable 60 fps on a real gpu. no webgl, no framework, just not repainting things that did not change.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;snapping.&lt;/strong&gt; autocad muscle memory is real: endpoints beat intersections beat midpoints beat grid. i keep two r-tree spatial indexes (rbush), one for segments and one for points, query a small box around the cursor, and pick the winner by priority:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;endpoint(10) &amp;gt; intersection(9) &amp;gt; midpoint(8) &amp;gt; perpendicular(7) &amp;gt; on-line(5) &amp;gt; grid(2)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;one detail that separates "toy" from "correct": a strong object snap overrides ortho lock. if you are drawing orthogonally but hover an endpoint that is 2 degrees off axis, real cad snaps to the endpoint. get that wrong and anyone who has used autocad feels it instantly.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;undo.&lt;/strong&gt; i used snapshots, not command objects. before any transaction the store clones the model (&lt;code&gt;structuredClone&lt;/code&gt;, about 200 kb for 1000 walls, 80 snapshots deep). drags mutate the live model freely so every side panel updates in real time, and if you press escape the snapshot rolls back.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nx"&gt;BuildingModel&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="k"&gt;void&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;begin&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;          &lt;span class="c1"&gt;// clone the model&lt;/span&gt;
  &lt;span class="nf"&gt;mutate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;    &lt;span class="c1"&gt;// do anything&lt;/span&gt;
  &lt;span class="k"&gt;this&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;commit&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;         &lt;span class="c1"&gt;// snapshot -&amp;gt; undo stack, autosave kicks in&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;boring, memory-hungry, and completely immune to the classic command-pattern bug where one action forgets to implement its inverse. for models of this size it is the right trade.&lt;/p&gt;

&lt;p&gt;saving is a debounced single-flight put of the whole model, with an abort controller so a newer save supersedes an in-flight older one. writes cannot land out of order.&lt;/p&gt;

&lt;h2&gt;
  
  
  part 6: give the model tools, not a text box
&lt;/h2&gt;

&lt;p&gt;then my company announced an ai hackathon, and i had claude opus 4.8 available through an azure ai foundry endpoint (foundry serves the native anthropic api, so the official sdk works as-is, just point &lt;code&gt;base_url&lt;/code&gt; at it).&lt;/p&gt;

&lt;p&gt;the tempting demo is rag: "chat with your floor plan." i wanted the other thing. an agent that &lt;em&gt;does&lt;/em&gt; work: place the door, label the rooms, run the take-off, fix the storey height.&lt;/p&gt;

&lt;p&gt;the architecture is a tool-use loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json"&gt;&lt;code&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;user:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"add a 1500mm window to the longest external wall"&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;▼&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="err"&gt;claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tool:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;list_elements(kind=walls)&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;──▶&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;backend&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;runs&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;it&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;──▶&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;json&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;result&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;                                                                  &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;◀──────────────────────────────────────────────────────────────────┘&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="s2"&gt;"W1 and W3 are both 10m. W1 already has a door and a
   │    window, so W3 it is."&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;▼&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="err"&gt;claude&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;──&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;tool:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;add_openings(&lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="err"&gt;wallId:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"W3"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;type:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"window"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;                            &lt;/span&gt;&lt;span class="err"&gt;width_mm:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;1500&lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="err"&gt;)&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;▼&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="err"&gt;backend:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;mutate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;model&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;validate&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;save&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;rows&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;-&amp;gt;&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;stream&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;result&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;│&lt;/span&gt;&lt;span class="w"&gt;
   &lt;/span&gt;&lt;span class="err"&gt;▼&lt;/span&gt;&lt;span class="w"&gt;
 &lt;/span&gt;&lt;span class="err"&gt;claude:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"done. added window W3 (1500x1200, sill 900) centred on
          wall W3, with a leader label."&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;thirteen tools total. three read (summary, element listing, quantity take-off), ten write (walls, openings, rooms, annotations, storey, rename, delete with cascade).&lt;/p&gt;

&lt;p&gt;three rules made this safe instead of terrifying:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;1. every mutation goes through the same validated path the ui uses.&lt;/strong&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_save&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="nf"&gt;save_edited_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;project_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;   &lt;span class="c1"&gt;# pydantic-validated, same as PUT /model
&lt;/span&gt;    &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ToolError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;edit rejected by model validation: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;the agent physically cannot persist a model the editor and the dwg exporter would not accept. i did not write a second, special, "for the ai" write path. that is the whole trick, and it only works because part 4 made the model rows with a single schema-checked door in front of them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;2. tool errors are recoverable, not fatal.&lt;/strong&gt; a &lt;code&gt;ToolError&lt;/code&gt; ("no wall with id W99", "a 5000 mm door does not fit on a 3200 mm wall") is fed back to the model as an error result. the model reads it, adjusts, and tries something else. the stream never dies because the agent guessed wrong once. watching it recover mid-conversation is half the magic of the demo.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;3. domain guardrails live in the tools, not the prompt.&lt;/strong&gt; default door sizes, auto-assigned marks, fit checks, cascade deletes (kill a wall, its doors and labels go with it). prompts drift. code does not.&lt;/p&gt;

&lt;p&gt;the loop itself is about 60 lines: stream a response, forward text deltas to the browser as server-sent events, and when the model stops with &lt;code&gt;tool_use&lt;/code&gt;, execute the tools, append the results, and continue, up to 12 rounds.&lt;/p&gt;

&lt;h2&gt;
  
  
  the part nobody talks about: ai edits and your undo stack
&lt;/h2&gt;

&lt;p&gt;here is the ux problem that makes or breaks a copilot in an editor. the ai edits the database server-side, but the user is holding a live client-side editing session with its own undo history. if those drift apart you get the worst bug class in collaborative software.&lt;/p&gt;

&lt;p&gt;my solution has two halves.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;flush before the agent reads.&lt;/strong&gt; the editor autosaves on a 900 ms debounce, so when you send a chat message there may be an unsaved wall on your screen. before every agent turn the frontend cancels the pending timer and force-saves. the agent always sees your latest state, and, just as important, a stale queued autosave can no longer fire mid-run and silently overwrite what the agent wrote.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;fold the result into undo as one step.&lt;/strong&gt; when the agent reports that it changed something, the frontend refetches the model and applies it through the normal store transaction:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;applyServerModel&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;async &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;res&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;api&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;project&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;fresh&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;structuredClone&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;model&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;store&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;execute&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nb"&gt;Object&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;assign&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;fresh&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;   &lt;span class="c1"&gt;// one commit -&amp;gt; one undo entry&lt;/span&gt;
  &lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="p"&gt;};&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;whatever the agent did, five tool calls, twenty labels, becomes exactly one entry in the undo stack. ctrl+z reverts the entire ai edit, and the autosave writes the revert back to the database. i have an end-to-end browser test that asserts the full round trip: agent adds a door, db count goes up by one, ctrl+z, db count comes back down.&lt;/p&gt;

&lt;p&gt;an ai copilot without undo is a liability. with undo it is just a very fast colleague.&lt;/p&gt;

&lt;h2&gt;
  
  
  two bugs worth telling
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;the invisible stream.&lt;/strong&gt; my first browser test passed the "request completed" check but showed an empty chat bubble. the backend streamed sse frames, curl showed them fine, the browser parsed nothing. cause: my parser split frames on &lt;code&gt;\n\n&lt;/code&gt;, but the sse library (sse-starlette) terminates lines with &lt;code&gt;\r\n&lt;/code&gt;. curl's terminal output hides the difference completely. the fix is one regex:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight typescript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// frames end with a blank line; the server uses \r\n line endings&lt;/span&gt;
&lt;span class="k"&gt;while &lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;match&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="se"&gt;\r?\n\r?\n&lt;/span&gt;&lt;span class="sr"&gt;/&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;frame&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="nx"&gt;buf&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nx"&gt;buf&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;slice&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;index&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="nx"&gt;m&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="c1"&gt;// parse "event:" and "data:" lines with split(/\r?\n/)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;if you ever hand-parse sse from a &lt;code&gt;fetch&lt;/code&gt; body: it is always the line endings.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;the minus one millimetre door.&lt;/strong&gt; i wrote a fuzz harness that reconstructs models from the whole dwg corpus and fires every tool at them with valid, edge-case, and garbage inputs, asserting after every single mutation that the model still validates and still exports to dxf. it found that &lt;code&gt;add_openings&lt;/code&gt; with &lt;code&gt;width_mm: -1&lt;/code&gt; sailed straight through my fit check:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;width&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="mi"&gt;100&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;length&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;   &lt;span class="c1"&gt;# -1 + 100 = 99, "fits" on any wall
&lt;/span&gt;    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;ToolError&lt;/span&gt;&lt;span class="p"&gt;(...)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;a door with negative width, persisted to the database. would claude ever send that? probably not, the tool schema declares minimums. but "probably not" is not an invariant, and the boundary has to hold on its own. range checks went in, and the fuzz harness lives in the repo and runs against the corpus like the etl smoke test does.&lt;/p&gt;

&lt;p&gt;fuzz your tool layer. the llm is a fuzzer with excellent grammar, so meet it with one that has none.&lt;/p&gt;

&lt;h2&gt;
  
  
  what i learned
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;the hard part was never the ai.&lt;/strong&gt; it was turning anonymous line segments into walls, doors and rooms. i spent 90% of the time on extraction, geometry and the editor, and the ai feature took days, not weeks, precisely because that foundation existed.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;never trust file headers.&lt;/strong&gt; measure the geometry. the units, the layers, the structure: verify everything against what is actually drawn.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;geometry bugs hide in the shapes you did not test.&lt;/strong&gt; rectangular buildings passed for weeks; one l-shaped plan broke the room detection. topological rules (drop the largest face) beat metric thresholds (drop faces over x%) every time.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;one validated write path.&lt;/strong&gt; the ui, the import pipeline, and the ai all write through the same schema-checked door. every safety property you enforce there is enforced everywhere, forever.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;let the agent fail cheaply.&lt;/strong&gt; recoverable tool errors turn "the ai crashed" into "the ai corrected itself", which users read as intelligence.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;the ai needs an undo story.&lt;/strong&gt; refetch and fold into one transaction is simple and it works. ship nothing agentic without it.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;smoke test against reality.&lt;/strong&gt; 231 hostile files taught me more than any spec. the corpus, not my imagination, decided what "robust" means.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;the stack, for the curious: fastapi, dbos durable workflows, libredwg + ezdxf, sqlmodel on sqlite (postgres-ready), react + vite + typescript, three.js for 3d, raw canvas 2d for the editor, rbush for spatial indexing, and the anthropic python sdk pointed at claude opus 4.8 on azure ai foundry.&lt;/p&gt;

&lt;p&gt;the whole thing runs on one small box. the most satisfying part is still that first conversation: asking a floor plan a question, getting a correct answer, and watching a door it drew for you appear in the drawing, with its swing arc, its mark tag, and a ctrl+z that takes it away again.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>webdev</category>
      <category>python</category>
      <category>typescript</category>
    </item>
    <item>
      <title>Your SSH key is already an account: building multiplayer apps over SSH</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Fri, 26 Jun 2026 15:59:36 +0000</pubDate>
      <link>https://dev.to/arif/your-ssh-key-is-already-an-account-building-multiplayer-apps-over-ssh-1fd3</link>
      <guid>https://dev.to/arif/your-ssh-key-is-already-an-account-building-multiplayer-apps-over-ssh-1fd3</guid>
      <description>&lt;p&gt;Almost every app starts with the same chore. Before a user can do anything, you build a way to know who they are: an email field, a password hash, an OAuth redirect, a confirmation link, a session cookie. It is real work, and none of it is the thing you set out to build.&lt;/p&gt;

&lt;p&gt;SSH has handled this since the 1990s, and almost nobody builds on it.&lt;/p&gt;

&lt;p&gt;When a client connects over SSH, the handshake cryptographically proves the client holds the private key for the public key it presents. By the time your code runs, the connection has already answered the question every login form exists to answer. The key fingerprint is a stable, unique, verified id for that user. No email, no password, no signup, no third party. You have an account before the first byte of your app runs.&lt;/p&gt;

&lt;p&gt;So I pulled the common parts into a small Go framework called &lt;a href="https://github.com/doganarif/wharf" rel="noopener noreferrer"&gt;wharf&lt;/a&gt;, built on Charm's &lt;code&gt;wish&lt;/code&gt; and &lt;code&gt;bubbletea&lt;/code&gt;.&lt;/p&gt;

&lt;h2&gt;
  
  
  TL;DR
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;The SSH key fingerprint is a free, verified, zero-signup user id.&lt;/li&gt;
&lt;li&gt;
&lt;code&gt;wharf&lt;/code&gt; gives you identity, presence, multiplayer rooms, and per-user persistence. You write a Bubble Tea model, it does the rest.&lt;/li&gt;
&lt;li&gt;Each room is a single goroutine, so broadcasting is one line and there are no locks.&lt;/li&gt;
&lt;li&gt;Storage is a tiny interface with adapters for memory, SQLite, Postgres, Redis, and bbolt, each its own module so the core has no database dependency.&lt;/li&gt;
&lt;li&gt;Go, MIT. Repo: github.com/doganarif/wharf&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The smallest app
&lt;/h2&gt;

&lt;p&gt;An app is a function from a session to a Bubble Tea model. That is the whole contract.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;wharf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":2222"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;App&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"shout"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;
        &lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;shout&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;wharf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Session&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;me&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;          &lt;span class="c"&gt;// verified identity, for free&lt;/span&gt;
        &lt;span class="n"&gt;room&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"shout"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="c"&gt;// a shared room&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;&lt;code&gt;ssh shout@your.host&lt;/code&gt; and you are in. The room traffic shows up as normal messages in your &lt;code&gt;Update&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Update&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Msg&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Cmd&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;type&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;KeyMsg&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Type&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;tea&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;KeyEnter&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;room&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Broadcast&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Who&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;me&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ShortID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Text&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;})&lt;/span&gt;
            &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;wharf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Event&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="c"&gt;// someone broadcast to the room&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Payload&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;wharf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Presence&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="c"&gt;// the roster changed (join or leave)&lt;/span&gt;
        &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;roster&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;msg&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Roster&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That is multiplayer. No socket handling, no membership bookkeeping, no auth.&lt;/p&gt;

&lt;h2&gt;
  
  
  Identity is free, and persistent
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;s.User&lt;/code&gt; is derived from the public key. The fingerprint is the real id, and &lt;code&gt;wharf&lt;/code&gt; turns it into a stable handle and color so the same key is always the same &lt;code&gt;brave-otter&lt;/code&gt; in the same shade. Two strangers who never signed up now have distinct, durable identities.&lt;/p&gt;

&lt;p&gt;Use the fingerprint as a key into a store and you get per-user state with no login:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;b&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Bucket&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"visits"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="c"&gt;// namespaced to this app&lt;/span&gt;
&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;User&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fingerprint&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="c"&gt;// read/unread, bookmarks, "continue where you left off", a history&lt;/span&gt;
&lt;span class="c"&gt;// that is simply there when they reconnect tomorrow.&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It is anonymous and persistent at once. You never learn anyone's email, so you are not holding data you have to protect, but you recognize them perfectly across visits.&lt;/p&gt;

&lt;h2&gt;
  
  
  A room is a single goroutine
&lt;/h2&gt;

&lt;p&gt;The naive multiplayer design is a shared map of members behind a mutex, and it races the moment people join and leave while messages fly.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;wharf&lt;/code&gt; models each room as an actor. One goroutine owns the member set. Join, leave, and broadcast are messages sent to it over channels. Because exactly one goroutine ever touches that map, there are no locks and no data races by construction. Concurrency is message passing, not shared memory.&lt;/p&gt;

&lt;p&gt;A nice second-order effect: since the room already serializes every mutation, it is the natural home for shared state. &lt;code&gt;wharf&lt;/code&gt; has stateful rooms that hold a value and a reducer, hand each newcomer a snapshot on join, and broadcast the new state after every action. A live poll or a small game board becomes a reducer and a render, with the hard part solved one layer down.&lt;/p&gt;

&lt;h2&gt;
  
  
  Persistence without forcing a database on you
&lt;/h2&gt;

&lt;p&gt;It would have been easy to import a Postgres driver into the core and make everyone carry it. Instead, persistence is a five-method interface. The in-memory implementation lives in the core and nothing else. Every real backend is its own Go module:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="s"&gt;"github.com/doganarif/wharf/store/sqlite"&lt;/span&gt;

&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sqlite&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"wharf.db"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="n"&gt;wharf&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":2222"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Store&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;st&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;App&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"shout"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;shout&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Importing the core pulls in no database. Importing an adapter pulls in exactly one. SQLite, Postgres, Redis, and an embedded single-file option all pass one shared conformance suite, so swappable is something I can prove rather than claim. The Postgres and Redis adapters are verified against live servers, and per-key history that works in memory survives a restart unchanged when you point it at a file or a server.&lt;/p&gt;

&lt;h2&gt;
  
  
  A bug that made the design click
&lt;/h2&gt;

&lt;p&gt;Early on, presence was sometimes wrong: one of two connected users would not see the other arrive. Broadcasts worked, identity worked, only the join ordering was off, intermittently.&lt;/p&gt;

&lt;p&gt;The cause was timing. Before rendering, the terminal layer asks the client about its background color and waits for a reply. Real terminals answer in milliseconds. A client that never answers makes the code wait for the timeout, and in my first version that wait happened before the session joined its room. A slow client would join late, after someone who connected after it, and events arrived out of order.&lt;/p&gt;

&lt;p&gt;The fix was to join the room before negotiating anything about rendering. The lesson generalized: the moment you stop assuming "connected" and "in the room" happen at the same instant, a class of ordering bugs disappears.&lt;/p&gt;

&lt;h2&gt;
  
  
  What you can build, and the one catch
&lt;/h2&gt;

&lt;p&gt;&lt;code&gt;wharf&lt;/code&gt; is not a chat library, even though chat is the obvious demo. The primitives (identity, presence, rooms, shared state, storage) fit a lot of things. The examples I shipped are a collaborative drawing canvas where you watch other cursors move, a keyless chat room, and a live poll. The same shapes cover guestbooks, small multiplayer games, status boards, internal tools your team reaches with keys they already have, and an assistant you talk to in your terminal.&lt;/p&gt;

&lt;p&gt;The honest catch: SSH is pull-only. Nobody re-runs a command on a Tuesday for no reason, so an SSH app cannot tap someone on the shoulder. The experience is great and the retention is near zero unless you add a way back in, usually a small web mirror for discovery or a notification when there is an actual reason to return. The terminal is the experience, not the reminder.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try it
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;go run github.com/doganarif/wharf/cmd/wharf@latest
&lt;span class="c"&gt;# then, in two terminals:&lt;/span&gt;
ssh &lt;span class="nt"&gt;-p&lt;/span&gt; 2222 canvas@localhost
ssh &lt;span class="nt"&gt;-p&lt;/span&gt; 2222 chat@localhost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Repo (Go, MIT): &lt;a href="https://github.com/doganarif/wharf" rel="noopener noreferrer"&gt;https://github.com/doganarif/wharf&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you build something odd with it, I would like to see it.&lt;/p&gt;

</description>
      <category>go</category>
      <category>ssh</category>
      <category>opensource</category>
      <category>showdev</category>
    </item>
    <item>
      <title>Shipping 100,000 construction PDFs a month: what actually breaks</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Thu, 18 Jun 2026 22:07:44 +0000</pubDate>
      <link>https://dev.to/arif/shipping-100000-construction-pdfs-a-month-what-actually-breaks-3icn</link>
      <guid>https://dev.to/arif/shipping-100000-construction-pdfs-a-month-what-actually-breaks-3icn</guid>
      <description>&lt;p&gt;After a year running a document processing pipeline through hundreds of thousands of construction documents (tender packs, permit applications, site surveys, BIM exports, drawing sets at A0 and larger), I can tell you what actually breaks.&lt;/p&gt;

&lt;p&gt;It is not the PDFs.&lt;/p&gt;

&lt;p&gt;That is the thing most people get wrong about systems like this. Every PDF tutorial focuses on the parser: which library, which model, which extraction service. After a year, the PDFs themselves rank third on the list of things that break. The first is the error taxonomy. The second is coordination between documents. The actual content of the files is, mostly, a tractable engineering problem with off-the-shelf tools.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;One pipeline run per document, not per upload.&lt;/strong&gt; Per-document isolation pays for itself the first time a corrupt PDF lands in a 2,000-file archive.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PDF malformation has a bounded long tail.&lt;/strong&gt; Five categories cover most of what production traffic produces. Handle the categories, not the cases.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision LLMs are not the bottleneck. Coordination is.&lt;/strong&gt; The completion fence is two SQL queries and it caused more incidents than the model layer ever did.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Large-format pages require tiling before submission.&lt;/strong&gt; A0 drawings at 300 DPI exceed every vision API's pixel budget. Detect at the geometry level, subdivide into a grid, and batch-submit tiles through a separate quota pool.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;The most effective intervention against hallucinated numbers is not a better model. It is grounding.&lt;/strong&gt; Feed the model the deterministically-extracted text alongside the image and it becomes a layout mediator rather than a content generator.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;A hundred thousand documents a month is one new document every twenty-six seconds on average, in bursts of several thousand at a time. Most are uneventful. The ones that are not built this pipeline.&lt;/p&gt;

&lt;h2&gt;
  
  
  The shape of the pipeline
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;ZIP upload ──► extract recursively ──► fan-out (one run per file)
                                            │
                                            ▼
                            ┌─────────────────────────────┐
                            │ format normalization        │
                            │   (Office → PDF, flatten)   │
                            ├─────────────────────────────┤
                            │ extraction                  │
                            │   (parser / OCR / vision)   │
                            ├─────────────────────────────┤
                            │ enrichment                  │
                            │   (classify, embed)         │
                            ├─────────────────────────────┤
                            │ persistence                 │
                            │   (vector + search + DB)    │
                            └─────────────────────────────┘
                                            │
                                            ▼
                            project-wide completion fence
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The architecture is conventional. The interesting parts live in the seams between stages and in the orchestration around them.&lt;/p&gt;

&lt;h2&gt;
  
  
  One run per document
&lt;/h2&gt;

&lt;p&gt;The first instinct is to batch. Receive a ZIP of two thousand files, process them together, commit them together. Batching minimises overhead. It also couples receipt, processing, and commit into a single failure domain, which is the opposite of what you want when input variance is this high. A single corrupt PDF inside the batch fails the batch. Retries become coarse. Observability collapses into "this upload did not finish." The user sees an error on a project that contains one bad file and 1,999 good ones.&lt;/p&gt;

&lt;p&gt;We run one pipeline run per document.&lt;/p&gt;

&lt;p&gt;The coordinator receives a ZIP, expands it recursively to a configured nesting limit, uploads each file to object storage, and dispatches one child pipeline run per file. The launch is fire-and-forget through the orchestrator's task API. The per-file descriptor passed between stages is intentionally small: a storage key, a filename, a relative archive path, an extension, a size, a nesting flag. No bytes. Each child worker fetches its own file from storage.&lt;/p&gt;

&lt;p&gt;This decouples extraction throughput from processing memory. The coordinator's working set stays flat regardless of whether the archive contains fifty documents or fifty thousand. Child pipeline runs execute on uniformly sized container tasks (4 vCPU, 30 GiB RAM, 100 GiB ephemeral disk), so capacity planning is one number multiplied by concurrency.&lt;/p&gt;

&lt;p&gt;Within a child run, the stages are strictly sequential. Threading inside a run would muddle error attribution, complicate the status state machine, and burn file descriptors. Parallelism lives between runs, not inside them.&lt;/p&gt;

&lt;p&gt;Every stage is wrapped by the same helper. It opens a trace span, awaits the stage coroutine, and on failure calls one of two status writers before re-raising:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;_run_step&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;coro&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="nf"&gt;trace_span&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;coro&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;UnprocessableError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mark_unprocessable&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;db&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;mark_failed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;str&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;raise&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The choice between the two writers is the architectural decision that took the longest to get right. A document the extraction service classifies as permanently unprocessable (DRM-locked, zero-byte, malformed) gets a different label in the database than a document that timed out talking to an API. On-call sees the difference at a glance and does not waste a retry on a file that will never succeed.&lt;/p&gt;

&lt;p&gt;Status writers function as a product-level state machine. Failure modes become product states.&lt;/p&gt;

&lt;p&gt;Before any work runs, the child inserts a database row with status &lt;code&gt;processing&lt;/code&gt; and updates the run's display label to match the row's ID. If the run crashes anywhere downstream, the row stays, with the step name and stack trace attached. Nothing disappears between systems.&lt;/p&gt;

&lt;h2&gt;
  
  
  PDFs we did not write
&lt;/h2&gt;

&lt;p&gt;The PDF specification is permissive and two decades old. Most generators violate parts of it. A few violate most of it. Each one looks valid until you try to read it.&lt;/p&gt;

&lt;p&gt;Construction PDFs are particularly varied. A typical project contains:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Tender documents (long, text-heavy, sometimes with embedded scans)&lt;/li&gt;
&lt;li&gt;Permit applications (form-heavy, populated through Acrobat or web tools)&lt;/li&gt;
&lt;li&gt;Site surveys (scanned or photographed, sometimes rotated)&lt;/li&gt;
&lt;li&gt;Construction drawings (A1, A0, vector-heavy with thousands of dimension annotations)&lt;/li&gt;
&lt;li&gt;BIM-exported sheet sets (huge, layered, often with missing fonts)&lt;/li&gt;
&lt;li&gt;Material specifications (mixed text and tables)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;After a year, the failure modes are stable. Five categories cover most of it: form-heavy, scanned, structurally corrupt, oversized, and mislabeled.&lt;/p&gt;

&lt;h3&gt;
  
  
  Form-heavy
&lt;/h3&gt;

&lt;p&gt;Open a permit application in a browser and you see numbers. Extract the text programmatically and you see &lt;code&gt;Field1: [empty]&lt;/code&gt;. The visible content lives in form widgets, not page content; the two are stored separately inside the file, and a parser that does not know to look for widget annotations returns the empty shell.&lt;/p&gt;

&lt;p&gt;The flattener inspects every page for widget and free-text annotations. If any exist, PyMuPDF bakes them into static page content and re-serialises the file with its highest garbage-collection pass: unreferenced objects removed, object numbers compacted, cross-reference table rebuilt.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;needs_flattening&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;widgets&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;first_annot&lt;/span&gt; &lt;span class="ow"&gt;is&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;flatten&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;apply_redactions&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;  &lt;span class="c1"&gt;# bakes widget values into content
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tobytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;garbage&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;deflate&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A particular subtlety surfaced in production: non-UTF-8 annotation names cause PyMuPDF to throw &lt;code&gt;SystemError&lt;/code&gt;, &lt;code&gt;UnicodeDecodeError&lt;/code&gt;, or &lt;code&gt;RuntimeError&lt;/code&gt; from inside the widget enumeration call. The detection catches all three and uses them as signal.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;needs_flattening_safe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;needs_flattening&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="nf"&gt;except &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nb"&gt;SystemError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;UnicodeDecodeError&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;True&lt;/span&gt;  &lt;span class="c1"&gt;# crash on enumeration = file definitely needs flattening
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A parser crash on widget enumeration is, in practice, a near-certain indicator that the file needs flattening. The crash became the heuristic.&lt;/p&gt;

&lt;h3&gt;
  
  
  Scanned
&lt;/h3&gt;

&lt;p&gt;Site surveys arrive looking like real PDFs. They are not. There is no text layer. A naive parser returns an empty string and reports success. The pipeline detects scans before any extraction attempt with a text-density heuristic.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;SCAN_THRESHOLD&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mf"&gt;0.5&lt;/span&gt;  &lt;span class="c1"&gt;# at most half the pages have text -&amp;gt; treat as scan
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_scanned&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="bp"&gt;False&lt;/span&gt;
    &lt;span class="n"&gt;with_text&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;doc&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_text&lt;/span&gt;&lt;span class="p"&gt;().&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;with_text&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&lt;/span&gt; &lt;span class="n"&gt;SCAN_THRESHOLD&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The threshold is set above zero so mixed documents (a scanned cover sheet stapled to typed pages, common in tender packs) route correctly rather than failing the binary check on page one. OCR output is written back to the same storage key so reprocessing skips the OCR step on a second attempt.&lt;/p&gt;

&lt;h3&gt;
  
  
  Structurally corrupt
&lt;/h3&gt;

&lt;p&gt;Print-to-PDF drivers occasionally emit documents with one or more pages containing empty content streams. The extraction service signals permanent failure and refuses the whole file. The pipeline does not give up. It scans every page for zero-length content streams, removes the offending pages, rebuilds the file, and retries.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;repair_empty_pages&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Document&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="sh"&gt;"""&lt;/span&gt;&lt;span class="s"&gt;Returns True if the document was modified.&lt;/span&gt;&lt;span class="sh"&gt;"""&lt;/span&gt;
    &lt;span class="n"&gt;bad&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;read_contents&lt;/span&gt;&lt;span class="p"&gt;()]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;reversed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bad&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;  &lt;span class="c1"&gt;# delete in reverse so indices stay valid
&lt;/span&gt;        &lt;span class="n"&gt;doc&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;delete_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;bad&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A forty-page drawing set with two corrupt pages becomes a thirty-eight-page document that succeeds.&lt;/p&gt;

&lt;p&gt;The general principle here is worth naming: do not try to predict which inputs are corrupt. Let the service tell you, then repair surgically. The same principle returns later for rate limits.&lt;/p&gt;

&lt;h3&gt;
  
  
  Oversized
&lt;/h3&gt;

&lt;p&gt;When the extraction service signals page-count exceeded or payload too large, the pipeline splits the document into hundred-page chunks and submits each independently with up to five attempts per chunk (exponential backoff, 4 to 90 seconds). The full document is always tried first. Chunking is reactive, not proactive. Most files are not large; the ones that are land in their own code path without slowing down the median.&lt;/p&gt;

&lt;h3&gt;
  
  
  The waterfall
&lt;/h3&gt;

&lt;p&gt;These behaviours feed a priority waterfall over three extraction tiers:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Primary structured extraction service.&lt;/strong&gt; Handles the common case. Returns text blocks with bounding boxes, tables, and figure renditions.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Secondary extraction service.&lt;/strong&gt; Takes over on permanent failure from tier 1. Its own retry budget.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Vision-model fallback.&lt;/strong&gt; Activates if tier 2 also exhausts.&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Three tiers down and still failing, the document is marked permanently unprocessable and surfaced to the user. A retry at one tier never consumes the budget of another. This sounds obvious until you find a system where it does.&lt;/p&gt;

&lt;p&gt;Office files (&lt;code&gt;.docx&lt;/code&gt;, &lt;code&gt;.doc&lt;/code&gt;, &lt;code&gt;.xlsx&lt;/code&gt;, &lt;code&gt;.xls&lt;/code&gt;, &lt;code&gt;.pptx&lt;/code&gt;, &lt;code&gt;.ppt&lt;/code&gt;) are converted to PDF through a cloud conversion API before they enter the extraction path. Three task-level retries with 10, 30, 60 second delays. Conversion errors are classified into permanent (genuinely corrupt source files) and transient (API issues) using the API's response metadata. Email files are converted in-process using a local parser and a PDF rendering library: header and body onto A4 pages, no external API call. The in-process path keeps email content off third-party infrastructure. Customers do not always realise their PDF pipelines contain their inbox.&lt;/p&gt;

&lt;h2&gt;
  
  
  Tiling large-format pages
&lt;/h2&gt;

&lt;p&gt;A single A0 sheet rasterised at 300 DPI produces an image roughly 9900 x 14000 pixels: around 140 megapixels. Vision APIs enforce hard limits on both total pixel count and the length of any single side. Sending the raw image does not produce an error in every case. Some endpoints silently crop or downsample before inference, which means the model sees a different image than the one you submitted, with no signal that anything went wrong. The only safe path is to detect large-format pages before rasterisation and subdivide them.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Detection at the geometry level&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;PDF coordinates are measured in points (1 pt = 1/72 inch). PyMuPDF exposes page dimensions directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;_LARGE_FORMAT_MIN_POINTS&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;1190&lt;/span&gt;  &lt;span class="c1"&gt;# long edge of A3 landscape in PDF points
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_large_format_page&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;width&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;rect&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;height&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;_LARGE_FORMAT_MIN_POINTS&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The threshold is 1190 points because A3 landscape measures 1190 x 842 pt. Any page whose longest side meets or exceeds that value is treated as a large-format page and routed to tiling. PDF metadata is not used for this decision. Pages exported from BIM tools routinely carry an A4 &lt;code&gt;/MediaBox&lt;/code&gt; while the actual geometry describes an A1 drawing.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tile grid construction&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The grid divides the page into equal-sized cells, each no larger than the threshold constant:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;compute_tile_grid&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width_pt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;height_pt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;cols&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;width_pt&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;_LARGE_FORMAT_MIN_POINTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;rows&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;max&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;math&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;ceil&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;height_pt&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;_LARGE_FORMAT_MIN_POINTS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="n"&gt;tile_w&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;width_pt&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;
    &lt;span class="n"&gt;tile_h&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;height_pt&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;rows&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cols&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tile_w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;tile_h&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;An A1 page (roughly 1684 x 2384 pt) produces a 2 x 2 grid (4 tiles). An A0 page (roughly 2384 x 3370 pt) produces a 2 x 3 grid (6 tiles). Tiles are edge-to-edge with zero overlap. This means any entity whose geometry straddles a tile boundary (a dimension string, a room label, a wall line) is split across two tiles. There is no stitching pass after inference. The grounding mechanism described in the next section is what makes split entities recoverable: if the text layer was extracted, the dimension is already present verbatim in both tiles' grounding context regardless of where the tile cut falls.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Rasterisation&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Tiles are rendered at a fixed 200 DPI, not the 300 DPI used for normal pages:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;  &lt;span class="c1"&gt;# PyMuPDF
&lt;/span&gt;
&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;prepare_tile&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clip_rect&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Rect&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;mat&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;fitz&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Matrix&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;200&lt;/span&gt; &lt;span class="o"&gt;/&lt;/span&gt; &lt;span class="mi"&gt;72&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;pix&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;get_pixmap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;matrix&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;mat&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;clip&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="n"&gt;clip_rect&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;pix&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;tobytes&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;png&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;_downscale_if_needed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;  &lt;span class="c1"&gt;# enforces 3072 px longest-edge cap
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;200 DPI is chosen because each tile is already a sub-region of the full page. At that resolution, a 1190 pt tile rasterises to roughly 3306 pixels on that edge, which exceeds the 3072 px per-dimension cap. The &lt;code&gt;_downscale_if_needed&lt;/code&gt; guard re-encodes the tile to fit before submission. Using 300 DPI on the same tile would produce an even larger intermediate image, burning CPU for no benefit. Normal (non-tiled) pages start at 300 DPI and scale down adaptively to stay within a pixel-area budget just under 180 million total pixels and a longest-edge cap of 3072 pixels.&lt;/p&gt;

&lt;p&gt;Tiles are encoded as RGB PNG. PNG is lossless and avoids format-rejection errors from CMYK or palette-mode inputs that some vision APIs refuse with a 400.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Batch submission and polling&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Tiled pages are not submitted through the synchronous inference path. They go to a batch inference endpoint backed by a separate quota pool. Batch requests draw on a quota that is independent of the real-time API, so a burst of A0 drawings does not saturate the quota used by the rest of the pipeline.&lt;/p&gt;

&lt;p&gt;Each batch job gets a unique scoped prefix in object storage. Tile images are uploaded in sequential insertion order. A manifest is written containing one request object per tile, each referencing its storage reference and inference parameters. The batch platform writes results to the same prefix as one or more result files.&lt;/p&gt;

&lt;p&gt;The poller checks job status every 30 seconds. The timeout ceiling is 6 hours. On timeout, the job is cancelled and a &lt;code&gt;TimeoutError&lt;/code&gt; is raised. The staging prefix is left intact for inspection.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Reassembly&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Output lines from the batch are not guaranteed to arrive in submission order. Each input request embeds an identifier that maps back to its insertion index. The collector builds a reverse-lookup map from identifier to index and places each response into a pre-allocated list at the correct position:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;id_to_index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;record&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nb"&gt;id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;record&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;enumerate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;
&lt;span class="n"&gt;descriptions&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;""&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;output_lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;request_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;request_id&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;idx&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;id_to_index&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;request_id&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="n"&gt;descriptions&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;idx&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;response&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The final list is in original tile order, which maps directly to reading order across the page grid.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Partial success is a hard failure&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;If any tile slot is still empty after all output files are consumed, the batch runner raises &lt;code&gt;RuntimeError&lt;/code&gt; and returns nothing:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;produced&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;descriptions&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;d&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;produced&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;RuntimeError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Batch incomplete: &lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;produced&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt;/&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;records&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="s"&gt; tiles returned&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;A document missing one tile out of six would look complete to a downstream consumer. In a construction context that means a floor plan with a silent gap in the extracted data. The pipeline raises rather than persist that state. The task-level retry will requeue the entire document.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Cleanup&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The staging prefix is not deleted in the error path. A storage lifecycle policy reclaims any prefix older than a fixed TTL. The TTL covers the longest plausible batch job window plus inspection time. This keeps error-path code simple and ensures orphaned prefixes from crashed workers are cleaned up without any explicit handler.&lt;/p&gt;




&lt;h2&gt;
  
  
  Blueprint understanding
&lt;/h2&gt;

&lt;p&gt;A single A1 sheet combines dimension chains at 0°, 90°, and 45° rotation; room labels; hatch patterns; section cut symbols; and revision clouds, all at different scales within the same sheet. The text density is high but non-uniform. Symbols carry meaning that differs by convention and trade. A "300" in one region is a room number; in another it is a dimension in millimetres.&lt;/p&gt;

&lt;p&gt;Pure vision inference on these inputs produces two failure modes. First, the model misreads values that are visually ambiguous at the rendered resolution (rotated text, condensed fonts, overlapping annotations). Second, the model invents plausible-looking numbers when it is uncertain. In construction, a wrong dimension is a specification error. It enters procurement quantities and fabrication orders. The liability exposure from a single incorrect quantity in a bill of materials is orders of magnitude larger than a missing word in a document summary.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The hybrid path&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;When the structured extraction service processes a PDF successfully, it returns both the rendered page images and a list of text blocks, each with a bounding box in PDF point coordinates. The pipeline does not discard this structured output when it calls the vision model. It uses it as ground truth.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Spatial join&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;For each figure (a rendered tile or a figure rendition), the pipeline finds all text blocks whose bounding boxes overlap the figure's region:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;bounds_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="c1"&gt;# a and b are [x0, y0, x1, y1] in PDF points
&lt;/span&gt;    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="ow"&gt;or&lt;/span&gt; &lt;span class="n"&gt;b&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;a&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;collect_grounding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;figure_bounds&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;float&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;text_blocks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;dict&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;
    &lt;span class="n"&gt;page&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;int&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]:&lt;/span&gt;
    &lt;span class="n"&gt;parts&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;text_blocks&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;page&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="n"&gt;page&lt;/span&gt; &lt;span class="ow"&gt;and&lt;/span&gt; &lt;span class="nf"&gt;bounds_overlap&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;figure_bounds&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;bounding_box&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;]):&lt;/span&gt;
            &lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;block&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;text&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;parts&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No spatial index is needed. With tens to low hundreds of text blocks per page, a linear scan adds under a millisecond per figure. The result is the set of text blocks whose bounding boxes overlap the figure's region.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Grounded prompt assembly&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;The collected strings are deduplicated before injection. Any word appearing more than 5 times across the concatenated text is collapsed to a single occurrence, and identical lines are removed. This handles the common case of warehouse floor plans and grid-heavy drawings where a single label repeats across every bay.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;collections&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;Counter&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;deduplicate_grounding&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;list&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;])&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;combined&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;parts&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;combined&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;split&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;freq&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nc"&gt;Counter&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;seen_words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;filtered_words&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;freq&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seen_words&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;continue&lt;/span&gt;
            &lt;span class="n"&gt;seen_words&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;filtered_words&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt; &lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;filtered_words&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;seen_lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;set&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;set&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;lines&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;splitlines&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;line&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;seen_lines&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;seen_lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;add&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;line&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;join&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;lines&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;assemble_grounded_prompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;base_prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;grounding_text&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="ow"&gt;not&lt;/span&gt; &lt;span class="n"&gt;grounding_text&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;strip&lt;/span&gt;&lt;span class="p"&gt;():&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;base_prompt&lt;/span&gt;
    &lt;span class="nf"&gt;return &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;base_prompt&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
        &lt;span class="sa"&gt;f&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;Extracted text in this region:&lt;/span&gt;&lt;span class="se"&gt;\n&lt;/span&gt;&lt;span class="si"&gt;{&lt;/span&gt;&lt;span class="n"&gt;grounding_text&lt;/span&gt;&lt;span class="si"&gt;}&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;
    &lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;With grounding in place, the vision model maps regions and describes visual structure. The numbers come from the text layer, not from model inference. The model cannot invent a number that was not in the grounding text.&lt;/p&gt;

&lt;p&gt;The grounding text is also appended to the stored segment output, not just the prompt. This means the downstream vector index writer receives both the vision narrative and the verified dimension values in the same chunk. Retrieval does not depend on the model having faithfully reproduced the number; the number is present in the chunk regardless.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The 429 chain under blueprint-heavy load&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Blueprint-heavy workloads arrive in bursts. A tender pack or a permit application can contain dozens of A0 and A1 sheets. When several such archives land at once, the tiling path generates hundreds of synchronous vision calls in a short window.&lt;/p&gt;

&lt;p&gt;The retry logic operates at two levels. The inner level handles transient quota exhaustion on a single model:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tenacity&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;stop_after_attempt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;wait_random_exponential&lt;/span&gt;
&lt;span class="kn"&gt;from&lt;/span&gt; &lt;span class="n"&gt;tenacity&lt;/span&gt; &lt;span class="kn"&gt;import&lt;/span&gt; &lt;span class="n"&gt;retry_if_exception&lt;/span&gt;

&lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;is_rate_limit&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;getattr&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;status_code&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="mi"&gt;429&lt;/span&gt;

&lt;span class="nd"&gt;@retry&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;retry&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;retry_if_exception&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;is_rate_limit&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;wait_random_exponential&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;multiplier&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;max&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="mi"&gt;20&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;stop&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="nf"&gt;stop_after_attempt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="n"&gt;reraise&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;call_vision_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="p"&gt;...&lt;/span&gt;  &lt;span class="c1"&gt;# vendor call
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three attempts, randomised exponential backoff, maximum 20 seconds between retries. If all three fail, the exception propagates. Randomisation prevents the thundering herd pattern where multiple workers retry in lockstep.&lt;/p&gt;

&lt;p&gt;The outer level rotates through a list of 4 model candidates on both HTTP 429 (after the inner retries are exhausted) and HTTP 503 (immediately, without consuming the retry budget):&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;describe_with_rotation&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;model_candidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;tuple&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;...],&lt;/span&gt;
    &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="n"&gt;image_bytes&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;bytes&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
    &lt;span class="n"&gt;last_exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;Exception&lt;/span&gt; &lt;span class="o"&gt;|&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="bp"&gt;None&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;model_candidates&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;try&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;call_vision_model&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;image_bytes&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;RateLimitError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# inner retries exhausted; move to next model
&lt;/span&gt;            &lt;span class="n"&gt;last_exc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
        &lt;span class="k"&gt;except&lt;/span&gt; &lt;span class="n"&gt;ServiceUnavailableError&lt;/span&gt; &lt;span class="k"&gt;as&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
            &lt;span class="c1"&gt;# 503: skip immediately, no retry budget consumed
&lt;/span&gt;            &lt;span class="n"&gt;last_exc&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;exc&lt;/span&gt;
            &lt;span class="k"&gt;continue&lt;/span&gt;
    &lt;span class="k"&gt;raise&lt;/span&gt; &lt;span class="nc"&gt;VisionServiceError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="s"&gt;all model candidates exhausted&lt;/span&gt;&lt;span class="sh"&gt;"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;from&lt;/span&gt; &lt;span class="n"&gt;last_exc&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each candidate in the rotation list draws on an independent quota pool. A 429 on one model does not predict a 429 on another. A 503 indicates a model-level availability issue, not a quota issue, so retrying the same model would waste time.&lt;/p&gt;

&lt;p&gt;The outer task-level retry (5 attempts at 10, 30, 60, 120, and 300 seconds) handles failures that survive both the inner backoff and the model rotation, such as a batch job timeout or a network partition.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Why wrong numbers are the failure mode that matters&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A missing tile means a blank region in the extraction output. That is detectable: the document has fewer segments than expected, downstream validation flags it, the task retries. A hallucinated dimension is undetectable without the source document. It looks like a valid result. It passes schema validation. It enters retrieval. If a quantity surveyor or a structural engineer queries it and acts on the number, the error has left the system.&lt;/p&gt;

&lt;p&gt;Construction drawings with one wrong measurement are litigation problems. Grounding makes the pipeline's numeric accuracy independent of model behavior on any given input.&lt;/p&gt;

&lt;h2&gt;
  
  
  Orchestration under load
&lt;/h2&gt;

&lt;p&gt;At a hundred thousand documents a month, coordination failures between pipeline runs cause more incidents than malformed documents do. The PDFs are the easy part. The coordination between PDFs is the system.&lt;/p&gt;

&lt;p&gt;The first sign something was wrong was &lt;code&gt;PoolTimeout&lt;/code&gt; errors appearing before a single child pipeline run had started. The first time a 5,000-file archive landed, every launch request went into the same HTTP connection pool and the pool ran out before the first child flow began. The arithmetic was fast and ugly: five thousand task-launch requests, a connection pool sized for ten, an orchestrator API that sustains roughly ten concurrent launches.&lt;/p&gt;

&lt;p&gt;The coordinator now caps in-flight launches and rate-limits them with a semaphore and a sleep inside the same block.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;launch_semaphore&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nc"&gt;Semaphore&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;10&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;launch_child&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_meta&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;with&lt;/span&gt; &lt;span class="n"&gt;launch_semaphore&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mf"&gt;0.1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;orchestrator&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;dispatch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;file_meta&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The semaphore controls connection pool depth. The sleep controls burst rate. Concurrency control and rate limiting look like the same problem until they fail.&lt;/p&gt;

&lt;p&gt;After fan-out, the coordinator waits for the children. Same problem in reverse: five hundred simultaneous long-poll coroutines would hold five hundred connections open. The coordinator groups child run references into batches of fifty and awaits each batch with &lt;code&gt;asyncio.gather(return_exceptions=True)&lt;/code&gt;.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight python"&gt;&lt;code&gt;&lt;span class="n"&gt;BATCH_SIZE&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;50&lt;/span&gt;

&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;def&lt;/span&gt; &lt;span class="nf"&gt;wait_for_all&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
    &lt;span class="n"&gt;results&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="nf"&gt;range&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nf"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;BATCH_SIZE&lt;/span&gt;&lt;span class="p"&gt;):&lt;/span&gt;
        &lt;span class="n"&gt;batch&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;runs&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="n"&gt;i&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="n"&gt;BATCH_SIZE&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
        &lt;span class="n"&gt;results&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;extend&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="n"&gt;asyncio&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;gather&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;wait_for_run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="ow"&gt;in&lt;/span&gt; &lt;span class="n"&gt;batch&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
            &lt;span class="n"&gt;return_exceptions&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="bp"&gt;True&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;results&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Exceptions are collected per child rather than propagated, so a single failure does not abort the wait for the rest of the batch.&lt;/p&gt;

&lt;p&gt;Backpressure between the coordinator and the cluster is handled by the orchestrator's own concurrency settings, not by the coordinator. Container task capacity forms the lower bound. Adding a coordinator-side cap would create another constant to tune as load changes.&lt;/p&gt;

&lt;p&gt;The vector index writer is the one place backpressure is deliberately turned off. During indexing bursts, the writer bypasses the index service's built-in flow control and absorbs transient overload with task-level retries instead. Service-side throttling would smooth the load curve and protect the storage tier from peaks. Higher peak load on the storage tier is an acceptable tradeoff for predictable end-to-end latency. Users observe indexing latency directly.&lt;/p&gt;

&lt;p&gt;Completion is harder than fan-out. Multiple ZIP archives for the same project can arrive concurrently, and none of them can signal "project ready" until every archive has finished. The fence is two database queries:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="c1"&gt;-- 1. mark this upload processed (idempotent on second call)&lt;/span&gt;
&lt;span class="k"&gt;UPDATE&lt;/span&gt; &lt;span class="n"&gt;uploads&lt;/span&gt;
&lt;span class="k"&gt;SET&lt;/span&gt;    &lt;span class="n"&gt;processed_at&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;NOW&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;  &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt;  &lt;span class="n"&gt;upload_id&lt;/span&gt;  &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt;  &lt;span class="n"&gt;processed_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;

&lt;span class="c1"&gt;-- 2. count remaining unprocessed uploads for this project&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;COUNT&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt;   &lt;span class="n"&gt;uploads&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt;  &lt;span class="n"&gt;project_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;
  &lt;span class="k"&gt;AND&lt;/span&gt;  &lt;span class="n"&gt;processed_at&lt;/span&gt; &lt;span class="k"&gt;IS&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Before the fence closes, the pipeline checks whether this is the last outstanding upload; if so, a cross-document analysis step runs synchronously before the completion timestamp is stamped, so results are visible the moment the project is marked ready. Only when the count reaches zero does a final call fire to mark the project indexed and send a notification. The queries are idempotent. The downstream analysis step is gated by the same fence, so it sees a complete snapshot rather than a moving target.&lt;/p&gt;

&lt;p&gt;Boring SQL. Surprisingly load-bearing. The fence is what saves you.&lt;/p&gt;

&lt;p&gt;Observability is wired into the database, not into a separate log pipeline. Each document pipeline run writes its run identifier into the database before processing starts and updates its display label to match the database-assigned document ID. The link is bidirectional and queryable without polling. Treating the database as the source of truth and tracing as decoration is contrarian; most teams do the opposite. It is also the right call for any batch workload where the database already tracks state for product reasons. Two sources of truth is one too many.&lt;/p&gt;

&lt;p&gt;Distributed tracing spans the full pipeline. Span data is buffered in process and exported at run completion; a crash before export loses that run's spans. Export failures are silently discarded so an observability outage cannot kill a document run.&lt;/p&gt;

&lt;p&gt;The completion notification carries total file count, success count, failure count, and up to ten failed document records with step name and full stack trace. Failed records are fetched from the database at notification time rather than accumulated in memory, which keeps the coordinator's working set bounded regardless of failure rate.&lt;/p&gt;

&lt;h2&gt;
  
  
  What scales
&lt;/h2&gt;

&lt;p&gt;Two things in this system cost the most to get right, and they cost the most for the same reason: the feedback loop runs at the speed of production traffic.&lt;/p&gt;

&lt;p&gt;The error taxonomy, because every decision about whether to retry depends on it. Mark a failure permanent and a week later you find out the fix was a one-line threshold change. Mark it transient and you queue retry work that will never succeed. Neither wrong answer announces itself.&lt;/p&gt;

&lt;p&gt;The completion fence, because partial signals are invisible. A downstream consumer reading a half-indexed project does not know it is reading a partial result.&lt;/p&gt;

&lt;p&gt;Both belong to a category of bug that no amount of load testing surfaces until production traffic starts. At this scale, the cleaner your test set, the more dangerous the gap.&lt;/p&gt;

&lt;p&gt;Everything else is the cost of accepting input from the world.&lt;/p&gt;

</description>
      <category>python</category>
      <category>architecture</category>
      <category>pdf</category>
      <category>llm</category>
    </item>
    <item>
      <title>Control plane, data plane, durable storage: structuring a managed search engine in Rust</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Fri, 05 Jun 2026 09:56:44 +0000</pubDate>
      <link>https://dev.to/arif/control-plane-data-plane-durable-storage-structuring-a-managed-search-engine-in-rust-4b3d</link>
      <guid>https://dev.to/arif/control-plane-data-plane-durable-storage-structuring-a-managed-search-engine-in-rust-4b3d</guid>
      <description>&lt;p&gt;When you ask a search service to "create an index," it reads like a single write to a single place. While building a managed search engine in Rust this year, I started treating that one word as three separate facts, each owned by a different system: the record that an index exists, the index itself, and a durable copy for when the machine holding it dies.&lt;/p&gt;

&lt;p&gt;Pulling those apart changed how I reason about failure and scaling, and it sharpened my sense of what a relational database is actually good for. Here is the structure I settled on, and the path a single create request takes through it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  create request
       |
       v
  Control plane   .  Postgres          does it exist? who owns it? quota
       |
       |  commit first (cheap, authoritative)
       v
  Data plane      .  tantivy           the inverted index, BM25, local disk
       |
       |  materialize second (expensive, can fail)
       v
  Durability      .  object storage    authoritative copy, survives the node
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Three planes, one job each
&lt;/h2&gt;

&lt;p&gt;The control plane is Postgres. It holds metadata and nothing else: which indexes exist, which tenant owns each one, the field configuration, and the quota. The table is tiny.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;TABLE&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;         &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;PRIMARY&lt;/span&gt; &lt;span class="k"&gt;KEY&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;   &lt;span class="c1"&gt;-- a uuid, the real identifier&lt;/span&gt;
    &lt;span class="n"&gt;org_id&lt;/span&gt;     &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;-- the owning tenant&lt;/span&gt;
    &lt;span class="n"&gt;name&lt;/span&gt;       &lt;span class="nb"&gt;TEXT&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;      &lt;span class="c1"&gt;-- a human label, not unique&lt;/span&gt;
    &lt;span class="n"&gt;settings&lt;/span&gt;   &lt;span class="n"&gt;JSONB&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;     &lt;span class="c1"&gt;-- field config, stored as an opaque blob&lt;/span&gt;
    &lt;span class="n"&gt;created_at&lt;/span&gt; &lt;span class="n"&gt;TIMESTAMPTZ&lt;/span&gt; &lt;span class="k"&gt;NOT&lt;/span&gt; &lt;span class="k"&gt;NULL&lt;/span&gt; &lt;span class="k"&gt;DEFAULT&lt;/span&gt; &lt;span class="n"&gt;now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="k"&gt;CREATE&lt;/span&gt; &lt;span class="k"&gt;INDEX&lt;/span&gt; &lt;span class="n"&gt;indexes_org_idx&lt;/span&gt; &lt;span class="k"&gt;ON&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Look at what is missing. There is no document text and no tsvector. The settings column is JSONB, but I never query into it. It gets read back whole and parsed in the application. Postgres knows that an index exists and who owns it. It has no idea how to search one.&lt;/p&gt;

&lt;p&gt;That indexes_org_idx is worth an aside, because the word "index" trips people up here. It is a plain B-tree that speeds up "list the indexes for this tenant." It has nothing to do with full-text search. Two unrelated meanings of the same word, sitting in the same paragraph.&lt;/p&gt;

&lt;p&gt;The data plane is tantivy, a Lucene-class search library for Rust. The real inverted index lives here, on local disk, memory-mapped. Document text, the BM25 postings, the term dictionary, the stored originals: all on disk in the data plane, none of it in Postgres.&lt;/p&gt;

&lt;p&gt;Durability is object storage, an S3-compatible bucket. I treat the local tantivy files as a working copy. The authoritative durable copy lives in the bucket. If the box holding the working copy vanishes, the next boot rebuilds it from object storage.&lt;/p&gt;

&lt;p&gt;One more piece sits in memory: a map from index id to its open tantivy handle, so a query does not reopen files on every request. That map is not a source of truth. It is reconstructed from the control plane at startup.&lt;/p&gt;

&lt;h2&gt;
  
  
  Walking a create through all three
&lt;/h2&gt;

&lt;p&gt;The HTTP handler does almost nothing interesting. It checks that the API key has write permission and belongs to a tenant, then hands off to the index manager. The ordering inside the manager is where the design lives.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight rust"&gt;&lt;code&gt;&lt;span class="k"&gt;pub&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="k"&gt;fn&lt;/span&gt; &lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="nb"&gt;str&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;IndexSettings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;-&amp;gt;&lt;/span&gt; &lt;span class="nb"&gt;Result&lt;/span&gt;&lt;span class="o"&gt;&amp;lt;&lt;/span&gt;&lt;span class="nb"&gt;String&lt;/span&gt;&lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="nf"&gt;validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;          &lt;span class="c1"&gt;// name length, language, vector dims&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;new_uuid&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="c1"&gt;// 1. Control plane commits first. This insert is also the quota gate.&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.catalog&lt;/span&gt;&lt;span class="nf"&gt;.insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nd"&gt;bail!&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"index quota reached"&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="c1"&gt;// 2. Data plane materializes second.&lt;/span&gt;
    &lt;span class="k"&gt;let&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;match&lt;/span&gt; &lt;span class="nn"&gt;FtsIndex&lt;/span&gt;&lt;span class="p"&gt;::&lt;/span&gt;&lt;span class="nf"&gt;open&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="nf"&gt;.dir_for&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c1"&gt;// Opening the on-disk index failed. Roll back the catalog row&lt;/span&gt;
            &lt;span class="c1"&gt;// so we never leave a record pointing at nothing.&lt;/span&gt;
            &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.catalog&lt;/span&gt;&lt;span class="nf"&gt;.delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="k"&gt;.await&lt;/span&gt;&lt;span class="o"&gt;?&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;};&lt;/span&gt;

    &lt;span class="c1"&gt;// 3. Publish the open handle so queries can find it.&lt;/span&gt;
    &lt;span class="k"&gt;self&lt;/span&gt;&lt;span class="py"&gt;.live&lt;/span&gt;&lt;span class="nf"&gt;.write&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="nf"&gt;.insert&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="nf"&gt;.clone&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;Entry&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="nf"&gt;.into&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;index&lt;/span&gt; &lt;span class="p"&gt;});&lt;/span&gt;
    &lt;span class="nf"&gt;Ok&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The shape here is simple to say: commit the cheap authoritative thing first, materialize the expensive thing second, and undo the first if the second fails. The catalog row is a few bytes in a transactional database. Opening a tantivy index touches the filesystem and can fail for boring reasons like no disk, bad permissions, or a corrupt directory. Writing the row first means the quota is enforced atomically. Rolling it back on failure avoids the worst state in this whole system: a row in Postgres that claims an index exists with no data behind it.&lt;/p&gt;

&lt;p&gt;After the manager returns, the handler does one more thing. It syncs the new index directory to object storage, and if that sync fails it returns 503 instead of 200. I would rather tell the caller "this did not durably succeed" than report success for something that only exists on one disk.&lt;/p&gt;

&lt;h2&gt;
  
  
  The quota check is a single statement
&lt;/h2&gt;

&lt;p&gt;The quota gate looks small, and getting it wrong is a classic race. The naive version reads the count, compares it to the limit, then inserts:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;count = SELECT count(*) ...        // says 4 of 5 used
if count &amp;lt; limit: INSERT ...       // ok, insert
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Run two creates at the same moment and both read "4 of 5," both pass the check, and both insert. Now the tenant has 6 indexes on a limit of 5. The check and the write were two steps with a gap between them, and concurrency lives in that gap.&lt;/p&gt;

&lt;p&gt;The fix is to make the check and the insert the same statement, so the database serializes them:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;INSERT&lt;/span&gt; &lt;span class="k"&gt;INTO&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;settings&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;4&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="k"&gt;count&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;indexes&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;org_id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;max_indexes&lt;/span&gt; &lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;organizations&lt;/span&gt; &lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;2&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;If the tenant is at the limit, the WHERE is false, no row goes in, and rows_affected() comes back as 0. The application reads that zero as "quota reached." There is no window for two creates to both win, because there is no separate read to race against.&lt;/p&gt;

&lt;h2&gt;
  
  
  Why split metadata from the index at all
&lt;/h2&gt;

&lt;p&gt;The split costs something. Every create now writes to two systems, and I have to keep them consistent. The payoff shows up in three places.&lt;/p&gt;

&lt;p&gt;Search traffic never touches the control-plane database. A burst of queries hits the in-process tantivy reader and the local disk. It does not compete for connections with the database that authenticates requests and enforces quotas. The thing under the most load and the thing that must stay correct are physically different systems, and they scale on different axes.&lt;/p&gt;

&lt;p&gt;Failure has a clear blast radius. Losing the data plane is annoying but recoverable, since the durable copy is in object storage and the catalog still knows what should exist. Losing the control plane is the real outage, which is exactly why I keep it small, transactional, and boring. A few kilobytes per index in Postgres is cheap insurance for the gigabytes of index data it describes.&lt;/p&gt;

&lt;p&gt;Boot becomes a reconciliation. On startup the engine reads every row from the catalog and, for each one, opens the local copy or restores it from object storage if the local copy is missing. Then it sweeps the bucket for leftover data that has no matching catalog row, which is how a delete that got interrupted halfway gets cleaned up. The control plane is the list of what should exist, and startup makes reality match the list.&lt;/p&gt;

&lt;h2&gt;
  
  
  What a field actually turns into
&lt;/h2&gt;

&lt;p&gt;One detail from the data plane is worth showing, because it explains why search and a database want different things from the same string. When I create a text field that is both searchable and filterable, it becomes two physical fields in tantivy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;title       -&amp;gt;  analyzed field   (tokenized, lowercased, stemmed)   for BM25 matching
title__kw   -&amp;gt;  keyword field    (the raw string, untouched)        for exact filters
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Full-text matching wants "Running" to find "runs," so it tokenizes and stems. An exact filter on a brand name wants "ACME" to mean exactly "ACME," with no stemming and no folding. The same input needs opposite treatment depending on the question, so it gets stored twice, once for each job. Vectors are a third case. They never enter the tantivy schema at all; each vector field gets its own approximate-nearest-neighbor store on disk, beside the text index but separate from it.&lt;/p&gt;

&lt;h2&gt;
  
  
  What it costs to run this way
&lt;/h2&gt;

&lt;p&gt;The architecture has real rough edges, and I would rather name them than pretend otherwise.&lt;/p&gt;

&lt;p&gt;Writes commit synchronously. Each batch fsyncs and reloads the reader so documents are searchable the instant the call returns. That is good for correctness and bad for tiny frequent writes, so bulk loads have to batch thousands of documents per request or throughput falls off a cliff.&lt;/p&gt;

&lt;p&gt;Cold starts pay for the durability. After a deploy, an index restores from object storage and pages in from disk, so the first handful of queries run slower until the cache warms. Warm and cold are genuinely different latency regimes.&lt;/p&gt;

&lt;p&gt;A single index lives on a single node. There is no sharding of one index across machines yet. That is fine for tens of millions of documents on one large server and wrong for trillions. It is a deliberate ceiling, and worth stating plainly rather than hiding.&lt;/p&gt;

&lt;h2&gt;
  
  
  The takeaway
&lt;/h2&gt;

&lt;p&gt;If there is one rule worth keeping from this, it is the ordering. Commit the cheap, authoritative fact first. Materialize the expensive thing second. Roll back the first if the second fails.&lt;/p&gt;

&lt;p&gt;The record that an index exists belongs in a small transactional database. The index itself belongs in a purpose-built search library on fast local disk. The durable copy belongs in object storage. Wire a create through the three in that order and you get independent scaling and clean recovery for very little extra code, while the database goes back to doing the one thing it is best at: being the boring, correct source of truth for what exists.&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I build backend systems with this kind of separation: control planes, search, and correctness under concurrency. If that is the problem in front of you, &lt;a href="mailto:me@arif.sh"&gt;get in touch&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>rust</category>
      <category>search</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>I Benchmarked 7 LLMs on Cross-Domain MCP Orchestration. All 7 Found the Same Gap.</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Mon, 09 Mar 2026 03:38:02 +0000</pubDate>
      <link>https://dev.to/arif/i-benchmarked-7-llms-on-cross-domain-mcp-orchestration-all-7-found-the-same-gap-50l</link>
      <guid>https://dev.to/arif/i-benchmarked-7-llms-on-cross-domain-mcp-orchestration-all-7-found-the-same-gap-50l</guid>
      <description>&lt;h2&gt;
  
  
  The Problem
&lt;/h2&gt;

&lt;p&gt;The Model Context Protocol (MCP) has 106 official servers and 50+ academic papers. But every single study evaluates one server at a time. Nobody has asked: &lt;strong&gt;what happens when you compose multiple MCP servers for cross-domain tasks?&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;I decided to find out.&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Did
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Experiment A: Real MCP Tool Calls
&lt;/h3&gt;

&lt;p&gt;I connected 6 MCP servers simultaneously:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;arXiv&lt;/strong&gt; (academic preprints)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;PubMed&lt;/strong&gt; (biomedical literature)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firecrawl&lt;/strong&gt; (web search and scraping)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Context7&lt;/strong&gt; (library documentation)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory&lt;/strong&gt; (persistent knowledge graph)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Filesystem&lt;/strong&gt; (file operations)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Then I ran 3 cross-domain case studies with 17 real tool calls and 0% failure rate.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The key discovery&lt;/strong&gt;: insights that are impossible within any single server become visible when you combine them. For example, arXiv papers on AI mental health chatbots and PubMed clinical trials on the same topic turned out to be two disconnected research communities working on the same problem with zero cross-citations. This gap was only visible by querying both databases and linking the results through a knowledge graph.&lt;/p&gt;

&lt;h3&gt;
  
  
  Experiment B: 7-Model Benchmark
&lt;/h3&gt;

&lt;p&gt;I took the data collected from those MCP servers and sent identical prompts to 7 different LLMs:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Model&lt;/th&gt;
&lt;th&gt;Latency&lt;/th&gt;
&lt;th&gt;Tokens&lt;/th&gt;
&lt;th&gt;KG Entities&lt;/th&gt;
&lt;th&gt;KG Relations&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;GPT-5.4&lt;/td&gt;
&lt;td&gt;54.7s&lt;/td&gt;
&lt;td&gt;2,352&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;14&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;15&lt;/strong&gt;&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;DeepSeek R1&lt;/td&gt;
&lt;td&gt;33.9s&lt;/td&gt;
&lt;td&gt;4,296&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;td&gt;4&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Mistral Large 3&lt;/td&gt;
&lt;td&gt;6.5s&lt;/td&gt;
&lt;td&gt;1,857&lt;/td&gt;
&lt;td&gt;9&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Llama 4 Maverick&lt;/td&gt;
&lt;td&gt;&lt;strong&gt;3.0s&lt;/strong&gt;&lt;/td&gt;
&lt;td&gt;1,374&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;td&gt;3&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Gemini 2.5 Flash&lt;/td&gt;
&lt;td&gt;15.6s&lt;/td&gt;
&lt;td&gt;4,592&lt;/td&gt;
&lt;td&gt;12&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Sonnet 4.5&lt;/td&gt;
&lt;td&gt;21.3s&lt;/td&gt;
&lt;td&gt;2,411&lt;/td&gt;
&lt;td&gt;13&lt;/td&gt;
&lt;td&gt;11&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Claude Haiku 4.5&lt;/td&gt;
&lt;td&gt;9.3s&lt;/td&gt;
&lt;td&gt;2,136&lt;/td&gt;
&lt;td&gt;8&lt;/td&gt;
&lt;td&gt;6&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every model successfully produced 5 cross-domain insights. But the depth varied wildly: GPT-5.4 built a knowledge graph with 14 entities and 15 relations, while Llama 4 Maverick produced only 3 entities in 3 seconds.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Finding All 7 Models Agreed On
&lt;/h2&gt;

&lt;p&gt;Here is the most interesting part. Without being told what to look for, &lt;strong&gt;all 7 models independently identified the same research gap&lt;/strong&gt;:&lt;/p&gt;

&lt;blockquote&gt;
&lt;p&gt;LangChain provides &lt;code&gt;MultiServerMCPClient&lt;/code&gt; (the mechanism). Benchmarks evaluate individual tool calls. But nobody documents &lt;strong&gt;composition patterns&lt;/strong&gt; for how to effectively orchestrate multi-server workflows.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;p&gt;I call this the &lt;strong&gt;mechanism-pattern gap&lt;/strong&gt;. It is like how HTTP existed for years before REST patterns told people how to use it effectively.&lt;/p&gt;

&lt;h2&gt;
  
  
  5 Composition Patterns
&lt;/h2&gt;

&lt;p&gt;From my experiments, I identified 5 recurring patterns:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Sequential Pipeline&lt;/strong&gt; - Server A output feeds Server B query&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Parallel Fan-Out&lt;/strong&gt; - Same query to multiple servers at once&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cross-Reference Verification&lt;/strong&gt; - Validate findings across servers&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Iterative Refinement&lt;/strong&gt; - Cross-server context narrows queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Domain Bridging&lt;/strong&gt; - Synthesize insights from unrelated domains&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Domain Bridging is the most valuable. It produces insights that exist in neither source alone.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;All servers are open source and require no API keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;uvx arxiv-mcp-server
npx @cyanheads/pubmed-mcp-server
npx @modelcontextprotocol/server-memory
npx firecrawl-mcp
npx @upstash/context7-mcp
npx @modelcontextprotocol/server-filesystem
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The benchmark script and full results are on GitHub:&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;&lt;a href="https://github.com/doganarif/mcp-bench" rel="noopener noreferrer"&gt;github.com/doganarif/mcp-bench&lt;/a&gt;&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Paper: &lt;a href="https://doi.org/10.5281/zenodo.18917784" rel="noopener noreferrer"&gt;Zenodo DOI&lt;/a&gt;&lt;/p&gt;

&lt;p&gt;If you are an arXiv endorser for cs.SE and this seems reasonable, I would appreciate the help: &lt;a href="https://arxiv.org/auth/endorse?x=RLMZ66" rel="noopener noreferrer"&gt;endorsement link&lt;/a&gt;&lt;/p&gt;




&lt;p&gt;&lt;em&gt;I am Arif Dogan, a software engineer and independent researcher in Berlin. I build things with Go, Python, and LLMs. &lt;a href="https://arif.sh" rel="noopener noreferrer"&gt;arif.sh&lt;/a&gt; | &lt;a href="https://github.com/doganarif" rel="noopener noreferrer"&gt;GitHub&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>llm</category>
      <category>mcp</category>
      <category>benchmark</category>
    </item>
    <item>
      <title>AI Agent Failures Are Distributed Systems Failures. Here's the Complete Mapping.</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Fri, 20 Feb 2026 01:57:01 +0000</pubDate>
      <link>https://dev.to/arif/ai-agent-failures-are-distributed-systems-failures-heres-the-complete-mapping-216k</link>
      <guid>https://dev.to/arif/ai-agent-failures-are-distributed-systems-failures-heres-the-complete-mapping-216k</guid>
      <description>&lt;p&gt;A few months into building an AI agent pipeline for a fintech client, we had a silent failure that cost us three days.&lt;/p&gt;

&lt;p&gt;The agent processed a document. Returned a confident-looking response. No error, no exception, no log entry that suggested anything was wrong. That output went into the next step, which used it to write a decision record. The decision record went downstream. Three steps later, a human reviewer flagged something that did not add up.&lt;/p&gt;

&lt;p&gt;The root cause was a hallucinated intermediate field. One field. The model had made up a plausible-sounding value for something it should have extracted from the document. Everything downstream had treated that invented value as real.&lt;/p&gt;

&lt;p&gt;I had seen this failure mode before. Not in AI. In distributed systems.&lt;/p&gt;

&lt;p&gt;The microservice that returns 200 OK while writing corrupted data. The queue consumer that marks a message processed before finishing the work. The retry that fires twice because the ACK never arrived.&lt;/p&gt;

&lt;p&gt;Same pattern. Different worker.&lt;/p&gt;




&lt;h2&gt;
  
  
  The mental model that changes everything
&lt;/h2&gt;

&lt;p&gt;An AI agent is not a program that figures things out. It is a graph of tasks, each processed by a nondeterministic worker. The worker is expensive, slow, and occasionally wrong in ways that look correct. Everything else about the system, including how it should be built, is a distributed systems problem.&lt;/p&gt;

&lt;p&gt;Here is the complete mapping:&lt;/p&gt;

&lt;div class="table-wrapper-paragraph"&gt;&lt;table&gt;
&lt;thead&gt;
&lt;tr&gt;
&lt;th&gt;Distributed systems concept&lt;/th&gt;
&lt;th&gt;AI agent equivalent&lt;/th&gt;
&lt;/tr&gt;
&lt;/thead&gt;
&lt;tbody&gt;
&lt;tr&gt;
&lt;td&gt;Message queue&lt;/td&gt;
&lt;td&gt;Task buffer between agent steps&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Dead letter queue&lt;/td&gt;
&lt;td&gt;Failed tasks awaiting human review&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Idempotency key&lt;/td&gt;
&lt;td&gt;Deterministic task ID preventing duplicate LLM calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Circuit breaker&lt;/td&gt;
&lt;td&gt;Model fallback when primary LLM degrades or rate-limits&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Saga pattern + compensation&lt;/td&gt;
&lt;td&gt;Multi-step workflow rollback when a step fails midway&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Two-phase commit&lt;/td&gt;
&lt;td&gt;Human approval gate before any irreversible action&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Distributed tracing&lt;/td&gt;
&lt;td&gt;Per-step observability across agent execution&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Backpressure&lt;/td&gt;
&lt;td&gt;Concurrency limits on parallel LLM calls&lt;/td&gt;
&lt;/tr&gt;
&lt;tr&gt;
&lt;td&gt;Bulkhead pattern&lt;/td&gt;
&lt;td&gt;Cost and rate isolation between tenants or features&lt;/td&gt;
&lt;/tr&gt;
&lt;/tbody&gt;
&lt;/table&gt;&lt;/div&gt;

&lt;p&gt;Every one of these concepts exists because engineers got paged at 3am when they did not have it. AI agent engineers are getting paged for exactly the same reasons. They are building the same workarounds and calling them new things.&lt;/p&gt;

&lt;p&gt;The people writing most of the content about AI agents come from ML backgrounds. They have never been responsible for a distributed system that split-brained in production. They are reinventing concepts that have names.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 1: Silent failures that compound across steps
&lt;/h2&gt;

&lt;p&gt;This is the issue nobody who writes tutorials has actually experienced in a high-stakes system. LLM workers do not fail loudly. They return well-formed, confident-looking output that is semantically wrong. In a single-step system a human catches it. In a multi-step agent, the wrong output becomes the input to the next step.&lt;/p&gt;

&lt;p&gt;By the time you see the problem, the causal chain is four steps long and the original error is buried.&lt;/p&gt;

&lt;p&gt;The distributed systems answer: validate at every step boundary. Do not pass raw LLM output downstream. Treat every LLM response as untrusted input until it passes a validation check.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;StepResult&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Raw&lt;/span&gt;      &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Valid&lt;/span&gt;    &lt;span class="kt"&gt;bool&lt;/span&gt;
    &lt;span class="n"&gt;Schema&lt;/span&gt;   &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Validator&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;output&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Worker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;processStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;StepResult&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// self-correction: one retry with the validation error in context&lt;/span&gt;
        &lt;span class="n"&gt;corrected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err2&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llm&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sprintf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;
            &lt;span class="s"&gt;"Your previous response failed validation: %s&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Original task: %s&lt;/span&gt;&lt;span class="se"&gt;\n\n&lt;/span&gt;&lt;span class="s"&gt;Try again."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
            &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err2&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err3&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;corrected&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err3&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"validation failed after correction: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;raw&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;corrected&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;StepResult&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;Raw&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Valid&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;Schema&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;schema&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Self-correction on first validation failure recovers about 90% of cases in practice. The remaining 10% go to a dead letter queue, not to the next step in the pipeline.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 2: Idempotency, the hardest problem in agent systems
&lt;/h2&gt;

&lt;p&gt;This is the one I underestimated the most.&lt;/p&gt;

&lt;p&gt;In standard distributed systems, idempotency is straightforward: assign a key, deduplicate on that key, retry safely. In agent systems, there is a complication. The LLM worker is nondeterministic. Retry with the same input and you may get a different output.&lt;/p&gt;

&lt;p&gt;This creates a genuine design question: what does it mean to retry an agent step idempotently?&lt;/p&gt;

&lt;p&gt;The answer that works in practice: record the output the first time the step runs. On retry, return the recorded output rather than calling the LLM again. The task ID becomes the idempotency key.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;TaskStore&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;taskID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;taskID&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Worker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// check if already completed&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"store get: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;processStep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Input&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Schema&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;store&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;task&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ID&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Raw&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"store set: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This also makes your workflows resumable. A pipeline that crashes midway can restart from the last completed step rather than from scratch. In the fintech system this was essential: reprocessing a document from scratch on retry risked producing a different decision record.&lt;/p&gt;

&lt;p&gt;The one place this pattern breaks down: when you &lt;em&gt;want&lt;/em&gt; a fresh LLM response on retry because the recorded one failed validation. The solution is to record the output only after validation passes. Failed attempts do not get persisted.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 3: Multi-step workflows need a rollback plan
&lt;/h2&gt;

&lt;p&gt;Here is a scenario that forced me to think about this properly.&lt;/p&gt;

&lt;p&gt;A four-step pipeline: extract data from a document, validate it against business rules, write a decision record, trigger a downstream system. The first three steps succeed. Step four fails. What do you do?&lt;/p&gt;

&lt;p&gt;Without explicit design, you are stuck. The decision record exists. Retrying the whole pipeline from the start would write a duplicate. Doing nothing leaves the system in a broken intermediate state.&lt;/p&gt;

&lt;p&gt;The Saga pattern from distributed systems has a direct answer: every step that has side effects needs a compensating action that undoes it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Step&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Name&lt;/span&gt;       &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Execute&lt;/span&gt;    &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="n"&gt;Compensate&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Saga&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;steps&lt;/span&gt;    &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;Step&lt;/span&gt;
    &lt;span class="n"&gt;executed&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;int&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Saga&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;State&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Execute&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"step %s failed, rolling back %d completed steps"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;len&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="o"&gt;--&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Compensate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;state&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;cerr&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"compensation for step %s failed: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;steps&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;j&lt;/span&gt;&lt;span class="p"&gt;]]&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;cerr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
                &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"step %d (%s) failed: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;step&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Name&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;s&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;executed&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;i&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Applied to the document pipeline: if the downstream notification fails, the compensation for the decision-record step deletes it. The system returns to a clean state and the whole workflow can safely retry.&lt;/p&gt;

&lt;p&gt;Most agent frameworks do not give you this. They give you a chain of steps and leave failure handling as an exercise for the reader.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 4: Human-in-the-loop as blast radius management
&lt;/h2&gt;

&lt;p&gt;One of the cleanest mental models I have encountered for human review in agent systems: scope the human-in-the-loop requirement to the blast radius of the action.&lt;/p&gt;

&lt;p&gt;A read-only action that retrieves data: no approval needed, just log it.&lt;br&gt;
A write action that can be undone: soft approval, allow with logging and rollback capability.&lt;br&gt;
An irreversible action (sending an email, triggering a payment, deleting a record): hard approval gate, requires explicit human confirmation before execution.&lt;/p&gt;

&lt;p&gt;This is not a safety net. It is an architecture decision about which operations an automated system is permitted to take unilaterally.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;BlastRadius&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;

&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;BlastRadiusRead&lt;/span&gt;        &lt;span class="n"&gt;BlastRadius&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;iota&lt;/span&gt; &lt;span class="c"&gt;// read-only, no approval&lt;/span&gt;
    &lt;span class="n"&gt;BlastRadiusReversible&lt;/span&gt;                     &lt;span class="c"&gt;// can be undone, soft approval&lt;/span&gt;
    &lt;span class="n"&gt;BlastRadiusIrreversible&lt;/span&gt;                   &lt;span class="c"&gt;// cannot be undone, hard approval&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;ActionGate&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;action&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;radius&lt;/span&gt; &lt;span class="n"&gt;BlastRadius&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Approval&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Action&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;ID&lt;/span&gt;          &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Params&lt;/span&gt;      &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;any&lt;/span&gt;
    &lt;span class="n"&gt;Radius&lt;/span&gt;      &lt;span class="n"&gt;BlastRadius&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In the German government system, any output below a confidence threshold and any action classified as irreversible went to a human review queue. Not because we lacked confidence in the model, but because the downstream consequences of an error had legal weight. The auditors asked specifically about this control. Having it designed explicitly made the audit straightforward.&lt;/p&gt;

&lt;p&gt;The engineers I see struggling most with this are the ones trying to automate 100% of cases. A more honest design goal: automate the 90-95% the system can handle confidently, and escalate the rest with full context attached.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 5: Circuit breakers for model failures
&lt;/h2&gt;

&lt;p&gt;LLM APIs have bad days. Rate limits, degraded performance, elevated error rates. You need a circuit breaker.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;

&lt;span class="k"&gt;const&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;StateClosed&lt;/span&gt;   &lt;span class="n"&gt;State&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;iota&lt;/span&gt; &lt;span class="c"&gt;// normal operation&lt;/span&gt;
    &lt;span class="n"&gt;StateOpen&lt;/span&gt;                  &lt;span class="c"&gt;// failing fast&lt;/span&gt;
    &lt;span class="n"&gt;StateHalfOpen&lt;/span&gt;              &lt;span class="c"&gt;// testing recovery&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;CircuitBreaker&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;mu&lt;/span&gt;          &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Mutex&lt;/span&gt;
    &lt;span class="n"&gt;state&lt;/span&gt;       &lt;span class="n"&gt;State&lt;/span&gt;
    &lt;span class="n"&gt;failures&lt;/span&gt;    &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;threshold&lt;/span&gt;   &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;lastFailure&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
    &lt;span class="n"&gt;timeout&lt;/span&gt;     &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Allow&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;StateOpen&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Since&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lastFailure&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;timeout&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StateHalfOpen&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;
    &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cb&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;CircuitBreaker&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Record&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;mu&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failures&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt;
        &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;lastFailure&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Now&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;=&lt;/span&gt; &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;threshold&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StateOpen&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="k"&gt;else&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;failures&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
        &lt;span class="n"&gt;cb&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;state&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;StateClosed&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In a multi-model setup, combine the circuit breaker with a fallback: primary model trips the circuit, requests route to a cheaper backup. This keeps the system available during provider incidents. It also gives you a cheap escape valve when costs spike unexpectedly.&lt;/p&gt;

&lt;p&gt;Speaking of costs.&lt;/p&gt;




&lt;h2&gt;
  
  
  Problem 6: Observability and cost attribution
&lt;/h2&gt;

&lt;p&gt;In the summer of last year, a team publicly shared that they had spent $47k running AI agents in production before understanding where the money was going. The number got attention because people recognized the scenario.&lt;/p&gt;

&lt;p&gt;LLM costs are invisible until they are not. A feature that costs nothing at 100 users is a real number at 100,000. An edge case that you never hit in testing might consume 40x the tokens of a typical case.&lt;/p&gt;

&lt;p&gt;You cannot find this without logging every call with the context you need to understand it.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;CallTrace&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;TraceID&lt;/span&gt;          &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;FeatureName&lt;/span&gt;      &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;StepName&lt;/span&gt;         &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;Model&lt;/span&gt;            &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;PromptTokens&lt;/span&gt;     &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;CompletionTokens&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;CacheHit&lt;/span&gt;         &lt;span class="kt"&gt;bool&lt;/span&gt;
    &lt;span class="n"&gt;LatencyMs&lt;/span&gt;        &lt;span class="kt"&gt;int64&lt;/span&gt;
    &lt;span class="n"&gt;ValidationPassed&lt;/span&gt; &lt;span class="kt"&gt;bool&lt;/span&gt;
    &lt;span class="n"&gt;Attempt&lt;/span&gt;          &lt;span class="kt"&gt;int&lt;/span&gt;
    &lt;span class="n"&gt;Timestamp&lt;/span&gt;        &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Time&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;tokenCost&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;completion&lt;/span&gt; &lt;span class="kt"&gt;int&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;rates&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;][&lt;/span&gt;&lt;span class="m"&gt;2&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="s"&gt;"gpt-4o"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;              &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="m"&gt;0.0000025&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.000010&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s"&gt;"gpt-4o-mini"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;         &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="m"&gt;0.00000015&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.0000006&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s"&gt;"claude-3-5-sonnet"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;   &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="m"&gt;0.000003&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.000015&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
        &lt;span class="s"&gt;"claude-3-5-haiku"&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="m"&gt;0.0000008&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0.000004&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;rates&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;prompt&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;+&lt;/span&gt; &lt;span class="kt"&gt;float64&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;completion&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="m"&gt;1&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Store the traces. Aggregate costs by feature, by step, by user. Build a dashboard before the bill surprises you, not after.&lt;/p&gt;

&lt;p&gt;The secondary value: when something fails in production, you have the full execution trace. Input, intermediate steps, output, validation result. In a regulated environment this is not optional. Auditors ask for it. In any environment it makes debugging tractable instead of theoretical.&lt;/p&gt;




&lt;h2&gt;
  
  
  What this means for frameworks
&lt;/h2&gt;

&lt;p&gt;The popular agent frameworks optimize for developer experience. They make it fast to connect LLM calls, experiment with different prompting strategies, and ship a demo.&lt;/p&gt;

&lt;p&gt;That is genuinely useful. I have used some of them for prototyping.&lt;/p&gt;

&lt;p&gt;The problem is they tend to abstract away exactly what matters in production: queue semantics, idempotency, compensation flows, circuit breakers, cost attribution. You end up responsible for a system you do not fully understand, where the hard parts live inside library code you cannot easily inspect.&lt;/p&gt;

&lt;p&gt;The pattern that works: use a framework to figure out what your workflow should do. When you are ready to ship, write the orchestration layer with infrastructure primitives. Keep the LLM integration logic. Replace the framework's queue and retry machinery with something you control.&lt;/p&gt;

&lt;p&gt;This sounds like more work. It is less work in practice, because when something goes wrong at 2am you can actually find the problem.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Germany constraint that turned out to be good engineering
&lt;/h2&gt;

&lt;p&gt;The German government clients had a requirement I initially saw as a handicap: every decision made by the system had to be explainable to a civil servant who had no background in AI.&lt;/p&gt;

&lt;p&gt;That requirement killed several architectural ideas that would have worked fine in a consumer product. It forced the system toward a design where every step is logged with its inputs and outputs, the LLM is one component in a documented process rather than a black box, and there is always a human review path for anything below a defined confidence threshold.&lt;/p&gt;

&lt;p&gt;The resulting system is more verbose. It is also significantly easier to debug, trivial to audit, and more reliable under real load.&lt;/p&gt;

&lt;p&gt;The explainability requirement is coming for regulated industries whether teams are ready for it or not. Financial services, healthcare, legal, government. Design for it now and you get better engineering as a side effect.&lt;/p&gt;




&lt;h2&gt;
  
  
  Where to start
&lt;/h2&gt;

&lt;p&gt;If you are building an agent system and currently thinking mainly about prompts, here is a concrete sequence:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Define your task schema first.&lt;/strong&gt; What does a valid output look like at each step? What validation rules apply? This determines your circuit-break conditions and your dead letter queue trigger.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add idempotency before you add features.&lt;/strong&gt; Assign a deterministic task ID to every step. Store results. Make it safe to retry.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Design your Saga before you wire up the steps.&lt;/strong&gt; For every step that writes something, know what the compensation action is.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Set your blast radius policy.&lt;/strong&gt; Which actions require human approval before execution? Write it down as code, not documentation.&lt;/p&gt;&lt;/li&gt;
&lt;li&gt;&lt;p&gt;&lt;strong&gt;Add the cost trace from day one.&lt;/strong&gt; Not later. The cost dashboard is significantly harder to retrofit than to build initially.&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The prompt engineering is the interesting part. The infrastructure is what makes it shippable.&lt;/p&gt;




&lt;p&gt;If you are working on production AI systems and want to compare notes, I am at &lt;a href="https://arif.sh/work" rel="noopener noreferrer"&gt;arif.sh/work&lt;/a&gt;.&lt;/p&gt;

</description>
      <category>go</category>
      <category>ai</category>
      <category>architecture</category>
      <category>backend</category>
    </item>
    <item>
      <title>Building Production LLM Pipelines in Go: Lessons From the Field</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Tue, 17 Feb 2026 23:56:33 +0000</pubDate>
      <link>https://dev.to/arif/building-production-llm-pipelines-in-go-lessons-from-the-field-1f25</link>
      <guid>https://dev.to/arif/building-production-llm-pipelines-in-go-lessons-from-the-field-1f25</guid>
      <description>&lt;p&gt;I've spent the last two years integrating LLMs into production systems - a fintech platform processing millions of transactions and a document processing pipeline for government-scale workloads. Go is not the obvious choice for AI work. Python gets all the attention. But for production backends where reliability, cost, and latency actually matter, Go has been the better call every time.&lt;/p&gt;

&lt;p&gt;Here's what I've learned.&lt;/p&gt;




&lt;h2&gt;
  
  
  Why Go for LLM pipelines?
&lt;/h2&gt;

&lt;p&gt;The Python AI ecosystem is rich but chaotic. LangChain adds abstractions on top of abstractions. Dependency conflicts are constant. Async code that works in a notebook falls apart under real concurrency.&lt;/p&gt;

&lt;p&gt;Go gives you:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Predictable memory usage under load&lt;/li&gt;
&lt;li&gt;Goroutines for cheap, real concurrency (not async/await workarounds)&lt;/li&gt;
&lt;li&gt;Compile-time errors that catch the stupid mistakes before they hit production&lt;/li&gt;
&lt;li&gt;A single binary that deploys anywhere&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The tradeoff is you write more code. There is no LangChain equivalent in Go. But in my experience, that is a feature. You understand exactly what your pipeline is doing.&lt;/p&gt;




&lt;h2&gt;
  
  
  The architecture that works
&lt;/h2&gt;

&lt;p&gt;After iterating on several production systems, I keep coming back to this structure:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;Request → Validator → ContextBuilder → LLMClient → ResponseParser → Cache → Response
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Each stage is a function that takes input and returns output. No magic, no framework. Just functions.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Pipeline&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;validator&lt;/span&gt;      &lt;span class="n"&gt;Validator&lt;/span&gt;
    &lt;span class="n"&gt;contextBuilder&lt;/span&gt; &lt;span class="n"&gt;ContextBuilder&lt;/span&gt;
    &lt;span class="n"&gt;llmClient&lt;/span&gt;      &lt;span class="n"&gt;LLMClient&lt;/span&gt;
    &lt;span class="n"&gt;parser&lt;/span&gt;         &lt;span class="n"&gt;ResponseParser&lt;/span&gt;
    &lt;span class="n"&gt;cache&lt;/span&gt;          &lt;span class="n"&gt;Cache&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;p&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Pipeline&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;validator&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Validate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"validation: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;contextBuilder&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Build&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"context: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;cacheKey&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;cached&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;llmClient&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"llm: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;parser&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;raw&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;Response&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"parse: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;p&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;cache&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;cacheKey&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;result&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Simple. Each component is testable in isolation. Swap the LLM client without touching anything else.&lt;/p&gt;




&lt;h2&gt;
  
  
  The mistakes that cost real money
&lt;/h2&gt;

&lt;h3&gt;
  
  
  1. Not caching aggressively enough
&lt;/h3&gt;

&lt;p&gt;LLM calls are expensive. In a document processing pipeline, the same context gets rebuilt and sent to the model repeatedly for similar inputs. We added a two-layer cache: in-memory for hot keys, Redis for warm keys. Cost dropped 40% overnight.&lt;/p&gt;

&lt;p&gt;The key insight: you do not need semantic similarity matching for most caching. A hash of the normalized input is enough for 80% of cache hits.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;PipelineCache&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Key&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;PipelineContext&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;sha256&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NormalizedInput&lt;/span&gt;&lt;span class="p"&gt;()))&lt;/span&gt;
    &lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;SystemPrompt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;hex&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;EncodeToString&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;h&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sum&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  2. Treating LLM errors like HTTP errors
&lt;/h3&gt;

&lt;p&gt;LLMs fail in ways HTTP servers do not. You get:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Rate limit errors (back off and retry)&lt;/li&gt;
&lt;li&gt;Malformed JSON responses (retry with a stricter prompt)&lt;/li&gt;
&lt;li&gt;Responses that are technically valid but semantically wrong (detect and escalate)&lt;/li&gt;
&lt;li&gt;Context window overflows (truncate and retry)&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Each failure mode needs a different response. A generic retry loop handles none of them well.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;LLMClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Complete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="n"&gt;CompletionRequest&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;lastErr&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;maxRetries&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="o"&gt;++&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;==&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;resp&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;rateLimitErr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;RateLimitError&lt;/span&gt;
        &lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="n"&gt;contextErr&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;ContextWindowError&lt;/span&gt;

        &lt;span class="k"&gt;switch&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;As&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;rateLimitErr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;wait&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Duration&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;rateLimitErr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RetryAfterSeconds&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt; &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Second&lt;/span&gt;
            &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;After&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;wait&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Done&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
                &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ctx&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Err&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;errors&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;As&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;contextErr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;req&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WithTruncatedContext&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;contextErr&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MaxTokens&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;default&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;lastErr&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
            &lt;span class="n"&gt;time&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Sleep&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;backoff&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;attempt&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="s"&gt;""&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Errorf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"max retries exceeded: %w"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;lastErr&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  3. Building prompts with string concatenation
&lt;/h3&gt;

&lt;p&gt;It starts fine. Then someone adds a special case. Then another. Then you have a 200-line function full of &lt;code&gt;if/else&lt;/code&gt; building a string and nobody knows what the model is actually seeing.&lt;/p&gt;

&lt;p&gt;Treat prompts like templates. Keep them in separate files. Version them.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;PromptTemplate&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;System&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;User&lt;/span&gt;   &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;PromptTemplate&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Render&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;vars&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="n"&gt;any&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;CompletionRequest&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;systemTmpl&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;template&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;New&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"system"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;t&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;System&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;CompletionRequest&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="c"&gt;// ... render both templates&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;When a prompt changes, it is a code change with a diff. Not a hidden string mutation buried in a function.&lt;/p&gt;

&lt;h3&gt;
  
  
  4. No observability until something breaks in production
&lt;/h3&gt;

&lt;p&gt;Add structured logging from the start. Log the input hash, model, token counts, latency, and cache hit/miss on every call. When costs spike or latency degrades, you need this to debug.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;logger&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Info&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"llm_call"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"input_hash"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;req&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Hash&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="s"&gt;"model"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;model&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"prompt_tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;PromptTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"completion_tokens"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;usage&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CompletionTokens&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="s"&gt;"latency_ms"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;latency&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Milliseconds&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
    &lt;span class="s"&gt;"cache_hit"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;false&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I use this to build a cost dashboard per feature, per user, per day. Without it you are flying blind.&lt;/p&gt;




&lt;h2&gt;
  
  
  RAG in Go: keep it boring
&lt;/h2&gt;

&lt;p&gt;Retrieval-augmented generation does not need a vector database for most use cases. I have seen teams spin up Pinecone or Weaviate for a use case that would work fine with Postgres and &lt;code&gt;pgvector&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;Start with &lt;code&gt;pgvector&lt;/code&gt;. It is boring. It works. You already have Postgres. The query looks like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight sql"&gt;&lt;code&gt;&lt;span class="k"&gt;SELECT&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;AS&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt;
&lt;span class="k"&gt;FROM&lt;/span&gt; &lt;span class="n"&gt;documents&lt;/span&gt;
&lt;span class="k"&gt;WHERE&lt;/span&gt; &lt;span class="mi"&gt;1&lt;/span&gt; &lt;span class="o"&gt;-&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;=&amp;gt;&lt;/span&gt; &lt;span class="err"&gt;$&lt;/span&gt;&lt;span class="mi"&gt;1&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;&amp;gt;&lt;/span&gt; &lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="mi"&gt;7&lt;/span&gt;
&lt;span class="k"&gt;ORDER&lt;/span&gt; &lt;span class="k"&gt;BY&lt;/span&gt; &lt;span class="n"&gt;similarity&lt;/span&gt; &lt;span class="k"&gt;DESC&lt;/span&gt;
&lt;span class="k"&gt;LIMIT&lt;/span&gt; &lt;span class="mi"&gt;5&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Migrate to a dedicated vector store when you have a real performance problem. Not before.&lt;/p&gt;

&lt;p&gt;For generating embeddings in Go:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;EmbeddingClient&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;httpClient&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;
    &lt;span class="n"&gt;apiKey&lt;/span&gt;     &lt;span class="kt"&gt;string&lt;/span&gt;
    &lt;span class="n"&gt;model&lt;/span&gt;      &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;EmbeddingClient&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;Embed&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ctx&lt;/span&gt; &lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Context&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="kt"&gt;float32&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;error&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// standard HTTP call to your embedding endpoint&lt;/span&gt;
    &lt;span class="c"&gt;// returns float32 slice ready for pgvector&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;No SDK needed. The API is a POST request. Write the 30 lines and move on.&lt;/p&gt;




&lt;h2&gt;
  
  
  Structured output without the drama
&lt;/h2&gt;

&lt;p&gt;Getting JSON out of an LLM reliably is harder than it looks. Models hallucinate keys, nest objects incorrectly, or return valid JSON wrapped in markdown code blocks.&lt;/p&gt;

&lt;p&gt;What works in production:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Use a model that supports native JSON mode or function calling&lt;/li&gt;
&lt;li&gt;Define your schema explicitly in the prompt with an example&lt;/li&gt;
&lt;li&gt;Validate with a strict parser, not &lt;code&gt;json.Unmarshal&lt;/code&gt; directly&lt;/li&gt;
&lt;li&gt;Retry on parse failure with a self-correction prompt: "Your previous response was not valid JSON. Here is the error: {error}. Please try again."&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The self-correction step recovers 90% of parse failures without a human in the loop.&lt;/p&gt;




&lt;h2&gt;
  
  
  Cost math before you build
&lt;/h2&gt;

&lt;p&gt;Before writing a line of code, do the math:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;daily_requests * avg_input_tokens * input_price
+ daily_requests * avg_output_tokens * output_price
= daily_cost
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then ask: what is the cache hit rate I can realistically achieve? What is the monthly cost at 10x scale?&lt;/p&gt;

&lt;p&gt;I have seen products that were profitable at 100 users become unprofitable at 1000 because nobody did this calculation. LLM costs scale linearly. Your pricing needs to account for it.&lt;/p&gt;




&lt;h2&gt;
  
  
  What I would do differently
&lt;/h2&gt;

&lt;p&gt;If I were starting a new LLM-heavy Go service today:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Build the cache layer before the LLM integration, not after&lt;/li&gt;
&lt;li&gt;Log token usage from day one, not when the bill arrives&lt;/li&gt;
&lt;li&gt;Keep prompts in files, not code&lt;/li&gt;
&lt;li&gt;Write an LLM interface with a mock implementation for tests&lt;/li&gt;
&lt;li&gt;Do the cost math before committing to a feature&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;The patterns that work in Go for LLM pipelines are not exotic. They are the same patterns that work for any external API integration: encapsulate, cache, observe, handle errors explicitly. The LLM just has a few extra failure modes.&lt;/p&gt;




&lt;p&gt;If you are building something that needs a production AI backend and want to talk through the architecture, I am available for contracts. &lt;a href="https://arif.sh/work" rel="noopener noreferrer"&gt;arif.sh/work&lt;/a&gt;&lt;/p&gt;

</description>
      <category>go</category>
      <category>ai</category>
      <category>backend</category>
      <category>machinelearning</category>
    </item>
    <item>
      <title>The Real Deal Behind EngageCraft.io’s Creation</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Wed, 28 Feb 2024 09:03:00 +0000</pubDate>
      <link>https://dev.to/arif/the-real-deal-behind-engagecraftios-creation-jek</link>
      <guid>https://dev.to/arif/the-real-deal-behind-engagecraftios-creation-jek</guid>
      <description>&lt;p&gt;Hello World! I’m here to pull back the curtain on how EngageCraft.io went from a sketch on a napkin to a live SaaS platform in the span of a single weekend. No fluff, just the straight-up journey of building my first indie project.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Spark and the Sprint
&lt;/h2&gt;

&lt;p&gt;The idea? Simple yet ambitious: create a tool where users feed in their product URL, and out pops ready-to-use social media content for platforms like Twitter and Reddit. The catch? I wanted to see it live by Monday morning. Why wait, right?&lt;/p&gt;

&lt;h2&gt;
  
  
  Choosing Laravel: No Regrets
&lt;/h2&gt;

&lt;p&gt;Laravel was my weapon of choice for this sprint. I needed something that would let me move fast without breaking things (too much). Laravel Forge and Breeze weren’t just tools; they were my lifelines to a smooth deployment and an MVP that didn’t look like it was held together by duct tape. I chose Laravel because it promised—and delivered—a way to develop quickly and deploy smoothly, a non-negotiable for my crazy deadline.&lt;/p&gt;

&lt;h2&gt;
  
  
  Marketing: The Unexpected Journey
&lt;/h2&gt;

&lt;p&gt;With EngageCraft.io built, I stared down my next challenge: marketing. I was a developer stepping into a marketer’s shoes, and it felt like learning to walk all over again. But here’s the thing—I was building EngageCraft.io for people like me, and who better to sell it to them than one of their own? So, I dove headfirst into marketing with the same gusto I applied to coding.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Takeaway
&lt;/h2&gt;

&lt;p&gt;Building EngageCraft.io on a weekend was a wild ride. There were moments of doubt, sure, but also moments of pure adrenaline. It was more than just coding; it was about bringing a vision to life, learning new skills on the fly, and pushing beyond what I thought was possible.&lt;/p&gt;

&lt;h2&gt;
  
  
  Looking Forward
&lt;/h2&gt;

&lt;p&gt;This isn’t just the end of a weekend project; it’s the beginning of something bigger. EngageCraft.io is live, but it’s also evolving. I’m on a mission to refine it, scale it, and see it become the go-to tool for social media content generation.&lt;/p&gt;

&lt;p&gt;So, what’s the moral of the story? Don’t wait. If you’ve got an idea, chase it. Build it. Launch it. The world needs more creators who are willing to leap, especially over a weekend.&lt;/p&gt;

</description>
      <category>saas</category>
      <category>startup</category>
      <category>coding</category>
    </item>
    <item>
      <title>Embarking on 2024: Dreams, Challenges, and Adventures</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Sun, 31 Dec 2023 23:34:09 +0000</pubDate>
      <link>https://dev.to/arif/embarking-on-2024-dreams-challenges-and-adventures-4m80</link>
      <guid>https://dev.to/arif/embarking-on-2024-dreams-challenges-and-adventures-4m80</guid>
      <description>&lt;p&gt;Happy New Year, everyone! As we step into 2024, I am filled with a mix of excitement and anticipation. This year, I am determined to create new habits and reinvent myself. There is something about the start of a new year that gives me hope and ignites the urge to chase my dreams.&lt;/p&gt;

&lt;p&gt;I am incredibly grateful to have my wife by my side on this journey. Her constant encouragement and unwavering belief in the power of pursuing our dreams have been contagious. She reminded me that it is never too late to set new goals and turn our aspirations into reality.&lt;/p&gt;

&lt;p&gt;Here is my ambitious game plan for 2024:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;European Escapades&lt;/strong&gt; : This year, I have promised myself to finally explore the vibrant streets of Europe. From the historical alleys of Rome to the scenic beauty of the Swiss Alps, I am eagerly looking forward to immersing myself in the diverse cultures and landscapes.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;A Year of Books&lt;/strong&gt; : My bookshelf is overflowing with titles that have been waiting to be read. This year, I am committed to finally giving those pages the attention they deserve. I aim to delve into a wide range of genres, from classic literature to modern tech manuals.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fitness Journey&lt;/strong&gt; : It is time for me to prioritize my health. I will start with a few mornings of exercise each week. I understand that this won't be an easy journey, but I am fully committed to making it a part of my lifestyle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Coding Mastery&lt;/strong&gt; : As a programmer, I recognize the importance of continuously improving my skills. To achieve this, I am setting a goal to tackle three LeetCode problems daily. This will undoubtedly be challenging, but I am ready to embrace the task and grow as a developer.&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;To make all of this happen, I will focus on taking small, consistent steps. I understand the value of building habits gradually and not overwhelming myself with too much at once.&lt;/p&gt;

&lt;p&gt;Additionally, reflection will be a vital part of my journey. I will use this blog to not only share my triumphs but also to document the inevitable setbacks that come along the way. After all, personal growth is not a linear path.&lt;/p&gt;

&lt;p&gt;I am incredibly excited to share these experiences with all of you. I hope that my journey will inspire you to chase your own dreams, no matter their size. I am also eager to hear about your plans and aspirations – let's support and motivate each other!&lt;/p&gt;

&lt;p&gt;So, here's to an unforgettable year in 2024! We have dreams to chase, habits to form, and adventures to embark on. This year will be a time of discovery, learning, and undoubtedly, a lot of fun. Let us make the most of it together!&lt;/p&gt;

&lt;p&gt;Cheers to a year of turning dreams into reality. I cannot wait to embark on this adventure with all of you. Let the journey begin!&lt;/p&gt;

</description>
      <category>lifestyle</category>
      <category>life</category>
      <category>motivation</category>
    </item>
    <item>
      <title>Creating Real-Time WebSockets with Go and WebAssembly</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Thu, 09 Nov 2023 12:21:07 +0000</pubDate>
      <link>https://dev.to/arif/creating-real-time-websockets-with-go-and-webassembly-2cfl</link>
      <guid>https://dev.to/arif/creating-real-time-websockets-with-go-and-webassembly-2cfl</guid>
      <description>&lt;p&gt;Writing WebAssembly with Go was always something I wanted to do, but finding examples or tutorials was hard, so I decided to shuffle through pages. &lt;/p&gt;

&lt;p&gt;In this post, I'll share with you my learnings and some code examples.&lt;/p&gt;

&lt;p&gt;This project is an exciting exploration into the world of real-time web communication. We're delving into connecting a WebSocket directly from WebAssembly (Wasm) code, enabling dynamic interactions within the web page. By leveraging WebSocket events, we can manipulate the Document Object Model (DOM) in response to live data, creating an interactive and responsive user experience.&lt;/p&gt;

&lt;h2&gt;
  
  
  Project Outline
&lt;/h2&gt;

&lt;p&gt;In this project, we will create a WebSocket service (using Gorilla WebSocket), a WASM service, and an index.html (this is solely to see what we've done).&lt;/p&gt;

&lt;h3&gt;
  
  
  Folder Structure
&lt;/h3&gt;

&lt;p&gt;In this basic project, we're keeping the folder structure straightforward to avoid complexity. All that's required is a &lt;code&gt;wasm/&lt;/code&gt; directory for the WebAssembly files, a &lt;code&gt;pkg/socket/&lt;/code&gt; directory for the socket service logic, and a &lt;code&gt;public/&lt;/code&gt; directory to house our frontend assets.&lt;/p&gt;

&lt;p&gt;So project will look like this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;main.go
wasm/
pkg/socket/
public/
Makefile
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Let's Code!
&lt;/h2&gt;

&lt;h3&gt;
  
  
  public/index.html
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight html"&gt;&lt;code&gt;&lt;span class="cp"&gt;&amp;lt;!DOCTYPE html&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;html&lt;/span&gt; &lt;span class="na"&gt;lang=&lt;/span&gt;&lt;span class="s"&gt;"en"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;head&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;meta&lt;/span&gt; &lt;span class="na"&gt;charset=&lt;/span&gt;&lt;span class="s"&gt;"UTF-8"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;title&amp;gt;&lt;/span&gt;Go WASM Button Example&lt;span class="nt"&gt;&amp;lt;/title&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script &lt;/span&gt;&lt;span class="na"&gt;src=&lt;/span&gt;&lt;span class="s"&gt;"wasm_exec.js"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;style&amp;gt;&lt;/span&gt;
    &lt;span class="nf"&gt;#btn&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nl"&gt;position&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;absolute&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;top&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50%&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;left&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="m"&gt;50%&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
      &lt;span class="nl"&gt;transform&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="n"&gt;translate&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="m"&gt;-50%&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;-50%&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/style&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/head&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;body&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;p&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&amp;lt;/p&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;button&lt;/span&gt; &lt;span class="na"&gt;id=&lt;/span&gt;&lt;span class="s"&gt;"btn"&lt;/span&gt;&lt;span class="nt"&gt;&amp;gt;&lt;/span&gt;Click me&lt;span class="nt"&gt;&amp;lt;/button&amp;gt;&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;script&amp;gt;&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;go&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="nc"&gt;Go&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;

    &lt;span class="nx"&gt;WebAssembly&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;instantiateStreaming&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;fetch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;main.wasm&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt; &lt;span class="nx"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;importObject&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;then&lt;/span&gt;&lt;span class="p"&gt;((&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;go&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;run&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;result&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;instance&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;

    &lt;span class="nb"&gt;document&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;getElementById&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;btn&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;).&lt;/span&gt;&lt;span class="nf"&gt;addEventListener&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="s2"&gt;click&lt;/span&gt;&lt;span class="dl"&gt;"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nf"&gt;onButtonClick&lt;/span&gt;&lt;span class="p"&gt;();&lt;/span&gt;
    &lt;span class="p"&gt;});&lt;/span&gt;
  &lt;span class="nt"&gt;&amp;lt;/script&amp;gt;&lt;/span&gt;
&lt;span class="nt"&gt;&amp;lt;/body&amp;gt;&lt;/span&gt;

&lt;span class="nt"&gt;&amp;lt;/html&amp;gt;&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our HTML layout is intentionally minimalistic, featuring a single button centrally placed on the screen. When clicked, this button activates a 'click' event, which in turn calls the &lt;code&gt;buttonClick()&lt;/code&gt; function.&lt;/p&gt;

&lt;p&gt;Additionally, positioned at the top of the screen is a &lt;code&gt;&amp;lt;p&amp;gt;&lt;/code&gt; tag with id &lt;code&gt;text&lt;/code&gt;, which we will dynamically manipulate from our WebAssembly code.&lt;/p&gt;

&lt;h3&gt;
  
  
  pkg/socket/manage.go
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"math/rand"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;
    &lt;span class="s"&gt;"sync"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/gorilla/websocket"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Pre-configure the upgrader, which is responsible for upgrading&lt;/span&gt;
&lt;span class="c"&gt;// an HTTP connection to a WebSocket connection.&lt;/span&gt;
&lt;span class="k"&gt;var&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="n"&gt;websocketUpgrader&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Upgrader&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;ReadBufferSize&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="m"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;WriteBufferSize&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="m"&gt;1024&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// NotifyEvent represents an event that contains a reference&lt;/span&gt;
&lt;span class="c"&gt;// to the client who initiated the event and the message to be notified.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;NotifyEvent&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;client&lt;/span&gt;  &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;
    &lt;span class="n"&gt;message&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Client represents a single WebSocket connection.&lt;/span&gt;
&lt;span class="c"&gt;// It holds the client's ID, the WebSocket connection itself, and&lt;/span&gt;
&lt;span class="c"&gt;// the manager that controls all clients.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;id&lt;/span&gt;         &lt;span class="kt"&gt;uint32&lt;/span&gt;
    &lt;span class="n"&gt;connection&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;
    &lt;span class="n"&gt;manager&lt;/span&gt;    &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;

    &lt;span class="n"&gt;writeChan&lt;/span&gt; &lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// Manager keeps track of all active clients and broadcasts messages.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Manager&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;clients&lt;/span&gt; &lt;span class="n"&gt;ClientList&lt;/span&gt;

    &lt;span class="n"&gt;sync&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;RWMutex&lt;/span&gt;

    &lt;span class="n"&gt;notifyChan&lt;/span&gt; &lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;NotifyEvent&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// ClientList is a map of clients to keep track of their presence.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;ClientList&lt;/span&gt; &lt;span class="k"&gt;map&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="kt"&gt;bool&lt;/span&gt;

&lt;span class="c"&gt;// NewClient creates a new Client instance with a unique ID, its connection,&lt;/span&gt;
&lt;span class="c"&gt;// and a reference to the Manager.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;manager&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;id&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;         &lt;span class="n"&gt;rand&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Uint32&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt;
        &lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
        &lt;span class="n"&gt;writeChan&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;  &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// readMessages continuously reads messages from the WebSocket connection.&lt;/span&gt;
&lt;span class="c"&gt;// It will send any received messages to the manager's notification channel.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;readMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;removeClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;messageType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ReadMessage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notifyChan&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;NotifyEvent&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;message&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;)}&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;IsUnexpectedCloseError&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CloseGoingAway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;CloseAbnormalClosure&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Printf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"error reading message: %v"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
            &lt;span class="k"&gt;break&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MessageType: "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messageType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Payload: "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// writeMessages listens on the client's write channel for messages&lt;/span&gt;
&lt;span class="c"&gt;// and writes any received messages to the WebSocket connection.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;writeMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;removeClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;data&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeChan&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;WriteMessage&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;TextMessage&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;data&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// NewManager creates a new Manager instance, initializes the client list,&lt;/span&gt;
&lt;span class="c"&gt;// and starts the goroutine responsible for notifying other clients.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewManager&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;clients&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;    &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;ClientList&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
        &lt;span class="n"&gt;notifyChan&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="n"&gt;NotifyEvent&lt;/span&gt;&lt;span class="p"&gt;),&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notifyOtherClients&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// otherClients returns a slice of clients excluding the provided client.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;otherClients&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;clientList&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;([]&lt;/span&gt;&lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clients&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;id&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;clientList&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nb"&gt;append&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;clientList&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;clientList&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// notifyOtherClients waits for notify events and broadcasts the message&lt;/span&gt;
&lt;span class="c"&gt;// to all clients except the one who sent the message.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;notifyOtherClients&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;select&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="k"&gt;case&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;notifyChan&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
            &lt;span class="n"&gt;otherClients&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;otherClients&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

            &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="k"&gt;range&lt;/span&gt; &lt;span class="n"&gt;otherClients&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
                &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeChan&lt;/span&gt; &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt; &lt;span class="n"&gt;e&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;message&lt;/span&gt;
            &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// addClient adds a new client to the manager's client list.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;addClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clients&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="no"&gt;true&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// removeClient removes a client from the manager's client list and&lt;/span&gt;
&lt;span class="c"&gt;// closes the WebSocket connection.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;removeClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Lock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Unlock&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clients&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt; &lt;span class="n"&gt;ok&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;connection&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
        &lt;span class="nb"&gt;delete&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;clients&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// ServeWS is an HTTP handler that upgrades the HTTP connection to a&lt;/span&gt;
&lt;span class="c"&gt;// WebSocket connection and registers the new client with the manager.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;m&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Manager&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;ServeWS&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ResponseWriter&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Request&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"New Connection"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;websocketUpgrader&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Upgrade&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;w&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;r&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="n"&gt;client&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;NewClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="n"&gt;m&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;addClient&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;client&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;writeMessages&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our WebSocket service is powered by a Go file that relies on the &lt;code&gt;gorilla/websocket&lt;/code&gt; library. It begins with setting up an &lt;code&gt;Upgrader&lt;/code&gt; from the library, which transitions an HTTP connection to the WebSocket protocol, allowing for real-time communication.&lt;/p&gt;

&lt;p&gt;The code defines a &lt;code&gt;NotifyEvent&lt;/code&gt; type that encapsulates messages from a client (an active WebSocket connection) and a Client type that keeps track of individual connections and their communication with the server through channels.&lt;/p&gt;

&lt;p&gt;The Manager type oversees all Client instances, managing incoming and outgoing messages with the help of &lt;code&gt;ClientList&lt;/code&gt;, a map that keeps track of all active connections.&lt;/p&gt;

&lt;p&gt;Clients are created with a unique ID and are tied to their WebSocket connections and the managing system. They have two main routines: &lt;code&gt;readMessages&lt;/code&gt; listens for incoming messages and forwards them to the manager, and &lt;code&gt;writeMessages&lt;/code&gt; sends outgoing messages from the server to the client's WebSocket.&lt;/p&gt;

&lt;p&gt;The Manager is also responsible for broadcasting messages to all clients except the sender, ensuring that messages are propagated in real-time to all connected users.&lt;/p&gt;

&lt;p&gt;Lastly, &lt;code&gt;ServeWS&lt;/code&gt; is a function that handles new WebSocket connections by upgrading HTTP requests to WebSocket and initiating the message reading and writing routines.&lt;/p&gt;

&lt;p&gt;This structure allows us to manage multiple clients and their interactions efficiently, forming the backbone of our WebSocket-based real-time communication service.&lt;/p&gt;

&lt;h3&gt;
  
  
  main.go
&lt;/h3&gt;

&lt;p&gt;main.go, in this file we need to serve our index.html and some websocket stuff. ill use &lt;code&gt;net/http&lt;/code&gt; package for this.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"net/http"&lt;/span&gt;

    &lt;span class="s"&gt;"github.com/doganarif/wasm/2/pkg/socket"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;setupAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Serve on port :8080, fudge yeah hardcoded port&lt;/span&gt;
    &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Fatal&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ListenAndServe&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;":8080"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// setupAPI will start all Routes and their Handlers&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;setupAPI&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;manager&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;socket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;NewManager&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Serve the ./public directory at Route /&lt;/span&gt;
    &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FileServer&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dir&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"./public"&lt;/span&gt;&lt;span class="p"&gt;)))&lt;/span&gt;
    &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Handle&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"/ws"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;http&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;HandlerFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;manager&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;ServeWS&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our web service has a simple starting point, a file where everything kicks off. It sets up our web server and tells it to listen for visitors on port 8080 (the classic spot for web development testing).&lt;/p&gt;

&lt;p&gt;We call a &lt;code&gt;setupAPI()&lt;/code&gt; function that does a bit of behind-the-scenes work. It taps into our socket package to create a Manager, which is like the director of a play, coordinating all our WebSocket connections.&lt;/p&gt;

&lt;p&gt;We also set up a special &lt;code&gt;/ws&lt;/code&gt; route for WebSocket connections, where our Manager starts handling the real-time chat.&lt;/p&gt;

&lt;h3&gt;
  
  
  wasm/wasm.go
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight go"&gt;&lt;code&gt;&lt;span class="k"&gt;package&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;

&lt;span class="k"&gt;import&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;
    &lt;span class="s"&gt;"context"&lt;/span&gt;
    &lt;span class="s"&gt;"fmt"&lt;/span&gt;
    &lt;span class="s"&gt;"log"&lt;/span&gt;
    &lt;span class="s"&gt;"syscall/js"&lt;/span&gt;

    &lt;span class="s"&gt;"nhooyr.io/websocket"&lt;/span&gt;
&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;// Conn wraps a WebSocket connection.&lt;/span&gt;
&lt;span class="k"&gt;type&lt;/span&gt; &lt;span class="n"&gt;Conn&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;wsConn&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// NewConn establishes a new WebSocket connection to a specified URL.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;NewConn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;_&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Dial&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="s"&gt;"ws://localhost:8080/ws"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;fmt&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"ERROR"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;

    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="n"&gt;wsConn&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;main&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Channel to keep the main function running until it's closed.&lt;/span&gt;
    &lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="nb"&gt;make&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;chan&lt;/span&gt; &lt;span class="k"&gt;struct&lt;/span&gt;&lt;span class="p"&gt;{},&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;

    &lt;span class="nb"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"WASM Go Initialized"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// Establish a new WebSocket connection.&lt;/span&gt;
    &lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;NewConn&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Register the onButtonClick function in the global JavaScript context.&lt;/span&gt;
    &lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Global&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"onButtonClick"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;onButtonClickFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

    &lt;span class="c"&gt;// Start reading messages in a new goroutine.&lt;/span&gt;
    &lt;span class="k"&gt;go&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;readMessage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;

    &lt;span class="c"&gt;// Wait indefinitely.&lt;/span&gt;
    &lt;span class="o"&gt;&amp;lt;-&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// onButtonClickFunc returns a js.Func that sends a "HELLO" message over WebSocket when invoked.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;onButtonClickFunc&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;conn&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Func&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;FuncOf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;this&lt;/span&gt; &lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;args&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Value&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt;&lt;span class="p"&gt;{}&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="nb"&gt;println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Button Clicked!"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="c"&gt;// Send a message through the WebSocket connection.&lt;/span&gt;
        &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;conn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wsConn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Write&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;(),&lt;/span&gt; &lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;MessageText&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="p"&gt;[]&lt;/span&gt;&lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"HELLO"&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Error writing to WebSocket:"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;
        &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt;
    &lt;span class="p"&gt;})&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// readMessage handles incoming WebSocket messages and updates the DOM accordingly.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;c&lt;/span&gt; &lt;span class="o"&gt;*&lt;/span&gt;&lt;span class="n"&gt;Conn&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="n"&gt;readMessage&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;defer&lt;/span&gt; &lt;span class="k"&gt;func&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Close the WebSocket connection when the function returns.&lt;/span&gt;
        &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wsConn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Close&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;websocket&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;StatusGoingAway&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"BYE"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="p"&gt;}()&lt;/span&gt;

    &lt;span class="k"&gt;for&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
        &lt;span class="c"&gt;// Read a message from the WebSocket connection.&lt;/span&gt;
        &lt;span class="n"&gt;messageType&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;c&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;wsConn&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Read&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;context&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Background&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;

        &lt;span class="k"&gt;if&lt;/span&gt; &lt;span class="n"&gt;err&lt;/span&gt; &lt;span class="o"&gt;!=&lt;/span&gt; &lt;span class="no"&gt;nil&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
            &lt;span class="c"&gt;// Log and panic if there is an error reading the message.&lt;/span&gt;
            &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Panicf&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;err&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Error&lt;/span&gt;&lt;span class="p"&gt;())&lt;/span&gt;
        &lt;span class="p"&gt;}&lt;/span&gt;

        &lt;span class="c"&gt;// Update the DOM with the received message.&lt;/span&gt;
        &lt;span class="n"&gt;updateDOMContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;

        &lt;span class="c"&gt;// Log the message type and payload for debugging.&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"MessageType: "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;messageType&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
        &lt;span class="n"&gt;log&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Println&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"Payload: "&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;payload&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;

&lt;span class="c"&gt;// updateDOMContent updates the text content of the DOM element with the given text.&lt;/span&gt;
&lt;span class="k"&gt;func&lt;/span&gt; &lt;span class="n"&gt;updateDOMContent&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;text&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c"&gt;// Get the document object from the global JavaScript context.&lt;/span&gt;
    &lt;span class="n"&gt;document&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;js&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Global&lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Get&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"document"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// Get the DOM element by its ID.&lt;/span&gt;
    &lt;span class="n"&gt;element&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="n"&gt;document&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Call&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"getElementById"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="s"&gt;"text"&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
    &lt;span class="c"&gt;// Set the innerText of the element to the provided text.&lt;/span&gt;
    &lt;span class="n"&gt;element&lt;/span&gt;&lt;span class="o"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Set&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="s"&gt;"innerText"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Our wasm.go file, once compiled into a .wasm executable, is the client-side counterpart to the WebSocket service we discussed earlier. It's set to run within the web browser, interfacing directly with the DOM.&lt;/p&gt;

&lt;p&gt;The file’s responsibility starts with establishing a WebSocket connection to the server endpoint we've set up—ws://localhost:8080/ws. This is the communication channel our web service listens to, we already detailed.&lt;/p&gt;

&lt;p&gt;In parallel, we're attentive to the server’s response. Incoming messages from our WebSocket server are caught and used to update the webpage dynamically. &lt;br&gt;
This is where the full circle of client-server interaction completes—our Go code receives server messages, then reflects changes directly in the user's view (&amp;lt;p id="text").&lt;/p&gt;

&lt;h3&gt;
  
  
  Makefile
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;&lt;span class="nl"&gt;.PHONY&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;build buildwasm run clean serve&lt;/span&gt;

&lt;span class="c"&gt;# Variables
&lt;/span&gt;&lt;span class="nv"&gt;WASM_DIR&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; ./wasm
&lt;span class="nv"&gt;PUBLIC_DIR&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; ./public
&lt;span class="nv"&gt;SERVER_FILE&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; main.go
&lt;span class="nv"&gt;WASM_SOURCE&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;WASM_DIR&lt;span class="p"&gt;)&lt;/span&gt;/wasm.go
&lt;span class="nv"&gt;WASM_TARGET&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;PUBLIC_DIR&lt;span class="p"&gt;)&lt;/span&gt;/main.wasm
&lt;span class="nv"&gt;GOOS&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; js
&lt;span class="nv"&gt;GOARCH&lt;/span&gt; &lt;span class="o"&gt;:=&lt;/span&gt; wasm

&lt;span class="c"&gt;# Default rule
&lt;/span&gt;&lt;span class="nl"&gt;all&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;run&lt;/span&gt;

&lt;span class="c"&gt;# Run the server
&lt;/span&gt;&lt;span class="nl"&gt;run&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;serve&lt;/span&gt;

&lt;span class="c"&gt;# Serve the project
&lt;/span&gt;&lt;span class="nl"&gt;serve&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildwasm&lt;/span&gt;
    go run &lt;span class="p"&gt;$(&lt;/span&gt;SERVER_FILE&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Build the WebAssembly module
&lt;/span&gt;&lt;span class="nl"&gt;buildwasm&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nv"&gt;GOOS&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;$(&lt;/span&gt;GOOS&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="nv"&gt;GOARCH&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;$(&lt;/span&gt;GOARCH&lt;span class="p"&gt;)&lt;/span&gt; go build &lt;span class="nt"&gt;-o&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;WASM_TARGET&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;WASM_SOURCE&lt;span class="p"&gt;)&lt;/span&gt;

&lt;span class="c"&gt;# Clean the built artifacts
&lt;/span&gt;&lt;span class="nl"&gt;clean&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-f&lt;/span&gt; &lt;span class="p"&gt;$(&lt;/span&gt;WASM_TARGET&lt;span class="p"&gt;)&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  Conclusion
&lt;/h2&gt;

&lt;p&gt;We've gone through the ropes of creating a real-time application using Go and WebAssembly. The project's setup is simple: a WebSocket server in Go, a basic HTML page, and some Go code compiled to WebAssembly that runs in the browser. We've covered how to send and receive messages through WebSockets and how to reflect those messages in the web page by updating the DOM.&lt;/p&gt;

&lt;p&gt;From setting up the project structure to handling real-time communication between the client and server, the provided code examples aim to give you a clear starting point for building your own applications with Go and WebAssembly. It's a straightforward setup, but it lays the foundation for more complex projects.&lt;/p&gt;

&lt;p&gt;Hope this helps you kickstart your own adventures in WebAssembly with Go. Keep building, keep learning, and most importantly, keep it simple.&lt;/p&gt;

&lt;p&gt;Github Repo : &lt;a href="https://github.com/doganarif/go-wasm-socket"&gt;https://github.com/doganarif/go-wasm-socket&lt;/a&gt;&lt;/p&gt;

</description>
      <category>webassembly</category>
      <category>go</category>
      <category>websockets</category>
      <category>programming</category>
    </item>
    <item>
      <title>Dockerizing Django with Numpy and Gunicorn</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Thu, 14 Jan 2021 10:19:47 +0000</pubDate>
      <link>https://dev.to/arif/dockerizing-django-with-numpy-and-gunicorn-4ci</link>
      <guid>https://dev.to/arif/dockerizing-django-with-numpy-and-gunicorn-4ci</guid>
      <description>&lt;p&gt;Hi all.&lt;/p&gt;

&lt;p&gt;Actually theres lot of tutorial about dockerizing django on web. But at the project i working on we using libraries like Numpy and Pillow and alpine linux cannot compile this because missing dependencies.&lt;/p&gt;

&lt;p&gt;Our Stack;&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;Django&lt;/li&gt;
&lt;li&gt;Numpy&lt;/li&gt;
&lt;li&gt;Pillow&lt;/li&gt;
&lt;li&gt;Postgresql&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;First of all let create basic &lt;code&gt;Dockerfile&lt;/code&gt;&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.6.4-alpine&lt;/span&gt;

&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="s"&gt; . /app&lt;/span&gt;

&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;

&lt;span class="k"&gt;RUN &lt;/span&gt;pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt

&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8000&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["manage.py", "runserver"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Its basically works if you not using c compiled libraries like &lt;code&gt;Numpy&lt;/code&gt;.&lt;/p&gt;

&lt;p&gt;The thing is we need to add compilers to alpine i have big layer on there because running this proccesses on one layer is better i think.&lt;/p&gt;

&lt;p&gt;Full Dependency install Layer :&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; jpeg tiff-dev openjpeg-dev postgresql-dev &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nt"&gt;--virtual&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.build-deps &lt;span class="se"&gt;\
&lt;/span&gt;    g++ zlib-dev freetype-dev lcms2-dev &lt;span class="se"&gt;\
&lt;/span&gt;    gcc libc-dev linux-headers tk-dev tcl-dev harfbuzz-dev &lt;span class="se"&gt;\
&lt;/span&gt;    fribidi-dev build-base py-pip rsyslog cython file binutils &lt;span class="se"&gt;\
&lt;/span&gt;    musl-dev python3-dev cython &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; locale.h /usr/include/xlocale.h &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; pip &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; /root/.cache &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    find /usr/lib/python3.&lt;span class="k"&gt;*&lt;/span&gt;/ &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s1"&gt;'tests'&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; &lt;span class="nb"&gt;rm&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; &lt;span class="s1"&gt;'{}'&lt;/span&gt; + &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    find /usr/lib/python3.&lt;span class="k"&gt;*&lt;/span&gt;/site-packages/ &lt;span class="nt"&gt;-name&lt;/span&gt; &lt;span class="s1"&gt;'*.so'&lt;/span&gt; &lt;span class="nt"&gt;-print&lt;/span&gt; &lt;span class="nt"&gt;-exec&lt;/span&gt; sh &lt;span class="nt"&gt;-c&lt;/span&gt; &lt;span class="s1"&gt;'file "{}" | grep -q "not stripped" &amp;amp;&amp;amp; strip -s "{}"'&lt;/span&gt; &lt;span class="se"&gt;\;&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;rm&lt;/span&gt; /usr/include/xlocale.h &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apk &lt;span class="nt"&gt;--purge&lt;/span&gt; del .build-deps
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Compilers for Alpine
&lt;/h3&gt;

&lt;p&gt;I installed some packages like &lt;code&gt;g++&lt;/code&gt; and &lt;code&gt;linux-headers&lt;/code&gt; virtual and cleaned them after pip install because i just need this packages for post scripts of some dependencies and size is matter on container.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;  -t, --virtual NAME    Instead of adding all the packages to 'world', create a new 
                        virtual package with the listed dependencies and add that 
                        to 'world'; the actions of the command are easily reverted 
                        by deleting the virtual package
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;What that means is when you install packages, those packages are not added to global packages. And this change can be easily reverted. So if I need gcc to compile a program, but once the program is compiled I no more need gcc. &lt;/p&gt;

&lt;p&gt;
I can install gcc, and other required packages in a virtual package and all of its dependencies and everything can be removed this virtual package name. Below is an example usage
&lt;/p&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;apk add --virtual mypacks gcc vim
apk del mypacks
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The next command will delete all 18 packages installed with the first command.&lt;/p&gt;




&lt;h3&gt;
  
  
  Postgres
&lt;/h3&gt;

&lt;p&gt;Im running Postgresql on my local machine not docker. And i need to connect it. You can use &lt;code&gt;host.docker.internal&lt;/code&gt; this mapping is directly routes your machines &lt;code&gt;localhost&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;So i added this env variable&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; DB_HOST=host.docker.internal&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;and used this on `settings.py`&lt;/p&gt;




&lt;h3&gt;
  
  
  Gunicorn
&lt;/h3&gt;

&lt;p&gt;I decided to use Gunicorn for running processed with &lt;code&gt;n&lt;/code&gt; workers. So i added &lt;code&gt;gunicorn&lt;/code&gt; to my requirements.txt and added this two lines for run processes&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; WORKER_COUNT=2&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["sh", "-c", "gunicorn --bind :8000 --workers ${WORKER_COUNT} project.wsgi:application"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Using &lt;code&gt;WORKER_COUNT&lt;/code&gt; env variable for default value. Im overriding this value everytime i need.&lt;/p&gt;




&lt;h3&gt;
  
  
  Running
&lt;/h3&gt;

&lt;p&gt;I decided to create makefile for run this container with &lt;code&gt;n&lt;/code&gt; workers.&lt;br&gt;
So i created Makefile with this command.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight make"&gt;&lt;code&gt;
&lt;span class="nl"&gt;docker-run&lt;/span&gt;&lt;span class="o"&gt;:&lt;/span&gt;
    docker run &lt;span class="nt"&gt;-e&lt;/span&gt; &lt;span class="nv"&gt;worker_count&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;&lt;span class="p"&gt;${&lt;/span&gt;worker&lt;span class="p"&gt;}&lt;/span&gt; &lt;span class="nt"&gt;-p&lt;/span&gt; 127.0.0.1:8000:8000 project&lt;span class="p"&gt;${&lt;/span&gt;version&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;






&lt;h3&gt;
  
  
  Full Dockerfile
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight docker"&gt;&lt;code&gt;&lt;span class="k"&gt;FROM&lt;/span&gt;&lt;span class="s"&gt; python:3.6.4-alpine&lt;/span&gt;

&lt;span class="k"&gt;ADD&lt;/span&gt;&lt;span class="s"&gt; . /app&lt;/span&gt;
&lt;span class="k"&gt;WORKDIR&lt;/span&gt;&lt;span class="s"&gt; /app&lt;/span&gt;

&lt;span class="c"&gt;# Need to Update at Settings.py&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; DB_HOST=host.docker.internal&lt;/span&gt;

&lt;span class="c"&gt;# Default Worker Count is 2&lt;/span&gt;
&lt;span class="k"&gt;ENV&lt;/span&gt;&lt;span class="s"&gt; WORKER_COUNT=2&lt;/span&gt;

&lt;span class="c"&gt;# Install dependencies layer.&lt;/span&gt;
&lt;span class="k"&gt;RUN &lt;/span&gt;apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; postgresql &lt;span class="se"&gt;\
&lt;/span&gt;                        jpeg tiff-dev openjpeg-dev &lt;span class="se"&gt;\
&lt;/span&gt;                        libpq postgresql-libs postgresql-dev &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    apk add &lt;span class="nt"&gt;--no-cache&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nt"&gt;--virtual&lt;/span&gt;&lt;span class="o"&gt;=&lt;/span&gt;.build-deps &lt;span class="se"&gt;\
&lt;/span&gt;    g++ jpeg-dev zlib-dev freetype-dev lcms2-dev &lt;span class="se"&gt;\
&lt;/span&gt;    gcc libc-dev linux-headers tk-dev tcl-dev harfbuzz-dev &lt;span class="se"&gt;\
&lt;/span&gt;    fribidi-dev zlib-dev build-base py-pip rsyslog cython file binutils &lt;span class="se"&gt;\
&lt;/span&gt;    musl-dev python3-dev cython &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    &lt;span class="nb"&gt;ln&lt;/span&gt; &lt;span class="nt"&gt;-s&lt;/span&gt; locale.h /usr/include/xlocale.h &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;--upgrade&lt;/span&gt; pip &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\
&lt;/span&gt;    pip &lt;span class="nb"&gt;install&lt;/span&gt; &lt;span class="nt"&gt;-r&lt;/span&gt; requirements.txt &lt;span class="nt"&gt;--no-cache-dir&lt;/span&gt; &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; &lt;span class="se"&gt;\ &lt;/span&gt;
    rm -r /root/.cache &amp;amp;&amp;amp; \
    find /usr/lib/python3.*/ -name 'tests' -exec rm -r '{}' + &amp;amp;&amp;amp; \
    find /usr/lib/python3.*/site-packages/ -name '*.so' -print -exec sh -c 'file "{}" | grep -q "not stripped" &amp;amp;&amp;amp; strip -s "{}"' \; &amp;amp;&amp;amp; \
    rm /usr/include/xlocale.h &amp;amp;&amp;amp; \
    apk --purge del .build-deps  


&lt;span class="k"&gt;EXPOSE&lt;/span&gt;&lt;span class="s"&gt; 8000&lt;/span&gt;

&lt;span class="k"&gt;CMD&lt;/span&gt;&lt;span class="s"&gt; ["sh", "-c", "gunicorn --bind :8000 --workers ${WORKER_COUNT} project.wsgi:application"]&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;I hope you found this article useful.&lt;/p&gt;

&lt;p&gt;Thanks.&lt;/p&gt;

</description>
      <category>docker</category>
      <category>django</category>
      <category>python</category>
      <category>numpy</category>
    </item>
    <item>
      <title>Im Looking For Product Manager For My Own Project</title>
      <dc:creator>arif</dc:creator>
      <pubDate>Wed, 30 Dec 2020 22:07:38 +0000</pubDate>
      <link>https://dev.to/arif/im-looking-for-product-manager-for-my-own-project-3pin</link>
      <guid>https://dev.to/arif/im-looking-for-product-manager-for-my-own-project-3pin</guid>
      <description>&lt;h3&gt;
  
  
  Project Description
&lt;/h3&gt;

&lt;p&gt;Basically Take-Away eCommerce Mobile Application and BackEnd Services.&lt;/p&gt;

&lt;h3&gt;
  
  
  Status
&lt;/h3&gt;

&lt;p&gt;I only developed %50 of Back End Project with Django. For example developed pricing, basically customer logic etc.&lt;/p&gt;

&lt;h3&gt;
  
  
  Stack
&lt;/h3&gt;

&lt;p&gt;Im developing Back-End Web Services with Django. Also for mobile apps im thinking develop with flutter or react-native.&lt;/p&gt;

&lt;h3&gt;
  
  
  Requests
&lt;/h3&gt;

&lt;p&gt;I need one product manager for develop product and business. For now im funding this project myself. So i can share stock. If you can accept please message me :)  &lt;/p&gt;

</description>
      <category>product</category>
      <category>project</category>
    </item>
  </channel>
</rss>
