<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Jovann Thompson</title>
    <description>The latest articles on DEV Community by Jovann Thompson (@thompson_jovann_4fae7e88d).</description>
    <link>https://dev.to/thompson_jovann_4fae7e88d</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3877545%2F0546de1f-bbc4-4a1c-8ea9-3790222d69f8.jpeg</url>
      <title>DEV Community: Jovann Thompson</title>
      <link>https://dev.to/thompson_jovann_4fae7e88d</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/thompson_jovann_4fae7e88d"/>
    <language>en</language>
    <item>
      <title>Systems Primitives: A Practical Software Systems Reading Framework</title>
      <dc:creator>Jovann Thompson</dc:creator>
      <pubDate>Wed, 27 May 2026 01:09:45 +0000</pubDate>
      <link>https://dev.to/thompson_jovann_4fae7e88d/systems-primitives-a-practical-reading-framework-2189</link>
      <guid>https://dev.to/thompson_jovann_4fae7e88d/systems-primitives-a-practical-reading-framework-2189</guid>
      <description>&lt;p&gt;While debugging my first software system, I kept running into the same problem: I could see failures happening, but I couldn’t consistently explain them. Sometimes the pipeline stalled. Sometimes artifacts looked correct while downstream stages failed anyway. Sometimes local fixes worked briefly, then broke again under slightly different conditions.&lt;/p&gt;

&lt;p&gt;At first, every issue felt like a separate bug. Over time I realized the deeper problem was that I didn’t yet have a stable way to read the system itself. I needed a way to answer questions like: what is this stage actually responsible for? Where does this behavior originate? What assumptions exist between components? What kind of failure is this?&lt;/p&gt;

&lt;p&gt;So during the process of learning how to navigate my own codebase more seriously, I started organizing a small set of recurring concepts that helped reduce ambiguity while debugging. It’s a practical reading framework, a set of primitives that helped me reason through real software behavior more coherently.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Core Problem
&lt;/h2&gt;

&lt;p&gt;One of the hardest parts of early debugging was that everything collapsed together. A timeout looked like a crash, a schema mismatch looked like a database failure, a slow stage looked like a dead process. Without structure, every symptom felt disconnected, which led to reactive debugging and making patches without fully understanding if the fix was right.&lt;/p&gt;

&lt;p&gt;The problem with that approach is that it treats symptoms independently instead of locating the actual responsibility layer. What finally started helping was separating the system into smaller reasoning categories to reduce confusion.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Primitives
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Promise&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;A system has to be understood in terms of what it’s trying to accomplish. The promise defines the intended result. Without a clear promise, it’s difficult to classify failures because there’s no stable definition of correct behavior.&lt;/p&gt;

&lt;p&gt;In the ETL pipeline, the promise was simple: transform raw PDF conversations into structured, traceable data. That immediately separates extraction failures from transformation failures from storage failures from reporting failures.&lt;/p&gt;

&lt;p&gt;It also clarified why the off-by-one labeling bug mattered so much later. The system was still producing output, but once INPUT/OUTPUT numbering drifted, the conversation became harder to trace reliably across stages. The pipeline was operationally running while violating part of its core promise: preserving coherent conversational structure.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Boundaries&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Boundaries define ownership, what the system controls versus what the system depends on. This became important once the pipeline started interacting with external libraries, PDFs, SQLite, filesystem paths, and downstream visualization scripts.&lt;/p&gt;

&lt;p&gt;Without boundaries, debugging turns into blame diffusion. Every failure feels like it could belong to any layer. A concrete example of this came when I added graceful degradation and tried to rerun the pipeline against a different PDF. The run failed, but the failure wasn’t in my system. The PDF hadn’t been uploaded correctly and pdfplumber couldn’t parse the structure. Without a clear boundary in my head, I could have spent hours assuming my pipeline was broken. Once I understood where my system’s responsibility ended and the external dependency began, the real issue became obvious and I could think clearly about fallback logic instead of chasing a problem that wasn’t mine.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Flow&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Flow describes how work moves through the system, ordering, branching, transformations, retries, stage progression. This became critical once the pipeline started looking dead. The runtime would reach diagnostics, go quiet, and appear frozen.&lt;/p&gt;

&lt;p&gt;What made flow traceable was following execution through the orchestrator. The orchestrator was the spine of the pipeline, the place where every stage connected. By tracing the execution path through it, I could see which stages had actually run, which ones were still in progress, and where the handoff between them was breaking down. That turned a frozen-looking runtime into something I could follow step by step.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Contracts&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Contracts are the assumptions shared between stages. One component produces something. Another component expects something. Those assumptions can involve schema, naming, ordering, file paths, formatting, or runtime behavior.&lt;/p&gt;

&lt;p&gt;A major shift in my debugging happened once I stopped treating failures as isolated bugs and started treating them as broken contracts between stages. A downstream script expecting a column that upstream processing never created isn’t random failure. It’s a contract mismatch. That framing made debugging much more precise.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;State&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;State answers: what is true right now? This became important because the pipeline often lacked durable runtime visibility. A stage might partially finish, silently fail, repeat work, or leave artifacts behind that looked valid even when execution was incomplete.&lt;/p&gt;

&lt;p&gt;What helped was learning to check what each stage actually produced before moving on. Once I could see where a stage stopped and what it left behind, the picture clarified immediately. I could see everything the pipeline had generated up to a certain point, and then one specific artifact was missing or incomplete. That narrowed the problem from “something is wrong somewhere” to “this particular stage didn’t finish what it promised.” Without that visibility, I kept confusing “currently running” with “successfully completed.”&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Invariants&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Invariants are conditions that must remain true for the system to stay correct. One invariant in the pipeline was conversational turn alignment, INPUT 1 / OUTPUT 1, INPUT 2 / OUTPUT 2. When the cleaned output started producing INPUT 2 / OUTPUT 1, the pipeline still ran. Nothing crashed. But the invariant was broken.&lt;/p&gt;

&lt;p&gt;That distinction exposed a different category of failure: quiet correctness drift. The system was operationally functional while structurally incorrect.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Constraints&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Constraints are the limits the system must operate inside, runtime, memory, file variability, dependency behavior, data quality. One major debugging moment came after realizing diagnostics was taking nearly fifty minutes because PDFs were being reparsed repeatedly inside row-level loops. The issue wasn’t mysterious instability. The workload itself violated practical runtime constraints. Once the constraint became visible, the fix became much easier to reason about.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Failure Modes&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Failure modes classify recurring break patterns. Instead of “something weird happened again,” the question became “what category of failure is this?” Contract mismatch, silent runtime drift, invalid state, partial extraction, repeated expensive work, schema divergence, hidden branching behavior. Naming the category made debugging cumulative instead of repetitive. The same patterns started reappearing in recognizable forms.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Guarantees&lt;/strong&gt;&lt;/p&gt;

&lt;p&gt;Guarantees define what the system can reliably provide under stated conditions, not ideal behavior, actual dependable behavior. In my pipeline that distinction became real fast.&lt;/p&gt;

&lt;p&gt;The clearest example was labeling. The system was supposed to guarantee properly paired INPUT/OUTPUT labels from start to finish. But when I checked the cleaned output manually, the numbering was off from the very first turn. The pipeline implied it was producing correct structure. It wasn’t. Being explicit about what the system actually guarantees versus what it appears to guarantee forces realism and clarifies what downstream stages are actually allowed to trust.&lt;/p&gt;




&lt;h2&gt;
  
  
  One Real Failure Walkthrough
&lt;/h2&gt;

&lt;p&gt;One of the clearest examples of these primitives working together happened during diagnostics debugging. The runtime appeared to freeze during QA and diagnostics processing. At first the symptom looked like a crash. Using the primitives changed the investigation entirely.&lt;/p&gt;

&lt;p&gt;The promise said diagnostics should complete and produce visibility artifacts. Tracing flow through the orchestrator showed execution continued farther than expected. Examining state revealed that weak runtime visibility was making slow execution appear dead. Checking contracts showed downstream stages expected artifacts that hadn’t been fully validated yet. The constraint was the one that finally broke it open: repeated PDF parsing was creating severe runtime overhead, reopening and reparsing full PDFs inside row-level loops across 82 calls at roughly 37 seconds each.&lt;/p&gt;

&lt;p&gt;The fix was structural. Parse once, cache the text, reuse lightweight searches. But the important part wasn’t the optimization itself. It was that the primitives reduced ambiguity enough to locate the real responsibility layer. Without that structure, the investigation would have kept bouncing between symptoms.&lt;/p&gt;




&lt;h2&gt;
  
  
  Limits of the Framework
&lt;/h2&gt;

&lt;p&gt;This framework has real limits worth naming. The concepts overlap. Contracts often exist at boundaries, state transitions occur through flow, guarantees depend on constraints and invariants. They’re more like perspectives than isolated primitives.&lt;/p&gt;

&lt;p&gt;It’s also strongest for engineered systems. It becomes weaker in environments dominated by incentives, politics, social dynamics, or human behavior that doesn’t follow a spec. And it isn’t predictive in any rigorous scientific sense.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;The biggest shift this framework created was moving debugging from reactive behavior toward structured reasoning. Before this, failures felt random. Afterward, systems became easier to decompose: define the promise, identify the boundaries, trace the flow, verify the state, locate the broken contract, identify the constraint, classify the failure mode, then fix the smallest responsible layer.&lt;/p&gt;

&lt;p&gt;That sequence didn’t eliminate complexity. It made the complexity legible.&lt;/p&gt;

&lt;p&gt;And honestly, that was the real transition. Not learning how to write software, but learning how to read systems well enough that failures stopped feeling like chaos.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;The primitives in this framework came directly from building and documenting a real local ETL pipeline. system-envelope.md is the architecture doc where this thinking first took shape: github.com/Jt-Thompson&lt;/strong&gt;&lt;/p&gt;

</description>
      <category>architecture</category>
      <category>softwareengineering</category>
      <category>debugging</category>
      <category>systems</category>
    </item>
    <item>
      <title>Debug Log #2 — The Off-By-One That Didn’t Crash (It Just Lied)</title>
      <dc:creator>Jovann Thompson</dc:creator>
      <pubDate>Tue, 26 May 2026 03:57:20 +0000</pubDate>
      <link>https://dev.to/thompson_jovann_4fae7e88d/debug-log-2-the-off-by-one-that-didnt-crash-it-just-lied-3o5m</link>
      <guid>https://dev.to/thompson_jovann_4fae7e88d/debug-log-2-the-off-by-one-that-didnt-crash-it-just-lied-3o5m</guid>
      <description>&lt;p&gt;I built a local pipeline to take long chat transcripts saved as PDFs and turn them into something structured, cleaned output where every conversational turn is rewritten into paired labels:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INPUT 1 / OUTPUT 1
INPUT 2 / OUTPUT 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That pairing is the contract. It’s what makes the transcript auditable instead of just scrollable.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Symptom
&lt;/h2&gt;

&lt;p&gt;When doing a last integrity pass, I opened the cleaned PDF to confirm the labeling holds from start to finish. But right at the beginning the artifact was telling me a different story:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INPUT 2 / OUTPUT 1
INPUT 3 / OUTPUT 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The system was still alternating input/output, the output existed, the pipeline completed. But the numbering was shifted from the first turn. The system runs, the output exists, and the output is quietly lying by one. That lie ripples into every downstream count, integrity check, and assumption built on top of it.&lt;/p&gt;

&lt;p&gt;The real question became: where is the first place the system starts lying?&lt;/p&gt;




&lt;h2&gt;
  
  
  Initial Confusion
&lt;/h2&gt;

&lt;p&gt;At first I kept framing it as a counting issue, maybe something in the missing-input/missing-output analysis, maybe a reporting mismatch, maybe the integrity summary was slightly off. I didn’t want to rerun the entire dataset just to test a small correctness problem, so I tried to do it the right way: make a small sample input, isolate the stage, validate expected versus actual.&lt;/p&gt;

&lt;p&gt;That immediately raised practical questions I couldn’t dodge. Where do I even inject a sample? If my entrypoint starts at PDFs, how do I test a mid-stage without breaking the whole flow? If I create a CSV, which CSV does the stage actually expect?&lt;/p&gt;

&lt;p&gt;The framing itself was the problem. I was treating it like a reporting bug when it was actually a contract bug.&lt;/p&gt;




&lt;h2&gt;
  
  
  What the Bug Really Was
&lt;/h2&gt;

&lt;p&gt;The system was never meant to count like:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INPUT 1, OUTPUT 2, INPUT 3, OUTPUT 4...
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;It was meant to preserve paired conversational turns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;INPUT 1 / OUTPUT 1
INPUT 2 / OUTPUT 2
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;So if the cleaned PDF starts at &lt;code&gt;INPUT 2 / OUTPUT 1&lt;/code&gt;, the core failure isn’t in downstream analysis. The numbering contract is being violated somewhere upstream, and everything else is just inheriting the damage. Reframing it that way collapsed the search space immediately. Stop looking at reporting, trace back to wherever the labels get written in the first place.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Trap I Almost Fell Into
&lt;/h2&gt;

&lt;p&gt;Before that reframe landed, I tried to build a debug input using raw “you said / chatgpt said” style text, because that’s what I visually associate with the PDF source. But a test fixture only helps if it matches the contract of the stage you’re actually testing. Some stages in the pipeline don’t consume raw conversational text, they consume already-columnized CSV data. Feed the wrong-shaped input into the wrong layer and you’re not debugging the system anymore. You’re debugging a mismatch you created.&lt;/p&gt;

&lt;p&gt;That was one of the real lessons of this log: if your mental model of the pipeline layers is even slightly off, you can do a lot of work that produces zero signal.&lt;/p&gt;




&lt;h2&gt;
  
  
  Tracing Back to the First Lie
&lt;/h2&gt;

&lt;p&gt;The way it became solvable was tracing backward from the artifact I trusted until I found the first divergence.&lt;/p&gt;

&lt;p&gt;Start with the cleaned PDF, numbering is wrong at the first turn. Work backward through the pipeline outputs and stage boundaries. At each boundary ask: is the numbering still correct here, or did it break here? The moment a layer is confirmed correct, stop blaming it and move earlier.&lt;/p&gt;

&lt;p&gt;That tracing forced a clear outcome. The numbering wasn’t being broken by the analysis layer. It wasn’t something happening at the end. It was being introduced in the ingestion and cleaning step, the part of the system that writes the labels in the first place.&lt;/p&gt;




&lt;h2&gt;
  
  
  Root Cause
&lt;/h2&gt;

&lt;p&gt;The offset wasn’t random drift. It was a systematic base shift baked in from the start.&lt;/p&gt;

&lt;p&gt;I remembered why: sometimes when copying a thread, the first “You said:” label doesn’t exist the way the parser expects, so I had added logic to bootstrap the first input anyway. The intention was correct, recover from messy real-world formatting. But the implementation created a permanent misalignment. Input and output were being advanced out of sync at the very beginning, so everything after stayed consistently off by one.&lt;/p&gt;

&lt;p&gt;The bug didn’t need to crash to be real. It just needed to violate the contract once.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Fix
&lt;/h2&gt;

&lt;p&gt;The fix was structural, not a patch. Instead of two separate counters drifting against each other, the labeling logic was rebuilt around a single turn counter that increments only when an INPUT is encountered or injected, labels OUTPUT using that same turn number, and ensures the edge-case injection doesn’t double-increment the first real turn. The goal was to make it structurally impossible for OUTPUT numbering to drift away from INPUT numbering, regardless of what the source formatting looks like.&lt;/p&gt;




&lt;h2&gt;
  
  
  Proof
&lt;/h2&gt;

&lt;p&gt;I didn’t jump straight into a full run. I validated the fix in isolation first, a small harness that calls the labeling function directly against three cases: normal format, missing first label, and continuation from a higher turn number. Only after the harness proved the contract held did I rerun the full pipeline and spot-check the cleaned PDF from beginning to end.&lt;/p&gt;

&lt;p&gt;The output stayed aligned. The labeling read sharper because it was finally consistent.&lt;/p&gt;

&lt;p&gt;That’s what closed the loop: not “it seems fixed,” but the invariant proven in isolation, then proven again end-to-end.&lt;/p&gt;




&lt;h2&gt;
  
  
  What This Log Is Really About
&lt;/h2&gt;

&lt;p&gt;This was a quiet failure mode, a system that runs fine, produces output, and misleads you the whole time.&lt;/p&gt;

&lt;p&gt;The takeaway is simple: if an artifact looks slightly wrong, don’t argue with it and don’t patch randomly. Trace backward until you find the first layer where the contract breaks. Fix the smallest layer that owns the contract. Prove it in isolation. Then reintegrate.&lt;/p&gt;

&lt;p&gt;That’s how you stop a system from merely running and start making it trustworthy.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project
&lt;/h2&gt;

&lt;p&gt;GitHub Repository:&lt;br&gt;
&lt;a href="https://github.com/Jt-Thompson" rel="noopener noreferrer"&gt;https://github.com/Jt-Thompson&lt;/a&gt;&lt;/p&gt;

</description>
      <category>invariants</category>
      <category>dataengineering</category>
      <category>testing</category>
      <category>debugging</category>
    </item>
    <item>
      <title>Debug Log #1 — The Pipeline That Looked Broken</title>
      <dc:creator>Jovann Thompson</dc:creator>
      <pubDate>Tue, 26 May 2026 02:40:24 +0000</pubDate>
      <link>https://dev.to/thompson_jovann_4fae7e88d/debug-log-1-the-pipeline-that-looked-broken-23f8</link>
      <guid>https://dev.to/thompson_jovann_4fae7e88d/debug-log-1-the-pipeline-that-looked-broken-23f8</guid>
      <description>&lt;p&gt;I had been building a local ETL pipeline designed to process long conversational PDFs into structured datasets. The system extracted dialogue, cleaned it, generated QA artifacts, and loaded the results into SQLite for downstream analysis.&lt;/p&gt;

&lt;p&gt;&lt;a href="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b0ub82fpq1568ut6ogk.jpeg" class="article-body-image-wrapper"&gt;&lt;img src="https://media2.dev.to/dynamic/image/width=800%2Cheight=%2Cfit=scale-down%2Cgravity=auto%2Cformat=auto/https%3A%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Farticles%2F4b0ub82fpq1568ut6ogk.jpeg" alt=" " width="800" height="1192"&gt;&lt;/a&gt;&lt;br&gt;
By the time this debugging process started, the core extract-transform-load flow already worked. Data could move end-to-end through the system successfully.&lt;/p&gt;

&lt;p&gt;The problems started showing up once I added the QA and diagnostics stages around it. During development, parts of those systems seemed to work. But when I came back and reran the full pipeline, execution would appear to stop somewhere around diagnostics. Long stretches of silence. No artifacts I could confirm. No database state I fully trusted.&lt;/p&gt;

&lt;p&gt;At that point I knew something was wrong, but I didn’t yet have enough experience to understand what kind of wrong it was. I tried a few patches based on the first advice I got, potential path issues or output mismatches. None of them solved it. So the project sat for a while. I had to spend time away from the build learning how to read my own codebase, trace execution, and navigate the system well enough to come back and debug it properly.&lt;/p&gt;


&lt;h2&gt;
  
  
  Initial Understanding
&lt;/h2&gt;

&lt;p&gt;When I came back, I started with a simpler question: was the pipeline actually broken, or did it only look broken because I couldn’t see what was happening?&lt;/p&gt;

&lt;p&gt;Part of what triggered that question was staring at the terminal during long runs and realizing the process was still alive even though nothing visible was happening. At the time I was still learning basic operational ideas like what a “hang” even meant in practice. I had been treating long silence like proof that the system was dead, when in reality some pipeline states are just slow, blocked, waiting, or stuck behind expensive work.&lt;/p&gt;

&lt;p&gt;That reframe changed the investigation immediately. Instead of treating it like one giant broken object, I started seeing it as a chain of expectations between stages. One script writes outputs. Another script expects those outputs somewhere specific. One stage assumes a schema already exists. Another assumes a naming format already matches. If those assumptions drift even slightly, the whole pipeline can look broken from the outside even when parts of it are still functioning correctly.&lt;/p&gt;

&lt;p&gt;So the debugging process became: isolate the stopping point, check what was actually produced, compare it against what the next stage expected, then narrow the mismatch before changing anything.&lt;/p&gt;


&lt;h2&gt;
  
  
  Runtime Visibility
&lt;/h2&gt;

&lt;p&gt;Earlier I had added a short timeout during debugging attempts, but I eventually realized the timeout logic was only surfacing a warning state, not actually terminating the process itself. The run would hit QA and diagnostics, I’d see the timeout, and everything after that seemed to disappear.&lt;/p&gt;

&lt;p&gt;Instead of patching again, I started watching the runtime more carefully. I checked CPU and RAM usage to see whether the process was actually dead or just slow under load. I watched where execution appeared to stall. Then I did the thing I usually avoid during long runs: I waited long enough for the system to reveal more information on its own.&lt;/p&gt;

&lt;p&gt;That changed the picture completely. The timeout message was not the same thing as “the pipeline stops.” It was just one event earlier in the run. Once I let the process continue, the pipeline moved into later stages successfully. The question stopped being “why does the pipeline die at QA?” and became “what is actually happening after QA finishes?”&lt;/p&gt;

&lt;p&gt;Eventually the runtime progressed far enough for the real failure to surface:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;sqlite3.OperationalError: no such column: missing_input
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;That changed the debugging process again. Now the issue was no longer vague runtime ambiguity. There was a concrete failure tied to a specific schema mismatch much later in execution. The pipeline was running farther than it looked. The runtime visibility had just been too weak to make that obvious earlier.&lt;/p&gt;




&lt;h2&gt;
  
  
  Instrumentation and Tracing
&lt;/h2&gt;

&lt;p&gt;Once I understood the pipeline was continuing much farther than I originally thought, I stopped treating it like a mysterious crash and started measuring directly.&lt;/p&gt;

&lt;p&gt;I added timing instrumentation at the stage boundaries. The ambiguity disappeared immediately. Diagnostics wasn’t “a little slow.” It was taking around fifty minutes:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;[TIMING] diagnostics stage completed in 3036.21s
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At that point the problem stopped feeling random. One part of the system was repeatedly doing expensive work. So I followed the runtime cost from stage to module to function to loop, until the slowdown had a specific address.&lt;/p&gt;

&lt;p&gt;That narrowing led to &lt;code&gt;write_missing_outputs_csv&lt;/code&gt;, which accounted for nearly the entire diagnostics runtime.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Bottleneck
&lt;/h2&gt;

&lt;p&gt;Tracing deeper into that function showed repeated calls to &lt;code&gt;extract_pdf_context(...)&lt;/code&gt; inside the row-level loop. Following the call chain into &lt;code&gt;quality_utils.py&lt;/code&gt; confirmed what was happening: the function was calling &lt;code&gt;pdfplumber.open(pdf_path)&lt;/code&gt;, iterating through every page, and rebuilding the extracted text from scratch, then doing the exact same thing on the next row.&lt;/p&gt;

&lt;p&gt;Before changing anything, I added a counter to verify the scale of it directly:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;extract_pdf_context called: 82
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Then the math:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;~3081 seconds total / 82 calls ≈ 37.5 seconds per call
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;At that point the issue stopped being a hypothesis. I knew exactly how to reason about a fix.&lt;/p&gt;




&lt;h2&gt;
  
  
  The Structural Refactor
&lt;/h2&gt;

&lt;p&gt;The fix was about changing the shape of the system so the work couldn’t repeat itself.&lt;/p&gt;

&lt;p&gt;The workload changed from:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;rows × (open PDF + parse all pages)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;to:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;(open PDF + parse all pages once) + rows × (string search)
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;In practice: a new function &lt;code&gt;extract_full_pdf_text()&lt;/code&gt; opens and parses the PDF one time before the loop and returns the full text as a string. A second function &lt;code&gt;extract_pdf_context_from_text()&lt;/code&gt; takes that cached string and does a lightweight search against it, no file I/O, no page iteration. Inside the loop, only the search runs. I applied the same refactor pattern across both QA and diagnostics stages, then removed the temporary profiling scaffolding and kept the durable timing instrumentation that was still useful operationally.&lt;/p&gt;




&lt;h2&gt;
  
  
  First Clean Full Run
&lt;/h2&gt;

&lt;p&gt;Up until this point, one of the hardest parts of debugging the pipeline was not being able to tell whether it was actually finishing at all. Long stretches of silence made the runtime feel ambiguous. It looked dead, stalled, or half-working. I couldn’t fully trust what I was seeing.&lt;/p&gt;

&lt;p&gt;Then I finally got a clean full run. The important part wasn’t just that the pipeline completed. It was that the system explained itself clearly at the end:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;811 total inputs
770 outputs
0 missing inputs
41 missing outputs
94.94% coverage
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The SQLite load completed with 811 rows. The database-side missing outputs count matched what the diagnostics and CSV-side reporting were showing. Earlier in the investigation, different artifacts often contradicted each other and created more ambiguity. Now the outputs were reinforcing each other.&lt;/p&gt;

&lt;p&gt;The full run took around 470 seconds, just under eight minutes. That reframed a lot of the earlier fear around the pipeline hanging. Now I had a runtime I could measure and reason about.&lt;/p&gt;

&lt;p&gt;The pipeline was no longer a broken system somewhere in the middle of execution. It had become a system that could complete end-to-end, validate itself, load the database successfully, and expose the remaining problems as specific issues instead of vague uncertainty.&lt;/p&gt;




&lt;h2&gt;
  
  
  What Changed
&lt;/h2&gt;

&lt;p&gt;Before this investigation, debugging felt mostly reactive. Change something, rerun it, hope the behavior improves. This process introduced a different sequence: wait long enough for the system to reveal itself, instrument at the boundaries, follow the runtime cost by layers, verify assumptions with concrete measurements, then refactor the structure rather than the logic.&lt;/p&gt;

&lt;p&gt;The lack of observability wasn’t a side issue. It was half the problem. Once timing existed at stage boundaries, the pipeline stopped feeling like a black box and started feeling like something I could reason about directly.&lt;/p&gt;




&lt;h2&gt;
  
  
  Project
&lt;/h2&gt;

&lt;p&gt;GitHub Repository:&lt;br&gt;
&lt;a href="https://github.com/Jt-Thompson" rel="noopener noreferrer"&gt;https://github.com/Jt-Thompson&lt;/a&gt;&lt;/p&gt;

</description>
      <category>python</category>
      <category>debugging</category>
      <category>etl</category>
      <category>instrumentation</category>
    </item>
  </channel>
</rss>
