<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: John Mahoney</title>
    <description>The latest articles on DEV Community by John Mahoney (@john_mahoney_41e9c2589ceb).</description>
    <link>https://dev.to/john_mahoney_41e9c2589ceb</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3864776%2F26a7deb8-e1d4-4d5b-bb69-5bce9d9993bb.png</url>
      <title>DEV Community: John Mahoney</title>
      <link>https://dev.to/john_mahoney_41e9c2589ceb</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/john_mahoney_41e9c2589ceb"/>
    <language>en</language>
    <item>
      <title>How we run real-time AI deposition analysis with Deepgram + Claude</title>
      <dc:creator>John Mahoney</dc:creator>
      <pubDate>Tue, 21 Apr 2026 16:49:09 +0000</pubDate>
      <link>https://dev.to/john_mahoney_41e9c2589ceb/how-we-run-real-time-ai-deposition-analysis-with-deepgram-claude-492p</link>
      <guid>https://dev.to/john_mahoney_41e9c2589ceb/how-we-run-real-time-ai-deposition-analysis-with-deepgram-claude-492p</guid>
      <description>&lt;blockquote&gt;
&lt;p&gt;&lt;strong&gt;TL;DR&lt;/strong&gt; — Two-hour witness depos produce 12K–25K words of live transcript. We run that through Deepgram Nova-3 → a single Node.js WebSocket server → Claude Haiku-4-5 with a 12-key JSON schema → the attorney's screen in ~4 seconds per segment. The hard parts weren't the models; they were WebSocket idle timeouts, audio pipelines on macOS, and prompt engineering to stop hallucinated PubMed citations. Here's what shipped and what broke on the way.&lt;/p&gt;
&lt;/blockquote&gt;

&lt;h2&gt;
  
  
  The setup
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Product context:&lt;/strong&gt; plaintiff med-mal attorneys spend a lot of their life in depositions. "Did the expert's current testimony contradict their report from 2019?" is the kind of question that wins cases. It's also the kind of question a human brain doesn't hold well during two hours of straight listening.&lt;/p&gt;

&lt;p&gt;We're building a tool that does hold it: Courtroom AI lives on a browser tab the attorney watches during the depo. It listens through the reporter's realtime stream (or a microphone, or a Deepgram hook-up), produces structured JSON analysis per testimony segment, and pushes real-time flags — admissions, evasion patterns, prior-testimony contradictions, peer-reviewed literature that contradicts "in my experience" claims, FRE foundation triggers — to the side panel.&lt;/p&gt;

&lt;p&gt;Stack is almost embarrassingly simple:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Frontend&lt;/strong&gt;: React + Vite, no router, no state library beyond component state and a &lt;code&gt;useWebSocket&lt;/code&gt; hook. Single 232 KB JS bundle.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Backend&lt;/strong&gt;: one Node.js HTTP server, &lt;code&gt;ws&lt;/code&gt; for WebSocket, &lt;code&gt;@anthropic-ai/sdk&lt;/code&gt; for Claude calls.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Transcription&lt;/strong&gt;: Deepgram Nova-3 for streaming ASR; microphone fallback via the Web Speech API; pasted-transcript "simulator" mode for replaying historical depos.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analysis&lt;/strong&gt;: Claude Haiku-4-5 with &lt;code&gt;max_tokens: 8192&lt;/code&gt; (we'll get to why).&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Hosting&lt;/strong&gt;: Railway. Deploys on &lt;code&gt;git push main&lt;/code&gt;. Uptime from UptimeRobot + Sentry for error tracking.&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  The 12-key analysis schema
&lt;/h2&gt;

&lt;p&gt;Every testimony segment (roughly one Q-A pair or a 60-second chunk of narrative) goes through one Claude call that returns a strict JSON object with 12 top-level keys:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"medical"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"accuracyScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0-10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"inaccuracies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"daubert"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vulnerabilityScore"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="mi"&gt;0-10&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"vulnerabilities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priorTestimony"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"inconsistencies"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"impeachmentOpportunities"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"crossExam"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"questions"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;],&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"keyWeaknesses"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"elements"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"duty"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"breach"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"causation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"damages"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"admission"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"isAdmission"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"quote"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"significance"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"evasion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"isEvasive"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pattern"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"escalationScript"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"coverage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"topicsCovered"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"foundation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"triggers"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="s2"&gt;"FRE 613"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"FRE 803(18)"&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chartContradiction"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"contradicted"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;bool&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"witnessClaim"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"chartEvidence"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"literatureHits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"witnessClaim"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"pubmedQuery"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="s2"&gt;"..."&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="nl"&gt;"results"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="c1"&gt;...&lt;/span&gt;&lt;span class="p"&gt;]&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The reason it's one big schema instead of 12 separate calls: latency. At 12 sequential calls per segment, with Haiku averaging 1.5s per response, the attorney would see analysis ~18 seconds after the witness spoke. That's unusable. One combined call with a single output budget lands in ~4 seconds on dense segments.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotcha #1: max_tokens 4096 was too low
&lt;/h2&gt;

&lt;p&gt;When we first shipped, &lt;code&gt;crossExam&lt;/code&gt; at the end of the output would truncate mid-sentence. Users saw "Cross-examination analysis failed" toast errors. The model wasn't failing; it was hitting the token budget.&lt;/p&gt;

&lt;p&gt;Haiku-4-5's default response budget is 4096 tokens. Twelve fields worth of nested arrays on dense Q-A segments routinely need 6–8K output tokens. We bumped to &lt;code&gt;max_tokens: 8192&lt;/code&gt; — you pay for the ceiling only if you hit it, so there's no cost penalty for raising it. The fix was a one-line change; the diagnosis took hours because the error surfaced as "JSON parse error at position X" rather than "max_tokens exceeded."&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Takeaway:&lt;/strong&gt; if Claude is returning malformed JSON, check the finish_reason before debugging the prompt. &lt;code&gt;finish_reason: "max_tokens"&lt;/code&gt; means your budget is the problem.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotcha #2: WebSockets and Cloudflare's 100s idle timeout
&lt;/h2&gt;

&lt;p&gt;Railway fronts everything with Cloudflare, which closes idle WebSocket connections after 100 seconds. A witness pausing to read a document could easily take 2 minutes. We were losing sessions silently, and the user would see "analysis stopped" with no obvious cause.&lt;/p&gt;

&lt;p&gt;Fix: a client-side ping every 25 seconds. Four pings per Cloudflare window, well under the budget:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="nf"&gt;useEffect&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="o"&gt;!&lt;/span&gt;&lt;span class="nx"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;id&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;setInterval&lt;/span&gt;&lt;span class="p"&gt;(()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;readyState&lt;/span&gt; &lt;span class="o"&gt;===&lt;/span&gt; &lt;span class="nx"&gt;WebSocket&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;OPEN&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
      &lt;span class="nx"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;ping&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
    &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="mi"&gt;25&lt;/span&gt;&lt;span class="nx"&gt;_000&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;return &lt;/span&gt;&lt;span class="p"&gt;()&lt;/span&gt; &lt;span class="o"&gt;=&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;clearInterval&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;id&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;},&lt;/span&gt; &lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="nx"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;]);&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Server side just echoes &lt;code&gt;pong&lt;/code&gt; and moves on. Zero logic, pure connection keepalive.&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotcha #3: PubMed citations that didn't exist
&lt;/h2&gt;

&lt;p&gt;Early versions asked Claude to produce literature that contradicts the witness's "in my experience" claims — complete with PMIDs, authors, journal names. Which it did. Convincingly. And almost none of them were real.&lt;/p&gt;

&lt;p&gt;This is the &lt;em&gt;Mata v. Avianca&lt;/em&gt; problem in miniature, and it's disqualifying for legal tech. An attorney who reads one fake citation loses trust in everything the tool outputs.&lt;/p&gt;

&lt;p&gt;Fix: Claude only generates a &lt;strong&gt;PubMed search query&lt;/strong&gt;. We then call NCBI E-utilities (&lt;code&gt;esearch.fcgi&lt;/code&gt; → &lt;code&gt;esummary.fcgi&lt;/code&gt;) to resolve that query into actual PMIDs with real titles, authors, and journal years. If the query returns zero hits, we retry with a progressively simpler query (ladder from 3-AND-clause → 2-AND → 1-AND → first-noun-phrase) before giving up. The attorney never sees a fabricated citation; at worst they see "no literature found for this query," which is accurate.&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Progressive fallback: Claude's over-specific queries often return zero hits.&lt;/span&gt;
&lt;span class="c1"&gt;// Simplify until we get something, or return empty array honestly.&lt;/span&gt;
&lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="kd"&gt;function&lt;/span&gt; &lt;span class="nf"&gt;pubmedLookup&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;topN&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="mi"&gt;3&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;ladder&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;pubmedQueryLadder&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;query&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="k"&gt;for &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;term&lt;/span&gt; &lt;span class="k"&gt;of&lt;/span&gt; &lt;span class="nx"&gt;ladder&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;pmids&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nf"&gt;esearch&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;term&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pmids&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;length&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="nf"&gt;esummary&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pmids&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
  &lt;span class="p"&gt;}&lt;/span&gt;
  &lt;span class="k"&gt;return&lt;/span&gt; &lt;span class="p"&gt;[];&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;24-hour in-memory cache on query string means repeated witnesses don't re-hit NCBI. Rate-limited to 3 req/sec per their published limit (unused API-key-free tier).&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotcha #4: macOS audio pipelines and microphone permissions
&lt;/h2&gt;

&lt;p&gt;Microphone capture in Chrome on macOS requires two things plaintiff attorneys don't think about: (a) the site's &lt;code&gt;Permissions-Policy&lt;/code&gt; header must allow &lt;code&gt;microphone=(self)&lt;/code&gt;, and (b) the macOS system preferences must allow browser mic access at the OS level.&lt;/p&gt;

&lt;p&gt;We tightened &lt;code&gt;Permissions-Policy&lt;/code&gt; too aggressively one week and accidentally disabled mic access on &lt;code&gt;/courtroom-ai/&lt;/code&gt; for three days. The browser console message is unhelpful (&lt;code&gt;[Deprecation] Feature policy 'microphone' is disabled&lt;/code&gt;) and the tool just... silently doesn't transcribe. Users called it "broken."&lt;/p&gt;

&lt;p&gt;Fix: explicit per-route policy:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="k"&gt;if &lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;pathname&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;startsWith&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;/courtroom-ai&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;))&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt;
  &lt;span class="nx"&gt;res&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;setHeader&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;Permissions-Policy&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;microphone=(self), camera=()&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;For local dev on macOS we also ship a &lt;code&gt;bin/courtroom-setup.sh&lt;/code&gt; that re-creates a BlackHole aggregate audio device after reboots (macOS doesn't persist aggregate devices across reboots, which is a whole separate surprise).&lt;/p&gt;

&lt;h2&gt;
  
  
  Gotcha #5: committed-dist because of nixpacks
&lt;/h2&gt;

&lt;p&gt;Railway's nixpacks builder does run &lt;code&gt;npm run build&lt;/code&gt; when it detects a Vite project, but its output sometimes doesn't overwrite the committed &lt;code&gt;dist/&lt;/code&gt; bundle, especially with cache hits. We lost a whole afternoon debugging "why isn't my frontend change showing up" when the answer was: the old bundle was still the one being served.&lt;/p&gt;

&lt;p&gt;Fix: we commit &lt;code&gt;dist/&lt;/code&gt; to the repo and &lt;code&gt;.gitignore&lt;/code&gt; explicitly re-allows it. On any frontend change:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;&lt;span class="nb"&gt;cd &lt;/span&gt;courtroom-ai-tool/frontend &lt;span class="o"&gt;&amp;amp;&amp;amp;&lt;/span&gt; npx vite build
git add dist/
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;This feels wrong — "don't commit build artifacts" is received wisdom. But it gives us a deterministic bundle-hash match between source and prod, which is more valuable than the &lt;code&gt;.gitignore&lt;/code&gt; hygiene in a small team.&lt;/p&gt;

&lt;h2&gt;
  
  
  What we learned
&lt;/h2&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;One call &amp;gt; many calls&lt;/strong&gt; for real-time UX. Take the max_tokens hit; pay once, return everything.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Never generate citations the user might act on.&lt;/strong&gt; Generate queries that resolve against an authoritative source, and fail honestly when resolution returns nothing.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Browser realtime = keepalive pings.&lt;/strong&gt; Any path that goes through a CDN has an idle timeout. Find it before your users do.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;On-device audio is more brittle than the models.&lt;/strong&gt; The transcript quality failures we've seen are overwhelmingly pipeline-level, not ASR-level. Test the mic path on fresh macOS installs.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Commit the damn build.&lt;/strong&gt; Until your deploy platform's cache semantics are bulletproof, deterministic artifacts beat clean gitignore every time.&lt;/li&gt;
&lt;/ol&gt;




&lt;p&gt;We're MedLegal AI — we're not hiring, we're building this. If you're a plaintiff firm and want to kick the tires, &lt;a href="https://medicalai.law/register?ref=devto-cai-apr21" rel="noopener noreferrer"&gt;14-day free trial at medicalai.law&lt;/a&gt;. The Courtroom AI add-on is $99/mo for 10 hours or $299/mo for 50; &lt;a href="https://medicalai.law/deposition-ai" rel="noopener noreferrer"&gt;full details here&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Canonical URL for this post:&lt;/strong&gt; &lt;code&gt;https://medicalai.law/blog/how-ai-analyzes-deposition-real-time&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Tags:&lt;/strong&gt; &lt;code&gt;#webdev&lt;/code&gt; &lt;code&gt;#ai&lt;/code&gt; &lt;code&gt;#nodejs&lt;/code&gt; &lt;code&gt;#legaltech&lt;/code&gt;&lt;/p&gt;




</description>
      <category>ai</category>
      <category>claude</category>
      <category>node</category>
      <category>showdev</category>
    </item>
    <item>
      <title>How we built real-time deposition analysis with Claude's streaming API</title>
      <dc:creator>John Mahoney</dc:creator>
      <pubDate>Tue, 21 Apr 2026 01:11:59 +0000</pubDate>
      <link>https://dev.to/john_mahoney_41e9c2589ceb/how-we-built-real-time-deposition-analysis-with-claudes-streaming-api-4f4h</link>
      <guid>https://dev.to/john_mahoney_41e9c2589ceb/how-we-built-real-time-deposition-analysis-with-claudes-streaming-api-4f4h</guid>
      <description>&lt;p&gt;Medical-malpractice plaintiff attorneys spend 3+ hours in expert depositions hunting for two things: admissions they can use at trial, and inconsistencies they can impeach. Both windows close in seconds. If you don't catch them live, you're reading the transcript a week later wishing you had.&lt;/p&gt;

&lt;p&gt;We built a live-feed analyzer that watches the deposition stream, runs Claude against every 30-second window, and surfaces real-time signals to the attorney's laptop while they question the witness.&lt;/p&gt;

&lt;h2&gt;
  
  
  Architecture
&lt;/h2&gt;

&lt;p&gt;Three hops:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Deepgram&lt;/strong&gt; transcribes the live audio over WebSocket&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Our Node WS server&lt;/strong&gt; buffers transcript into 30-second segments&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Claude&lt;/strong&gt; (Haiku 4.5, streaming) analyzes each segment and returns a 12-key JSON&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The JSON is the heart of the system. Every segment returns:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight json-doc"&gt;&lt;code&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"medical"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;accuracyScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;inaccuracies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;accurateStatements&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;confidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;summary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"daubert"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vulnerabilityScore&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;vulnerabilities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;strengths&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;overallRisk&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"priorTestimony"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;inconsistencies&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;impeachmentOpportunities&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;summary&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"crossExam"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;questions&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;keyWeaknesses&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;recommendedApproach&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"elements"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;duty&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;breach&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;causation&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;damages&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;   &lt;/span&gt;&lt;span class="c1"&gt;// each { advanced, quote }&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"admission"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;       &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;isAdmission&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;quote&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;significance&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;whyMatters&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"evasion"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;         &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;isEvasive&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pattern&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;escalationScript&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"coverage"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;        &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;topicsCovered&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;notes&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"foundation"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;      &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;triggers&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;                              &lt;/span&gt;&lt;span class="c1"&gt;// FRE 613/803(18)/803(6)/702/30(b)(6)&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"chartContradiction"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;contradicted&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;witnessClaim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;chartEvidence&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;severity&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;},&lt;/span&gt;&lt;span class="w"&gt;
  &lt;/span&gt;&lt;span class="nl"&gt;"literatureHits"&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt;&lt;span class="w"&gt;  &lt;/span&gt;&lt;span class="p"&gt;[{&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;witnessClaim&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pubmedQuery&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;foundationScript&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="err"&gt;pubmedUrl&lt;/span&gt;&lt;span class="w"&gt; &lt;/span&gt;&lt;span class="p"&gt;}]&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;span class="p"&gt;}&lt;/span&gt;&lt;span class="w"&gt;
&lt;/span&gt;&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  The per-segment loop
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight javascript"&gt;&lt;code&gt;&lt;span class="c1"&gt;// Per segment: single Claude call with the whole expert-witness-specific prompt&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;msg&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="k"&gt;await&lt;/span&gt; &lt;span class="nx"&gt;anthropic&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;create&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt;
  &lt;span class="na"&gt;model&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;claude-haiku-4-5-20251001&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;
  &lt;span class="na"&gt;max_tokens&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="mi"&gt;8192&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt;                &lt;span class="c1"&gt;// critical — 4096 truncates crossExam mid-stream&lt;/span&gt;
  &lt;span class="na"&gt;messages&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="p"&gt;[{&lt;/span&gt; &lt;span class="na"&gt;role&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;user&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="na"&gt;content&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="nf"&gt;buildPrompt&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;segment&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;caseContext&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;chart&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt; &lt;span class="p"&gt;}],&lt;/span&gt;
&lt;span class="p"&gt;});&lt;/span&gt;
&lt;span class="kd"&gt;const&lt;/span&gt; &lt;span class="nx"&gt;analysis&lt;/span&gt; &lt;span class="o"&gt;=&lt;/span&gt; &lt;span class="nf"&gt;sanitizeResult&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;parse&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nf"&gt;extractJson&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;msg&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nx"&gt;content&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="mi"&gt;0&lt;/span&gt;&lt;span class="p"&gt;].&lt;/span&gt;&lt;span class="nx"&gt;text&lt;/span&gt;&lt;span class="p"&gt;)));&lt;/span&gt;
&lt;span class="nx"&gt;ws&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;send&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="nx"&gt;JSON&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;stringify&lt;/span&gt;&lt;span class="p"&gt;({&lt;/span&gt; &lt;span class="na"&gt;type&lt;/span&gt;&lt;span class="p"&gt;:&lt;/span&gt; &lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="s1"&gt;analysis&lt;/span&gt;&lt;span class="dl"&gt;'&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;analysis&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="nx"&gt;segment&lt;/span&gt; &lt;span class="p"&gt;}));&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Three things we got wrong the first time:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;max_tokens=4096 was too small.&lt;/strong&gt; The 12-key output needs ~6-8K on dense segments. If crossExam is written near the end of the stream, it gets truncated and the UI shows "Cross-examination analysis failed." Bumped to 8192.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Chart context wasn't propagating.&lt;/strong&gt; &lt;code&gt;chartContradiction&lt;/code&gt; can't fire without the chart data in the prompt. We now stash &lt;code&gt;ws._sessionChartContext&lt;/code&gt; on a &lt;code&gt;setChartContext&lt;/code&gt; WS message before analysis begins.&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Cloudflare killed idle WebSockets after 100s.&lt;/strong&gt; Claude's longer analyses took 45-90s, and during dense segments the WS went silent. Added a 25s keepalive ping from the client.&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  What we skipped (for now)
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;PDF.js text-layer positioning&lt;/strong&gt; for the chart contradiction pin (today it's at the file-list row, not the page)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Firm-scoped vector index&lt;/strong&gt; over historical transcripts (cross-case expert inconsistency)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Live PubMed API calls&lt;/strong&gt; — today we generate the search query, the attorney clicks through&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;Full writeup (including the chart-cross-reference and co-counsel channel) is on our blog at &lt;a href="https://medicalai.law/blog/how-ai-analyzes-deposition-real-time" rel="noopener noreferrer"&gt;medicalai.law/blog/how-ai-analyzes-deposition-real-time&lt;/a&gt;.&lt;/p&gt;

&lt;p&gt;Questions welcome.&lt;/p&gt;

</description>
      <category>ai</category>
      <category>claude</category>
      <category>node</category>
      <category>legaltech</category>
    </item>
    <item>
      <title>Processing 1,500 Pages of Medical Records in 3 Minutes with AI</title>
      <dc:creator>John Mahoney</dc:creator>
      <pubDate>Wed, 15 Apr 2026 13:24:29 +0000</pubDate>
      <link>https://dev.to/john_mahoney_41e9c2589ceb/how-we-built-an-ai-powered-medical-records-extraction-pipeline-2k0p</link>
      <guid>https://dev.to/john_mahoney_41e9c2589ceb/how-we-built-an-ai-powered-medical-records-extraction-pipeline-2k0p</guid>
      <description>&lt;p&gt;Medical malpractice attorneys deal with thousands of pages of medical records per case. Organizing those records into a chronological timeline is the foundation of every case — and it's historically been done by hand, taking 20-40 hours per case.&lt;/p&gt;

&lt;p&gt;We built a pipeline that extracts structured data from uploaded medical record PDFs, streams AI-generated analysis back to the browser in real time, and handles files up to 500MB. Here's how it works.&lt;/p&gt;

&lt;p&gt;&lt;em&gt;John Mahoney, Founder @ &lt;a href="https://medicalai.law" rel="noopener noreferrer"&gt;MedLegal AI&lt;/a&gt;&lt;/em&gt;&lt;/p&gt;




&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;

&lt;p&gt;The system has four stages:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Upload&lt;/strong&gt; — Browser uploads PDFs directly to S3 via presigned URLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Extract&lt;/strong&gt; — Server pulls the file from S3, runs OCR if needed, extracts raw text&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze&lt;/strong&gt; — Text is sent to Claude API for structured extraction&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Stream&lt;/strong&gt; — Results stream back to the browser via SSE as they're generated&lt;/li&gt;
&lt;/ol&gt;

&lt;h2&gt;
  
  
  Stage 1: Presigned S3 Uploads
&lt;/h2&gt;

&lt;p&gt;Medical record PDFs are large. 200-500MB is common. We're deployed behind Cloudflare and Railway, both with upload size limits.&lt;/p&gt;

&lt;p&gt;The solution: the browser uploads directly to S3 via presigned PUT URLs.&lt;/p&gt;

&lt;p&gt;&lt;code&gt;\&lt;/code&gt;`javascript&lt;br&gt;
const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');&lt;br&gt;
const { getSignedUrl } = require('@aws-sdk/s3-request-presigner');&lt;/p&gt;

&lt;p&gt;async function generatePresignedUpload(userId, fileName) {&lt;br&gt;
  const fileId = crypto.randomUUID() + '.pdf';&lt;br&gt;
  const s3Key = `case-analysis/uploads/\${userId}/\${fileId}`;&lt;/p&gt;

&lt;p&gt;const presignClient = new S3Client({&lt;br&gt;
    region: process.env.AWS_REGION,&lt;br&gt;
    requestChecksumCalculation: 'WHEN_REQUIRED',&lt;br&gt;
    responseChecksumValidation: 'WHEN_REQUIRED',&lt;br&gt;
  });&lt;/p&gt;

&lt;p&gt;const putCmd = new PutObjectCommand({&lt;br&gt;
    Bucket: process.env.S3_BUCKET,&lt;br&gt;
    Key: s3Key,&lt;br&gt;
    ServerSideEncryption: 'AES256',&lt;br&gt;
  });&lt;/p&gt;

&lt;p&gt;return await getSignedUrl(presignClient, putCmd, { expiresIn: 600 });&lt;br&gt;
}&lt;br&gt;
`&lt;code&gt;\&lt;/code&gt;&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Key gotcha:&lt;/strong&gt; AWS SDK v3 adds checksum query params that break browser PUT requests. Set &lt;code&gt;requestChecksumCalculation: 'WHEN_REQUIRED'\&lt;/code&gt; to fix.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 2: Text Extraction with OCR Fallback
&lt;/h2&gt;

&lt;p&gt;We try pdf-parse first (fast, digital PDFs), then fall back to Poppler + Tesseract for scanned documents.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 3: AI Analysis with Claude
&lt;/h2&gt;

&lt;p&gt;We use Claude's streaming Messages API. Rate limiting is handled with exponential backoff and user-visible status messages.&lt;/p&gt;

&lt;h2&gt;
  
  
  Stage 4: SSE Streaming
&lt;/h2&gt;

&lt;p&gt;Server-Sent Events give us real-time streaming from server to browser. We use fetch + ReadableStream instead of EventSource because we need POST requests.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Critical for Railway:&lt;/strong&gt; Send headers immediately and keepalive comments every 30s to prevent proxy timeouts.&lt;/p&gt;

&lt;h2&gt;
  
  
  Results
&lt;/h2&gt;

&lt;p&gt;1,500 pages processed in 3-5 minutes vs. 20-40 hours manually. SSE streaming means users see the timeline being built in real time.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Stack:&lt;/strong&gt; Node.js 20+, Claude API, AWS S3, Poppler + Tesseract, React + Vite, Railway&lt;/p&gt;




&lt;p&gt;&lt;em&gt;John Mahoney builds AI tools for medical malpractice litigation at &lt;a href="https://medicalai.law" rel="noopener noreferrer"&gt;medicalai.law&lt;/a&gt;.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>legaltech</category>
      <category>healthtech</category>
      <category>aws</category>
    </item>
    <item>
      <title>Building an AI Pipeline to Process 10,000+ Pages of Medical Records</title>
      <dc:creator>John Mahoney</dc:creator>
      <pubDate>Tue, 07 Apr 2026 02:36:50 +0000</pubDate>
      <link>https://dev.to/john_mahoney_41e9c2589ceb/building-an-ai-pipeline-to-proai-machinelearning-webdev-saascess-10000-pages-of-medical-records-3218</link>
      <guid>https://dev.to/john_mahoney_41e9c2589ceb/building-an-ai-pipeline-to-proai-machinelearning-webdev-saascess-10000-pages-of-medical-records-3218</guid>
      <description></description>
    </item>
  </channel>
</rss>
