Processing 1,500 Pages of Medical Records in 3 Minutes with AI

#ai #legaltech #healthtech #aws

Medical malpractice attorneys deal with thousands of pages of medical records per case. Organizing those records into a chronological timeline is the foundation of every case — and it's historically been done by hand, taking 20-40 hours per case.

We built a pipeline that extracts structured data from uploaded medical record PDFs, streams AI-generated analysis back to the browser in real time, and handles files up to 500MB. Here's how it works.

John Mahoney, Founder @ MedLegal AI

The Architecture

The system has four stages:

Upload — Browser uploads PDFs directly to S3 via presigned URLs
Extract — Server pulls the file from S3, runs OCR if needed, extracts raw text
Analyze — Text is sent to Claude API for structured extraction
Stream — Results stream back to the browser via SSE as they're generated

Stage 1: Presigned S3 Uploads

Medical record PDFs are large. 200-500MB is common. We're deployed behind Cloudflare and Railway, both with upload size limits.

The solution: the browser uploads directly to S3 via presigned PUT URLs.

\`javascript
const { S3Client, PutObjectCommand } = require('@aws-sdk/client-s3');
const { getSignedUrl } = require('@aws-sdk/s3-request-presigner');

async function generatePresignedUpload(userId, fileName) {
const fileId = crypto.randomUUID() + '.pdf';
const s3Key = `case-analysis/uploads/\${userId}/\${fileId}`;

const presignClient = new S3Client({
region: process.env.AWS_REGION,
requestChecksumCalculation: 'WHEN_REQUIRED',
responseChecksumValidation: 'WHEN_REQUIRED',
});

const putCmd = new PutObjectCommand({
Bucket: process.env.S3_BUCKET,
Key: s3Key,
ServerSideEncryption: 'AES256',
});

return await getSignedUrl(presignClient, putCmd, { expiresIn: 600 });
}
`\

Key gotcha: AWS SDK v3 adds checksum query params that break browser PUT requests. Set requestChecksumCalculation: 'WHEN_REQUIRED'\ to fix.

Stage 2: Text Extraction with OCR Fallback

We try pdf-parse first (fast, digital PDFs), then fall back to Poppler + Tesseract for scanned documents.

Stage 3: AI Analysis with Claude

We use Claude's streaming Messages API. Rate limiting is handled with exponential backoff and user-visible status messages.

Stage 4: SSE Streaming

Server-Sent Events give us real-time streaming from server to browser. We use fetch + ReadableStream instead of EventSource because we need POST requests.

Critical for Railway: Send headers immediately and keepalive comments every 30s to prevent proxy timeouts.

Results

1,500 pages processed in 3-5 minutes vs. 20-40 hours manually. SSE streaming means users see the timeline being built in real time.

Stack: Node.js 20+, Claude API, AWS S3, Poppler + Tesseract, React + Vite, Railway

John Mahoney builds AI tools for medical malpractice litigation at medicalai.law.

DEV Community