I'm not a patent attorney. I'm a developer who ended up deep in one of the most jargon-heavy, process-heavy, deadline-driven workflows in the legal industry — and built an AI to automate it.
Here's what actually happened.
The problem I stumbled into
A patent attorney friend walked me through what happens after you file a patent and the USPTO rejects it.
The examiner sends back what's called an office action — a multi-page document that goes claim by claim, maps each limitation to prior art references, and explains why your claims aren't patentable. §101, §102, §103, §112 — each rejection type has its own legal framework, its own case law, its own formatting requirements.
The attorney now has 3 months to respond or the application goes abandoned.
And responding isn't just "write a letter saying you disagree."
It involves:
- Downloading every cited prior art reference (sometimes 10+)
- Reading each reference to find where the examiner mischaracterized it
- Looking up relevant MPEP sections and case law
- Drafting claim amendments in a very specific markup format (37 CFR 1.121 — strikethroughs and underlines in a specific order)
- Writing legal arguments per rejection, citing cases by name with pinpoint citations
- Verifying every citation is real before filing
That's 3–8 hours per office action. Every single time.
A solo practitioner doing 10 office actions a month spends up to 80 hours on this one task. Half their working month.
I couldn't stop thinking about how much of this was structured enough to automate.
The first naive attempt
My first instinct was: just throw the PDF at GPT-4 and ask it to write a response.
This works until it catastrophically doesn't.
The problem is hallucinated citations. LLMs will confidently cite MPEP §2147, a case called In re Brink, a specific paragraph from a prior art reference that doesn't exist. A patent attorney files this without catching it, and now they've submitted fabricated legal authorities to a federal agency. That's a duty of candor violation — potentially career-ending.
So the first real engineering problem wasn't "can AI draft good arguments" — it was "how do we make sure everything it cites actually exists."
What we ended up building
PatentSolve (patentsolve.com) is a 4-step pipeline:
Step 1 — Parse. The attorney uploads the office action PDF (or we pull it directly from the USPTO by application number). We extract every rejection, every cited reference, every element mapping. Scanned documents go through Claude Vision. The parsing step alone took months to get right — office actions are not consistently formatted, examiners have wildly different writing styles, and some of the PDFs are genuinely terrible quality.
Step 2 — Analyze. For each rejection, we do a structured analysis: identify logic gaps in the examiner's reasoning, find where references were mischaracterized, map which claim limitations are actually distinguishable. This runs per-rejection in parallel.
Step 3 — Strategize. Before writing a single word, we pull the examiner's historical prosecution data from the USPTO. Allowance rate, interview success rate, how often they maintain rejections after amendment, what types of arguments have worked on them before. We built a 10-dimensional behavioral fingerprint per examiner. If the data says this examiner caves 70% of the time after an interview request, that changes the strategy completely.
Step 4 — Assemble with verification. Generate the full response — preamble, claim amendments in proper 37 CFR 1.121 markup, per-rejection legal arguments with MPEP and case law citations. Then run every single citation through a verification layer before delivery. Every MPEP section number checked against a valid database. Every case name matched against known patent case law. Every prior art reference location checked against the fetched document text.
The attorney gets a Word document that's ready for review and filing. Not a rough draft — a structured, formatted, citation-verified prosecution response.
The technical decisions worth talking about
Claude over GPT for legal reasoning. We tested both extensively on patent prosecution tasks. Claude handles long, structured legal documents better — the context window usage is more efficient on dense technical content, and the instruction-following on complex markup formatting (claim amendments specifically) is meaningfully better.
Parallel rejection processing with rate limit management. Office actions with 5+ rejections need to be analyzed concurrently or generation takes forever. But you can't just fire off unlimited parallel Claude calls. We settled on 3 concurrent calls with a queue, which keeps total generation time under 5 minutes for complex cases while staying within API rate limits.
The citation verification problem in detail. MPEP hallucinations follow patterns — certain fake section numbers come up repeatedly across different responses, which tells you something about how the model learned the domain. We built a whitelist/blacklist approach: known valid MPEP sections pass automatically, known hallucination targets get flagged, anything in between gets checked against a live database. For case law, we maintain a list of ~30 landmark patent cases that the model should cite — if it cites something outside that list, it gets flagged for attorney review.
Scanned PDF handling. A surprising percentage of office actions arrive as scanned documents, especially older ones or faxed documents that got digitized poorly. We detect scanned pages (< 50 characters extracted by the PDF parser triggers Vision mode), then process them through Claude Vision page by page. Quality is 90%+ even on bad scans.
Examiner data pipeline. The USPTO publishes prosecution data through their Open Data Portal. We pull it, normalize it, and build the behavioral fingerprints on a rolling basis. The tricky part is cache invalidation — examiner behavior changes over time, so we expire cached fingerprints at 30 days and rebuild. The data pipeline ended up being one of the more interesting infrastructure pieces in the whole project.
What surprised us
Attorneys don't want magic — they want trust. Early versions showed attorneys a finished response and they were skeptical of everything in it because they couldn't see how it was generated. We added a real-time activity feed — a terminal-style log that shows every step of the pipeline as it runs. Suddenly the same output felt trustworthy because they could see the reasoning chain. The UI change was more important than any model improvement we made.
The examiner intelligence is the real moat. We thought the AI drafting was the core product. Attorneys tell us the examiner behavioral data is what they can't get anywhere else. Knowing that a specific examiner has a 23% allowance rate, maintains rejections 80% of the time after amendment, but grants 68% of interview requests — that changes how you approach the case entirely. No competitor does this well. The drafting is table stakes; the data is the differentiator.
Citation hallucinations are a regulatory issue, not just a UX issue. The USPTO issued formal guidance in April 2024 warning about AI systems that "hallucinate or confabulate information" and explicitly stating that filing a paper with erroneous citations violates the duty of candor. Our citation verification went from "nice to have" to "this is why attorneys can trust this tool" overnight. Frame your safety features as compliance features — legal buyers respond to that framing.
Where we are now
PatentSolve launched at $75 per office action (pay per use) and $299/month for unlimited. We built the whole thing on Next.js, Supabase, Prisma, and the Claude API — 41 pages, 71 API endpoints, 132 test files.
The legal AI space is moving fast. Well-funded competitors are building full patent lifecycle platforms — drafting, prosecution, portfolio analytics, the works. Our bet is that doing one thing better than anyone else (office action response, with verified citations and real examiner intelligence) beats being the 6th-best at everything.
If you're building in legal AI or working with structured document processing at scale, happy to trade notes. The PDF parsing and citation verification problems are genuinely interesting engineering territory.
PatentSolve is at patentsolve.com — AI-powered patent office action response for patent attorneys and agents.
Top comments (0)