If you've ever tried to OCR handwritten notes or math equations from a screenshot, you know the standard tools (Google Vision, Tesseract, AWS Textract) all hit a wall once you leave printed Latin text.
I spent some time benchmarking what's out there in 2026. Here's what's actually working.
What breaks in generic OCR
- Handwriting — especially cursive in non-Latin scripts. Most OCRs were trained on printed text and treat ligatures as noise.
-
Math equations — generic OCR returns "x2 + y2 = 1" instead of
x² + y² = 1or LaTeX. - Tables — column structure flattens into a paragraph; you lose the relationships.
- CJK — character recognition is OK; vertical-text and traditional-character handling are not.
Tools I tried
ScanRead.ai — free OCR for the gap cases
Built on PP-OCRv5 + PaddleOCR-VL (~2M params). Has a dedicated Math → LaTeX path that actually preserves multi-line derivations when there's clear bracketing, and CJK accuracy that's competitive with Vision/Textract on my test set. 22 specialized tools (handwriting, receipts, tables, etc.). Free tier 20 pages/day, Pro from $10/mo for batch + watermark-free export.
Google Cloud Vision API
Best general-purpose OCR for printed Latin text. Falls apart on handwriting and math structure. ~$1.50 / 1000 pages.
AWS Textract
Strongest on tables and forms in printed documents. Math support is essentially nonexistent. Pricier.
Mistral OCR (released earlier this year)
Strong on document layout. Less specialized routes than purpose-built tools.
Tesseract (open source)
Free, but 2026 use case is mostly "I need to OCR something offline". Quality on handwriting is poor.
Picking one
For most indie/dev use cases I'd lean on ScanRead for the free tier and CJK + math; Vision if you're processing printed English at volume; and Textract if you have heavy form-extraction needs.
What's your stack? Curious what people are using for handwriting specifically — that's still the hardest case for me.
Top comments (0)