DEV Community

Cover image for 🚀 Exploring Document Parsing with Different AI Models
Adam LB
Adam LB

Posted on

🚀 Exploring Document Parsing with Different AI Models

I’m excited to share a new project I’ve been working on: Documents-Parsing-Lab

This project is not a benchmark, but an exploration lab — built around a collection of Jupyter notebooks that test different AI models on various document structures. The idea is simple: let users see the parsing results directly, compare approaches, and understand where each model shines (and where it struggles).


🔍 What’s Inside?

Each notebook applies one or more models to different real-world document types, including:

  • PDFs
  • Tables
  • Charts & Figures
  • Complex layouts

⚡ Models Explored So Far

  • OCR & Vision Models

    • Dolphin (ByteDance)
    • Typhoon_OCR_7B
    • MonkeyOCR
    • PaddleOCR
  • Table Structure Recognition

    • UniTable
    • TableFormer
    • Microsoft Table Transformer

(…and more to come!)


✨ Why This Project?

There are lots of benchmarks out there — but in practice, people want to see how models handle real documents. This lab is meant to be a hands-on exploration space where you can:

  • Run the notebooks yourself
  • Compare outputs across models
  • Understand limitations before using them in production

📈 What’s Next?

More models and examples will be added over time. The goal is to keep expanding this into a growing resource for anyone interested in:

  • OCR
  • Document AI
  • Structured data extraction

Top comments (0)