DEV Community

Cover image for I am building a document api suite that gives you coordinates for every answer
Diego Viteri
Diego Viteri

Posted on

I am building a document api suite that gives you coordinates for every answer

Hi everyone!

I've been working on a document processing API suite that solves a few problems I encountered in my career..

  • LLMs are great to provide answers on documents but not so great when you need bounding boxes for them.

  • OCRs give you bounding boxes but they don't understand context.

  • Mixing them both is a pain.

So we are using vision models to solve this. The result is Ninjadoc, an easy to use api that answers your questions with geometry data for the evidence it found.

We have a few more products lined up down the line but for now this is what you get:

  • Endpoints to ask a single question in natural language (with/without coordinates)

  • A dashboard to define a collection of questions

  • Ai enhanced Markdown transforms (our techniques are great for these)

More improvements will come down the line but it's looking good and I wish I had something like this before.

Let me know what you think, any feedback is appreciated.

Ninjadoc

Thanks!

Top comments (0)