DEV Community

Cover image for Extractly - Turn PDFs into Data
Abdul Raheem
Abdul Raheem

Posted on

Extractly - Turn PDFs into Data

This is a submission for the Google AI Studio Multimodal Challenge

What I Built

Extractly is an AI-powered PDF extraction platform that accurately extracts text, tables, and charts from PDFs, preserving their exact format and wording.

Most open-source libraries break down when faced with complex PDFs, especially SEC filings, financial reports, and compliance documents filled with dense tables and tricky formatting. Cells merge, numbers misalign, and the meaning of entire sections can be lost. Extractly fixes this problem.

Extractly excels at maintaining table structures and formatting integrity. This matters because even a small misalignment in financial tables can completely change their meaning, which in turn compromises any downstream applications( especially RAG systems).

With Extractly, organizations can:

  • Reliably extract complex tables and structured data without losing fidelity.
  • Ensure clean, LLM-ready data for training or RAG pipelines.
  • Build production-grade AI systems that understand documents as they were intended, not as garbled text.

By bridging the gap between raw PDFs and accurate, structured data, Extractly enables a new level of trust, precision, and usability in working with critical documents.

Original PDF Content Extractly Content
Original PDF Content Extractly Content

Real World Impact

By transforming messy, unstructured PDFs into clean, structured, and reliable data, Extractly unlocks new levels of automation and insight across industries:

  • Finance & Compliance → Accurate SEC filing extractions reduce hours of manual review.
  • Legal & Contracts → Precise table preservation ensures no meaning is lost in negotiations.
  • Healthcare & Research → Extracts lab results and trial data from complex forms with high accuracy.
  • AI & RAG Pipelines → Produces clean, reliable data that boosts retrieval accuracy and downstream analytics.

Demo

Live App: https://extractly-505581424280.us-west1.run.app/
Video Recording: https://youtu.be/rFwgBlbzGXg

How I Used Google AI Studio

I used Google AI Studio to quickly turn my backend into a functional app. With the app builder and Gemini 2.5 Pro code assistant, I connected my backend, generated the UI, and set up the necessary connectors in minutes.

I applied prompt engineering techniques to guide the code assistant in optimizing the user experience. This included refining the UI into a more interactive design, adding components such as file download options, and rendering extracted results directly within the application.

Google AI Studio saved me a lot of time I would have otherwise spent building the frontend from scratch, while still giving me the freedom to shape the app’s flow and design the way I wanted.

Multimodal Features

Extractly uses Gemini 2.5 Pro’s multimodal capabilities to process PDFs that contain a mix of text, tables, and images. Instead of treating a PDF as flat text, Gemini analyzes both the content and the layout, which allows Extractly to:

  • Accurately capture tables and complex structures without losing formatting or merging cells.
  • Preserve original document fidelity, so financial and legal documents retain their meaning.
  • Extract multiple modalities together (text, structured data, and visuals) for a richer, more usable output.

By treating PDFs as multimodal objects (text + layout + structure), Extractly ensures that users don’t lose meaning or context when working with complex documents. For users, this means they can trust the extracted data to be LLM-ready, consistent, and production-grade without spending time on manual cleanup.

Top comments (0)