Beyond the Page: An Update on Docling Graph and Multimodal Extraction
Introduction
The Docling ecosystem is experiencing unprecedented growth, making it one of the most dynamic projects in its field. With the release of version 2.93, it's really exciting to unveil a host of substantial enhancements across all of Doclingβs core strengths. A key highlight of this latest update is the significant advancements in our Docling-Graph library.
What is Docling Graph?
Excerpt from the official GitHub repository
Docling-Graph turns documents into validated Pydantic objects, then builds a directed knowledge graph with explicit semantic relationships.
This transformation enables high-precision use cases in chemistry, finance, and legal domains, where AI must capture exact entity connections (compounds and reactions, instruments and dependencies, properties and measurements) rather than rely on approximate text embeddings.
This toolkit supports two extraction paths: local VLM extraction via Docling, and LLM-based extraction routed through LiteLLM for local runtimes (vLLM, Ollama) and API providers (Mistral, OpenAI, Gemini, IBM watsonx), all orchestrated through a flexible, config-driven pipeline.
Key Capabilities
- βπ» Input formats: Doclingβs supported inputs: PDF, images, markdown, Office, HTML, and more.
- π§ Extraction: LLM or VLM backends, with chunking and processing modes.
- π Graphs: Pydantic β NetworkX directed graphs with stable IDs and edge metadata.
- π¦ Export: CSV, Cypher, and other KG-friendly formats.
- π Visualization: Interactive HTML and Markdown reports.
Latest Changes
- πͺ Multi-pass extraction: Delta and staged contracts (experimental).
- π Structured extraction: LLM output is schema-enforced by default; see CLI and API to disable.
- β¨ LiteLLM: Single interface for vLLM, OpenAI, Mistral, watsonx, and more.
- π Trace capture: Debug exports for extraction and fallback diagnostics.
Coming Soon
- π§© Interactive Template Builder: Guided workflows for building Pydantic templates.
- π§² Ontology-Based Templates: Match content to the best Pydantic template using semantic similarity.
- πΎ Graph Database Integration: Export data straight into Neo4j, ArangoDB, and similar databases.
Fast Implementation
Ready to take your enterprise-grade applications to the next level? You can begin integrating advanced graph capabilities immediately by exploring the production-ready samples available in the official repository.
Whether you are building sophisticated GraphRAG systems or looking to map complex document relationships, these resources provide the perfect starting point for your implementation.
- Install the package
pip install docling-graph
- Create your
.envfile or export the API keys of your target platform;
export OPENAI_API_KEY="..." # OpenAI
export MISTRAL_API_KEY="..." # Mistral
export GEMINI_API_KEY="..." # Google Gemini
# IBM WatsonX
export WATSONX_API_KEY="..." # IBM WatsonX API Key
export WATSONX_PROJECT_ID="..." # IBM WatsonX Project ID
export WATSONX_URL="..." # IBM WatsonX URL (optional)
- Use the provided samples to enable your applications in default mode;
from docling_graph import run_pipeline, PipelineContext
from docs.examples.templates.rheology_research import ScholarlyRheologyPaper
# Create configuration
config = {
"source": "https://arxiv.org/pdf/2207.02720",
"template": ScholarlyRheologyPaper,
"backend": "llm",
"inference": "remote",
"processing_mode": "many-to-one",
"extraction_contract": "staged", # robust for smaller models
"provider_override": "mistral",
"model_override": "mistral-medium-latest",
"structured_output": True, # default
"use_chunking": True,
}
# Run pipeline - returns data directly, no files written to disk
context: PipelineContext = run_pipeline(config)
# Access results
graph = context.knowledge_graph
models = context.extracted_models
metadata = context.graph_metadata
print(f"Extracted {len(models)} model(s)")
print(f"Graph: {graph.number_of_nodes()} nodes, {graph.number_of_edges()} edges")
- Or with Pydantic
from pydantic import BaseModel, Field
from docling_graph.utils import edge
class Person(BaseModel):
"""Person entity with stable ID."""
model_config = {
'is_entity': True,
'graph_id_fields': ['last_name', 'date_of_birth']
}
first_name: str = Field(description="Person's first name")
last_name: str = Field(description="Person's last name")
date_of_birth: str = Field(description="Date of birth (YYYY-MM-DD)")
class Organization(BaseModel):
"""Organization entity."""
model_config = {'is_entity': True}
name: str = Field(description="Organization name")
employees: list[Person] = edge("EMPLOYS", description="List of employees")
- Or even using the CLI;
# Initialize configuration
docling-graph init
# Convert document from URL (each line except the last must end with \)
docling-graph convert "https://arxiv.org/pdf/2207.02720" \
--template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
--processing-mode "many-to-one" \
--extraction-contract "staged" \
--debug
# Visualize results
docling-graph inspect outputs
Conclusion
As the ecosystem continues to evolve at a breakneck pace, the integration of structural intelligence through Docling-Graph represents a significant milestone in how we process enterprise data. By moving beyond simple text extraction and embracing relationship-aware mapping, you can unlock the full potential of your document-driven workflows and RAG implementations. Whether you are automating complex SDLC tasks or building advanced knowledge systems, these new tools provide the robust foundation needed for the next generation of intelligent applications.
Stay tuned for more Docling updates and thanks for reading!
Links
- Docling Graph repository: https://github.com/docling-project/docling-graph
- Docling Project Documentation and Samples: https://docling-project.github.io/docling/

Top comments (0)