Alain Airom (Ayrom)

Posted on Feb 22

The Architecture of Understanding: My Experience Building with Docling-Graph

#docling #bob #documentprocessing #aidocumentprocessing

Beyond the Text: Wiring Up My First Docling-Graph Application with Bob

Introducing Docling-Graph

Docling-Graph turns documents into validated Pydantic objects, then builds a directed knowledge graph with explicit semantic relationships.

This transformation enables high-precision use cases in chemistry, finance, and legal domains, where AI must capture exact entity connections (compounds and reactions, instruments and dependencies, properties and measurements) rather than rely on approximate text embeddings.

This toolkit supports two extraction paths: local VLM extraction via Docling, and LLM-based extraction routed through LiteLLM for local runtimes (vLLM, Ollama) and API providers (Mistral, OpenAI, Gemini, IBM watsonx), all orchestrated through a flexible, config-driven pipeline.

Key Capabilities

✍🏻 Input formats: Docling’s supported inputs: PDF, images, markdown, Office, HTML, and more.
🧠 Extraction: LLM or VLM backends, with chunking and processing modes.
💎 Graphs: Pydantic → NetworkX directed graphs with stable IDs and edge metadata.
📦 Export: CSV, Cypher, and other KG-friendly formats.
🔍 Visualization: Interactive HTML and Markdown reports.
🪜 Multi-pass extraction: Delta and staged contracts (experimental).
📐 Structured extraction: LLM output is schema-enforced by default; see CLI and API to disable.
✨ LiteLLM: Single interface for vLLM, OpenAI, Mistral, WatsonX, and more.
🐛 Trace capture: Debug exports for extraction and fallback diagnostics.
And Coming Soon…
🧩 Interactive Template Builder: Guided workflows for building Pydantic templates.
🧲 Ontology-Based Templates: Match content to the best Pydantic template using semantic similarity.
💾 Graph Database Integration: Export data straight into Neo4j, ArangoDB, and similar databases.

My Implementation of docling-graph — First Step

To build a comprehensive test-of-concept, I used Bob to synthesize the Docling-Graph documentation and sample code into a working implementation. This application isn’t a production-ready solution yet; rather, it’s a foundational prototype designed to explore the library’s technical capacities. My goal was to see how the graph-based document parsing holds up in a real-world environment, laying the groundwork for future business applications.

Out of the various examples provided in the official repository, I selected the following implementation as the foundation for my application. It serves as the perfect blueprint for demonstrating how Docling-Graph maps document structures into a navigable, programmatic format.

"""
Example 02: Quickstart - LLM Extraction from PDF

Description:
    Basic LLM extraction from a multi-page rheology research PDF using a remote API.
    Demonstrates the standard workflow for text-heavy documents with automatic chunking.

Use Cases:
    - Rheology researchs and academic documents
    - Technical reports and whitepapers
    - Multi-page business documents
    - Any text-heavy PDF content

Prerequisites:
    - Installation: uv sync
    - Environment: export MISTRAL_API_KEY="your-api-key"
    - Data: Sample rheology research included in repository

Key Concepts:
    - LLM Backend: Processes text extracted from PDFs
    - Many-to-One Mode: All pages merged into single output
    - Chunking: Automatically splits large documents for LLM context limits
    - Remote Inference: Uses Mistral API for extraction
    - Programmatic Merge: Combines chunk results without additional LLM call

Expected Output:
    - nodes.csv: Extracted research data (authors, experiments, results)
    - edges.csv: Relationships between research entities
    - graph.html: Interactive knowledge graph visualization
    - document.md: Markdown version of the PDF
    - report.md: Extraction statistics and summary

Related Examples:
    - Example 01: VLM extraction from images
    - Example 07: Local LLM inference
    - Example 08: Advanced chunking strategies
    - Documentation: https://ibm.github.io/docling-graph/usage/examples/research-paper/
"""

import sys
from pathlib import Path

from rich import print as rich_print
from rich.console import Console
from rich.panel import Panel

# Setup project path
project_root = Path(__file__).parent.parent.parent
sys.path.append(str(project_root))

try:
    from examples.templates.rheology_research import ScholarlyRheologyPaper

    from docling_graph import PipelineConfig, run_pipeline
except ImportError:
    rich_print("[red]Error:[/red] Could not import required modules.")
    rich_print("Please run this script from the project root directory.")
    sys.exit(1)

# Configuration
SOURCE_FILE = "docs/examples/data/research_paper/rheology.pdf"
TEMPLATE_CLASS = ScholarlyRheologyPaper
console = Console()


def main() -> None:
    """Execute LLM extraction from rheology research PDF."""
    console.print(
        Panel.fit(
            "[bold blue]Example 02: Quickstart - LLM from PDF[/bold blue]\n"
            "[dim]Extract structured data from a rheology research using Large Language Model[/dim]",
            border_style="blue",
        )
    )

    console.print("\n[yellow]📋 Configuration:[/yellow]")
    console.print(f"  • Source: [cyan]{SOURCE_FILE}[/cyan]")
    console.print(f"  • Template: [cyan]{TEMPLATE_CLASS.__name__}[/cyan]")
    console.print("  • Backend: [cyan]LLM (Large Language Model)[/cyan]")
    console.print("  • Provider: [cyan]Mistral AI[/cyan]")
    console.print("  • Mode: [cyan]many-to-one[/cyan]")

    console.print("\n[yellow]⚠️  Prerequisites:[/yellow]")
    console.print("  • Mistral API key must be set: [cyan]export MISTRAL_API_KEY='...'[/cyan]")
    console.print("  • Install dependencies: [cyan]uv sync[/cyan]")

    try:
        # Configure the pipeline
        config = PipelineConfig(
            source=SOURCE_FILE,
            template=TEMPLATE_CLASS,
            # LLM backend for text-based extraction
            backend="llm",
            # Remote inference using API
            inference="remote",
            # Use Mistral AI provider
            provider_override="mistral",
            # Use a capable model for complex extraction
            model_override="mistral-large-latest",
            # Many-to-one: merge all pages into single result
            processing_mode="many-to-one",
            # extraction_contract="direct" (default); use "staged" for complex nested templates (see Example 11)
            use_chunking=True,
        )

        # Execute the pipeline
        console.print("\n[yellow]⚙️  Processing (this may take 1-2 minutes)...[/yellow]")
        console.print("  • Converting PDF to markdown")
        console.print("  • Chunking document for LLM context")
        console.print("  • Extracting data from each chunk")
        console.print("  • Merging results programmatically")
        console.print("  • Building knowledge graph")

        context = run_pipeline(config)

        # Success message
        console.print("\n[green]✓ Success![/green]")
        graph = context.knowledge_graph
        console.print(
            f"\n[bold]Extracted:[/bold] [cyan]{graph.number_of_nodes()} nodes[/cyan] "
            f"and [cyan]{graph.number_of_edges()} edges[/cyan]"
        )

        console.print("\n[bold]💡 What Happened:[/bold]")
        console.print("  • PDF converted to markdown using Docling")
        console.print("  • Document split into chunks respecting context limits")
        console.print("  • Each chunk processed by Mistral LLM")
        console.print("  • Results merged programmatically (no LLM consolidation)")
        console.print("  • Knowledge graph built from extracted entities")

        console.print("\n[bold]🎯 Key Differences from Example 01:[/bold]")
        console.print("  • LLM vs VLM: Text-based vs vision-based extraction")
        console.print("  • Remote vs Local: API call vs local model")
        console.print("  • Many-to-one vs One-to-one: Merged vs separate outputs")
        console.print("  • Chunking: Enabled for large documents")

    except FileNotFoundError:
        console.print(f"\n[red]Error:[/red] Source file not found: {SOURCE_FILE}")
        console.print("\n[yellow]Troubleshooting:[/yellow]")
        console.print("  • Ensure you're running from the project root directory")
        console.print("  • Check that the sample data exists in docs/examples/data/")
        sys.exit(1)

    except Exception as e:
        error_msg = str(e).lower()
        console.print(f"\n[red]Error:[/red] {e}")
        console.print("\n[yellow]Troubleshooting:[/yellow]")

        if "api" in error_msg or "key" in error_msg or "auth" in error_msg:
            console.print(
                "  • Set your Mistral API key: [cyan]export MISTRAL_API_KEY='your-key'[/cyan]"
            )
            console.print("  • Get a key at: https://console.mistral.ai/")
            console.print("  • Or use local inference: see Example 07")
        else:
            console.print("  • Ensure dependencies installed: [cyan]uv sync[/cyan]")
            console.print("  • Check your internet connection")
            console.print("  • Verify the template class is correctly defined")
            console.print("  • Try with a smaller document first")

        sys.exit(1)


if __name__ == "__main__":
    main()

Rheology Research Extraction


Overview

Extract complex research data from scientific papers including experiments, measurements, materials, and results.

**Document Type:** Rheology Research (PDF)  
**Time:** 30 minutes  
**Backend:** LLM with chunking

---

Prerequisites

bash
Install with remote API support
pip install docling-graph

Set API key
export MISTRAL_API_KEY="your-key"


---

Template Overview

The rheology research template (`rheology_research.py`) includes:

- **Measurements** - Flexible value/unit pairs
- **Materials** - Granular material properties
- **Geometry** - Experimental setup
- **Vibration** - Vibration parameters
- **Simulation** - DEM simulation details
- **Results** - Rheological measurements
- **Experiments** - Complete experiment instances
- **Research** - Root document model

Key Components

python
1. Measurement Model
class Measurement(BaseModel):
    Flexible measurement with value and unit."""
    name: str
    numeric_value: float | None = None
    text_value: str | None = None
    unit: str | None = None

2. Enum Types
class GeometryType(str, Enum):
    VANE_RHEOMETER = "Vane Rheometer"
    DOUBLE_PLATE = "Double Plate"
    CYLINDRICAL_CONTAINER = "Cylindrical Container"

3. Experiment Entity
class Experiment(BaseModel):
    experiment_id: str
    objective: str
    granular_material: GranularMaterial = edge("USES_MATERIAL")
    vibration_conditions: VibrationConditions = edge("HAS_VIBRATION")
    rheological_results: List[RheologicalResult] = edge("HAS_RESULT")

4. Root Model
class Research(BaseModel):
    title: str
    authors: List[str]
    experiments: List[Experiment] = edge("HAS_EXPERIMENT")


Processing

Using CLI

bash
Process rheology research with chunking
uv run docling-graph convert research.pdf \
    --template "docs.examples.templates.rheology_research.ScholarlyRheologyPaper" \
    --backend llm \
    --inference remote \
    --provider mistral \
    --model mistral-large-latest \
    --processing-mode many-to-one \
    --use-chunking \
    --docling-pipeline vision \
    --output-dir "outputs/research"


Using Python API

python
Process rheology research.

import os
from docling_graph import run_pipeline, PipelineConfig

os.environ["MISTRAL_API_KEY"] = "your-key"

config = PipelineConfig(
    source="research.pdf",
    template="docs.examples.templates.rheology_research.ScholarlyRheologyPaper",
    backend="llm",
    inference="remote",
    provider_override="mistral",
    model_override="mistral-large-latest",
    processing_mode="many-to-one",
    use_chunking=True,
    docling_config="vision"  # Better for complex layouts
)

print("Processing rheology research (may take several minutes)...")
run_pipeline(config)
print("✅ Complete!")


Expected Results

Graph Structure

Research (Title)
├── HAS_EXPERIMENT → Experiment 1
│   ├── USES_MATERIAL → GranularMaterial
│   │   └── properties: [Measurement, Measurement]
│   ├── HAS_GEOMETRY → SystemGeometry
│   │   └── dimensions: [Measurement, Measurement]
│   ├── HAS_VIBRATION → VibrationConditions
│   │   ├── amplitude: Measurement
│   │   ├── frequency: Measurement
│   │   └── confining_pressure: Measurement
│   ├── HAS_SIMULATION → SimulationSetup
│   │   └── parameters: [Measurement, Measurement]
│   └── HAS_RESULT → RheologicalResult
│       └── measurement: Measurement
└── HAS_EXPERIMENT → Experiment 2
    └── ...


Statistics

json
{
  "node_count": 45,
  "edge_count": 38,
  "density": 0.019,
  "node_types": {
    "Research": 1,
    "Experiment": 3,
    "GranularMaterial": 3,
    "SystemGeometry": 3,
    "VibrationConditions": 3,
    "RheologicalResult": 12,
    "Measurement": 20
  }
}



Key Features

1. Enum Normalization

python
class GeometryType(str, Enum):
    VANE_RHEOMETER = "Vane Rheometer"
    CYLINDRICAL_CONTAINER = "Cylindrical Container"

 Validator accepts multiple formats
@field_validator("geometry_type", mode="before")
@classmethod
def normalize_enum(cls, v):
    # Accepts: "Vane Rheometer", "vane_rheometer", "VANE_RHEOMETER"
    return _normalize_enum(GeometryType, v)


2. Measurement Parsing
python
# Parses strings like "1.6 mPa.s", "2 mm", "80-90 °C"
def _parse_measurement_string(s: str):
    # Single value: "1.6 mPa.s" → {numeric_value: 1.6, unit: "mPa.s"}
    # Range: "80-90 °C" → {numeric_value_min: 80, numeric_value_max: 90, unit: "°C"}
    ...


3. Flexible Measurements

python
class Measurement(BaseModel):
    name: str
    numeric_value: float | None = None  # Single value
    numeric_value_min: float | None = None  # Range min
    numeric_value_max: float | None = None  # Range max
    text_value: str | None = None  # Qualitative
    unit: str | None = None


4. Nested Relationships

python
class Experiment(BaseModel):
     Direct edges
    granular_material: GranularMaterial = edge("USES_MATERIAL")

    Nested properties (not separate nodes)
    key_findings: List[str] = Field(default_factory=list)



Configuration Tips

For Long Documents

bash
# Enable chunking and consolidation
uv run docling-graph convert research.pdf \
    --template "templates.ScholarlyRheologyPaper" \
    --use-chunking \
    --processing-mode many-to-one


For Complex Layouts

bash
Use vision pipeline for better table/figure handling
uv run docling-graph convert research.pdf \
    --template "templates.ScholarlyRheologyPaper" \
    --docling-pipeline vision


For Cost Optimization

bash
Use smaller model without consolidation
uv run docling-graph convert research.pdf \
    --template "templates.ScholarlyRheologyPaper" \
    --model mistral-small-latest \


Customization

Simplify for Your Domain

python
"""Simplified research template."""

from pydantic import BaseModel, Field
from typing import List

def edge(label: str, **kwargs):
    return Field(..., json_schema_extra={"edge_label": label}, **kwargs)

class Measurement(BaseModel):
    """Simple measurement."""
    name: str
    value: str  # Keep as string for simplicity
    unit: str | None = None

class Experiment(BaseModel):
    """Simplified experiment."""
    title: str
    objective: str
    methods: str
    results: str
    measurements: List[Measurement] = Field(default_factory=list)

class Research(BaseModel):
    """Simplified rheology research (for demonstration).

    Note: For production use, see the full ScholarlyRheologyPaper template at:
    docs/examples/templates/rheology_research.py

    The full template includes:
    - Comprehensive scholarly metadata (authors, affiliations, identifiers)
    - Detailed formulation specifications (materials, components, amounts)
    - Batch preparation history (mixing steps, equipment, conditions)
    - Complete rheometry setup (instruments, geometries, protocols)
    - Test runs and datasets (curves, measurements, model fits)
    """
    title: str
    authors: List[str]
    abstract: str
    experiments: List[Experiment] = edge("HAS_EXPERIMENT")

Troubleshooting

🐛 Extraction Takes Too Long

**Solution:**
`bash
Disable consolidation for faster processing
uv run docling-graph convert research.pdf \
    --template "templates.ScholarlyRheologyPaper" \

Or use smaller model
--model mistral-small-latest


🐛 Missing Measurements

**Solution:**
python
# Make measurements optional
measurements: List[Measurement] = Field(
    default_factory=list,
    description="List of measurements (optional)"
)


🐛 Enum Validation Errors

**Solution:**
python
# Add OTHER option to enums
class GeometryType(str, Enum):
    VANE_RHEOMETER = "Vane Rheometer"
    OTHER = "Other"  # Fallback

Or make enum optional
geometry_type: GeometryType | None = Field(default=None)


Best Practices

👍 Start Simple, Add Complexity

python
Phase 1: Basic structure
class Research(BaseModel):
    title: str
    authors: List[str]
    abstract: str

Phase 2: Add experiments
class Research(BaseModel):
    title: str
    authors: List[str]
    abstract: str
    experiments: List[Experiment]

Phase 3: Add measurements, validations, etc.

👍 Use Appropriate Chunking

python
For papers > 10 pages
config = PipelineConfig(
    source="long_paper.pdf",
    template="templates.ScholarlyRheologyPaper",
    use_chunking=True,  # Essential
)


👍 Provide Clear Examples

python
✅ Good - Domain-specific examples
viscosity: Measurement = Field(
    description="Effective viscosity measurement",
    examples=[
        {"name": "Effective Viscosity", "numeric_value": 1.6, "unit": "mPa.s"}
    ]
)



Next Steps

1. **[ID Card →](id-card.md)** - Vision-based extraction
2. **[Advanced Patterns →](../../fundamentals/schema-definition/advanced-patterns.md)** - Complex templates
3. **[Performance Tuning →](../advanced/performance-tuning.md)** - Optimization

Core Project Implementation Synthesis

The D*ocling-Graph Showcase Application* is a production-ready, local-first solution designed to transform unstructured documents into validated, structured knowledge graphs. At its core, the implementation bridges the gap between raw document parsing (via the Docling-Graph library) and accessible user interaction (via a Gradio web interface).

The system architecture follows a modular pipeline: it ingests various file formats (PDFs, Office docs, images), processes them through a Document Converter, and utilizes local LLM inference — specifically Ollama with the Granite 3.1 model — to perform entity and relationship extraction. This extraction is governed by a Template Engine that ensures the output conforms to strict Pydantic schemas. The final result is a timestamped knowledge graph stored in an organized output directory. The project is fully “container-ready” with Docker and Kubernetes manifests, supported by automation scripts for launching and lifecycle management, ensuring it can scale from a simple local test to a deployed environment.

The Template Guide: Defining the “Brain” of Extraction

Templates serve as the foundational blueprints for the application’s intelligence, defining exactly what should be extracted and how it should be structured. Built using Pydantic models, these templates act as a bridge between unstructured text and formal data; they specify “nodes” (entities like parties, dates, or products) and “edges” (relationships like “buyer of” or “tax applied to”).

What makes these templates unique is their use of v*alidation rules and natural language descriptions* to guide the LLM. For instance, a template might include a validator to normalize currency formats or specific “hints” to tell the LLM where to look for data. By switching between different templates — such as those for billing documents, scientific research, or identity cards — the application can pivot its entire extraction logic to suit different industries without changing the underlying code. Essentially, the Template Guide provides the schema that transforms a generic LLM into a specialized document expert.

LLM Implementation

While the official repository showcases various hosted models, I’ve tailored my implementation to run on a local Ollama setup for maximum privacy and control. That said, the application’s architecture is intentionally provider-agnostic; users can easily pivot to watsonx, OpenAI, Mistral, or Gemini by simply adjusting the environment configuration. A dedicated LLM configuration layer handles the specific nuances of each provider, ensuring the extraction logic remains consistent regardless of the backend.

If you have several local models using Ollama, the application let’s you chose the one you prefer (or to benchmark them!).

The output of precossed documents

As is my habit, I’ve configured the system to store everything in timestamped files within the output folder for easy version control. Just a heads-up: because I was putting the CPU through its paces with these complex document-to-graph transformations, the 'heavy lifting' can take a little while. Grab a coffee while Bob’s docling-graph works through the more data-dense files! 😉

The code…

Now that you know the ‘why’ and the ‘how,’ here is the code Bob and I put together to bring the Gradio interface to life. This project is a starting point, and I’ve made it fully available on GitHub for anyone to fork, test, and improve. If you have ideas for better configurations or want to use the UI as a springboard for your own business case, I’d love to see what you build!

"""
Docling-Graph Showcase Application
A Gradio-based UI for document processing using docling-graph with Ollama/Granite4
"""

import os
import sys
from pathlib import Path
from datetime import datetime
from typing import List, Tuple, Optional, Dict, Any, Type
import json
import traceback
import requests
import importlib.util
from dotenv import load_dotenv

import gradio as gr
from rich.console import Console
from rich.panel import Panel
from pydantic import BaseModel

# Load environment variables from .env file
load_dotenv()

# Add project root to path
project_root = Path(__file__).parent
sys.path.append(str(project_root))

try:
    from docling_graph import PipelineConfig, run_pipeline
except ImportError:
    print("Error: docling-graph not installed. Run: pip install docling-graph")
    sys.exit(1)

console = Console()

# Configuration
INPUT_DIR = project_root / "input"
OUTPUT_DIR = project_root / "output"
SAMPLES_DIR = project_root / "_samples"
TEMPLATES_DIR = project_root / "templates"

# Ensure directories exist
INPUT_DIR.mkdir(exist_ok=True)
OUTPUT_DIR.mkdir(exist_ok=True)
TEMPLATES_DIR.mkdir(exist_ok=True)

# Load configuration from environment variables
OLLAMA_BASE_URL = os.getenv("OLLAMA_BASE_URL", "http://localhost:11434")
OLLAMA_MODEL = os.getenv("OLLAMA_MODEL", "granite4")

# watsonx Orchestrate configuration
WO_DEVELOPER_EDITION_SOURCE = os.getenv("WO_DEVELOPER_EDITION_SOURCE", "orchestrate")
WO_INSTANCE = os.getenv("WO_INSTANCE", "")
WO_API_KEY = os.getenv("WO_API_KEY", "")

# Remote API keys
MISTRAL_API_KEY = os.getenv("MISTRAL_API_KEY", "")
OPENAI_API_KEY = os.getenv("OPENAI_API_KEY", "")
GEMINI_API_KEY = os.getenv("GEMINI_API_KEY", "")

# Application settings
GRADIO_SERVER_PORT = int(os.getenv("GRADIO_SERVER_PORT", "7860"))
GRADIO_SERVER_NAME = os.getenv("GRADIO_SERVER_NAME", "0.0.0.0")


def get_ollama_models() -> List[str]:
    """Fetch available Ollama models from the local Ollama instance."""
    try:
        response = requests.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=5)
        if response.status_code == 200:
            data = response.json()
            models = [model["name"] for model in data.get("models", [])]
            return sorted(models) if models else [OLLAMA_MODEL]
        else:
            console.print(f"[yellow]Warning: Could not fetch Ollama models (status {response.status_code})[/yellow]")
            return [OLLAMA_MODEL]
    except requests.exceptions.RequestException as e:
        console.print(f"[yellow]Warning: Ollama not available - {str(e)}[/yellow]")
        return [OLLAMA_MODEL]


def check_ollama_status() -> Tuple[bool, str]:
    """Check if Ollama is running and return status."""
    try:
        response = requests.get(f"{OLLAMA_BASE_URL}/api/tags", timeout=5)
        if response.status_code == 200:
            models = get_ollama_models()
            return True, f"🟢 Ollama Running ({len(models)} models available)"
        return False, "🟡 Ollama responding but no models found"
    except requests.exceptions.RequestException:
        return False, "🔴 Ollama Not Running"


def get_timestamp() -> str:
    """Generate timestamp for output files."""
    return datetime.now().strftime("%Y%m%d_%H%M%S")


def load_template_from_file(template_path: Path) -> Optional[Type[BaseModel]]:
    """
    Dynamically load a Pydantic template class from a Python file.

    Args:
        template_path: Path to the template Python file

    Returns:
        The root template class (BaseModel subclass) or None if not found
    """
    try:
        # Load the module
        spec = importlib.util.spec_from_file_location(template_path.stem, template_path)
        if spec is None or spec.loader is None:
            console.print(f"[yellow]Warning: Could not load spec for {template_path.name}[/yellow]")
            return None

        module = importlib.util.module_from_spec(spec)
        spec.loader.exec_module(module)

        # Find the root template class (usually the last BaseModel class defined)
        # Look for classes with graph_id_fields in model_config
        template_classes = []
        for name in dir(module):
            obj = getattr(module, name)
            if (isinstance(obj, type) and
                issubclass(obj, BaseModel) and
                obj is not BaseModel):
                # Check if it has graph_id_fields (indicates it's a root entity)
                if hasattr(obj, 'model_config'):
                    config = obj.model_config
                    if isinstance(config, dict) and 'graph_id_fields' in config:
                        template_classes.append(obj)

        # Return the last one found (usually the root document class)
        if template_classes:
            return template_classes[-1]

        console.print(f"[yellow]Warning: No root template class found in {template_path.name}[/yellow]")
        return None

    except Exception as e:
        console.print(f"[red]Error loading template {template_path.name}: {str(e)}[/red]")
        return None


def get_available_templates() -> Dict[str, Dict[str, Any]]:
    """
    Get all available templates from templates directory and _samples.

    Returns:
        Dictionary mapping template names to their info (path, class, description)
    """
    templates = {}

    # Load from templates directory
    if TEMPLATES_DIR.exists():
        for template_file in TEMPLATES_DIR.glob("*.py"):
            if template_file.name.startswith("_"):
                continue

            template_class = load_template_from_file(template_file)
            if template_class:
                # Extract description from docstring
                description = template_class.__doc__ or "No description available"
                description = description.strip().split('\n')[0]  # First line only

                templates[template_file.stem] = {
                    "name": template_file.stem.replace("_", " ").title(),
                    "path": template_file,
                    "class": template_class,
                    "description": description,
                    "source": "templates"
                }

    # Load from _samples directory
    if SAMPLES_DIR.exists():
        for template_file in SAMPLES_DIR.glob("*_template.py"):
            template_class = load_template_from_file(template_file)
            if template_class:
                description = template_class.__doc__ or "No description available"
                description = description.strip().split('\n')[0]

                templates[f"sample_{template_file.stem}"] = {
                    "name": f"Sample: {template_file.stem.replace('_', ' ').title()}",
                    "path": template_file,
                    "class": template_class,
                    "description": description,
                    "source": "_samples"
                }

    return templates


def list_input_files() -> List[str]:
    """List all files in the input directory."""
    if not INPUT_DIR.exists():
        return []

    files = []
    for file_path in INPUT_DIR.iterdir():
        if file_path.is_file():
            files.append(file_path.name)
    return sorted(files)


def process_document(
    file_path: str,
    backend: str,
    processing_mode: str,
    use_chunking: bool,
    provider: str,
    model: str,
    template_key: str,
    progress=gr.Progress()
) -> Tuple[str, Optional[str], Optional[str], Optional[str]]:
    """
    Process a single document using docling-graph.

    Args:
        file_path: Path to the document
        backend: Extraction backend (llm or vlm)
        processing_mode: Processing mode (one-to-one or many-to-one)
        use_chunking: Whether to use chunking
        provider: LLM provider (ollama, mistral, openai, etc.)
        model: Model name
        template_key: Key of the template to use
        progress: Gradio progress tracker

    Returns:
        Tuple of (status_message, graph_html_path, nodes_csv_path, edges_csv_path)
        File paths may be None if files weren't generated
    """
    try:
        progress(0.0, desc="🔧 Initializing...")

        # Create timestamped output directory
        timestamp = get_timestamp()
        output_subdir = OUTPUT_DIR / f"run_{timestamp}"
        output_subdir.mkdir(exist_ok=True)

        # Prepare source path
        source_path = INPUT_DIR / file_path
        if not source_path.exists():
            return f"Error: File not found: {file_path}", None, None, None

        progress(0.1, desc="📋 Loading template...")

        # Load the selected template
        available_templates = get_available_templates()
        if template_key not in available_templates:
            return f"Error: Template '{template_key}' not found", None, None, None

        template_info = available_templates[template_key]
        template_class = template_info["class"]
        template_name = template_info["name"]

        progress(0.15, desc="📋 Configuring pipeline...")

        # Configure pipeline
        config_dict = {
            "source": str(source_path),
            "template": template_class,
            "backend": backend,
            "inference": "remote" if provider != "ollama" else "remote",
            "processing_mode": processing_mode,
            "use_chunking": use_chunking,
            "output_dir": str(output_subdir),
        }

        # Add provider-specific configuration
        if provider == "ollama":
            config_dict["provider_override"] = "ollama"
            config_dict["model_override"] = f"ollama/{model}"
            config_dict["api_base"] = OLLAMA_BASE_URL
        else:
            config_dict["provider_override"] = provider
            config_dict["model_override"] = model

        config = PipelineConfig(**config_dict)

        progress(0.2, desc="⚙️ Processing document...")
        console.print("  • Converting document to markdown")

        progress(0.3, desc="📄 Converting to markdown...")

        progress(0.5, desc="🧠 Extracting data with LLM...")
        console.print("  • Extracting structured data")

        progress(0.7, desc="🔗 Building knowledge graph...")

        # Run pipeline
        context = run_pipeline(config)

        progress(0.85, desc="💾 Exporting results...")
        console.print("  • Exporting to CSV and HTML")

        # Get results
        graph = context.knowledge_graph
        models = context.extracted_models

        # Export results manually
        from docling_graph.core import CSVExporter, JSONExporter, InteractiveVisualizer
        from pathlib import Path

        # Export nodes and edges as CSV
        csv_exporter = CSVExporter()
        csv_output_path = output_subdir / f"graph_{timestamp}"
        csv_exporter.export(graph=graph, output_path=csv_output_path)

        # Export as JSON
        json_exporter = JSONExporter()
        json_output_path = output_subdir / f"graph_{timestamp}.json"
        json_exporter.export(graph=graph, output_path=json_output_path)

        # Generate HTML visualization
        visualizer = InteractiveVisualizer()
        html_output_path = output_subdir / f"graph_{timestamp}.html"
        visualizer.save_cytoscape_graph(graph=graph, output_path=html_output_path)

        progress(0.95, desc="💾 Generating outputs...")
        console.print("  • Saving results")

        # Save results with timestamp
        timestamp_str = timestamp

        # Save report in the style of 02_quickstart_llm_pdf.py
        report_path = output_subdir / f"report_{timestamp_str}.md"
        with open(report_path, "w") as f:
            f.write(f"# Document Processing Report\n\n")
            f.write(f"## Configuration\n\n")
            f.write(f"- **Source:** {file_path}\n")
            f.write(f"- **Template:** {template_name}\n")
            f.write(f"- **Backend:** {backend.upper()} ({'Large Language Model' if backend == 'llm' else 'Vision Language Model'})\n")
            f.write(f"- **Provider:** {provider}\n")
            f.write(f"- **Model:** {model}\n")
            f.write(f"- **Mode:** {processing_mode}\n")
            f.write(f"- **Chunking:** {'Enabled' if use_chunking else 'Disabled'}\n\n")

            f.write(f"## Results\n\n")
            f.write(f"**Extracted:** {graph.number_of_nodes()} nodes and {graph.number_of_edges()} edges\n\n")

            f.write(f"## What Happened\n\n")
            f.write(f"- Document converted to markdown using Docling\n")
            if use_chunking:
                f.write(f"- Document split into chunks respecting context limits\n")
                f.write(f"- Each chunk processed by {provider} {backend.upper()}\n")
                f.write(f"- Results merged programmatically\n")
            else:
                f.write(f"- Document processed by {provider} {backend.upper()}\n")
            f.write(f"- Knowledge graph built from extracted entities\n\n")

            f.write(f"## Output Files\n\n")
            f.write(f"- **nodes.csv:** Extracted entities\n")
            f.write(f"- **edges.csv:** Relationships between entities\n")
            f.write(f"- **graph.html:** Interactive knowledge graph visualization\n")
            f.write(f"- **document.md:** Markdown version of the document\n")
            f.write(f"- **report.md:** This extraction report\n")

        # Find generated files (CSV files are in subdirectory)
        graph_html = list(output_subdir.glob("*.html"))
        nodes_csv = list(output_subdir.glob("**/nodes.csv"))
        edges_csv = list(output_subdir.glob("**/edges.csv"))

        # Return None instead of empty string if files don't exist (Gradio handles None properly)
        graph_html_path = str(graph_html[0]) if graph_html else None
        nodes_csv_path = str(nodes_csv[0]) if nodes_csv else None
        edges_csv_path = str(edges_csv[0]) if edges_csv else None

        progress(1.0, desc="✅ Complete!")

        status = f"""## ✅ Success!

**Extracted:** {graph.number_of_nodes()} nodes and {graph.number_of_edges()} edges

### 📋 Configuration
- **Source:** {file_path}
- **Template:** {template_name}
- **Backend:** {backend.upper()} ({'Large Language Model' if backend == 'llm' else 'Vision Language Model'})
- **Provider:** {provider}
- **Model:** {model}
- **Mode:** {processing_mode}

### 💡 What Happened
- Document converted to markdown using Docling
{f'- Document split into chunks respecting context limits' if use_chunking else ''}
{f'- Each chunk processed by {provider} {backend.upper()}' if use_chunking else f'- Document processed by {provider} {backend.upper()}'}
{f'- Results merged programmatically' if use_chunking else ''}
- Knowledge graph built from extracted entities

### 📁 Output Files
**Directory:** `{output_subdir.name}`

- **report_{timestamp_str}.md** - Extraction report and statistics
- **{Path(graph_html_path).name if graph_html_path else 'graph.html'}** - Interactive visualization
- **{Path(nodes_csv_path).name if nodes_csv_path else 'nodes.csv'}** - Extracted entities
- **{Path(edges_csv_path).name if edges_csv_path else 'edges.csv'}** - Relationships
"""

        return status, graph_html_path, nodes_csv_path, edges_csv_path

    except Exception as e:
        error_msg = f"""❌ Error Processing Document

**Error:** {str(e)}

**Traceback:**

{traceback.format_exc()}


**Troubleshooting:**
- Ensure Ollama is running: `ollama serve`
- Check if model is available: `ollama list`
- Verify input file exists in ./input directory
- Check API keys if using remote providers
- For large documents, processing may take 30-60 minutes
- Check logs: `tail -f logs/docling-graph-app.log`
"""
        return error_msg, None, None, None


def batch_process_documents(
    backend: str,
    processing_mode: str,
    use_chunking: bool,
    provider: str,
    model: str,
    template_key: str,
    progress=gr.Progress()
) -> str:
    """
    Process all documents in the input directory.

    Returns:
        Status message with results
    """
    try:
        files = list_input_files()
        if not files:
            return "❌ No files found in input directory"

        results = []
        total_files = len(files)

        for idx, file_name in enumerate(files):
            progress((idx + 1) / total_files, desc=f"Processing {file_name}...")

            status, _, _, _ = process_document(
                file_name,
                backend,
                processing_mode,
                use_chunking,
                provider,
                model,
                template_key,
                progress=gr.Progress()
            )

            results.append(f"### {file_name}\n{status}\n")

        summary = f"""# Batch Processing Complete

**Total Files:** {total_files}
**Output Directory:** {OUTPUT_DIR}

---

{"".join(results)}
"""
        return summary

    except Exception as e:
        return f"❌ Batch Processing Error: {str(e)}\n\n{traceback.format_exc()}"


# Create Gradio Interface
with gr.Blocks(title="Docling-Graph Showcase") as app:
    # Check Ollama status at startup
    ollama_running, ollama_status = check_ollama_status()
    available_models = get_ollama_models() if ollama_running else [OLLAMA_MODEL]

    # Load available templates
    available_templates = get_available_templates()
    template_choices = {info["name"]: key for key, info in available_templates.items()}
    template_descriptions = {info["name"]: info["description"] for key, info in available_templates.items()}

    gr.Markdown(f"""
    # 🔍 Docling-Graph Showcase

    Transform documents into validated knowledge graphs using docling-graph with local or remote LLMs.

    **Status:** {ollama_status}

    **Features:**
    - 📄 Individual or batch document processing
    - 🧠 Local LLM inference with Ollama or remote providers
    - 📊 Interactive graph visualization
    - 💾 CSV export for nodes and edges
    - 📋 Multiple domain-specific templates

    ---
    """)

    with gr.Tabs():
        # Individual Processing Tab
        with gr.Tab("📄 Individual Processing"):
            gr.Markdown("### Process a single document")

            with gr.Row():
                with gr.Column(scale=1):
                    file_dropdown = gr.Dropdown(
                        choices=list_input_files(),
                        label="Select Document",
                        info="Files from ./input directory"
                    )
                    refresh_btn = gr.Button("🔄 Refresh File List", size="sm")

                    # Template selection
                    template_dropdown = gr.Dropdown(
                        choices=list(template_choices.keys()),
                        value=list(template_choices.keys())[0] if template_choices else None,
                        label="📋 Extraction Template",
                        info="Choose a domain-specific template for structured extraction"
                    )

                    template_info = gr.Markdown(
                        value=f"**Description:** {list(template_descriptions.values())[0] if template_descriptions else 'No templates available'}",
                        visible=True
                    )

                    backend_radio = gr.Radio(
                        choices=["llm", "vlm"],
                        value="llm",
                        label="Extraction Backend",
                        info="LLM for text, VLM for images"
                    )

                    mode_radio = gr.Radio(
                        choices=["one-to-one", "many-to-one"],
                        value="many-to-one",
                        label="Processing Mode",
                        info="one-to-one: separate outputs per page, many-to-one: merged output"
                    )

                    chunking_check = gr.Checkbox(
                        value=True,
                        label="Use Chunking",
                        info="Split large documents for LLM context limits"
                    )

                    provider_dropdown = gr.Dropdown(
                        choices=["ollama", "watsonx", "mistral", "openai", "gemini"],
                        value="ollama",
                        label="Provider",
                        info="LLM provider (ollama for local, watsonx for IBM watsonx)"
                    )

                    # Dynamic model selection based on provider
                    model_dropdown = gr.Dropdown(
                        choices=available_models,
                        value=available_models[0] if available_models else OLLAMA_MODEL,
                        label="Ollama Model",
                        info="Select from available Ollama models",
                        visible=True,
                        allow_custom_value=True
                    )

                    model_text = gr.Textbox(
                        value="",
                        label="Model Name (for non-Ollama providers)",
                        info="e.g., gpt-4, mistral-large, gemini-pro",
                        visible=False
                    )

                    refresh_models_btn = gr.Button("🔄 Refresh Ollama Models", size="sm")

                    # API Key fields for remote providers
                    with gr.Accordion("🔑 API Configuration (for remote providers)", open=False):
                        api_key_text = gr.Textbox(
                            value="",
                            label="API Key",
                            type="password",
                            info="Required for watsonx, OpenAI, Mistral, or Gemini. Leave empty to use .env values.",
                            placeholder="Enter API key or leave empty to use .env"
                        )
                        api_base_text = gr.Textbox(
                            value="",
                            label="API Base URL (optional)",
                            info="Custom API endpoint if needed. Leave empty to use defaults.",
                            placeholder="Optional: Custom API endpoint"
                        )

                    process_btn = gr.Button("🚀 Process Document", variant="primary")

                with gr.Column(scale=2):
                    status_output = gr.Markdown(label="Status")

                    with gr.Accordion("📊 Outputs", open=False):
                        graph_file = gr.File(label="Graph HTML")
                        nodes_file = gr.File(label="Nodes CSV")
                        edges_file = gr.File(label="Edges CSV")

            # Function to handle provider change
            def update_model_inputs(provider):
                """Update model input fields based on selected provider."""
                if provider == "ollama":
                    models = get_ollama_models()
                    return (
                        gr.Dropdown(visible=True, choices=models, value=models[0] if models else OLLAMA_MODEL),
                        gr.Textbox(visible=False)
                    )
                else:
                    # For remote providers, show text input for model name
                    default_models = {
                        "watsonx": "ibm/granite-13b-chat-v2",
                        "openai": "gpt-4",
                        "mistral": "mistral-large-latest",
                        "gemini": "gemini-pro"
                    }
                    return (
                        gr.Dropdown(visible=False),
                        gr.Textbox(visible=True, value=default_models.get(provider, ""))
                    )

            def refresh_ollama_models():
                """Refresh the list of available Ollama models."""
                models = get_ollama_models()
                return gr.Dropdown(choices=models, value=models[0] if models else OLLAMA_MODEL)

            def get_model_value(provider, model_dropdown_value, model_text_value):
                """Get the appropriate model value based on provider."""
                return model_dropdown_value if provider == "ollama" else model_text_value

            # Wire up individual processing
            refresh_btn.click(
                fn=lambda: gr.Dropdown(choices=list_input_files()),
                outputs=file_dropdown
            )

            provider_dropdown.change(
                fn=update_model_inputs,
                inputs=[provider_dropdown],
                outputs=[model_dropdown, model_text]
            )

            refresh_models_btn.click(
                fn=refresh_ollama_models,
                outputs=model_dropdown
            )

            # Function to update template description
            def update_template_info(template_name):
                """Update template description when selection changes."""
                if template_name and template_name in template_descriptions:
                    return f"**Description:** {template_descriptions[template_name]}"
                return "**Description:** No description available"

            template_dropdown.change(
                fn=update_template_info,
                inputs=[template_dropdown],
                outputs=[template_info]
            )

            # Modified process function to handle both model inputs and template
            def process_with_model_selection(file_path, template_name, backend, mode, chunking, provider,
                                            model_dropdown_val, model_text_val, progress=gr.Progress()):
                model = model_dropdown_val if provider == "ollama" else model_text_val
                template_key = template_choices.get(template_name, list(template_choices.values())[0])
                return process_document(file_path, backend, mode, chunking, provider, model, template_key, progress)

            process_btn.click(
                fn=process_with_model_selection,
                inputs=[
                    file_dropdown,
                    template_dropdown,
                    backend_radio,
                    mode_radio,
                    chunking_check,
                    provider_dropdown,
                    model_dropdown,
                    model_text
                ],
                outputs=[status_output, graph_file, nodes_file, edges_file]
            )

        # Batch Processing Tab
        with gr.Tab("📚 Batch Processing"):
            gr.Markdown("### Process all documents in the input directory")

            with gr.Row():
                with gr.Column(scale=1):
                    # Template selection for batch
                    batch_template_dropdown = gr.Dropdown(
                        choices=list(template_choices.keys()),
                        value=list(template_choices.keys())[0] if template_choices else None,
                        label="📋 Extraction Template",
                        info="Choose a domain-specific template for structured extraction"
                    )

                    batch_template_info = gr.Markdown(
                        value=f"**Description:** {list(template_descriptions.values())[0] if template_descriptions else 'No templates available'}",
                        visible=True
                    )

                    batch_backend = gr.Radio(
                        choices=["llm", "vlm"],
                        value="llm",
                        label="Extraction Backend"
                    )

                    batch_mode = gr.Radio(
                        choices=["one-to-one", "many-to-one"],
                        value="many-to-one",
                        label="Processing Mode"
                    )

                    batch_chunking = gr.Checkbox(
                        value=True,
                        label="Use Chunking"
                    )

                    batch_provider = gr.Dropdown(
                        choices=["ollama", "watsonx", "mistral", "openai", "gemini"],
                        value="ollama",
                        label="Provider"
                    )

                    # Dynamic model selection for batch processing
                    batch_model_dropdown = gr.Dropdown(
                        choices=available_models,
                        value=available_models[0] if available_models else OLLAMA_MODEL,
                        label="Ollama Model",
                        info="Select from available Ollama models",
                        visible=True,
                        allow_custom_value=True
                    )

                    batch_model_text = gr.Textbox(
                        value="",
                        label="Model Name (for non-Ollama providers)",
                        info="e.g., gpt-4, mistral-large, gemini-pro",
                        visible=False
                    )

                    batch_refresh_models_btn = gr.Button("🔄 Refresh Ollama Models", size="sm")

                    with gr.Accordion("🔑 API Configuration (for remote providers)", open=False):
                        batch_api_key_text = gr.Textbox(
                            value="",
                            label="API Key",
                            type="password",
                            info="Required for watsonx, OpenAI, Mistral, or Gemini. Leave empty to use .env values.",
                            placeholder="Enter API key or leave empty to use .env"
                        )
                        batch_api_base_text = gr.Textbox(
                            value="",
                            label="API Base URL (optional)",
                            info="Custom API endpoint if needed. Leave empty to use defaults.",
                            placeholder="Optional: Custom API endpoint"
                        )

                    batch_btn = gr.Button("🚀 Process All Documents", variant="primary")

                with gr.Column(scale=2):
                    batch_status = gr.Markdown(label="Batch Status")

            # Wire up batch processing provider change
            batch_provider.change(
                fn=update_model_inputs,
                inputs=[batch_provider],
                outputs=[batch_model_dropdown, batch_model_text]
            )

            batch_refresh_models_btn.click(
                fn=refresh_ollama_models,
                outputs=batch_model_dropdown
            )

            # Function to update batch template description
            batch_template_dropdown.change(
                fn=update_template_info,
                inputs=[batch_template_dropdown],
                outputs=[batch_template_info]
            )

            # Modified batch process function
            def batch_process_with_model_selection(template_name, backend, mode, chunking, provider,
                                                   model_dropdown_val, model_text_val, progress=gr.Progress()):
                model = model_dropdown_val if provider == "ollama" else model_text_val
                template_key = template_choices.get(template_name, list(template_choices.values())[0])
                return batch_process_documents(backend, mode, chunking, provider, model, template_key, progress)

            # Wire up batch processing
            batch_btn.click(
                fn=batch_process_with_model_selection,
                inputs=[
                    batch_template_dropdown,
                    batch_backend,
                    batch_mode,
                    batch_chunking,
                    batch_provider,
                    batch_model_dropdown,
                    batch_model_text
                ],
                outputs=batch_status
            )

        # Help Tab
        with gr.Tab("ℹ️ Help"):
            gr.Markdown("""
            ## Getting Started

            ### 1. Setup Ollama (for local inference)
            ```

bash
            # Install Ollama
            curl -fsSL https://ollama.com/install.sh | sh

            # Start Ollama service
            ollama serve

            # Pull a model (examples)
            ollama pull granite4
            ollama pull llama3
            ollama pull mistral


            ```

            ### 2. Add Documents
            Place your documents (PDF, images, markdown, etc.) in the `./input` directory.

            ### 3. Select Provider & Model
            - **Ollama (Local):** Select from your installed models using the dropdown
            - **Remote Providers:** Choose OpenAI, Mistral, or Gemini and enter your API key

            ### 4. Process Documents
            - **Individual:** Select a file and click "Process Document"
            - **Batch:** Click "Process All Documents" to process everything

            ### 5. View Results
            Results are saved in `./output` with timestamps:
            - `report_TIMESTAMP.md` - Processing summary
            - `graph_TIMESTAMP.html` - Interactive visualization
            - `nodes.csv` - Extracted entities
            - `edges.csv` - Relationships

            ## Configuration

            ### Templates
            Templates define the structure of data to extract from documents. Each template is a Pydantic model that:
            - Defines entities (nodes) and relationships (edges)
            - Provides field descriptions to guide the LLM
            - Validates extracted data
            - Generates a knowledge graph

            **Available Templates:**
            - Located in `./templates/` directory
            - Can be customized or extended
            - Support complex nested structures
            - Include validation and normalization

            **Creating Custom Templates:**
            1. Create a new `.py` file in `./templates/`
            2. Define Pydantic models with `graph_id_fields`
            3. Use the `edge()` helper for relationships
            4. Add field descriptions to guide extraction
            5. Restart the app to load new templates

            ### Backends
            - **LLM:** Text-based extraction (best for PDFs, documents)
            - **VLM:** Vision-based extraction (best for images, forms)

            ### Processing Modes
            - **one-to-one:** Each page becomes a separate output
            - **many-to-one:** All pages merged into single output

            ### Providers

            #### Ollama (Local - Recommended)
            - **Advantages:** Privacy, no API costs, works offline
            - **Models:** Any model you've pulled (granite4, llama3, mistral, etc.)
            - **Setup:** Just install Ollama and pull models
            - **Refresh:** Click "🔄 Refresh Ollama Models" to update the list

            #### Remote Providers
            - **watsonx:** IBM watsonx models (requires WO_INSTANCE and WO_API_KEY in .env)
            - **OpenAI:** GPT-4, GPT-3.5-turbo (requires API key)
            - **Mistral:** mistral-large-latest, mistral-medium (requires API key)
            - **Gemini:** gemini-pro (requires API key)

            ## Troubleshooting

            ### Ollama Connection Error
            ```

bash
            # Check if Ollama is running
            curl http://localhost:11434/api/tags

            # Restart Ollama
            ollama serve


            ```

            ### No Models Available
            ```

bash
            # List available models
            ollama list

            # Pull a model
            ollama pull granite4
            ollama pull llama3

            # Refresh the model list in the UI
            # Click "🔄 Refresh Ollama Models" button


            ```

            ### Remote Provider Errors
            - **watsonx:** Verify WO_INSTANCE and WO_API_KEY in .env file
            - **Other providers:** Verify your API key is correct
            - Check your API quota/credits
            - Ensure you have network connectivity

            ### Environment Configuration
            ```

bash
            # Copy the template and configure
            cp .env.template .env

            # Edit .env with your settings
            # For watsonx: Set WO_INSTANCE and WO_API_KEY
            # For other providers: Set respective API keys


            ```

            ### Out of Memory
            - Enable chunking (recommended for large documents)
            - Use a smaller model
            - Process fewer documents at once

            ## Documentation

            For detailed documentation, see the `./Docs` directory or visit:
            https://docling-project.github.io/docling-graph/
            """)

if __name__ == "__main__":
    console.print(
        Panel.fit(
            "[bold blue]Docling-Graph Showcase[/bold blue]\n"
            "[dim]Starting Gradio application...[/dim]",
            border_style="blue",
        )
    )

    # Try to find an available port starting from 7861
    import socket

    def find_free_port(start_port=7861, max_attempts=10):
        """Find a free port starting from start_port."""
        for port in range(start_port, start_port + max_attempts):
            try:
                with socket.socket(socket.AF_INET, socket.SOCK_STREAM) as s:
                    s.bind(("", port))
                    return port
            except OSError:
                continue
        return start_port  # Fallback to original port

    port = find_free_port()
    console.print(f"[green]Starting on port {port}[/green]")

    app.launch(
        server_name="0.0.0.0",
        server_port=port,
        share=False,
        show_error=True
    )

# Made with Bob

True to my standard ‘production-first’ approach, the application comes fully containerized with a dedicated Dockerfile. To simplify the transition from local testing to cloud-scale environments, I’ve also included a comprehensive set of Kubernetes YAML manifests. This ensures that whether you are deploying to a private cluster or a public cloud provider, the infrastructure is defined as code and ready to scale.

Conclusion

In conclusion, the true power of Docling-Graph lies in its ability to move beyond the limitations of “fuzzy” text searching and approximate embeddings. By transforming unstructured documents into validated Pydantic objects, it enforces a strict data contract that ensures every extracted entity — be it a chemical compound in a lab report, a tax clause in a financial statement, or a dependency in a legal contract — is captured with clinical precision. This isn’t just data extraction; it is the automated creation of a semantic knowledge graph where the relationships between entities are as explicit and reliable as the data itself.

Through the collaborative efforts with Bob, this implementation provides more than just a code snippet; it delivers a complete, production-ready ecosystem. What has been built is a robust bridge between high-level document intelligence and practical deployment:

Universal Flexibility: A modular configuration that effortlessly toggles between local-first privacy (via Ollama) and high-performance cloud intelligence (IBM watsonx, OpenAI, Mistral, Gemini).
Architectural Integrity: A dual-path extraction pipeline that leverages both local VLM capabilities and LiteLLM routing, ensuring the system adapts to the complexity of the document at hand.
Operational Readiness: Beyond the logic, Bob has provided the “last mile” of software engineering — Gradio UIs for user interaction, Dockerfile and Kubernetes manifests for scaling, and timestamped automation for data lifecycle management.

Ultimately, what Bob has delivered is a starter kit that doesn’t just “test” Docling-Graph — it builds a foundation for mission-critical AI applications where accuracy is non-negotiable and the relationship is the message.

>>> Thanks for reading <<<

DEV Community