Docling Data Pipeline for AstraDB Serverless RAG

#astradb #vectordatabase #docling #rag

Adapting an existing AstraDB “Unstructured Serverless” sample to Docling (with IBM Bob)!

Introduction

While researching Astra DB RAG implementations recently, I came across a guide for integrating Unstructured Serverless with Astra DB. Although I was initially unfamiliar with the Unstructured.io ecosystem, diving into their documentation sparked a different idea. Given our access to Docling — an open-source alternative that requires no API keys and offers unlimited usage — I saw an opportunity to optimize. By leveraging Docling’s ability to integrate seamlessly into IBM Code Engine Serverless fleets, I developed a custom sample application modeled after the original Astra DB workflow. This article breaks down that implementation, showcasing a cost-effective, scalable approach to document processing for RAG.

TL;DR — Astra DB Introduction

Astra DB is a cloud-native, fully managed Database-as-a-Service (DBaaS) built on the powerful Apache Cassandra engine. It is designed to handle massive, globally distributed workloads with high availability and low latency, all while removing the operational complexity of managing a traditional database cluster. For modern developers, its standout feature is its serverless vector search capability, which allows for the storage and querying of high-dimensional embeddings. This makes it a premier choice for Generative AI and Retrieval-Augmented Generation (RAG) applications, as it seamlessly integrates real-time data with AI orchestration tools like LangChain and LlamaIndex.

Core Capabilities at a Glance:

Serverless Efficiency: Scales automatically to zero when not in use and expands elastically to meet spikes in traffic, ensuring you only pay for the resources you consume.
Vector-First Design: Optimized for AI-driven semantic search, enabling applications to “understand” and retrieve data based on context rather than just keywords.
Multi-Cloud Flexibility: Deploys across AWS, Google Cloud, Azure and IBM, providing a consistent API and infrastructure regardless of your cloud provider.
Developer-Friendly APIs: Beyond traditional CQL, it offers the Data API (a JSON-based interface), along with REST and GraphQL, making it accessible for frontend and backend developers alike.

Astra DB and HCD enhance the NoSQL database of IBM® watsonx.data® with vector capabilities, strengthening our retrieval-augmented generation and knowledge embedding capabilities. Built for elastic scalability and predictable performance, these solutions support mission-critical workloads with near-zero latency.
Astra DB is a Forrester Leader delivering NoSQL vector search capabilities on cloud and is built on Apache Cassandra®, providing the speed, reliability and multi-model support needed for modern AI workloads — including tabular, search and graph data. This enables complex, context-sensitive searches across diverse data formats for generative AI applications.

Steps of Implementation

I began with the reference implementation provided in the Astra DB documentation — a Python-based workflow centered on Unstructured.io — to serve as a point of comparison for the Docling approach.

The environmnet variables for Astra DB and Unstructured.io (sample using OPENAI)

UNSTRUCTURED_API_KEY=UNSTRUCTURED_API_KEY
UNSTRUCTURED_API_URL=https://api.unstructuredapp.io/general/v0/general
API_ENDPOINT=API_ENDPOINT
APPLICATION_TOKEN=APPLICATION_TOKEN
OPENAI_API_KEY=OPENAI_API_KEY

The sample code on the documentation site of Astra DB;

import os
import requests

from dotenv import load_dotenv
from langchain_astradb import AstraDBVectorStore
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough

from langchain_community.document_loaders import (
    unstructured,
    UnstructuredAPIFileLoader,
)

from langchain_openai import (
    ChatOpenAI,
    OpenAIEmbeddings,
)

load_dotenv()

# download pdf
url = "https://raw.githubusercontent.com/datastax/ragstack-ai/48bc55e7dc4de6a8b79fcebcedd242dc1254dd63/examples/notebooks/resources/attention_pages_9_10.pdf"
file_path = "./attention_pages_9_10.pdf"

response = requests.get(url)
if response.status_code == 200:
    with open(file_path, "wb") as file:
        file.write(response.content)
    print("Download complete.")
else:
    print("Error downloading the file.")

# simple parse
loader = UnstructuredAPIFileLoader(
    file_path="./attention_pages_9_10.pdf",
    api_key=os.getenv("UNSTRUCTURED_API_KEY"),
    url = os.getenv("UNSTRUCTURED_API_URL"),
)
simple_docs = loader.load()

print(len(simple_docs))
print(simple_docs[0].page_content[0:400])

# complex parse
elements = unstructured.get_elements_from_api(
    file_path="./attention_pages_9_10.pdf",
    api_key=os.getenv("UNSTRUCTURED_API_KEY"),
    api_url=os.getenv("UNSTRUCTURED_API_URL"),
    strategy="hi_res", # default "auto"
    pdf_infer_table_structure=True,
)

print(len(elements))
tables = [el for el in elements if el.category == "Table"]
print(tables[1].metadata.text_as_html)

# create vector store
astra_db_store = AstraDBVectorStore(
    collection_name="langchain_unstructured",
    embedding=OpenAIEmbeddings(),
    token=os.getenv("APPLICATION_TOKEN"),
    api_endpoint=os.getenv("API_ENDPOINT")
)

# load documents
documents = []
current_doc = None

for el in elements:
    if el.category in ["Header", "Footer"]:
        continue # skip these
    if el.category == "Title":
        if current_doc is not None:
            documents.append(current_doc)
        current_doc = None
    if not current_doc:
        current_doc = Document(page_content="", metadata=el.metadata.to_dict())
    current_doc.page_content += el.metadata.text_as_html if el.category == "Table" else el.text
    if el.category == "Table":
        if current_doc is not None:
            documents.append(current_doc)
        current_doc = None

astra_db_store.add_documents(documents)

# prompt and query
prompt = """
Answer the question based only on the supplied context. If you don't know the answer, say "I don't know".
Context: {context}
Question: {question}
Your answer:
"""

llm = ChatOpenAI(model="gpt-3.5-turbo-16k", streaming=False, temperature=0)

chain = (
    {"context": astra_db_store.as_retriever(), "question": RunnablePassthrough()}
    | PromptTemplate.from_template(prompt)
    | llm
    | StrOutputParser()
)

response_1 = chain.invoke("What does reducing the attention key size do?")
print("\n***********New Unstructured Basic Query Engine***********")
print(response_1)

response_2 = chain.invoke("For the transformer to English constituency results, what was the 'WSJ 23 F1' value for 'Dyer et al. (2016) (5]'?")
print("\n***********New Unstructured Basic Query Engine***********")
print(response_2)

response_3 = chain.invoke("When was George Washington born?")
print("\n***********New Unstructured Basic Query Engine***********")
print(response_3)

With the baseline established, the next step was migration. I worked with Bob to draft a Python implementation that runs directly in the terminal, utilizing Docling for document parsing and chunking.

"""
Console Application: PDF to AstraDB Vector Store using Docling
This application replaces Unstructured-io with Docling for PDF processing
"""

import os
from dotenv import load_dotenv
from pathlib import Path

from docling.document_converter import DocumentConverter
from langchain_astradb import AstraDBVectorStore
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Load environment variables
load_dotenv()

def process_pdf_with_docling(file_path: str) -> list:
    """
    Process PDF file using Docling and return structured documents

    Args:
        file_path: Path to the PDF file

    Returns:
        List of Document objects
    """
    print(f"\n📄 Processing PDF with Docling: {file_path}")

    # Initialize Docling converter
    converter = DocumentConverter()

    # Convert the PDF
    result = converter.convert(file_path)

    # Extract documents from Docling result
    documents = []
    current_doc = None

    # Process the document structure
    doc = result.document

    # Iterate through document elements
    for element, level in doc.iterate_items():
        element_type = element.label if hasattr(element, 'label') else 'text'

        # Skip headers and footers
        if element_type.lower() in ['header', 'footer']:
            continue

        # Start new document on titles
        if element_type.lower() == 'title':
            if current_doc is not None:
                documents.append(current_doc)
            current_doc = None

        # Initialize document if needed
        if not current_doc:
            metadata = {
                'source': file_path,
                'element_type': element_type,
            }
            current_doc = Document(page_content="", metadata=metadata)

        # Add content to current document
        if hasattr(element, 'text'):
            current_doc.page_content += element.text + "\n"

        # For tables, append and start new document
        if element_type.lower() == 'table':
            if current_doc is not None:
                documents.append(current_doc)
            current_doc = None

    # Add last document if exists
    if current_doc is not None:
        documents.append(current_doc)

    print(f"✅ Extracted {len(documents)} document chunks from PDF")
    return documents


def create_vector_store():
    """Create and return AstraDB vector store"""
    print("\n🗄️  Connecting to AstraDB Vector Store...")

    astra_db_store = AstraDBVectorStore(
        collection_name="langchain_docling",
        embedding=OpenAIEmbeddings(),
        token=os.getenv("APPLICATION_TOKEN"),
        api_endpoint=os.getenv("API_ENDPOINT")
    )

    print("✅ Connected to AstraDB")
    return astra_db_store


def query_vector_store(astra_db_store, question: str) -> str:
    """
    Query the vector store with a question

    Args:
        astra_db_store: AstraDB vector store instance
        question: Question to ask

    Returns:
        Answer string
    """
    prompt = """
Answer the question based only on the supplied context. If you don't know the answer, say "I don't know".
Context: {context}
Question: {question}
Your answer:
"""

    llm = ChatOpenAI(model="gpt-3.5-turbo-16k", streaming=False, temperature=0)

    chain = (
        {"context": astra_db_store.as_retriever(), "question": RunnablePassthrough()}
        | PromptTemplate.from_template(prompt)
        | llm
        | StrOutputParser()
    )

    return chain.invoke(question)


def main():
    """Main console application"""
    print("=" * 80)
    print("PDF to AstraDB Vector Store - Console Application (Docling)")
    print("=" * 80)

    # Check for input PDF
    input_dir = Path("input")
    pdf_files = list(input_dir.glob("*.pdf"))

    if not pdf_files:
        print("❌ No PDF files found in 'input' folder")
        return

    # Use first PDF file found
    pdf_file = pdf_files[0]
    print(f"\n📁 Using PDF file: {pdf_file}")

    # Process PDF with Docling
    documents = process_pdf_with_docling(str(pdf_file))

    if not documents:
        print("❌ No documents extracted from PDF")
        return

    # Display sample content
    print(f"\n📝 Sample content from first document:")
    print("-" * 80)
    print(documents[0].page_content[:400])
    print("-" * 80)

    # Create vector store
    astra_db_store = create_vector_store()

    # Add documents to vector store
    print(f"\n💾 Adding {len(documents)} documents to vector store...")
    astra_db_store.add_documents(documents)
    print("✅ Documents added successfully")

    # Query examples
    questions = [
        "What does reducing the attention key size do?",
        "For the transformer to English constituency results, what was the 'WSJ 23 F1' value for 'Dyer et al. (2016) (5]'?",
        "When was George Washington born?"
    ]

    print("\n" + "=" * 80)
    print("QUERY RESULTS")
    print("=" * 80)

    for i, question in enumerate(questions, 1):
        print(f"\n🔍 Question {i}: {question}")
        print("-" * 80)
        response = query_vector_store(astra_db_store, question)
        print(f"💡 Answer: {response}")
        print("-" * 80)

    print("\n✅ Console application completed successfully!")


if __name__ == "__main__":
    main()

# Made with Bob

Building on the console prototype, the next step was to transition to a cloud-native, serverless environment. I chose IBM Code Engine for its ability to manage serverless fleets, which required evolving the script into a full-scale GUI application. To ensure the app was truly ‘cloud-ready,’ I eliminated all hardcoded paths and developed a Streamlit interface for a dynamic user experience. To complete the deployment package, I’ve included the necessary Dockerfile and Kubernetes manifests, providing everything needed to containerize and launch the solution on any cloud platform.

IBM Code Engine Serverless Fleets is a strategic extension of the Code Engine platform designed to handle massive, compute-intensive workloads that require large-scale parallel execution. Unlike standard serverless functions meant for quick tasks, Fleets can orchestrate thousands of concurrent “run-to-completion” instances across virtual machines or specialized GPUs (like NVIDIA L40s). It automatically manages task queuing, provisions the necessary worker nodes to handle batch processing — such as parsing millions of documents with Docling or running complex AI simulations — and scales everything back to zero once the job is finished, ensuring you only pay for the active compute time.

Key Advantages for your Pipeline:

Massive Parallelism: Perfect for the “fleets” mode you mentioned, where you need to process large document batches simultaneously.
Hardware Flexibility: Allows you to toggle between high-performance CPUs and on-demand GPUs for heavy-duty AI tasks.
Single-Tenant Security: Tasks execute in isolated infrastructure within your VPC, providing enterprise-grade security for sensitive data.
Zero Infrastructure Management: It removes the need for SRE or DevOps teams to manually size or scale clusters, even for workloads with 100,000+ processors.

The requirements for the code;

# Core dependencies
python-dotenv>=1.0.0

# Docling for PDF processing
docling>=1.0.0

# LangChain and related packages
langchain>=0.1.0
langchain-core>=0.1.0
langchain-community>=0.0.20
langchain-astradb>=0.3.0
langchain-openai>=0.0.5

# AstraDB
astrapy>=2.0,<3.0

# OpenAI
openai>=1.0.0

# Streamlit for GUI
streamlit>=1.30.0

# Additional utilities
requests>=2.31.0
pathlib>=1.0.1

The Streamlit code;

"""
GUI Application: PDF to AstraDB Vector Store using Docling
Streamlit-based interface for PDF processing and querying
"""

import os
import streamlit as st
from dotenv import load_dotenv
from pathlib import Path
from datetime import datetime

from docling.document_converter import DocumentConverter
from langchain_astradb import AstraDBVectorStore
from langchain_core.documents import Document
from langchain_core.output_parsers import StrOutputParser
from langchain_core.prompts import PromptTemplate
from langchain_core.runnables import RunnablePassthrough
from langchain_openai import ChatOpenAI, OpenAIEmbeddings

# Load environment variables
load_dotenv()

# Page configuration
st.set_page_config(
    page_title="PDF to AstraDB - Docling",
    page_icon="📄",
    layout="wide"
)

# Initialize session state
if 'vector_store' not in st.session_state:
    st.session_state.vector_store = None
if 'documents_loaded' not in st.session_state:
    st.session_state.documents_loaded = False
if 'processing_log' not in st.session_state:
    st.session_state.processing_log = []


def log_message(message: str):
    """Add message to processing log"""
    timestamp = datetime.now().strftime("%H:%M:%S")
    st.session_state.processing_log.append(f"[{timestamp}] {message}")


def process_pdf_with_docling(file_path: str) -> list:
    """
    Process PDF file using Docling and return structured documents

    Args:
        file_path: Path to the PDF file

    Returns:
        List of Document objects
    """
    log_message(f"📄 Processing PDF with Docling: {file_path}")

    # Initialize Docling converter
    converter = DocumentConverter()

    # Convert the PDF
    result = converter.convert(file_path)

    # Extract documents from Docling result
    documents = []
    current_doc = None

    # Process the document structure
    doc = result.document

    # Iterate through document elements
    for element, level in doc.iterate_items():
        element_type = element.label if hasattr(element, 'label') else 'text'

        # Skip headers and footers
        if element_type.lower() in ['header', 'footer']:
            continue

        # Start new document on titles
        if element_type.lower() == 'title':
            if current_doc is not None:
                documents.append(current_doc)
            current_doc = None

        # Initialize document if needed
        if not current_doc:
            metadata = {
                'source': file_path,
                'element_type': element_type,
            }
            current_doc = Document(page_content="", metadata=metadata)

        # Add content to current document
        if hasattr(element, 'text'):
            current_doc.page_content += element.text + "\n"

        # For tables, append and start new document
        if element_type.lower() == 'table':
            if current_doc is not None:
                documents.append(current_doc)
            current_doc = None

    # Add last document if exists
    if current_doc is not None:
        documents.append(current_doc)

    log_message(f"✅ Extracted {len(documents)} document chunks from PDF")
    return documents


def create_vector_store():
    """Create and return AstraDB vector store"""
    log_message("🗄️  Connecting to AstraDB Vector Store...")

    astra_db_store = AstraDBVectorStore(
        collection_name="langchain_docling_gui",
        embedding=OpenAIEmbeddings(),
        token=os.getenv("APPLICATION_TOKEN"),
        api_endpoint=os.getenv("API_ENDPOINT")
    )

    log_message("✅ Connected to AstraDB")
    return astra_db_store


def query_vector_store(astra_db_store, question: str) -> str:
    """
    Query the vector store with a question

    Args:
        astra_db_store: AstraDB vector store instance
        question: Question to ask

    Returns:
        Answer string
    """
    prompt = """
Answer the question based only on the supplied context. If you don't know the answer, say "I don't know".
Context: {context}
Question: {question}
Your answer:
"""

    llm = ChatOpenAI(model="gpt-3.5-turbo-16k", streaming=False, temperature=0)

    chain = (
        {"context": astra_db_store.as_retriever(), "question": RunnablePassthrough()}
        | PromptTemplate.from_template(prompt)
        | llm
        | StrOutputParser()
    )

    return chain.invoke(question)


def main():
    """Main Streamlit application"""

    # Header
    st.title("📄 PDF to AstraDB Vector Store")
    st.subheader("Powered by Docling & Streamlit")

    # Sidebar
    with st.sidebar:
        st.header("⚙️ Configuration")

        # Check environment variables
        api_endpoint = os.getenv("API_ENDPOINT")
        app_token = os.getenv("APPLICATION_TOKEN")
        openai_key = os.getenv("OPENAI_API_KEY")

        if api_endpoint and app_token and openai_key:
            st.success("✅ Environment variables loaded")
        else:
            st.error("❌ Missing environment variables")
            st.stop()

        st.divider()

        # File selection
        st.header("📁 PDF Selection")
        input_dir = Path("input")
        pdf_files = list(input_dir.glob("*.pdf"))

        if not pdf_files:
            st.error("❌ No PDF files found in 'input' folder")
            st.stop()

        selected_pdf = st.selectbox(
            "Select PDF file:",
            options=[f.name for f in pdf_files],
            index=0
        )

        pdf_path = input_dir / selected_pdf

        st.divider()

        # Process button
        if st.button("🚀 Process PDF", type="primary", use_container_width=True):
            st.session_state.processing_log = []

            with st.spinner("Processing PDF..."):
                try:
                    # Process PDF
                    documents = process_pdf_with_docling(str(pdf_path))

                    if not documents:
                        st.error("❌ No documents extracted from PDF")
                        st.stop()

                    # Create vector store
                    st.session_state.vector_store = create_vector_store()

                    # Add documents
                    log_message(f"💾 Adding {len(documents)} documents to vector store...")
                    st.session_state.vector_store.add_documents(documents)
                    log_message("✅ Documents added successfully")

                    st.session_state.documents_loaded = True
                    st.success("✅ PDF processed successfully!")

                except Exception as e:
                    st.error(f"❌ Error: {str(e)}")
                    log_message(f"❌ Error: {str(e)}")

    # Main content area
    col1, col2 = st.columns([2, 1])

    with col1:
        st.header("💬 Query Interface")

        if st.session_state.documents_loaded:
            # Query input
            question = st.text_input(
                "Enter your question:",
                placeholder="What does reducing the attention key size do?"
            )

            if st.button("🔍 Search", type="primary"):
                if question:
                    with st.spinner("Searching..."):
                        try:
                            response = query_vector_store(
                                st.session_state.vector_store,
                                question
                            )

                            st.subheader("💡 Answer:")
                            st.info(response)

                            # Save to output
                            output_dir = Path("output")
                            output_dir.mkdir(exist_ok=True)
                            timestamp = datetime.now().strftime("%Y%m%d_%H%M%S")
                            output_file = output_dir / f"query_result_{timestamp}.txt"

                            with open(output_file, "w") as f:
                                f.write(f"Question: {question}\n\n")
                                f.write(f"Answer: {response}\n")

                            st.success(f"✅ Result saved to: {output_file}")

                        except Exception as e:
                            st.error(f"❌ Error: {str(e)}")
                else:
                    st.warning("⚠️ Please enter a question")

            # Example questions
            st.divider()
            st.subheader("📝 Example Questions")

            example_questions = [
                "What does reducing the attention key size do?",
                "For the transformer to English constituency results, what was the 'WSJ 23 F1' value for 'Dyer et al. (2016) (5]'?",
                "When was George Washington born?"
            ]

            for i, eq in enumerate(example_questions, 1):
                if st.button(f"Example {i}: {eq}", key=f"example_{i}"):
                    st.session_state.example_question = eq
                    st.rerun()

            # Use example question if set
            if hasattr(st.session_state, 'example_question'):
                question = st.session_state.example_question
                delattr(st.session_state, 'example_question')
        else:
            st.info("👈 Please process a PDF file first using the sidebar")

    with col2:
        st.header("📋 Processing Log")

        if st.session_state.processing_log:
            log_container = st.container(height=400)
            with log_container:
                for log_entry in st.session_state.processing_log:
                    st.text(log_entry)
        else:
            st.info("No processing activity yet")

    # Footer
    st.divider()
    st.caption("Built with Docling, Streamlit, and AstraDB")


if __name__ == "__main__":
    main()

# Made with Bob

The required Dockerfile;

FROM python:3.11-slim

# Set working directory
WORKDIR /app

# Set environment variables
ENV PYTHONUNBUFFERED=1 \
    PYTHONDONTWRITEBYTECODE=1 \
    PORT=8080

# Install system dependencies
RUN apt-get update && apt-get install -y \
    build-essential \
    curl \
    git \
    && rm -rf /var/lib/apt/lists/*

# Copy requirements first for better caching
COPY requirements.txt .

# Install Python dependencies
RUN pip install --no-cache-dir --upgrade pip && \
    pip install --no-cache-dir -r requirements.txt

# Copy application files
COPY app_gui_docling.py .
COPY .env.example .env

# Create necessary directories
RUN mkdir -p input output logs

# Expose port (Code Engine uses PORT env variable)
EXPOSE 8080

# Health check
HEALTHCHECK --interval=30s --timeout=10s --start-period=40s --retries=3 \
    CMD curl -f http://localhost:8080/_stcore/health || exit 1

# Run Streamlit application
CMD streamlit run app_gui_docling.py \
    --server.port=${PORT} \
    --server.address=0.0.0.0 \
    --server.headless=true \
    --browser.gatherUsageStats=false \
    --server.enableCORS=false \
    --server.enableXsrfProtection=true

All the project files are available on the GitHub repository.

Bonus Part-My 1st Implementation of Skills and Rules using Bob!

As is my standard practice, I initially developed a comprehensive prompt to guide Bob through the implementation. However, to shift toward a more industrialized development workflow, I’ve transitioned from single-use instructions to a library of universal rules and ‘skills.’ This modular approach eliminates repetitive setup and ensures consistent quality across projects. Below, I’ve shared the specific skill set I provided to Bob for this build — a framework I intend to expand upon to further accelerate my development velocity and standardize future implementations.

## Project Development Rule

When working with this projects:

1. **Reference Guide**: Always check the links provided here to access the source code repositories for implementation:
   - Docling: https://github.com/docling-project/docling
   - [unstructured-python-client](https://github.com/Unstructured-IO/unstructured-python-client): [unstructured-python-client](https://github.com/Unstructured-IO/unstructured-python-client)
   - Astra DB Python implementation: https://docs.datastax.com/en/astra-db-serverless/api-reference/dataapiclient.html
2. **Project Structure**: Follow these conventions:
   - The documents of the project should be created in "Docs" folder except readme.md
   - Always provide a Mermaid flow architecture for the project
   - All the BASH scripts if needed, should be written in "scripts" folder
   - All the input documents are to be found in "input" folder
   - All the output documents which are asked to be provided should be writen in timestamped format in "output" folder
   - The result documents should be written in "output" folder, if the "output" folder does not exist, it should be created
   - Always provide README.md with architecture + workflow diagrams as described
   - Always provide a ".gitignore" file which filters/ignores any ".env" files or any folders whichs' names start with "_" (underscore) to be pushed to GitHub (e.g.: _sources/, _images/, _docs/... )
3. **Key Patterns**:
   - Always test the functionnality of the code you provide 
   - When you make updates/enhancements and/or correct the bugs, update the existing documents and scripts, don't create new ones

Conclusion

The integration of Docling with Astra DB Serverless represents a significant shift toward more efficient and cost-effective RAG architectures. By replacing third-party proprietary services with an open-source parsing engine, this implementation provides a high-performance pipeline that extracts complex structures — such as nested tables and diverse document formats — and seamlessly transforms them into OpenAI embeddings for vector storage. The resulting system is not only robust but also highly versatile, offering both a streamlined console interface for batch processing and a dynamic Streamlit GUI for interactive, real-time document querying.

Ultimately, moving this solution to IBM Code Engine transitions the project from a local prototype to a production-ready, industrialized fleet. This serverless deployment model ensures that the application scales elastically with demand, offering “scale-to-zero” efficiency that minimizes costs while maintaining high availability. By utilizing the universal “skills” and modular rules established during development, you now have a repeatable framework for deploying sophisticated AI-driven document intelligence tools that are powerful, scalable, and entirely under your operational control.

Disclaimer: This is a v0.1 release focused on the core concept. More robust enhancements and industrialization are in the pipeline. Expect frequent updates as the project evolves.

>>> Thanks for reading <<<