Building an Document Analysis Bot with RAG: A Deep Dive into LLMWare and Streamlit

Aman Singh — Thu, 02 Oct 2025 20:24:55 +0000

Building an Intelligent Document Analysis Bot with RAG: A Deep Dive into LLMWare and Streamlit

Project Overview

This project implements a Retrieval-Augmented Generation (RAG) Chat Application that enables users to upload documents and query their content using natural language. The system processes various document types and provides intelligent, context-aware responses based on the uploaded content.

The application leverages advanced AI technology to understand document semantics and deliver accurate, source-attributed answers to user queries.

Applications

Financial Document Analysis

Invoice processing and analysis
Contract review and term extraction
Financial statement summarization

Research and Academic Use

Literature review and analysis
Document summarization
Knowledge extraction from research papers

Business Intelligence

Customer support documentation analysis
Compliance and regulatory document review
Enterprise content management

Technology Stack

Core Framework

Streamlit: Web application framework for Python
LLMWare: Enterprise-grade RAG framework for document processing
Milvus: High-performance vector database for similarity search

AI Models

industry-bert-contracts: Specialized embedding model for document understanding
bling-phi-3-gguf: Large Language Model for generating responses

Dependencies

streamlit>=1.37  # Web UI framework
llmware          # RAG and document processing

Architecture

The application follows a modular architecture with clear separation of concerns across multiple components:

Technical Implementation

Document Processing Pipeline

The application processes documents through several stages:

# Step 1: Create a library and add documents
library = Library().create_new_library("MyDocs")
library.add_files("/path/to/documents")

# Step 2: Generate vector embeddings for semantic search
library.install_new_embedding(
    embedding_model_name="industry-bert-contracts",
    vector_db="milvus"
)

The processing involves:

Document parsing and chunking into manageable segments
Conversion of text chunks into high-dimensional vector embeddings
Storage of vectors in Milvus for efficient similarity search

Query Processing Engine

When processing user queries, the system:

# Step 1: Find relevant document chunks
query_results = Query(library).semantic_query(question, result_count=5)

# Step 2: Generate AI response with context
prompter = Prompt().load_model("bling-phi-3-gguf")
sources = prompter.add_source_query_results(query_results)
response = prompter.prompt_with_source(question)

The query processing involves:

Semantic Search: Converting questions to vectors and comparing against document chunks
Context Retrieval: Retrieving the most relevant chunks based on similarity
Response Generation: Using the LLM to generate answers with retrieved context
Source Attribution: Providing references to source documents

User Interface Components

The application provides multiple interfaces for document management and querying:

def render_document_loader(rag_engine: RAGEngine) -> None:
    source_choice = st.radio(
        "Document source",
        ("Use LLMWare sample invoices", "Use local folder", "Upload files"),
        index=0,
    )

Key features include:

Multiple document input methods (file upload, local folders, sample data)
Support for various file types (PDF, TXT, DOCX)
Real-time processing feedback

def render_query_interface(rag_engine: RAGEngine) -> None:
    prompt_text = st.text_area(
        "Ask a question about your documents",
        placeholder="Example: What is the total amount of the invoice?",
    )

Query interface features:

Natural language query processing
Source attribution for responses
Contextual answers based on document content

Key Technical Innovations

Hybrid Document Processing

The system supports multiple document input methods:

File upload through web interface
Local folder integration
Pre-loaded sample datasets

Intelligent Embedding Strategy

def generate_embeddings(self) -> bool:
    self.library.install_new_embedding(
        embedding_model_name="industry-bert-contracts",
        vector_db="milvus"
    )

This approach provides:

Domain-specific processing using business document-trained models
Scalable vector operations through Milvus
Enhanced semantic understanding of business documents

Context-Aware Response Generation

def query_documents(self, question: str, model_name: str = "bling-phi-3-gguf") -> Dict:
    query_results = Query(self.library).semantic_query(question, result_count=5)
    prompter = Prompt().load_model(model_name, temperature=0.0, sample=False)
    sources = prompter.add_source_query_results(query_results)
    response = prompter.prompt_with_source(question)

Getting Started

Installation

# Clone the repository
git clone <your-repo-url>
cd code-review-bot

# Install dependencies
pip install -r requirements.txt

Running the Application

streamlit run app.py

The application will be available at http://localhost:8501

Performance and Scalability

Vector Database Benefits

Fast similarity search across millions of vectors
Scalable architecture for large document collections
Memory-efficient production deployment

Model Selection Rationale

industry-bert-contracts: Specialized for business documents
bling-phi-3-gguf: Balanced performance and accuracy
Temperature 0.0: Ensures factual, consistent responses

Future Enhancements

Potential Improvements

Multi-modal support for image and table processing
Advanced analytics and document insights
RESTful API for external applications
Custom model fine-tuning for specific domains
Real-time document processing and indexing

Enterprise Features

User authentication and access controls
Document versioning and change tracking
Comprehensive audit logging
Domain-specific embedding customization

Conclusion

This RAG application demonstrates the integration of modern AI technologies with practical business requirements. The solution leverages LLMWare's enterprise-grade framework and Streamlit's interface to provide:

High-accuracy processing of diverse document types
Intelligent, source-backed responses to complex queries
Efficient scaling for large document collections
Intuitive user experience for non-technical users

The project illustrates how RAG technology transforms document analysis from manual processes into intelligent, automated workflows that deliver immediate insights and answers.

This application provides a foundation for building intelligent document analysis systems adaptable to specific business needs.

DEV Community: Aman Singh