<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Harideevagan M</title>
    <description>The latest articles on DEV Community by Harideevagan M (@harideevagan).</description>
    <link>https://dev.to/harideevagan</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3925559%2F8573da8d-cce3-404e-8253-f4a6ad930877.jpg</url>
      <title>DEV Community: Harideevagan M</title>
      <link>https://dev.to/harideevagan</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/harideevagan"/>
    <language>en</language>
    <item>
      <title>How I Built a RAG-Powered Conversational Assistant for Odoo ERP</title>
      <dc:creator>Harideevagan M</dc:creator>
      <pubDate>Mon, 11 May 2026 18:05:18 +0000</pubDate>
      <link>https://dev.to/harideevagan/how-i-built-a-rag-powered-conversational-assistant-for-odoo-erp-3pjn</link>
      <guid>https://dev.to/harideevagan/how-i-built-a-rag-powered-conversational-assistant-for-odoo-erp-3pjn</guid>
      <description>&lt;p&gt;Every enterprise runs on data — sales orders, invoices, inventory counts, customer records — but getting answers from that data inside an ERP system usually means clicking through 10 screens, running reports, or asking someone who knows which menu hides which number.&lt;br&gt;
I wanted to change that. As a Full-Stack Developer at Futurenet Technologies in Chennai, I built a conversational AI assistant that sits inside Odoo ERP and lets users ask questions in plain English (or give voice commands) and get instant answers from live business data.&lt;br&gt;
This is the story of how I built it, the architecture behind it, and the lessons I learned along the way.&lt;/p&gt;

&lt;p&gt;The Problem&lt;br&gt;
Our enterprise clients were running Odoo ERP with 15+ modules — Sales, Inventory, MRP, Accounting, CRM, HR, Payroll, POS, E-commerce, Helpdesk, and more. The data was all there, but:&lt;/p&gt;

&lt;p&gt;Sales reps had to navigate 4-5 screens just to check a customer's credit status&lt;br&gt;
Warehouse managers needed to run reports to see stock availability&lt;br&gt;
Executives wanted quick KPIs without waiting for BI dashboards to load&lt;br&gt;
Field service agents needed hands-free access while on-site&lt;/p&gt;

&lt;p&gt;The question was simple: Can users just ask the ERP what they need and get an answer?&lt;/p&gt;

&lt;p&gt;The Architecture&lt;br&gt;
Here's the high-level architecture I designed:&lt;br&gt;
User Query (text/voice/image)&lt;br&gt;
        │&lt;br&gt;
        ▼&lt;br&gt;
┌─────────────────────┐&lt;br&gt;
│  Multimodal Input    │  ← Speech-to-text, OCR, text input&lt;br&gt;
│  Processing Layer    │&lt;br&gt;
└────────┬────────────┘&lt;br&gt;
         │&lt;br&gt;
         ▼&lt;br&gt;
┌─────────────────────┐&lt;br&gt;
│  Query Router        │  ← Classifies intent and complexity&lt;br&gt;
└────────┬────────────┘&lt;br&gt;
         │&lt;br&gt;
    ┌────┴────┐&lt;br&gt;
    │         │&lt;br&gt;
    ▼         ▼&lt;br&gt;
┌────────┐ ┌──────────────┐&lt;br&gt;
│ Direct │ │ RAG Pipeline │&lt;br&gt;
│ ORM    │ │ (LangChain + │&lt;br&gt;
│ Query  │ │  pgvector)   │&lt;br&gt;
└───┬────┘ └──────┬───────┘&lt;br&gt;
    │              │&lt;br&gt;
    ▼              ▼&lt;br&gt;
┌─────────────────────┐&lt;br&gt;
│  Response Generator  │  ← Fine-tuned LLM&lt;br&gt;
│  (QLoRA Model)       │&lt;br&gt;
└────────┬────────────┘&lt;br&gt;
         │&lt;br&gt;
         ▼&lt;br&gt;
┌─────────────────────┐&lt;br&gt;
│  Odoo ERP Frontend   │  ← OWL widget in Odoo UI&lt;br&gt;
└─────────────────────┘&lt;br&gt;
The system has four main components:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Multimodal Input Processing
Users can interact through text, voice, or image:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Text: Direct chat input in the Odoo interface&lt;br&gt;
Voice: Speech-to-text using Whisper, enabling hands-free operation for warehouse and field workers&lt;br&gt;
Image: OCR processing for documents — snap a photo of a purchase order and the system extracts data&lt;/p&gt;

&lt;p&gt;pythonclass MultimodalProcessor:&lt;br&gt;
    def &lt;strong&gt;init&lt;/strong&gt;(self):&lt;br&gt;
        self.whisper_model = whisper.load_model("base")&lt;br&gt;
        self.ocr_engine = PaddleOCR(use_angle_cls=True, lang='en')&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;def process_input(self, input_data, input_type="text"):
    if input_type == "voice":
        result = self.whisper_model.transcribe(input_data)
        return result["text"]
    elif input_type == "image":
        result = self.ocr_engine.ocr(input_data, cls=True)
        extracted = " ".join([line[1][0] for line in result[0]])
        return extracted
    return input_data
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;RAG Pipeline with LangChain and pgvector
This is the core of the system. Instead of feeding the entire database to the LLM (impossible and expensive), I built a Retrieval-Augmented Generation pipeline:
Step 1: Document Indexing
I created embeddings for Odoo's business data — product descriptions, customer notes, helpdesk tickets, sales policies, HR policies, and module documentation:
pythonfrom langchain.embeddings import HuggingFaceEmbeddings
from langchain.vectorstores import PGVector&lt;/li&gt;
&lt;/ol&gt;

&lt;h1&gt;
  
  
  Using a lightweight embedding model for speed
&lt;/h1&gt;

&lt;p&gt;embeddings = HuggingFaceEmbeddings(&lt;br&gt;
    model_name="sentence-transformers/all-MiniLM-L6-v2"&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  Store embeddings in PostgreSQL using pgvector
&lt;/h1&gt;

&lt;p&gt;vector_store = PGVector(&lt;br&gt;
    connection_string=DATABASE_URL,&lt;br&gt;
    embedding_function=embeddings,&lt;br&gt;
    collection_name="odoo_knowledge_base"&lt;br&gt;
)&lt;br&gt;
Why pgvector instead of ChromaDB or Pinecone?&lt;br&gt;
Since Odoo already runs on PostgreSQL, using pgvector meant:&lt;/p&gt;

&lt;p&gt;No additional infrastructure to manage&lt;br&gt;
Embeddings live alongside the business data&lt;br&gt;
Transactions are ACID-compliant&lt;br&gt;
One backup strategy covers everything&lt;/p&gt;

&lt;p&gt;Step 2: Smart Retrieval&lt;br&gt;
Not every query needs RAG. "What's the stock of Product X?" can be answered directly from the ORM. But "What's our return policy for international orders?" needs the RAG pipeline.&lt;br&gt;
I built a query router that classifies intent:&lt;br&gt;
pythonclass QueryRouter:&lt;br&gt;
    """Routes queries to the appropriate handler."""&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;DIRECT_ORM_PATTERNS = [
    "stock", "quantity", "price", "total", 
    "count", "balance", "status"
]

def route(self, query: str) -&amp;gt; str:
    query_lower = query.lower()

    # Check if this can be answered via direct ORM query
    if any(pattern in query_lower for pattern in self.DIRECT_ORM_PATTERNS):
        return "orm_direct"

    # Check if this needs document retrieval
    return "rag_pipeline"
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;Step 3: Context-Aware Retrieval&lt;br&gt;
The retrieval step doesn't just do a simple similarity search. It considers the user's context — their role, department, and the Odoo module they're currently in:&lt;br&gt;
pythondef retrieve_context(self, query, user_context):&lt;br&gt;
    """Retrieve relevant documents with user context awareness."""&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Build metadata filter based on user's access rights
metadata_filter = {
    "department": user_context.get("department"),
    "module": user_context.get("active_module"),
}

# Retrieve top-k relevant documents
docs = self.vector_store.similarity_search(
    query,
    k=5,
    filter=metadata_filter
)

# Also fetch live data from Odoo ORM if needed
orm_context = self._fetch_orm_data(query, user_context)

return docs, orm_context
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;ol&gt;
&lt;li&gt;Fine-Tuned LLM with QLoRA
The base model didn't understand Odoo-specific terminology or our clients' business logic. So I fine-tuned it using QLoRA (Quantized Low-Rank Adaptation):
Why QLoRA?&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Full fine-tuning of a 7B parameter model needs 28+ GB VRAM&lt;br&gt;
QLoRA reduces this to under 8 GB by quantizing to 4-bit and training only low-rank adapters&lt;br&gt;
We could run this on a single GPU server&lt;/p&gt;

&lt;p&gt;pythonfrom peft import LoraConfig, get_peft_model, prepare_model_for_kbit_training&lt;br&gt;
from transformers import AutoModelForCausalLM, BitsAndBytesConfig&lt;/p&gt;

&lt;h1&gt;
  
  
  4-bit quantization config
&lt;/h1&gt;

&lt;p&gt;bnb_config = BitsAndBytesConfig(&lt;br&gt;
    load_in_4bit=True,&lt;br&gt;
    bnb_4bit_quant_type="nf4",&lt;br&gt;
    bnb_4bit_compute_dtype=torch.float16,&lt;br&gt;
    bnb_4bit_use_double_quant=True&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  Load base model with quantization
&lt;/h1&gt;

&lt;p&gt;model = AutoModelForCausalLM.from_pretrained(&lt;br&gt;
    "mistralai/Mistral-7B-Instruct-v0.2",&lt;br&gt;
    quantization_config=bnb_config,&lt;br&gt;
    device_map="auto"&lt;br&gt;
)&lt;/p&gt;

&lt;h1&gt;
  
  
  LoRA configuration
&lt;/h1&gt;

&lt;p&gt;lora_config = LoraConfig(&lt;br&gt;
    r=16,&lt;br&gt;
    lora_alpha=32,&lt;br&gt;
    target_modules=["q_proj", "k_proj", "v_proj", "o_proj"],&lt;br&gt;
    lora_dropout=0.05,&lt;br&gt;
    bias="none",&lt;br&gt;
    task_type="CAUSAL_LM"&lt;br&gt;
)&lt;/p&gt;

&lt;p&gt;model = prepare_model_for_kbit_training(model)&lt;br&gt;
model = get_peft_model(model, lora_config)&lt;br&gt;
Training Data&lt;br&gt;
I curated a dataset of 5,000+ examples from:&lt;/p&gt;

&lt;p&gt;Real Odoo support tickets and their resolutions&lt;br&gt;
ERP operation workflows (how to create a PO, how to check stock, etc.)&lt;br&gt;
Business-specific Q&amp;amp;A pairs from our clients&lt;br&gt;
Odoo documentation rewritten as conversational Q&amp;amp;A&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Integration with Odoo
The assistant lives inside Odoo as an OWL (Odoo Web Library) component — a chat widget accessible from any screen:
javascript/** @odoo-module */
import { Component, useState } from "@odoo/owl";&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;export class ERPAssistant extends Component {&lt;br&gt;
    static template = "erp_assistant.ChatWidget";&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;setup() {
    this.state = useState({
        messages: [],
        isListening: false,
        isProcessing: false,
    });
}

async sendMessage(query) {
    this.state.isProcessing = true;

    const response = await this.env.services.rpc("/api/assistant/query", {
        query: query,
        context: {
            active_model: this.env.config.activeModel,
            active_id: this.env.config.activeId,
            user_id: this.env.services.user.userId,
        }
    });

    this.state.messages.push({
        role: "assistant",
        content: response.answer,
        sources: response.sources,
    });

    this.state.isProcessing = false;
}
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;

&lt;p&gt;}&lt;/p&gt;

&lt;p&gt;Challenges and Lessons Learned&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;p&gt;Keeping embeddings in sync with live data&lt;br&gt;
Odoo data changes constantly — new products, updated prices, modified policies. I built a scheduled action that re-indexes changed records every hour:&lt;br&gt;
pythonclass EmbeddingSync(models.Model):&lt;br&gt;
_name = 'ai.embedding.sync'&lt;/p&gt;

&lt;p&gt;@api.model&lt;br&gt;
def _cron_sync_embeddings(self):&lt;br&gt;
    """Sync modified records to vector store."""&lt;br&gt;
    last_sync = self._get_last_sync_time()&lt;/p&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;# Find all records modified since last sync
modified_records = self.env['ir.model'].search([
    ('write_date', '&amp;gt;', last_sync)
])

for record in modified_records:
    self._update_embedding(record)
&lt;/code&gt;&lt;/pre&gt;
&lt;/li&gt;
&lt;li&gt;&lt;p&gt;Hallucination control&lt;br&gt;
The LLM sometimes generated plausible-sounding but incorrect numbers. My solution:&lt;/p&gt;&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Always verify numerical answers against the ORM before returning them&lt;br&gt;
Include source citations in every response so users can verify&lt;br&gt;
Add a confidence score — if the model isn't confident, it says "I'm not sure, let me show you the relevant screen instead" and navigates the user to the right Odoo view&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;Response latency
Enterprise users expect instant answers. My optimizations:&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;Cached embeddings for frequently accessed data (top products, common policies)&lt;br&gt;
Streaming responses so users see the answer forming in real-time&lt;br&gt;
Query classification to route simple queries directly to the ORM (under 200ms) instead of the full RAG pipeline (2-3 seconds)&lt;/p&gt;

&lt;p&gt;Results&lt;br&gt;
After deploying across multiple enterprise clients:&lt;/p&gt;

&lt;p&gt;Data processing time significantly reduced — users get answers in seconds instead of navigating multiple screens&lt;br&gt;
Voice command adoption grew rapidly — warehouse workers loved the hands-free operation&lt;br&gt;
Strong user adoption within weeks of launch — users preferred chatting over navigating menus&lt;br&gt;
Helpdesk ticket volume dropped — employees could self-serve common questions&lt;/p&gt;

&lt;p&gt;Tech Stack Summary&lt;br&gt;
ComponentTechnologyLLMMistral 7B + QLoRA fine-tuningRAG FrameworkLangChainVector Storepgvector (PostgreSQL)Embeddingssentence-transformers/all-MiniLM-L6-v2Speech-to-TextWhisperOCRPaddleOCRERPOdoo (v17)BackendPython, FastAPIFrontendOWL (Odoo Web Library)InfrastructureGPU server, Docker, Nginx&lt;/p&gt;

&lt;p&gt;What I Would Do Differently&lt;/p&gt;

&lt;p&gt;Start with a smaller model — I initially tried a 13B model but 7B with good fine-tuning performed just as well for this domain-specific use case, at half the inference cost.&lt;br&gt;
Invest more in evaluation — I should have built an automated eval pipeline earlier. Manual testing doesn't scale when you have thousands of possible queries.&lt;br&gt;
Hybrid search from day one — Combining vector similarity search with traditional keyword search (BM25) would have improved retrieval accuracy for queries containing specific product codes or order numbers.&lt;/p&gt;

&lt;p&gt;Wrapping Up&lt;br&gt;
Building an AI assistant for an ERP system is fundamentally different from building a general-purpose chatbot. The data is structured, the answers need to be precise, and users have zero tolerance for wrong numbers.&lt;br&gt;
The combination of RAG (for knowledge retrieval), QLoRA fine-tuning (for domain understanding), and deep Odoo integration (for real-time data access) made this possible without requiring massive infrastructure.&lt;br&gt;
If you're building something similar or have questions about any part of this architecture, feel free to reach out — I'd love to chat about it.&lt;/p&gt;

&lt;p&gt;I'm Harideevagan M, a Full-Stack Developer and AI Engineer at Futurenet Technologies in Chennai, India. I specialize in LLM engineering, Odoo ERP, and enterprise automation. Check out my portfolio at harideevagan.netlify.app or connect with me on LinkedIn and GitHub.&lt;/p&gt;

</description>
      <category>python</category>
      <category>ai</category>
      <category>odoo</category>
      <category>langchain</category>
    </item>
  </channel>
</rss>
