Adin Poprzanovic

Posted on Mar 13 • Originally published at rubicon-world.com

How We Built a GraphRAG Chatbot for Enterprise Intelligence

#ai #nlp #rag #neo4j

How RUBICON's Two Layer Fixed Entity Architecture eliminated data bottlenecks for a multi team enterprise, delivering a conversational AI system that gives leadership instant project clarity, without hallucinations.

Overview

Leadership at a growing enterprise was losing hours every week to a single problem: before any meaningful team check in could happen, a manager had to manually hunt through meeting notes, chat logs, and status documents just to build enough context to ask the right questions. Critical information lived in dozens of unstructured sources like Slack threads, meeting transcripts, and internal wikis, and nobody had a unified view of what was actually happening across projects.

We were brought in to eliminate that bottleneck entirely.

The standard playbook for this kind of problem is vector search: embed your documents, run similarity queries, feed results to an LLM. But when we mapped out the actual questions leadership needed to answer, things like "Who owns the at risk deliverables this sprint?" or "What did the infrastructure team commit to last Thursday?", we realized flat vector search couldn't reliably traverse the relationships between people, projects, and timelines. The answers required multi hop reasoning across structured organizational data.

So we built Chief Bot: an enterprise AI chatbot powered by a Neo4j knowledge graph and a Graph based RAG (Retrieval Augmented Generation) architecture. The key design decision, and the one this article focuses on, was adopting a Two Layer Fixed Entity Architecture instead of standard LLM driven entity extraction. This choice eliminated the hallucination and entity duplication problems that plague most GraphRAG implementations, while keeping costs an order of magnitude lower.

The result is a conversational interface where leadership asks natural language questions and gets precise, cited answers grounded in verified organizational data. In seconds, not hours.

Why "Two Layer Fixed Entity Architecture" Is Our Hero

Before settling on our final architecture, we piloted the project using standard automated LLM graph builders, tools like Neo4j's LLM Knowledge Graph Builder that scan unstructured sources and auto generate entities. These tools offer rapid graph generation, but the results were unsuitable for executive level decision making. The LLM frequently hallucinated nonexistent organizational roles, created duplicate nodes for the same person, and produced cluttered databases that made reliable information retrieval impossible.

We needed a different approach. Here's where we landed and why.

How Standard GraphRAG Falls Short

In a typical GraphRAG pipeline, you use an LLM to extract entities and relationships from every document. This sounds elegant, but in practice it creates three compounding problems:

Massive token costs. Every document gets processed through an LLM for entity extraction, relationship classification, and deduplication. For a midsize organization, this can cost thousands of dollars per ingestion cycle.
Entity duplication and drift. The LLM might extract "Sarah Chen," "S. Chen," and "Sarah (Infrastructure Lead)" as three separate nodes. Multiply this across hundreds of documents and your graph becomes unreliable.
No ground truth. If the LLM fabricates a relationship, say linking the wrong person to a project, there's no structural safeguard to catch it. The error propagates into every future query.

Our Two Layer Solution

By switching to a two layer approach, we established a "Ground Truth" that ensures 100% accuracy in core organizational relationships:

Layer 1, The Fixed Entity Layer (FEL1): Instead of letting an AI guess the company structure, we manually created a verified "skeleton" of known entities (Persons, Roles, Projects, Departments) from structured, authoritative sources. This is the deterministic backbone of the graph. No LLM touches this layer.
Layer 2, The Document Layer (DL2): Unstructured text chunks like meeting transcripts, status reports, and chat logs are attached to this fixed skeleton via cosine similarity mapping. Every piece of raw data gets a precise "home" in the graph, linked to the correct verified entity.

The critical constraint here is that every document chunk must attach to an existing Fixed Entity node. There are no orphan chunks floating in the graph. If a meeting transcript mentions "Sarah's infrastructure update," it gets linked to the verified Sarah > Infrastructure Department > Project Atlas path, not to an LLM hallucinated "Sarah Infrastructure" entity.

Eliminating Hallucinations and Graph Bloat

This two layer framework effectively eliminates graph bloat, which is the accumulation of redundant entities and noisy data that typically degrades LLM performance over time. Because the LLM is restricted to querying a pre-validated, fixed schema, it cannot hallucinate nonexistent connections or get lost in duplicate nodes. When a leader asks a complex question, the answer is grounded in a verified organizational relationship, not a probabilistic guess.

Comparison: Standard GraphRAG vs. Our Two Layer Fixed Entity Architecture

Challenges

Business Challenges

Balancing High Precision with Low Operational Costs. Standard GraphRAG methods can be prohibitively expensive due to high LLM token usage during graph construction, leading to both high costs and unreliable entity duplication. We needed to deliver a high precision tool without spending a fortune on LLM driven entity extraction, and that's the direct motivation behind adopting the Two Layer Fixed Entity Architecture.
Transitioning from Search to Reasoning. Moving beyond standard vector search meant demonstrating to stakeholders, with concrete examples, that a graph based approach would provide structurally accurate answers to relational questions that flat document search simply cannot handle.
Securing Executive Alignment through Transparency. To gain C level trust, we couldn't treat the AI as a "black box." Leadership needed to see how the system arrived at its answers, which led us to build a Developer Transparency Mode that exposes the Cypher logic behind every response.

Technical Challenges

Structured Extraction from Unstructured Data. The primary hurdle was extracting structured relationships from a single, large unstructured document into a dual layer graph consisting of Fixed Entities and Document Chunks, without introducing errors at the extraction stage.
Entity Duplication and Resolution. To avoid the cluttered and uncontrollable databases common in LLM generated graphs, we enforced a strict fixed ontology, ensuring every document chunk mapped precisely to the correct verified node.
Natural Language to Cypher Translation. The LLM needed to accurately translate conversational questions into valid Neo4j Cypher queries without hallucinating nonexistent node types or relationships. We constrained this by providing the exact graph schema as context at query time.
Multi Hop Query Performance. As relationship depth increased, we had to optimize query performance to make sure that traversing multiple graph hops (person to project to department to related documents) didn't introduce unacceptable latency.
Scalability Without Schema Redesign. The PoC needed to be architected so that scaling from a single document to hundreds of sources wouldn't require ripping out the data model.

Solution

We engineered Chief Bot to serve as a single source of truth for the company's complex project portfolio. By shifting from a static reporting model to an interactive, Graph based RAG architecture, we enabled stakeholders to extract real time insights from both structured and unstructured data through a conversational interface.

Data Engineering & Integrity

The core of the solution is a multilayered data ingestion framework built around the Two Layer Fixed Entity Architecture described above:

The Deterministic Core (FEL1). We performed a manual, high precision extraction of foundational project data from the client's primary documents. This "Ground Truth" layer populates the Neo4j knowledge graph, ensuring that primary relationships between projects, leads, and timelines are 100% accurate and verifiable.

The Contextual Layer (DL2, Modular Processing). We developed a suite of specialized functions to process fluid data sources like meeting transcripts, progress reports, and status updates. These modules perform text chunking, vector embedding generation, and cosine similarity mapping to link new information to existing Fixed Entity nodes. The modular architecture is designed for full automation in future phases, while currently supporting manual verification before data is committed to the knowledge base.

AI Reasoning & Transparent Interface

The system functions as a Natural Language-to-Graph Query engine. We integrated a leading Large Language Model (LLM) via a secure API to interpret free-form user questions, transform them into precise database queries, and return human-like responses. The solution preserves conversation history, allowing users to ask follow-up questions and drill down into specific project details without losing the thread of the discussion.

To bridge the gap between complex AI logic and stakeholder trust, we implemented Explainable AI (XAI) through a dedicated Developer Transparency Mode. This module exposes the system's underlying "thought process", providing real-time visibility into the active context state, the transformed database query, and the raw data results. By demystifying the model's reasoning, we transformed the chatbot from a "black box" into a verifiable, high-integrity decision-support tool.

Results

The Chief Bot proof of concept successfully transitioned the company from static data storage to active leadership intelligence. Here's what it delivered:

Leadership prep time dropped from hours to seconds. A manager can now ask "Who is handling the most at risk deliverables this sprint?" and get a precise, cited answer without opening a single document. The graph traverses from project nodes to lead nodes to document chunks, assembling the context that previously required manual synthesis across multiple data sources.
Zero entity duplication across the entire knowledge graph. Because the organizational backbone was predefined and verified in the Fixed Entity Layer, the system maintained a clean, consistent graph. This eliminates the accuracy degradation that typically plagues GraphRAG systems as more documents are ingested over time.
Deep drill down through multi turn conversations without context loss. The system preserves full conversation history, so users can move from broad queries to specific follow ups to lateral pivots, all without reestablishing context. This replaced the old workflow of switching between apps, files, and chat threads just to piece together a complete picture.
A validated, scalable path from PoC to production. The Fixed Entity Layer provides a stable structural backbone that doesn't change with every new document ingestion. New unstructured data attaches to existing verified nodes, so the system can scale from dozens to thousands of documents without schema redesign or accuracy degradation.

This sets the stage for a larger vision: a future where every internal document, Slack thread, and project brief isn't just noise, but a searchable, strategic asset that helps leadership lead with total clarity.

Lessons Learned and Tradeoffs

No architecture is without tradeoffs. Here's what we'd share with anyone evaluating a similar approach:

Fixed entities trade flexibility for reliability. If a new department or project lead appears, someone has to manually add them to the Fixed Entity Layer before the system can reference them. For a fast moving startup, this could be a bottleneck. For an enterprise with established org charts, it's actually a feature since the system stays aligned with official structure by design.
Natural language to Cypher is fragile at the edges. Straightforward questions ("Who leads Project Atlas?") translate reliably. Ambiguous or compound questions ("Compare the velocity of all teams that had leadership changes last quarter") sometimes produce incorrect Cypher. We mitigated this with schema constraints and few shot prompting, but it remains the area with the most room for improvement.
We deliberately limited the PoC data scope. We ingested one primary document to validate the architecture. The modular ingestion pipeline is designed for automated multi source ingestion (Slack, Jira, Google Docs), but that integration is Phase 2 work. We chose to prove the reasoning layer first before scaling the data layer.
Any AI system is only as good as the data it's built on. Chief Bot's accuracy depends entirely on the quality, completeness, and freshness of the documents ingested into the knowledge graph. Incomplete meeting notes or outdated project records produce incomplete answers. No amount of architectural sophistication changes that. This is why we designed the ingestion pipeline to be modular: so the client can continuously expand and refine their data sources over time.

Technology Stack

Neo4j - Graph database for native relationship traversal and multi hop queries that flat vector DBs can't support
OpenAI API - Natural language to Cypher translation and response synthesis
FastAPI + Uvicorn - Lightweight Python backend for low latency API serving
React + Vite - Fast component based frontend for the chat interface
Python - Data ingestion pipeline, text chunking, vector embedding generation

Quick Facts

Region: USA
Industry: AI
Project Duration: 1 month
Team: RUBICON Engineers & Solution Architects

DEV Community