Knowledge Banks for Conversational AI: When Less is More

#ai #promptengineering #conversationalai #optimization

It all started with a problem.
We began working with a client to build an AI chatbot. They had a massive amount of data—products, services, team members, contact info, you name it. The goal was seemingly simple: allow customers to ask the chatbot questions and get accurate answers.
But the reality? The chatbot was hallucinating. It couldn't provide the right answers.
(Note: If you aren't interested in the failed experiments, you can skip directly to the "What Worked" section below for the technical solution.)
What We Tried (And Why It Failed)
As engineers, our first instinct was "more context is better." Here is the roadmap of our initial failures:

The "Detail Overload" Approach We thought providing maximum detail would help. We fed the system 15-20 lines of data for every single entry—English descriptions, native language variations, keywords, and metadata.
- Result: The AI got lost in the noise.
- Accuracy: Stagnated at 40-50%.
The "Q&A" Overload Next, we created a training manual with 50+ sample Q&A pairs, repeating the same questions in three different languages. The Knowledge Bank swelled to over 10,000+ lines.
- Result: The AI became even more confused due to contradictory context weights.
- Accuracy: Dropped to 35-40%.
The "Redundancy" Approach This was our last-ditch effort. We provided the same data in three formats: CSV, Paragraph, and JSON, plus diagrams.
- Result: This was the worst performance. The data redundancy overwhelmed the context window.
- Accuracy: Hit rock bottom at 30-35%. What Worked: The "Less is More" Architecture We decided to flip the script entirely. We stopped trying to feed the AI everything and focused on feeding it only what mattered.
Aggressive Compression We took those 15-20 line entries and compressed them into 1 single line containing only the essential information.
Relocation We moved the core data out of the messy Knowledge Bank and placed it into the Master Prompt in a compact, token-efficient format.
Separation of Concerns This was the game-changer. We strictly separated Data from Instructions. We told the AI how to behave (logic/instructions) separately from what it knows (data). We stopped treating the Knowledge Bank as a rulebook and started treating it as a database. The Results The transformation was dramatic.
- Codebase Size: Reduced from 10,000+ lines to just 300-400 lines.
- Accuracy: Jumped from a struggling 30-50% to a consistent 80-85%. The chatbot is now answering almost perfectly. Key Technical Takeaways
- Relevance > Volume: In Conversational AI, huge datasets often create noise. Relevant, concise data is far more effective than "big" data.
- Decouple Knowledge & Logic: Keep your Knowledge Bank for facts and your Master Prompt for behavior. Mixing them confuses the model's reasoning capabilities.
- Structure > Format: It doesn't matter if you use CSV or JSON; what matters is how you organize the information inside it.
- Test with Real Queries: Synthetic Q&A samples can give you false confidence. Real user queries are the only metric that matters. Actionable Tips for Developers If you are struggling with a hallucinating chatbot, try this workflow:
- Analyze Redundancy: Check your knowledge bank. How many times is the same info repeated?
- Compress: Remove Chunk IDs, headers, and verbose descriptions. Keep only the raw tokens the AI needs.
- Optimize the Master Prompt: distinct instructions are better than vague examples.
- Real-World Testing: Stop testing with "Hello". Test with the messy, complex questions users actually ask.

Final Thoughts
In Prompt Engineering and Conversational AI, the principle of "less is more" is literally true. By reducing our line count from 10,000 to 300, we didn't just save tokens—we saved the project.
Have you faced similar issues with context overload? Let me know in the comments!

FARHAN HABIB FARAZ
Prompt Engineer & Prompt Team Lead
PowerInAI