DEV Community

Ben Sabic
Ben Sabic

Posted on

Replace Your Vector Pipeline with bash

Most knowledge agents follow the same playbook: pick a vector database, build a chunking pipeline, choose an embedding model, tune retrieval parameters. Weeks later, your agent confidently returns the wrong answer and you can't figure out why.

We took a different path. Instead of embeddings, we gave the agent a filesystem and bash. It searches your content using grep, find, and cat inside isolated sandboxes. No vector DB, no chunking, no embedding model.

The results were striking. Applying this pattern to a sales call summarization agent cut costs from ~$1.00 to ~$0.25 per call, and output quality actually improved. The core insight is simple: LLMs have been trained on enormous amounts of code. They already know how to navigate directories and grep through files. You're not teaching the model a new skill. You're leveraging the one it's best at.

Debugging gets a lot easier too. With vectors, a bad answer means trying to understand why one chunk scored 0.82 while the correct one scored 0.79. Was it the chunking boundary? The embedding model? The similarity threshold? With filesystem search, you open the trace and see exactly which commands ran, which files were read, and where things went wrong. The whole fix loop takes minutes.

We open-sourced a production-ready version of this architecture called the Knowledge Agent Template, built on Vercel Sandbox, AI SDK, and Chat SDK. It includes:

  • File-system-based search with no vector DB required
  • Multi-platform deployment via Chat SDK (GitHub, Discord, Slack, Teams, and more from a single codebase)
  • A complexity router that sends simple questions to fast, cheap models and hard questions to powerful ones
  • Built-in admin tools with usage stats, error logs, and an AI-powered admin agent for debugging

Read the full writeup and deploy the template:

👉 vercel.com/blog/build-knowledge-agents-without-embeddings

Top comments (0)