Documentation is supposed to help developers. In reality, it often slows them down. Large projects ship with thousands of pages spread across API references, tutorials, and READMEs. The problem is not missing information. The problem is finding the right information at the right time.
When someone asks “How do I get started?”, they want a tutorial. When they ask “What parameters does this API accept?”, they want a precise API reference. Traditional search treats both queries the same and returns a mixed list of links. That wastes time.
A better approach is to build a Router Agent. Instead of searching everything blindly, the agent first understands intent and then routes the query to the most suitable source.
How the Router Works
The system has three clear responsibilities.
First, a reasoning layer decides what the user is actually asking. Second, a vector database stores documentation in a structured way and retrieves it with high precision. Third, a small set of tools handle specific tasks like tutorials, API references, or general search.
For storage and retrieval, we use Weaviate. The same design works with other vector databases, but the examples below follow the current Weaviate v4 Python SDK.
Before going into the implementation, two resources are useful if you want to follow along:
You can get started with a free sandbox cluster here.
If you need help validating or adjusting the code, you can verify the code using Weaviate Docs AI here:
https://docs.weaviate.io/weaviate
Setting Up the Weaviate Client
We start by connecting to a Weaviate Cloud cluster. The connection style below matches the latest SDK. One small but important detail is closing the client when the program finishes.
import os
import weaviate
from weaviate.classes.init import Auth
client = weaviate.connect_to_weaviate_cloud(
cluster_url=os.getenv("WEAVIATE_URL"),
auth_credentials=Auth.api_key(os.getenv("WEAVIATE_API_KEY"))
)
Always remember to close the client when you are done:
client.close()
Designing the Collection Schema
Documentation should not be dumped as raw text. We add structure so the agent can filter with intent.
Each document has a title, content, language, and a doc_type. The doc_type is especially important because it allows clean routing between tutorials and API references.
from weaviate.classes.config import Configure, Property, DataType
def setup_collection(client):
if not client.collections.exists("Documentation"):
client.collections.create(
name="Documentation",
vectorizer_config=Configure.Vectorizer.text2vec_weaviate(),
properties=[
Property(name="title", data_type=DataType.TEXT),
Property(name="content", data_type=DataType.TEXT),
Property(name="language", data_type=DataType.TEXT),
Property(name="doc_type", data_type=DataType.TEXT),
],
)
This schema works well in practice and does not require extra tuning to get useful results.
Hybrid Search with Filters
Routing only works if retrieval is accurate. Weaviate’s hybrid search combines keyword matching (BM25) with vector search. This is important for developer docs, where exact function names matter as much as semantic meaning.
We explicitly import the query helpers and make the hybrid behaviour clear using alpha and fusion strategy.
from weaviate.classes.query import Filter, HybridFusion
A tool for finding tutorials looks like this:
def find_examples(topic: str) -> str:
docs = client.collections.get("Documentation")
results = docs.query.hybrid(
query=topic,
alpha=0.5,
fusion_type=HybridFusion.RELATIVE_SCORE,
filters=Filter.by_property("doc_type").equal("Tutorial"),
limit=3,
)
return format_results(results)
Here, format_results is a simple helper that converts Weaviate result objects into readable text for the agent.
Fetching a precise API reference uses the same pattern with a different filter:
def get_api_reference(api_name: str) -> str:
docs = client.collections.get("Documentation")
results = docs.query.hybrid(
query=api_name,
alpha=0.5,
fusion_type=HybridFusion.RELATIVE_SCORE,
filters=Filter.by_property("doc_type").equal("API Reference"),
limit=1,
)
return (
results.objects[0].properties["content"]
if results.objects
else "Not found"
)
For general queries that do not clearly fall into one category, a simple hybrid search is enough:
def search_docs(query: str) -> str:
docs = client.collections.get("Documentation")
results = docs.query.hybrid(
query=query,
limit=5,
)
return format_results(results)
The result access pattern follows the v4 SDK directly through results.objects and o.properties.
Once retrieval is reliable and scoped, the remaining problem is deciding which retrieval path to use for a given query.
Adding the Reasoning Layer with DSPy
The router itself lives in the reasoning layer. DSPy allows you to describe the decision logic as a structured signature instead of fragile prompt text.
The agent examines the query, decides which tool to call, and passes the right input to that tool.
import dspy
class DocsResponse(dspy.Signature):
query: str = dspy.InputField()
tools: str = dspy.InputField()
response: str = dspy.OutputField()
tool: str = dspy.OutputField()
tool_input: str = dspy.OutputField()
class DocsSearchAgent:
def __init__(self):
self.model = dspy.LM("anthropic/claude-3-5-sonnet")
self.tools = {
"find_examples": find_examples,
"get_api_reference": get_api_reference,
"search_docs": search_docs,
}
def ask(self, query: str):
agent = dspy.ChainOfThought(DocsResponse)
prediction = agent(query=query, tools=str(self.tools.keys()))
if prediction.tool in self.tools:
return self.tools[prediction.tool](prediction.tool_input)
return prediction.response
This approach avoids hardcoded rules and scales well as documentation grows.
Tech Stack Used
If you plan to take this beyond a prototype, this is the stack that fits the design.
In production, this setup works best with a clear separation of roles. DSPy handles reasoning and routing. Weaviate handles retrieval with hybrid search and filtering. Claude 3.5 Sonnet is used for reliable tool selection and code-aware reasoning. Embeddings can be swapped without rewriting the pipeline, and the frontend can be as simple as Streamlit or a full developer portal.
The defaults are strong enough that you do not need heavy tuning before this works in a real system.
Why Weaviate Instead of Pinecone or Milvus
I had to make a choice between Weaviate, Pinecone and Milvus, and all of these are capable systems. However, for documentation-heavy use cases, Weaviate felt easier to use end to end. The Python SDK is clear, hybrid search works without extra setup, and the schema model maps well to how documentation is actually structured.
Strong defaults matter in production. With Weaviate, you can reach stable behaviour quickly without spending a lot of time configuring internals. The quality of documentation and support also reduces friction when something goes wrong. That combination makes it practical for teams that want to move fast and still stay reliable.
Final Thoughts
A documentation agent should respect a developer’s time. The goal is not to “chat with docs” but to deliver the right answer with minimal effort. By combining Weaviate’s hybrid search with DSPy’s structured reasoning, you move away from fragile prompt tricks and toward a system that behaves predictably in real usage.
That difference is what separates an experiment from something teams can depend on.
Top comments (0)