How I built an AI agent that searches the web, reads pages, and writes research reports — all running on your machine with no cloud API keys required.
If you've used ChatGPT or Claude for research, you know the drill: copy-paste URLs, summarize this, compare that. What if your AI could just... do the research itself?
That's what I built. Axiom is a local AI research agent written in C# that:
- 🔍 Generates search queries from your topic
- 🌐 Searches the web (via Brave Search API)
- 📄 Fetches and reads web pages
- 🧠 Analyzes content for relevant findings
- 📝 Synthesizes everything into a structured report
All running locally with Ollama. No cloud AI APIs. No data leaving your machine.
The Architecture
You: "Research persistent memory systems for AI agents"
↓
Axiom generates 5-8 search queries
↓
Searches Brave API → finds 10-15 sources
↓
Fetches top sources, deduplicates by domain
↓
Analyzes each page for relevant findings
↓
Synthesizes findings into a structured report
↓
Saves report as markdown + stores in memory
Tech Stack
- C# / .NET 8 — Fast, typed, great tooling
- Ollama — Local LLM inference (llama3.1 8B)
- SQLite — Memory storage with semantic search
- Brave Search API — Web search (free tier: 2000 queries/month)
Why C# Instead of Python?
Everyone builds AI agents in Python. That's fine. But C#:
- Better tooling — Visual Studio / Rider, strong typing, refactoring
- Easier deployment — Single binary, no virtualenv hell
- Performance — Faster startup, lower memory
- Underserved market — .NET devs want AI tools too
The AI agent space is dominated by LangChain (Python) and LlamaIndex. There's a real gap for .NET developers.
Key Design Decisions
Tool System
Every capability is a ITool:
public interface ITool
{
string Id { get; }
string Name { get; }
string Description { get; }
string ParametersSchema { get; }
Task<string> ExecuteAsync(string parameters, CancellationToken ct);
}
The LLM decides which tools to call. The agent orchestrator handles the loop:
User message → LLM → Tool call? → Execute tool → Feed result back → Repeat
Memory with Semantic Search
Instead of a vector database (ChromaDB, FAISS), I used SQLite with embeddings stored as BLOBs:
public async Task StoreAsync(string content, string type, float[] embedding)
{
// Store embedding as byte array in SQLite
var blob = new byte[embedding.Length * 4];
Buffer.BlockCopy(embedding, 0, blob, 0, blob.Length);
// INSERT INTO memories (content, type, embedding, timestamp) VALUES (...)
}
Cosine similarity search loads embeddings into memory. Works great for thousands of memories — you don't need a vector DB for personal-scale data.
Research Runner
The autonomous research mode (ResearchRunner) orchestrates the full pipeline:
- Query generation — Ask the LLM to generate diverse search queries
- Search — Hit Brave API with each query, collect URLs
- Dedup — Remove duplicate domains (max 2 per domain)
- Fetch — Download and extract text from top sources
- Analyze — Ask the LLM to extract relevant findings from each page
- Synthesize — Combine all findings into a structured report
The whole thing runs in ~15 minutes on a Ryzen 5 5500 with CPU-only inference (8B model).
What I Learned
Small models need guardrails. The 3B model was unreliable for tool calling — it would generate malformed JSON or call non-existent tools. The 8B model is dramatically better. Still not perfect, but usable.
Truncation matters. When synthesizing 8+ findings, the total text can exceed the model's context window. I added per-finding truncation (1500 chars) and a total cap (12K chars). Without this, the model either hallucinates or returns empty responses.
Research quality scales with sources. More search queries → more diverse sources → better findings → better synthesis. I settled on 5-8 queries per topic as a sweet spot.
Try It Yourself
The full source code is on GitHub:
🔗 DynamicCSharp/hex-dynamics — Axiom Research Agent
Or start simpler with our starter kit:
🔗 DynamicCSharp/agentkit — Build your own AI agent in C#
Quick Start
git clone https://github.com/DynamicCSharp/hex-dynamics.git
cd hex-dynamics
# Make sure Ollama is running with llama3.1:8b
dotnet run --project src/Axiom.CLI
What's Next
- Web UI for dispatching research from a browser (already built, included in repo)
- Sub-agent spawning — let the research agent delegate sub-tasks
- Better models — Testing with Mistral, Phi-3, and Qwen2.5 as they improve
- Memory across sessions — Persistent knowledge that builds over time
- Multi-model pipelines — Use fast models for extraction, smart models for synthesis
Built by Hex Dynamics — we're building AI tools for developers who want to run everything locally.
If this is useful, give us a ⭐ on GitHub. It helps more than you'd think.
Top comments (0)