<?xml version="1.0" encoding="UTF-8"?>
<rss version="2.0" xmlns:atom="http://www.w3.org/2005/Atom" xmlns:dc="http://purl.org/dc/elements/1.1/">
  <channel>
    <title>DEV Community: Richard Petty</title>
    <description>The latest articles on DEV Community by Richard Petty (@richard_petty_b0d100bd27b).</description>
    <link>https://dev.to/richard_petty_b0d100bd27b</link>
    <image>
      <url>https://media2.dev.to/dynamic/image/width=90,height=90,fit=cover,gravity=auto,format=auto/https:%2F%2Fdev-to-uploads.s3.amazonaws.com%2Fuploads%2Fuser%2Fprofile_image%2F3763855%2Ff2f127f1-bf66-48a8-87c0-0e2c49cd1738.jpg</url>
      <title>DEV Community: Richard Petty</title>
      <link>https://dev.to/richard_petty_b0d100bd27b</link>
    </image>
    <atom:link rel="self" type="application/rss+xml" href="https://dev.to/feed/richard_petty_b0d100bd27b"/>
    <language>en</language>
    <item>
      <title>Building a Local AI Research Agent in C# — From Zero to Autonomous Research</title>
      <dc:creator>Richard Petty</dc:creator>
      <pubDate>Tue, 10 Feb 2026 09:32:27 +0000</pubDate>
      <link>https://dev.to/richard_petty_b0d100bd27b/building-a-local-ai-research-agent-in-c-from-zero-to-autonomous-research-3mg4</link>
      <guid>https://dev.to/richard_petty_b0d100bd27b/building-a-local-ai-research-agent-in-c-from-zero-to-autonomous-research-3mg4</guid>
      <description>&lt;p&gt;&lt;em&gt;How I built an AI agent that searches the web, reads pages, and writes research reports — all running on your machine with no cloud API keys required.&lt;/em&gt;&lt;/p&gt;




&lt;p&gt;If you've used ChatGPT or Claude for research, you know the drill: copy-paste URLs, summarize this, compare that. What if your AI could just... do the research itself?&lt;/p&gt;

&lt;p&gt;That's what I built. &lt;strong&gt;Axiom&lt;/strong&gt; is a local AI research agent written in C# that:&lt;/p&gt;

&lt;ul&gt;
&lt;li&gt;🔍 Generates search queries from your topic&lt;/li&gt;
&lt;li&gt;🌐 Searches the web (via Brave Search API)&lt;/li&gt;
&lt;li&gt;📄 Fetches and reads web pages&lt;/li&gt;
&lt;li&gt;🧠 Analyzes content for relevant findings&lt;/li&gt;
&lt;li&gt;📝 Synthesizes everything into a structured report&lt;/li&gt;
&lt;/ul&gt;

&lt;p&gt;All running locally with Ollama. No cloud AI APIs. No data leaving your machine.&lt;/p&gt;

&lt;h2&gt;
  
  
  The Architecture
&lt;/h2&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;You: "Research persistent memory systems for AI agents"
  ↓
Axiom generates 5-8 search queries
  ↓
Searches Brave API → finds 10-15 sources
  ↓
Fetches top sources, deduplicates by domain
  ↓
Analyzes each page for relevant findings
  ↓
Synthesizes findings into a structured report
  ↓
Saves report as markdown + stores in memory
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Tech Stack
&lt;/h3&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;C# / .NET 8&lt;/strong&gt; — Fast, typed, great tooling&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Ollama&lt;/strong&gt; — Local LLM inference (llama3.1 8B)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;SQLite&lt;/strong&gt; — Memory storage with semantic search&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Brave Search API&lt;/strong&gt; — Web search (free tier: 2000 queries/month)&lt;/li&gt;
&lt;/ul&gt;

&lt;h2&gt;
  
  
  Why C# Instead of Python?
&lt;/h2&gt;

&lt;p&gt;Everyone builds AI agents in Python. That's fine. But C#:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Better tooling&lt;/strong&gt; — Visual Studio / Rider, strong typing, refactoring&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Easier deployment&lt;/strong&gt; — Single binary, no virtualenv hell&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Performance&lt;/strong&gt; — Faster startup, lower memory&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Underserved market&lt;/strong&gt; — .NET devs want AI tools too&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The AI agent space is dominated by LangChain (Python) and LlamaIndex. There's a real gap for .NET developers.&lt;/p&gt;

&lt;h2&gt;
  
  
  Key Design Decisions
&lt;/h2&gt;

&lt;h3&gt;
  
  
  Tool System
&lt;/h3&gt;

&lt;p&gt;Every capability is a &lt;code&gt;ITool&lt;/code&gt;:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;interface&lt;/span&gt; &lt;span class="nc"&gt;ITool&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Id&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Name&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;Description&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;ParametersSchema&lt;/span&gt; &lt;span class="p"&gt;{&lt;/span&gt; &lt;span class="k"&gt;get&lt;/span&gt;&lt;span class="p"&gt;;&lt;/span&gt; &lt;span class="p"&gt;}&lt;/span&gt;
    &lt;span class="n"&gt;Task&lt;/span&gt;&lt;span class="p"&gt;&amp;lt;&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt;&lt;span class="p"&gt;&amp;gt;&lt;/span&gt; &lt;span class="nf"&gt;ExecuteAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;parameters&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;CancellationToken&lt;/span&gt; &lt;span class="n"&gt;ct&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;The LLM decides which tools to call. The agent orchestrator handles the loop:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight plaintext"&gt;&lt;code&gt;User message → LLM → Tool call? → Execute tool → Feed result back → Repeat
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h3&gt;
  
  
  Memory with Semantic Search
&lt;/h3&gt;

&lt;p&gt;Instead of a vector database (ChromaDB, FAISS), I used SQLite with embeddings stored as BLOBs:&lt;br&gt;
&lt;/p&gt;

&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight csharp"&gt;&lt;code&gt;&lt;span class="k"&gt;public&lt;/span&gt; &lt;span class="k"&gt;async&lt;/span&gt; &lt;span class="n"&gt;Task&lt;/span&gt; &lt;span class="nf"&gt;StoreAsync&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;content&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;string&lt;/span&gt; &lt;span class="n"&gt;type&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="kt"&gt;float&lt;/span&gt;&lt;span class="p"&gt;[]&lt;/span&gt; &lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;)&lt;/span&gt;
&lt;span class="p"&gt;{&lt;/span&gt;
    &lt;span class="c1"&gt;// Store embedding as byte array in SQLite&lt;/span&gt;
    &lt;span class="kt"&gt;var&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt; &lt;span class="p"&gt;=&lt;/span&gt; &lt;span class="k"&gt;new&lt;/span&gt; &lt;span class="kt"&gt;byte&lt;/span&gt;&lt;span class="p"&gt;[&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt; &lt;span class="p"&gt;*&lt;/span&gt; &lt;span class="m"&gt;4&lt;/span&gt;&lt;span class="p"&gt;];&lt;/span&gt;
    &lt;span class="n"&gt;Buffer&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="nf"&gt;BlockCopy&lt;/span&gt;&lt;span class="p"&gt;(&lt;/span&gt;&lt;span class="n"&gt;embedding&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="m"&gt;0&lt;/span&gt;&lt;span class="p"&gt;,&lt;/span&gt; &lt;span class="n"&gt;blob&lt;/span&gt;&lt;span class="p"&gt;.&lt;/span&gt;&lt;span class="n"&gt;Length&lt;/span&gt;&lt;span class="p"&gt;);&lt;/span&gt;
    &lt;span class="c1"&gt;// INSERT INTO memories (content, type, embedding, timestamp) VALUES (...)&lt;/span&gt;
&lt;span class="p"&gt;}&lt;/span&gt;
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;p&gt;Cosine similarity search loads embeddings into memory. Works great for thousands of memories — you don't need a vector DB for personal-scale data.&lt;/p&gt;

&lt;h3&gt;
  
  
  Research Runner
&lt;/h3&gt;

&lt;p&gt;The autonomous research mode (&lt;code&gt;ResearchRunner&lt;/code&gt;) orchestrates the full pipeline:&lt;/p&gt;

&lt;ol&gt;
&lt;li&gt;
&lt;strong&gt;Query generation&lt;/strong&gt; — Ask the LLM to generate diverse search queries&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Search&lt;/strong&gt; — Hit Brave API with each query, collect URLs&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Dedup&lt;/strong&gt; — Remove duplicate domains (max 2 per domain)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Fetch&lt;/strong&gt; — Download and extract text from top sources&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Analyze&lt;/strong&gt; — Ask the LLM to extract relevant findings from each page&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Synthesize&lt;/strong&gt; — Combine all findings into a structured report&lt;/li&gt;
&lt;/ol&gt;

&lt;p&gt;The whole thing runs in ~15 minutes on a Ryzen 5 5500 with CPU-only inference (8B model).&lt;/p&gt;

&lt;h2&gt;
  
  
  What I Learned
&lt;/h2&gt;

&lt;p&gt;&lt;strong&gt;Small models need guardrails.&lt;/strong&gt; The 3B model was unreliable for tool calling — it would generate malformed JSON or call non-existent tools. The 8B model is dramatically better. Still not perfect, but usable.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Truncation matters.&lt;/strong&gt; When synthesizing 8+ findings, the total text can exceed the model's context window. I added per-finding truncation (1500 chars) and a total cap (12K chars). Without this, the model either hallucinates or returns empty responses.&lt;/p&gt;

&lt;p&gt;&lt;strong&gt;Research quality scales with sources.&lt;/strong&gt; More search queries → more diverse sources → better findings → better synthesis. I settled on 5-8 queries per topic as a sweet spot.&lt;/p&gt;

&lt;h2&gt;
  
  
  Try It Yourself
&lt;/h2&gt;

&lt;p&gt;The full source code is on GitHub:&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/DynamicCSharp/hex-dynamics" rel="noopener noreferrer"&gt;DynamicCSharp/hex-dynamics&lt;/a&gt;&lt;/strong&gt; — Axiom Research Agent&lt;/p&gt;

&lt;p&gt;Or start simpler with our starter kit:&lt;/p&gt;

&lt;p&gt;🔗 &lt;strong&gt;&lt;a href="https://github.com/DynamicCSharp/agentkit" rel="noopener noreferrer"&gt;DynamicCSharp/agentkit&lt;/a&gt;&lt;/strong&gt; — Build your own AI agent in C#&lt;/p&gt;

&lt;h3&gt;
  
  
  Quick Start
&lt;/h3&gt;



&lt;div class="highlight js-code-highlight"&gt;
&lt;pre class="highlight shell"&gt;&lt;code&gt;git clone https://github.com/DynamicCSharp/hex-dynamics.git
&lt;span class="nb"&gt;cd &lt;/span&gt;hex-dynamics
&lt;span class="c"&gt;# Make sure Ollama is running with llama3.1:8b&lt;/span&gt;
dotnet run &lt;span class="nt"&gt;--project&lt;/span&gt; src/Axiom.CLI
&lt;/code&gt;&lt;/pre&gt;

&lt;/div&gt;



&lt;h2&gt;
  
  
  What's Next
&lt;/h2&gt;

&lt;ul&gt;
&lt;li&gt;
&lt;strong&gt;Web UI&lt;/strong&gt; for dispatching research from a browser (already built, included in repo)&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Sub-agent spawning&lt;/strong&gt; — let the research agent delegate sub-tasks&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Better models&lt;/strong&gt; — Testing with Mistral, Phi-3, and Qwen2.5 as they improve&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Memory across sessions&lt;/strong&gt; — Persistent knowledge that builds over time&lt;/li&gt;
&lt;li&gt;
&lt;strong&gt;Multi-model pipelines&lt;/strong&gt; — Use fast models for extraction, smart models for synthesis&lt;/li&gt;
&lt;/ul&gt;




&lt;p&gt;&lt;em&gt;Built by &lt;a href="https://github.com/DynamicCSharp" rel="noopener noreferrer"&gt;Hex Dynamics&lt;/a&gt; — we're building AI tools for developers who want to run everything locally.&lt;/em&gt;&lt;/p&gt;

&lt;p&gt;&lt;em&gt;If this is useful, give us a ⭐ on GitHub. It helps more than you'd think.&lt;/em&gt;&lt;/p&gt;

</description>
      <category>ai</category>
      <category>csharp</category>
      <category>dotnet</category>
      <category>opensource</category>
    </item>
  </channel>
</rss>
