Introduction
Happy coding! Today I want to share a practical AI tutorial in .NET style, with real code, simple architecture, and a result you can run on your own machine.
When many developers start building AI knowledge assistants, the first idea is usually RAG. That is a good choice in many cases, but not always the simplest one. Sometimes, your knowledge base is actually small, focused, and stable. In that lovely situation, building a small LLM Wiki can be a more elegant choice.
Let me tell a short lovely story.
Imagine a small tea house run by a kind grandmother. Every morning, her grandchildren ask the same questions:
- Which tea is good for a cold day?
- Which dessert has nuts?
- What is the house special today?
Now imagine two ways to help them.
- The first way is to hire a fast librarian who runs to a giant archive room every time a child asks a question. That is a bit like RAG.
- The second way is to write one beautiful, well-organized family handbook and keep it on the table all the time. That is a bit like an LLM Wiki.
If the tea house menu is small and stable, the handbook is often faster, simpler, and more reliable. But if the tea house becomes a huge restaurant chain with thousands of daily updates, you will eventually want the librarian too.
That is the heart of LLM Wiki vs RAG.
In this tutorial, we will build an open-source local LLM Wiki demo in C# using:
.NET 8OllamaOllamaSharp- a local markdown-based wiki structure
- the
kimi-k2.6:cloudOllama model
By the end, you will have:
- a console-based demo app
- a
LocalWikifolder with markdown files - document ingestion logic
- index maintenance logic
- wiki question-answering logic
- a clear understanding of when to use a wiki instead of RAG
What We Are Building
We are building a small local AI workflow like this:
- Prepare a source document.
- Send that document to an LLM through Ollama.
- Ask the LLM to return structured wiki content.
- Save the result as markdown in a local wiki folder.
- Update
Index.mdautomatically. - Ask questions against the wiki content.
- Let the model answer using the wiki instead of the raw source file.
This is not a full vector database pipeline. It is intentionally lighter.
LLM Wiki vs RAG
Before we code, let us understand the design choice.
What is RAG?
Retrieval-Augmented Generation (RAG) improves LLM output by retrieving relevant information from external sources before generating a response. Instead of depending only on training data, the model gets fresh context from documents, databases, or web content.
Common characteristics of RAG:
- dynamic retrieval at question time
- usually based on chunking + embeddings + vector search
- good for large and changing knowledge bases
- better for source attribution and enterprise-scale search
- more infrastructure and tuning effort
What is an LLM Wiki?
An LLM Wiki is a curated, structured markdown knowledge base written so the model can reason over it directly. Rather than retrieving many chunks at runtime, you keep important domain knowledge in concise markdown files and pass those files into the model context.
Common characteristics of an LLM Wiki:
- markdown-first knowledge organization
- compact, human-readable, and model-friendly
- great for small and stable domains
- simple to maintain for focused use cases
- low setup cost compared with a full RAG pipeline
When should you use a Wiki instead of RAG?
A wiki is a strong choice when:
- your knowledge base is relatively small
- your content changes slowly
- you want to ship fast
- you prefer markdown and file-based maintenance
- you want fewer moving parts
- you are building a demo, prototype, internal assistant, or focused domain tool
RAG is usually better when:
- you have a very large knowledge base
- documents change frequently
- you need deep retrieval across many domains
- you need richer source attribution and search behavior
- your content does not fit comfortably into context
A practical rule of thumb
Start with an LLM Wiki if your knowledge is:
- focused
- stable
- not too large
- curated by humans or AI into clean markdown
Move to RAG when your knowledge becomes:
- too large
- too dynamic
- too fragmented
- too cross-domain
And yes, a hybrid approach is often best:
- use a wiki for stable core knowledge
- use RAG for large or fast-changing content
References and background reading
The ideas in this tutorial align with public explanations of RAG and LLM Wiki tradeoffs, including:
- Wikipedia on Retrieval-Augmented Generation
- AWS explanation of RAG and external knowledge grounding
- MindStudio and 99helpers comparisons of LLM Wiki vs RAG for smaller knowledge bases
Prerequisites
Before we begin, make sure you have the following:
- Visual Studio or another C# IDE
.NET 8 SDK-
Ollamainstalled locally - the Kimi model available through Ollama
- a Windows terminal or PowerShell
To use the model from this demo, run:
ollama run kimi-k2.6:cloud
If Ollama is already running, that is enough for the demo.
Step 1: Create the Console App
If you want to build this from scratch in a new workspace, start with:
dotnet new console -n LLMWikiDemo
cd LLMWikiDemo
In this project, the feature was added into an existing .NET 8 console app, but the same idea applies to a fresh app.
Step 2: Add the NuGet Package
Add OllamaSharp so your C# application can talk to Ollama:
dotnet add package OllamaSharp
In this workspace, the project file already contains:
<PackageReference Include="OllamaSharp" Version="5.4.25" />
Step 3: Define the Architecture
To keep the system architecture simple, we use these parts:
-
Program.csas the entry point -
LlmWikiManager.csas the core service -
LocalWiki/as the generated markdown wiki -
SCHEMA.mdto define the wiki rules -
Index.mdas the article catalog -
Articles/for wiki pages
The flow is:
Source Document
-> LlmWikiManager.IngestDocumentAsync(...)
-> Ollama / Kimi model
-> JSON-like structured response
-> Markdown article
-> Index.md update
-> QueryWikiAsync(...)
-> Answer from wiki context
Step 4: Create the Wiki Manager
The main logic lives in Utils/LLMWikiManager.cs.
This class is responsible for:
- creating the wiki folders
- creating
SCHEMA.mdandIndex.md - sending prompts to Ollama
- parsing the model response
- writing article markdown
- reading article markdown back for Q&A
The manager constructor
We configure the Ollama endpoint and default model here:
public LlmWikiManager(
string ollamaApiUrl = "http://localhost:11434",
string modelName = "kimi-k2.6:cloud",
string? wikiRootDirectory = null)
{
_ollamaClient = new OllamaApiClient(new Uri(ollamaApiUrl))
{
SelectedModel = modelName
};
_wikiDirectory = wikiRootDirectory ?? Path.Combine(AppContext.BaseDirectory, "LocalWiki");
_schemaFilePath = Path.Combine(_wikiDirectory, "SCHEMA.md");
_indexFilePath = Path.Combine(_wikiDirectory, "Index.md");
_articlesDirectory = Path.Combine(_wikiDirectory, "Articles");
}
This means the wiki is generated near the app runtime folder, and the model defaults to kimi-k2.6:cloud.
Step 5: Bootstrap the Wiki Structure
We want the app to create the wiki structure automatically if it does not exist.
public async Task EnsureWikiStructureAsync()
{
Directory.CreateDirectory(_wikiDirectory);
Directory.CreateDirectory(_articlesDirectory);
if (!File.Exists(_schemaFilePath))
{
await File.WriteAllTextAsync(_schemaFilePath, DefaultSchema);
}
if (!File.Exists(_indexFilePath))
{
var initialIndex = "# Local Wiki Index" + Environment.NewLine + Environment.NewLine;
await File.WriteAllTextAsync(_indexFilePath, initialIndex);
}
}
This is one of the reasons the wiki approach feels nice. It is file-based, easy to inspect, and easy to explain.
Step 6: Stream Text from Ollama
OllamaSharp returns generation results as an async stream, so we collect the response text like this:
private async Task<string> GenerateTextAsync(string prompt)
{
var builder = new StringBuilder();
await foreach (var chunk in _ollamaClient.GenerateAsync(prompt))
{
if (!string.IsNullOrWhiteSpace(chunk?.Response))
{
builder.Append(chunk.Response);
}
}
return builder.ToString();
}
This is a small but important detail. If you try to await the sequence directly, the build will fail. You must use await foreach.
Step 7: Ingest a Source Document
Now comes the most interesting part.
When a source document arrives, we:
- read the file
- read the schema
- ask the model to return structured wiki data
- parse the result
- create a markdown article
- update
Index.md
The ingestion prompt
The ingestion prompt in LlmWikiManager.cs looks like this:
var ingestionPrompt = $"""
You are an AI assistant maintaining a markdown wiki.
Schema:
---
{schemaContent}
---
Process this source document and respond with ONLY one JSON object using these keys:
- summaryTitle: concise title for the article
- summaryFilename: kebab-case markdown filename
- summaryContent: full markdown article body
- indexUpdate: one markdown bullet linking article from Index.md
Source filename: {sourceFileName}
Source document:
---
{sourceContent}
---
""";
This is a practical pattern:
- tell the model who it is
- give it the wiki rules
- give it the source content
- enforce a structured output format
The ingestion method
A simplified view of the ingestion code:
public async Task<LlmWikiIngestionResult> IngestDocumentAsync(string sourceFilePath)
{
await EnsureWikiStructureAsync();
var sourceContent = await File.ReadAllTextAsync(sourceFilePath);
var schemaContent = await File.ReadAllTextAsync(_schemaFilePath);
var sourceFileName = Path.GetFileName(sourceFilePath);
var responseText = await GenerateTextAsync(ingestionPrompt);
var parsed = TryParseIngestionPayload(responseText);
var summaryTitle = !string.IsNullOrWhiteSpace(parsed?.SummaryTitle)
? parsed!.SummaryTitle!
: $"Summary of {Path.GetFileNameWithoutExtension(sourceFileName)}";
var summaryFilename = NormalizeSummaryFilename(parsed?.SummaryFilename ?? string.Empty, summaryTitle);
var summaryContent = !string.IsNullOrWhiteSpace(parsed?.SummaryContent)
? parsed!.SummaryContent!
: BuildFallbackSummaryContent(sourceFileName, sourceContent);
var summaryFilePath = Path.Combine(_articlesDirectory, summaryFilename);
await File.WriteAllTextAsync(summaryFilePath, summaryContent);
var normalizedRelativeArticlePath = $"Articles/{summaryFilename}";
var indexUpdate = !string.IsNullOrWhiteSpace(parsed?.IndexUpdate)
? parsed!.IndexUpdate!.Trim()
: $"- [{summaryTitle}]({normalizedRelativeArticlePath})";
// append to Index.md if needed
}
Why fallback logic matters
Real LLMs are helpful, but sometimes they do not follow the JSON contract perfectly.
That is why the demo includes:
- JSON extraction from mixed text
- deserialization attempts
- fallback summary generation
- filename normalization
- index deduplication
This makes the POC stronger and more production-minded than a fragile one-shot prompt demo.
Step 8: Parse the LLM Response Safely
Structured AI output is never perfect, so the code extracts the JSON object defensively.
private static string? ExtractJsonObject(string text)
{
if (string.IsNullOrWhiteSpace(text))
{
return null;
}
var start = text.IndexOf('{');
var end = text.LastIndexOf('}');
if (start < 0 || end <= start)
{
return null;
}
return text[start..(end + 1)];
}
Then we try to deserialize it:
private static LlmWikiIngestionPayload? TryParseIngestionPayload(string responseText)
{
var rawJson = ExtractJsonObject(responseText);
if (rawJson is null)
{
return null;
}
try
{
return JsonSerializer.Deserialize<LlmWikiIngestionPayload>(rawJson, new JsonSerializerOptions
{
PropertyNameCaseInsensitive = true
});
}
catch
{
return null;
}
}
This is a great lesson for readers: LLM output should be handled like user input. Trust it carefully, never blindly.
Step 9: Query the Wiki
Once the wiki exists, we can query it.
The app reads:
SCHEMA.mdIndex.md- the latest markdown articles
Then it builds a prompt for the model.
public async Task<string> QueryWikiAsync(string question)
{
await EnsureWikiStructureAsync();
var schemaContent = await File.ReadAllTextAsync(_schemaFilePath);
var indexContent = await File.ReadAllTextAsync(_indexFilePath);
var articleContext = await BuildArticleContextAsync(maxArticles: 6, maxCharsPerArticle: 2400);
var queryPrompt = $"""
You are an AI assistant answering questions from a local markdown wiki.
Rules:
- Use only the provided wiki context.
- If information is insufficient, explicitly say so.
- Keep response concise.
- Cite article filenames used as sources.
Schema:
---
{schemaContent}
---
Index:
---
{indexContent}
---
Article context:
---
{articleContext}
---
Question: {question}
""";
var answer = (await GenerateTextAsync(queryPrompt)).Trim();
if (string.IsNullOrWhiteSpace(answer))
{
return "Insufficient information in the wiki context to answer this question.";
}
return answer;
}
This is a wiki-style answer flow, not a vector retrieval flow.
Step 10: Wire Everything in Program.cs
The entry point demonstrates the entire POC.
using MyPlaygroundApp.Utils;
class Program
{
static async Task Main(string[] args)
{
Console.WriteLine("Open-Source LLM Wiki Demo (.NET + Ollama)");
Console.WriteLine("------------------------------------------");
Console.WriteLine();
var wikiManager = new LlmWikiManager(modelName: "kimi-k2.6:cloud");
await wikiManager.EnsureWikiStructureAsync();
var demoInputDirectory = Path.Combine(AppContext.BaseDirectory, "DemoInput");
Directory.CreateDirectory(demoInputDirectory);
var sourceDocumentPath = Path.Combine(demoInputDirectory, "csharp-ai-benefits.txt");
var sourceDocumentContent = """
C# is effective for AI solution development due to strong typing, mature tooling, and excellent package management.
The .NET ecosystem enables production-ready services, background workers, APIs, and cloud integration.
Developers can integrate local LLMs using Ollama and open-source libraries such as OllamaSharp and Semantic Kernel.
This approach is useful for private knowledge workflows like a local markdown wiki with ingestion and Q&A.
""";
await File.WriteAllTextAsync(sourceDocumentPath, sourceDocumentContent);
Console.WriteLine("Ingesting source document into local wiki...");
var ingestionResult = await wikiManager.IngestDocumentAsync(sourceDocumentPath);
Console.WriteLine($"Article created: {ingestionResult.SummaryFilename}");
Console.WriteLine($"Wiki location: {wikiManager.WikiDirectory}");
Console.WriteLine();
const string question = "What are the benefits of using C# for local LLM wiki development?";
Console.WriteLine($"Question: {question}");
var answer = await wikiManager.QueryWikiAsync(question);
Console.WriteLine();
Console.WriteLine("Answer:");
Console.WriteLine(answer);
Console.WriteLine();
Console.WriteLine("Demo complete.");
}
}
This is excellent for a tutorial because readers can run it and immediately see the full loop.
Step 11: Run the Demo
From the solution root, run:
dotnet run --project .\MyPlaygroundApp\MyPlaygroundApp.csproj
If Ollama is available and kimi-k2.6:cloud is usable, the demo should:
- create the wiki structure
- ingest the sample source document
- write a markdown article
- update the index
- answer a question from the wiki context
Step 12: Show the Generated Wiki Files
These example results are perfect to show your readers.
Index.md
# Local Wiki Index
- [C# for AI Development](Articles/csharp-ai-development.md)
SCHEMA.md
# LLM Wiki Schema
## Goal
Maintain a local markdown wiki from source documents.
## Ingestion Workflow
1. Read new source document.
2. Summarize key points with technical accuracy.
3. Create or update one markdown article in `Articles/`.
4. Ensure `Index.md` contains one bullet link per article.
5. Keep filenames kebab-case and `.md`.
## Query Workflow
1. Use only article/index content provided as context.
2. Answer concisely and cite article filenames used.
3. If context is insufficient, say information is insufficient.
csharp-ai-development.md
# C# for AI Development
C# is a strong choice for building AI solutions, offering robust language features and a mature ecosystem.
## Key Advantages
- **Strong Typing & Tooling**: Static typing reduces runtime errors, while mature IDEs and debugging tools improve developer productivity.
- **Package Management**: NuGet provides reliable dependency management for complex projects.
- **Production-Ready Infrastructure**: The .NET ecosystem supports building scalable services, background workers, APIs, and seamless cloud integration.
## Local LLM Integration
Developers can run local large language models via **Ollama**, using open-source libraries such as:
- **OllamaSharp**: A C# client for interacting with Ollama.
- **Semantic Kernel**: Microsoft's SDK for integrating AI services into applications.
## Use Cases
This stack is particularly effective for **private knowledge workflows**, such as local markdown wikis with document ingestion and question-answering capabilities, ensuring data remains on-premise.
These examples make the demo much more concrete and easier for readers to trust.
Why This POC Is Good
This POC is simple, but it teaches many important engineering ideas:
- local model integration in C#
- markdown-based AI knowledge design
- AI-assisted ingestion workflow
- safe structured output parsing
- file-based knowledge management
- query-time synthesis from curated context
It is a very good educational bridge between:
- plain prompt engineering
- file-based knowledge systems
- and full RAG architectures
Limitations of This POC
Be honest with your readers. This demo is useful, but it is still a demo.
Current limitations:
- it is console-based only
- article selection is simple and file-based
- there is no embedding or semantic ranking layer
- article context size is manually bounded
- source attribution is prompt-based, not enforced by retrieval metadata
- large wiki collections may require a hybrid or full RAG architecture later
That is okay. Simplicity is a feature here.
How to Extend It Later
If your readers want to continue, they can add:
- command-line arguments for custom ingestion and queries
- ASP.NET Core Minimal API endpoints
- Blazor UI for article management
- stronger JSON schema validation
- better article ranking before query
- Semantic Kernel orchestration
- hybrid wiki + RAG support
- file watchers for automatic wiki refresh
A nice next-step architecture could be:
-
LLM Wikifor curated stable knowledge -
RAGfor large and dynamic long-tail knowledge
That gives the best of both worlds.
Conclusion
We built a practical LLM Wiki POC in C# using .NET 8, Ollama, OllamaSharp, markdown files, and the kimi-k2.6:cloud model.
More importantly, we learned a design lesson that is easy to forget in modern AI work:
Not every problem needs RAG first.
Sometimes, the best answer is a clean, curated, lovingly maintained wiki.
If your knowledge is small, stable, and important, a markdown-first LLM Wiki can be wonderfully effective. It is easy to understand, easy to debug, easy to version-control, and easy to teach.
Then, when your world grows larger, you can invite RAG to the party.
Happy coding, and may your markdown always stay tidy.
Love C# & AI!

Top comments (0)