Build a Document Search Engine in C
Most search implementations fall into one of two camps: send everything to Elasticsearch, or call a search API. Both work. Both add infrastructure.
Here's a third option. Index local files, search them by keyword, by meaning, or both, in about 10 lines of C#. No external services.
dotnet add package Kjarni
using Kjarni;
using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });
using var searcher = new Searcher(
model: "minilm-l6-v2",
rerankerModel: "minilm-l6-v2-cross-encoder");
var results = searcher.Search("my_index", "how do returns work?",
mode: SearchMode.Hybrid);
foreach (var r in results)
Console.WriteLine($" {r.Score:F4}: {r.Text}");
The indexer reads your files, splits them into chunks, encodes each chunk as a vector, and builds a BM25 keyword index. The searcher queries both indexes and combines the results.
Setup
Create a few text files to search over:
mkdir -p docs
docs/returns.txt:
Our return policy allows customers to return any unused item within 30 days
of purchase for a full refund. Items must be in their original packaging.
Shipping costs are non-refundable.
docs/shipping.txt:
We ship to all 50 US states and internationally to over 40 countries.
Standard shipping takes 5-7 business days. Express shipping is available
for an additional fee.
docs/account.txt:
To reset your password, click "Forgot Password" on the login page.
You will receive an email with a reset link. The link expires after 24 hours.
Three short documents. In practice these could be product manuals, support articles, internal wikis, or any text files.
Indexing
using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });
The indexer does three things:
- Reads all files in the given directories
- Chunks each file into passages (for long documents)
- Encodes each chunk into a 384-dimension vector using the embedding model
It also builds a BM25 keyword index over the same chunks. The result is a local index on disk that you can query repeatedly without re-indexing.
Three Search Modes
Keyword Search (BM25)
Matches documents that contain the query words. The same algorithm that powers Elasticsearch and Solr.
var results = searcher.Search("my_index", "return policy refund",
mode: SearchMode.Keyword);
7.8795: Our return policy allows customers to return any unused item
within 30 days of purchase for a full refund...
This works because the query words — "return", "policy", "refund" — appear in the document. If you searched for "send items back and get money" instead, keyword search would find nothing.
For the theory behind BM25, see BM25 vs TF-IDF: Keyword Search Explained.
Semantic Search
Matches documents by meaning, regardless of the exact words used.
var results = searcher.Search("my_index", "can I send items back and get money?",
mode: SearchMode.Semantic);
This finds the returns document even though none of those exact words appear in it. The embedding model understands that "send items back" means "return" and "get money" means "refund."
For how embeddings and similarity work, see Semantic Search in C#.
Hybrid Search
Combines keyword and semantic results. This is usually the best default.
var results = searcher.Search("my_index", "how do returns work?",
mode: SearchMode.Hybrid);
1.3282: Our return policy allows customers to return any unused item
within 30 days of purchase for a full refund. Items must be in
their original packaging. Shipping costs are non-refundable.
-10.5874: To reset your password, click "Forgot Password" on the login
page. You will receive an email with a reset link. The link
expires after 24 hours.
-11.0939: We ship to all 50 US states and internationally to over 40
countries. Standard shipping takes 5-7 business days. Express
shipping is available for an additional fee.
Hybrid search catches both exact keyword matches and semantically related content. The scores are from the reranker (more on that below), which is why the gap between relevant and irrelevant results is so large. The returns document scores 1.3, while the other two are deep in the negatives.
Reranking
The results above use a cross-encoder reranker. This is the difference between good search and great search.
The Problem with Embeddings Alone
Embedding models are fast because they encode the query and each document independently. But this means they can't model the interaction between query and document directly. They're comparing summaries, not reading both texts together.
How Reranking Fixes This
A cross-encoder takes the query and a document as a single input and outputs a relevance score. It reads both texts at the same time, so it can attend to specific words in the document that answer the specific question.
Bi-encoder (embedding): Query -> Vector Document -> Vector Compare
Cross-encoder (reranker): [Query + Document] -> Relevance Score
The cross-encoder is slower because it processes each query-document pair individually. That's why it's used as a second stage: the embedding model retrieves candidates quickly, then the cross-encoder reranks the top results precisely.
Using the Reranker Directly
You can also use the reranker on its own:
using var reranker = new Reranker();
var results = reranker.Rerank(
"What is machine learning?",
new[] {
"Machine learning is a subset of artificial intelligence.",
"Deep learning uses neural networks with many layers.",
"The weather today is sunny.",
});
foreach (var r in results)
Console.WriteLine($" {r.Score:F4}: {r.Document}");
10.5139: Machine learning is a subset of artificial intelligence.
-5.5301: Deep learning uses neural networks with many layers.
-11.1001: The weather today is sunny.
The scores are logits, not probabilities. What matters is the relative ordering and the gap between scores. A positive score means the cross-encoder thinks the document is relevant. A negative score means it's not.
The Full Pipeline
Here's how the pieces fit together:
Query
|
+-- BM25 Keyword Index ----> Top N candidates by word match
|
+-- Vector Index ----------> Top N candidates by meaning
|
v
Merge candidates (union or intersection)
|
v
Cross-Encoder Reranker ----> Final ranked results
|
v
Return to user
Each stage filters and refines. BM25 is cheap and catches exact matches. The vector index catches semantic matches that keywords miss. The reranker reads both query and document together to produce a precise ranking.
using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });
using var searcher = new Searcher(
model: "minilm-l6-v2",
rerankerModel: "minilm-l6-v2-cross-encoder");
// Hybrid = BM25 + Semantic + Reranker
var results = searcher.Search("my_index", "how do returns work?",
mode: SearchMode.Hybrid);
When to Use Each Mode
| Mode | Best for | Misses |
|---|---|---|
| Keyword | Exact terms, error codes, IDs | Synonyms, rephrased queries |
| Semantic | Intent matching, fuzzy queries | Exact phrases, rare terms |
| Hybrid | General purpose (recommended) | Slightly slower |
Start with Hybrid. Switch to Keyword if your users search for exact identifiers. Switch to Semantic if your users describe what they want in natural language.
Practical Patterns
Filtering Results
Apply a score threshold to filter out irrelevant results:
var results = searcher.Search("my_index", query, mode: SearchMode.Hybrid);
var relevant = results.Where(r => r.Score > 0.0);
With reranking, a score above 0 is a reasonable default threshold for "probably relevant."
Search + Classification
Find relevant documents, then classify their sentiment. This combines search and classification together:
using var searcher = new Searcher(model: "minilm-l6-v2");
using var classifier = new Classifier("roberta-sentiment");
var results = searcher.Search("reviews_index", "battery life",
mode: SearchMode.Hybrid);
foreach (var r in results.Take(10))
{
var sentiment = classifier.Classify(r.Text);
Console.WriteLine($" {sentiment} \"{r.Text}\"");
}
See Sentiment Analysis in C# Without Python for more on classification.
Re-indexing
When documents change, re-create the index:
indexer.Create("my_index", new[] { "docs/" });
This rebuilds the full index. For large corpora where incremental updates matter, you'd manage the vector storage separately.
How It Compares
| Approach | Setup | Cost | Offline |
|---|---|---|---|
| Elasticsearch | Cluster + config | Server costs | No |
| Azure AI Search | Portal + API key | Per-query pricing | No |
| Algolia | Dashboard + API key | Per-search pricing | No |
| Kjarni | dotnet add package |
Free | Yes |
The tradeoff: Kjarni runs in-process on a single machine. If you need distributed search across billions of documents, use Elasticsearch. If you need search over thousands to millions of documents on a single server, a local engine works well and eliminates a dependency.
How It Works Under the Hood
Kjarni builds two indexes per collection:
- BM25 index — inverted index over tokenized text, with term frequency saturation and document length normalization
- Vector index — encoded embeddings for each chunk, queried by cosine similarity
At search time, both indexes return candidates. The results are merged and optionally reranked by a cross-encoder model that reads the query and each candidate together.
The engine is written in Rust. The C# package wraps a single native library. There is no Python runtime, no JVM, and no external service.
NuGet: https://www.nuget.org/packages/Kjarni
GitHub: https://github.com/olafurjohannsson/kjarni
Other Resources
- Semantic Search in C# - Embeddings and similarity from scratch
- Build a Document Search Engine in C# - Full hybrid search with indexing and reranking
- BM25 vs TF-IDF: Keyword Search Explained - How keyword search works under the hood
- What are Vector Embeddings? - How machines understand meaning through numbers
Top comments (0)