DEV Community

Ólafur Aron Jóhannsson
Ólafur Aron Jóhannsson

Posted on • Originally published at olafuraron.is

Build a Document Search Engine in C# Without Python

Build a Document Search Engine in C

Most search implementations fall into one of two camps: send everything to Elasticsearch, or call a search API. Both work. Both add infrastructure.

Here's a third option. Index local files, search them by keyword, by meaning, or both, in about 10 lines of C#. No external services.

dotnet add package Kjarni
Enter fullscreen mode Exit fullscreen mode

NuGet

using Kjarni;

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

using var searcher = new Searcher(
    model: "minilm-l6-v2",
    rerankerModel: "minilm-l6-v2-cross-encoder");

var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);

foreach (var r in results)
    Console.WriteLine($"  {r.Score:F4}: {r.Text}");
Enter fullscreen mode Exit fullscreen mode

The indexer reads your files, splits them into chunks, encodes each chunk as a vector, and builds a BM25 keyword index. The searcher queries both indexes and combines the results.

Setup

Create a few text files to search over:

mkdir -p docs
Enter fullscreen mode Exit fullscreen mode

docs/returns.txt:

Our return policy allows customers to return any unused item within 30 days
of purchase for a full refund. Items must be in their original packaging.
Shipping costs are non-refundable.
Enter fullscreen mode Exit fullscreen mode

docs/shipping.txt:

We ship to all 50 US states and internationally to over 40 countries.
Standard shipping takes 5-7 business days. Express shipping is available
for an additional fee.
Enter fullscreen mode Exit fullscreen mode

docs/account.txt:

To reset your password, click "Forgot Password" on the login page.
You will receive an email with a reset link. The link expires after 24 hours.
Enter fullscreen mode Exit fullscreen mode

Three short documents. In practice these could be product manuals, support articles, internal wikis, or any text files.

Indexing

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });
Enter fullscreen mode Exit fullscreen mode

The indexer does three things:

  1. Reads all files in the given directories
  2. Chunks each file into passages (for long documents)
  3. Encodes each chunk into a 384-dimension vector using the embedding model

It also builds a BM25 keyword index over the same chunks. The result is a local index on disk that you can query repeatedly without re-indexing.

Three Search Modes

Keyword Search (BM25)

Matches documents that contain the query words. The same algorithm that powers Elasticsearch and Solr.

var results = searcher.Search("my_index", "return policy refund",
    mode: SearchMode.Keyword);
Enter fullscreen mode Exit fullscreen mode
  7.8795: Our return policy allows customers to return any unused item
          within 30 days of purchase for a full refund...
Enter fullscreen mode Exit fullscreen mode

This works because the query words — "return", "policy", "refund" — appear in the document. If you searched for "send items back and get money" instead, keyword search would find nothing.

For the theory behind BM25, see BM25 vs TF-IDF: Keyword Search Explained.

Semantic Search

Matches documents by meaning, regardless of the exact words used.

var results = searcher.Search("my_index", "can I send items back and get money?",
    mode: SearchMode.Semantic);
Enter fullscreen mode Exit fullscreen mode

This finds the returns document even though none of those exact words appear in it. The embedding model understands that "send items back" means "return" and "get money" means "refund."

For how embeddings and similarity work, see Semantic Search in C#.

Hybrid Search

Combines keyword and semantic results. This is usually the best default.

var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);
Enter fullscreen mode Exit fullscreen mode
   1.3282: Our return policy allows customers to return any unused item
           within 30 days of purchase for a full refund. Items must be in
           their original packaging. Shipping costs are non-refundable.

 -10.5874: To reset your password, click "Forgot Password" on the login
           page. You will receive an email with a reset link. The link
           expires after 24 hours.

 -11.0939: We ship to all 50 US states and internationally to over 40
           countries. Standard shipping takes 5-7 business days. Express
           shipping is available for an additional fee.
Enter fullscreen mode Exit fullscreen mode

Hybrid search catches both exact keyword matches and semantically related content. The scores are from the reranker (more on that below), which is why the gap between relevant and irrelevant results is so large. The returns document scores 1.3, while the other two are deep in the negatives.

Reranking

The results above use a cross-encoder reranker. This is the difference between good search and great search.

The Problem with Embeddings Alone

Embedding models are fast because they encode the query and each document independently. But this means they can't model the interaction between query and document directly. They're comparing summaries, not reading both texts together.

How Reranking Fixes This

A cross-encoder takes the query and a document as a single input and outputs a relevance score. It reads both texts at the same time, so it can attend to specific words in the document that answer the specific question.

Bi-encoder (embedding):     Query -> Vector    Document -> Vector    Compare
Cross-encoder (reranker):   [Query + Document] -> Relevance Score
Enter fullscreen mode Exit fullscreen mode

The cross-encoder is slower because it processes each query-document pair individually. That's why it's used as a second stage: the embedding model retrieves candidates quickly, then the cross-encoder reranks the top results precisely.

Using the Reranker Directly

You can also use the reranker on its own:

using var reranker = new Reranker();

var results = reranker.Rerank(
    "What is machine learning?",
    new[] {
        "Machine learning is a subset of artificial intelligence.",
        "Deep learning uses neural networks with many layers.",
        "The weather today is sunny.",
    });

foreach (var r in results)
    Console.WriteLine($"  {r.Score:F4}: {r.Document}");
Enter fullscreen mode Exit fullscreen mode
  10.5139: Machine learning is a subset of artificial intelligence.
  -5.5301: Deep learning uses neural networks with many layers.
 -11.1001: The weather today is sunny.
Enter fullscreen mode Exit fullscreen mode

The scores are logits, not probabilities. What matters is the relative ordering and the gap between scores. A positive score means the cross-encoder thinks the document is relevant. A negative score means it's not.

The Full Pipeline

Here's how the pieces fit together:

Query
  |
  +-- BM25 Keyword Index ----> Top N candidates by word match
  |
  +-- Vector Index ----------> Top N candidates by meaning
  |
  v
  Merge candidates (union or intersection)
  |
  v
  Cross-Encoder Reranker ----> Final ranked results
  |
  v
  Return to user
Enter fullscreen mode Exit fullscreen mode

Each stage filters and refines. BM25 is cheap and catches exact matches. The vector index catches semantic matches that keywords miss. The reranker reads both query and document together to produce a precise ranking.

using var indexer = new Indexer(model: "minilm-l6-v2", quiet: true);
indexer.Create("my_index", new[] { "docs/" });

using var searcher = new Searcher(
    model: "minilm-l6-v2",
    rerankerModel: "minilm-l6-v2-cross-encoder");

// Hybrid = BM25 + Semantic + Reranker
var results = searcher.Search("my_index", "how do returns work?",
    mode: SearchMode.Hybrid);
Enter fullscreen mode Exit fullscreen mode

When to Use Each Mode

Mode Best for Misses
Keyword Exact terms, error codes, IDs Synonyms, rephrased queries
Semantic Intent matching, fuzzy queries Exact phrases, rare terms
Hybrid General purpose (recommended) Slightly slower

Start with Hybrid. Switch to Keyword if your users search for exact identifiers. Switch to Semantic if your users describe what they want in natural language.

Practical Patterns

Filtering Results

Apply a score threshold to filter out irrelevant results:

var results = searcher.Search("my_index", query, mode: SearchMode.Hybrid);
var relevant = results.Where(r => r.Score > 0.0);
Enter fullscreen mode Exit fullscreen mode

With reranking, a score above 0 is a reasonable default threshold for "probably relevant."

Search + Classification

Find relevant documents, then classify their sentiment. This combines search and classification together:

using var searcher = new Searcher(model: "minilm-l6-v2");
using var classifier = new Classifier("roberta-sentiment");

var results = searcher.Search("reviews_index", "battery life",
    mode: SearchMode.Hybrid);

foreach (var r in results.Take(10))
{
    var sentiment = classifier.Classify(r.Text);
    Console.WriteLine($"  {sentiment}  \"{r.Text}\"");
}
Enter fullscreen mode Exit fullscreen mode

See Sentiment Analysis in C# Without Python for more on classification.

Re-indexing

When documents change, re-create the index:

indexer.Create("my_index", new[] { "docs/" });
Enter fullscreen mode Exit fullscreen mode

This rebuilds the full index. For large corpora where incremental updates matter, you'd manage the vector storage separately.

How It Compares

Approach Setup Cost Offline
Elasticsearch Cluster + config Server costs No
Azure AI Search Portal + API key Per-query pricing No
Algolia Dashboard + API key Per-search pricing No
Kjarni dotnet add package Free Yes

The tradeoff: Kjarni runs in-process on a single machine. If you need distributed search across billions of documents, use Elasticsearch. If you need search over thousands to millions of documents on a single server, a local engine works well and eliminates a dependency.

How It Works Under the Hood

Kjarni builds two indexes per collection:

  • BM25 index — inverted index over tokenized text, with term frequency saturation and document length normalization
  • Vector index — encoded embeddings for each chunk, queried by cosine similarity

At search time, both indexes return candidates. The results are merged and optionally reranked by a cross-encoder model that reads the query and each candidate together.

The engine is written in Rust. The C# package wraps a single native library. There is no Python runtime, no JVM, and no external service.

NuGet:  https://www.nuget.org/packages/Kjarni
GitHub: https://github.com/olafurjohannsson/kjarni
Enter fullscreen mode Exit fullscreen mode

Other Resources

Top comments (0)