Ólafur Aron Jóhannsson

Posted on Feb 14 • Originally published at olafuraron.is

Semantic Search in C# Without Python

#csharp #dotnet #machinelearning #nlp

Semantic Search in C# — Without a Vector Database

Keyword search finds documents that contain the words you typed. Semantic search finds documents that mean what you meant.

Search for "how to change my login credentials". Keyword search returns nothing because none of your documents contain those exact words. Semantic search returns "How do I reset my password?" because the meaning is the same.

dotnet add package Kjarni

NuGet

using Kjarni;

using var embedder = new Embedder("minilm-l6-v2");
Console.WriteLine(embedder.Similarity("doctor", "physician")); // 0.8598
Console.WriteLine(embedder.Similarity("doctor", "banana"));    // 0.3379

No API key. No Python. No vector database. The model runs locally on CPU.

How Semantic Search Works

The core idea: convert text into numbers that capture meaning.

A sentence embedding model reads text and outputs a vector — an array of floating-point numbers, typically 384 or 768 dimensions. Texts with similar meaning produce vectors that are close together. Texts with different meaning produce vectors that are far apart.

"doctor"    -> [0.12, -0.34, 0.56, 0.78, ...]  (384 numbers)
"physician" -> [0.11, -0.33, 0.55, 0.79, ...]  (384 numbers)  <- close
"banana"    -> [-0.45, 0.23, -0.12, 0.01, ...]  (384 numbers)  <- far

You measure the distance between vectors using cosine similarity. The score ranges from -1 (opposite) to 1 (identical).

For a deeper explanation of how embeddings work, see What are Vector Embeddings?.

Encoding Text

using var embedder = new Embedder("minilm-l6-v2");
float[] vector = embedder.Encode("Hello world");
Console.WriteLine(vector.Length);                          // 384
Console.WriteLine(string.Join(", ", vector[..5]));
// -0.034477282, 0.03102318, 0.006734989, 0.02610899, -0.03936202

The model downloads on first use (~90MB) and caches locally. Every call to Encode() runs the full transformer: tokenization, attention layers, mean pooling, normalization. The output is a unit-length vector ready for cosine similarity.

These are the same vectors you'd get from Python's sentence-transformers:

from sentence_transformers import SentenceTransformer
model = SentenceTransformer('all-MiniLM-L6-v2')
vector = model.encode("Hello world", normalize_embeddings=True)
# [-0.03447726  0.03102319  0.00673499  0.02610895 -0.03936201]

Same model, same weights, same output.

Measuring Similarity

Cosine similarity tells you how close two vectors are:

using var embedder = new Embedder("minilm-l6-v2");

var pairs = new[] {
    ("doctor", "physician"),
    ("doctor", "hospital"),
    ("doctor", "banana"),
    ("cat", "dog"),
    ("cat", "car"),
    ("machine learning", "artificial intelligence"),
    ("machine learning", "potato soup"),
};

foreach (var (a, b) in pairs)
    Console.WriteLine($"  {embedder.Similarity(a, b):F4}  \"{a}\" / \"{b}\"");

  0.8598  "doctor" / "physician"
  0.5971  "doctor" / "hospital"
  0.3379  "doctor" / "banana"
  0.6606  "cat" / "dog"
  0.4633  "cat" / "car"
  0.7035  "machine learning" / "artificial intelligence"
  0.1848  "machine learning" / "potato soup"

The scores match intuition. "Doctor" and "physician" are near-synonyms (0.86). "Cat" and "dog" are related but different (0.66). "Machine learning" and "potato soup" have almost nothing in common (0.18).

Building a Search

Here's the pattern. Encode your documents once. Encode the query at search time. Rank by cosine similarity.

using var embedder = new Embedder("minilm-l6-v2");

var docs = new[] {
    "How do I reset my password?",
    "What is your refund policy?",
    "Do you ship internationally?",
    "How do I update my billing address?",
    "Where can I track my order?",
};

// Encode all documents (do this once, store the vectors)
var vectors = embedder.EncodeBatch(docs);

// Search
var query = embedder.Encode("I need to change my login credentials");

var results = docs
    .Select((doc, i) => (doc, score: Embedder.CosineSimilarity(query, vectors[i])))
    .OrderByDescending(x => x.score);

foreach (var (doc, score) in results)
    Console.WriteLine($"  {score:F4}: {doc}");

  0.5981: How do I reset my password?
  0.4067: How do I update my billing address?
  0.0767: Where can I track my order?
 -0.0027: What is your refund policy?
 -0.0451: Do you ship internationally?

"Change my login credentials" matches "reset my password" at 0.60 despite sharing zero keywords. That's the entire value of semantic search.

"Update my billing address" comes second. The model understands that changing account information is related even though the specific fields differ.

FAQ Matching

Route support tickets to the most relevant FAQ:

using var embedder = new Embedder("minilm-l6-v2");

var faqs = new[] {
    "How do I cancel my subscription?",
    "How do I get a refund?",
    "How do I change my email address?",
    "What payment methods do you accept?",
    "How do I contact support?",
};
var faqVectors = embedder.EncodeBatch(faqs);

string MatchFaq(string userQuestion)
{
    var queryVec = embedder.Encode(userQuestion);
    var best = faqs
        .Select((faq, i) => (faq, score: Embedder.CosineSimilarity(queryVec, faqVectors[i])))
        .OrderByDescending(x => x.score)
        .First();

    return best.score > 0.4 ? best.faq : "No matching FAQ found.";
}

Console.WriteLine(MatchFaq("I want to stop paying"));
// How do I cancel my subscription?

Console.WriteLine(MatchFaq("Can I pay with crypto?"));
// What payment methods do you accept?

Encode your FAQs once at startup, store the vectors, and only encode the user's query at request time.

Deduplication

Find near-duplicate content in a dataset:

var texts = GetAllDocuments();
var vectors = embedder.EncodeBatch(texts);

var duplicates = new List<(string, string, float)>();

for (int i = 0; i < texts.Length; i++)
    for (int j = i + 1; j < texts.Length; j++)
    {
        var sim = Embedder.CosineSimilarity(vectors[i], vectors[j]);
        if (sim > 0.85)
            duplicates.Add((texts[i], texts[j], sim));
    }

A threshold of 0.85 catches rephrased content while ignoring merely related documents.

Combining with Sentiment

Find relevant reviews about a topic, then check their sentiment. See Sentiment Analysis in C# for more on the classification side.

using var embedder = new Embedder("minilm-l6-v2");
using var classifier = new Classifier("roberta-sentiment");

var query = embedder.Encode("battery life");
var relevant = reviews
    .Select(r => (review: r, score: Embedder.CosineSimilarity(query, embedder.Encode(r))))
    .Where(x => x.score > 0.3)
    .OrderByDescending(x => x.score);

foreach (var (review, score) in relevant)
    Console.WriteLine($"{classifier.Classify(review)}  \"{review}\"");

Choosing a Model

Model	Dimensions	Speed	Quality
`minilm-l6-v2`	384	Fast	Good
`mpnet-base-v2`	768	Slower	Better

Start with minilm-l6-v2. It's the most widely used embedding model in production and handles most use cases well. Switch to mpnet-base-v2 if you need higher quality and can afford the extra latency and memory.

Both models have a 512 token input limit (~300-400 words). Longer text gets truncated. If your documents are long, split them into chunks first — which is exactly what the document search engine does for you automatically.

When to Use Semantic Search vs Keyword Search

Semantic search is not always better than keyword search.

Semantic search works best when:

Users don't know the exact terminology
You're matching intent, not words ("change login" → "reset password")
Documents are short (FAQs, product descriptions, support tickets)

Keyword search works best when:

Users search for exact terms (error codes, product IDs, proper nouns)
You need exact phrase matching
Documents are long and repetitive keywords matter

The best approach is usually both. See Build a Document Search Engine in C# for a hybrid search implementation that combines BM25 keyword search with semantic vectors and reranking.

For the theory behind keyword search, see BM25 vs TF-IDF: Keyword Search Explained.

How This Works

Kjarni loads HuggingFace sentence-transformer models directly from safetensors. The inference engine is written in Rust. The C# package wraps a single native library.

The outputs match Python's sentence-transformers

NuGet:  https://www.nuget.org/packages/Kjarni
GitHub: https://github.com/olafurjohannsson/kjarni

Other Resources

Semantic Search in C# - Embeddings and similarity from scratch
Build a Document Search Engine in C# - Full hybrid search with indexing and reranking
BM25 vs TF-IDF: Keyword Search Explained - How keyword search works under the hood
What are Vector Embeddings? - How machines understand meaning through numbers

DEV Community