Semantic search goes beyond keyword matching by understanding the meaning behind a query. It’s a core building block for modern AI use cases like Retrieval-Augmented Generation (RAG), intelligent search, and recommendation systems. Vector databases are essential for these use cases. Unlike traditional databases, they are optimized for storing and querying high-dimensional embedding vectors, enabling efficient similarity search and more context-aware AI applications.
In this blog, we’ll build a .NET console application that performs semantic search using - PostgreSQL + pgvector as the vector database,
Ollama for local embedding and chat models and
EF Core for data access
We’ll generate embeddings from user queries, store them and use them to retrieve contextually relevant results from a vector store.
Prerequisites
1. PostgreSQL with pgvector
You can install PostgreSQL locally or use Docker.
If using Docker, run a pgvector-enabled image and expose port 5432.
docker run -d \
--name postgres-pgvector \
-p 5432:5432 \
ankane/pgvector
After setup, enable the vector extension in your database:
CREATE EXTENSION IF NOT EXISTS vector;
2. Ollama (Local LLM)
Install Ollama from https://ollama.com
Pull required models:
ollama pull nomic-embed-text
ollama pull phi4-mini
Start the Ollama server:
ollama serve
Creating the .NET Console Application
Create a new console app dotnet new console -o VectorSearchApp
Next we can add the required packages, here I'm using EF core and DI for configuring the database access.
dotnet add package Microsoft.Extensions.Hosting
dotnet add package Microsoft.EntityFrameworkCore
dotnet add package Microsoft.EntityFrameworkCore.Design
dotnet add package Npgsql.EntityFrameworkCore.PostgreSQL
dotnet add package Pgvector.EntityFrameworkCore
dotnet add package OllamaSharp
Designing the Data Model
We define a simple entity to store the data
using Pgvector;
public class DesignPatternEntity
{
public Guid Id { get; set; } = Guid.NewGuid();
public string Name { get; set; } = default!;
public string Description { get; set; } = default!;
public Vector Embedding { get; set; } =default!;
}
The vector embedding is stored as a Vector type from Pgvector library
Configuring the Database Context
Configure DbContext to:
- Use PostgreSQL via Npgsql
using Microsoft.EntityFrameworkCore;
public class VectorSearchAppDbContext : DbContext
{
public VectorSearchAppDbContext(DbContextOptions<VectorSearchAppDbContext> options) : base(options)
{
}
public DbSet<DesignPatternEntity> DesignPatterns => Set<DesignPatternEntity>();
protected override void OnModelCreating(ModelBuilder modelBuilder)
{
modelBuilder.HasPostgresExtension("vector");
modelBuilder.Entity<DesignPatternEntity>()
.Property(x => x.Embedding)
.HasColumnType("vector(768)"); // match the model dimension
}
}
{
"ConnectionStrings": {
"Default": "Host=localhost;Database=vector_db;Username=rinesh"
}
}
Next, we’ll set up dependency injection and configure the required services as follows.
var services = new ServiceCollection();
var configuration = new ConfigurationBuilder()
.SetBasePath(Directory.GetCurrentDirectory())
.AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
.Build();
services.AddDbContext<VectorSearchAppDbContext>(options =>
{
options.UseNpgsql(
configuration.GetConnectionString("Default"),
o => o.UseVector());
});
Now use EF core migration to create you database with the DesignPatterns table
dotnet ef migrations add InitialCreate --output-dir Data/Migrations
dotnet ef database update
Generating Embeddings
To enable semantic search, we convert text into embeddings. Lets seed some data to the table we will use an embedding model from Ollama to generate vector embedding during the data seeding
Process:
- Take the description of each record
- Generate embeddings using the Ollama embedding model
- Store vectors in PostgreSQL
var ollamaApiClient = new OllamaApiClient("http://localhost:11434", "nomic-embed-text");
services.AddSingleton(ollamaApiClient);
var provider = services.BuildServiceProvider();
var dbContext = provider.GetRequiredService<VectorSearchAppDbContext>();
var embeddingGenerator = provider.GetRequiredService<OllamaApiClient>();
if (!dbContext.DesignPatterns.Any())
{
List<DesignPatternEntity> designPatterns =
[
new()
{
Name = "API Gateway Pattern",
Description = "The API Gateway pattern acts as a single entry point for all client requests in a microservices architecture. It handles request routing, composition, authentication, and rate limiting. This pattern helps simplify client interactions and improves security and observability."
},
new()
{
Name = "CQRS Pattern",
Description = "Command Query Responsibility Segregation (CQRS) separates read and write operations into different models. This improves scalability and performance, especially in systems with high read and write loads. It is often combined with Event Sourcing."
},
new()
{
Name = "Event-Driven Architecture",
Description = "Event-driven architecture enables services to communicate asynchronously using events. Producers emit events without knowing consumers, allowing loose coupling and scalability. This pattern is ideal for distributed systems and real-time processing."
},
// Other design patterns with names and descriptions — omitted for brevity.
];
foreach (var designPattern in designPatterns)
{
var embedding = await embeddingGenerator.EmbedAsync(designPattern.Description!);
designPattern.Embedding = new Vector(embedding.Embeddings[0].ToArray());
dbContext.DesignPatterns.Add(designPattern);
}
await dbContext.SaveChangesAsync();
}
Performing Semantic Search
When a user enters a query:
- Convert the query into an embedding
- Compare it with stored vectors
- Retrieve the most similar results
pgvector provides operators for similarity search - cosine distance(<=>), Euclidean Distance (<->) and Dot Product(<#>)
The most commonly used is cosine distance - Lower values indicate higher similarity.
ORDER BY "Vector" <=> query_vector
Pgvector.EntityFrameworkCore package provides EF equivalent for the similarity search operators
| SQL | EF |
|---|---|
| <=> | .CosineDistance() |
| <-> | .L2Distance() |
| <#> | .MaxInnerProduct() |
We will be using CosineDistance for our similarity search.
Console.WriteLine("Enter your question:");
var question = Console.ReadLine();
var questionEmbedding = await ollamaApiClient.EmbedAsync(question!);
var questionVector = new Vector(questionEmbedding.Embeddings[0].ToArray());
var topSimilarResults = await dbContext.DesignPatterns
.OrderBy(x => x.Embedding.CosineDistance(questionVector))
.Take(2)
.ToListAsync();
if (topSimilarResults.Count == 0)
{
Console.WriteLine("No similar results found");
}
else
{
Console.WriteLine("Top similar design patterns:");
foreach (var designPattern in results)
{
Console.WriteLine($"{designPattern.Name}: {designPattern.Description}");
}
}
Enhancing Results with RAG
Instead of returning raw results, we can improve the experience using an LLM.
Steps:
- Retrieve relevant data using vector search
- Inject that data into a prompt
- Ask the LLM to generate a response
This is called Retrieval-Augmented Generation (RAG). It ensures responses are grounded in your actual data.
I’m introducing an additional client to interact with the LLM, using the Phi-4 model.
var ollamaApiChatClient = new OllamaApiClient("http://localhost:11434", "phi4-mini:latest");
var chatPrompt = "You are an expert solution architect.\n";
foreach (var designPattern in results)
{
chatPrompt += $"Service: {designPattern.Name}\nDescription: {designPattern.Description}\n\n";
}
chatPrompt += $"User Question: {question}";
var chatResponse = await ollamaApiChatClient.GetResponseAsync(chatPrompt);
Console.WriteLine(chatResponse.Text);
In the below screenshot you can see how the application responded to a user query by generating a context aware response based on the retrieved information.
References
Build a .NET AI vector search app
pgvector: Embeddings and vector similarity
OllamaSharp
Source code of this demo app is available here


Top comments (0)