DEV Community

Cover image for Building a .NET RAG Application with PostgreSQL pgvector for AI Vector Search
rinesh
rinesh

Posted on

Building a .NET RAG Application with PostgreSQL pgvector for AI Vector Search

Semantic search goes beyond keyword matching by understanding the meaning behind a query. It’s a core building block for modern AI use cases like Retrieval-Augmented Generation (RAG), intelligent search, and recommendation systems. Vector databases are essential for these use cases. Unlike traditional databases, they are optimized for storing and querying high-dimensional embedding vectors, enabling efficient similarity search and more context-aware AI applications.

In this blog, we’ll build a .NET console application that performs semantic search using - PostgreSQL + pgvector as the vector database,
Ollama for local embedding and chat models and
EF Core for data access

We’ll generate embeddings from user queries, store them and use them to retrieve contextually relevant results from a vector store.

Prerequisites

1. PostgreSQL with pgvector

You can install PostgreSQL locally or use Docker.

If using Docker, run a pgvector-enabled image and expose port 5432.

docker run -d \
  --name postgres-pgvector \
  -p 5432:5432 \
  ankane/pgvector
Enter fullscreen mode Exit fullscreen mode

After setup, enable the vector extension in your database:

CREATE EXTENSION IF NOT EXISTS vector;
Enter fullscreen mode Exit fullscreen mode

2. Ollama (Local LLM)

Install Ollama from https://ollama.com

Pull required models:

ollama pull nomic-embed-text
ollama pull phi4-mini  
Enter fullscreen mode Exit fullscreen mode

Start the Ollama server:

ollama serve  
Enter fullscreen mode Exit fullscreen mode

Creating the .NET Console Application

Create a new console app dotnet new console -o VectorSearchApp

Next we can add the required packages, here I'm using EF core and DI for configuring the database access.

dotnet add package Microsoft.Extensions.Hosting
dotnet add package Microsoft.EntityFrameworkCore
dotnet add package Microsoft.EntityFrameworkCore.Design
dotnet add package Npgsql.EntityFrameworkCore.PostgreSQL
dotnet add package Pgvector.EntityFrameworkCore
dotnet add package OllamaSharp
Enter fullscreen mode Exit fullscreen mode

Designing the Data Model

We define a simple entity to store the data

using Pgvector;

public class DesignPatternEntity
{
    public Guid Id { get; set; } = Guid.NewGuid();

    public string Name { get; set; } = default!;

    public string Description { get; set; } = default!;

    public Vector Embedding { get; set; } =default!;
}
Enter fullscreen mode Exit fullscreen mode

The vector embedding is stored as a Vector type from Pgvector library

Configuring the Database Context

Configure DbContext to:

  • Use PostgreSQL via Npgsql
using Microsoft.EntityFrameworkCore;

public class VectorSearchAppDbContext : DbContext
{
    public VectorSearchAppDbContext(DbContextOptions<VectorSearchAppDbContext> options) : base(options)
    {
    }

    public DbSet<DesignPatternEntity> DesignPatterns => Set<DesignPatternEntity>();

    protected override void OnModelCreating(ModelBuilder modelBuilder)
    {
        modelBuilder.HasPostgresExtension("vector");

        modelBuilder.Entity<DesignPatternEntity>()
            .Property(x => x.Embedding)
            .HasColumnType("vector(768)"); // match the model dimension
    }
}
Enter fullscreen mode Exit fullscreen mode
{
  "ConnectionStrings": {
    "Default": "Host=localhost;Database=vector_db;Username=rinesh"
  }
}
Enter fullscreen mode Exit fullscreen mode

Next, we’ll set up dependency injection and configure the required services as follows.

var services = new ServiceCollection();

var configuration = new ConfigurationBuilder()
    .SetBasePath(Directory.GetCurrentDirectory())
    .AddJsonFile("appsettings.json", optional: false, reloadOnChange: true)
    .Build();

services.AddDbContext<VectorSearchAppDbContext>(options =>
{
    options.UseNpgsql(
        configuration.GetConnectionString("Default"),
        o => o.UseVector());
});
Enter fullscreen mode Exit fullscreen mode

Now use EF core migration to create you database with the DesignPatterns table

dotnet ef migrations add InitialCreate --output-dir Data/Migrations
dotnet ef database update 
Enter fullscreen mode Exit fullscreen mode

Database Table

Generating Embeddings

To enable semantic search, we convert text into embeddings. Lets seed some data to the table we will use an embedding model from Ollama to generate vector embedding during the data seeding

Process:

  1. Take the description of each record
  2. Generate embeddings using the Ollama embedding model
  3. Store vectors in PostgreSQL
var ollamaApiClient = new OllamaApiClient("http://localhost:11434", "nomic-embed-text");
services.AddSingleton(ollamaApiClient);

var provider = services.BuildServiceProvider();
var dbContext = provider.GetRequiredService<VectorSearchAppDbContext>();
var embeddingGenerator = provider.GetRequiredService<OllamaApiClient>();

if (!dbContext.DesignPatterns.Any())
{
    List<DesignPatternEntity> designPatterns =
    [
        new()
        {
            Name = "API Gateway Pattern",
            Description = "The API Gateway pattern acts as a single entry point for all client requests in a microservices architecture. It handles request routing, composition, authentication, and rate limiting. This pattern helps simplify client interactions and improves security and observability."
        },
        new()
        {
            Name = "CQRS Pattern",
            Description = "Command Query Responsibility Segregation (CQRS) separates read and write operations into different models. This improves scalability and performance, especially in systems with high read and write loads. It is often combined with Event Sourcing."
        },
        new()
        {
            Name = "Event-Driven Architecture",
            Description = "Event-driven architecture enables services to communicate asynchronously using events. Producers emit events without knowing consumers, allowing loose coupling and scalability. This pattern is ideal for distributed systems and real-time processing."
        },
        // Other design patterns with names and descriptions — omitted for brevity.
    ];

    foreach (var designPattern in designPatterns)
    {
        var embedding = await embeddingGenerator.EmbedAsync(designPattern.Description!);

        designPattern.Embedding = new Vector(embedding.Embeddings[0].ToArray());

        dbContext.DesignPatterns.Add(designPattern);
    }

    await dbContext.SaveChangesAsync();
}
Enter fullscreen mode Exit fullscreen mode

Performing Semantic Search

When a user enters a query:

  1. Convert the query into an embedding
  2. Compare it with stored vectors
  3. Retrieve the most similar results

pgvector provides operators for similarity search - cosine distance(<=>), Euclidean Distance (<->) and Dot Product(<#>)

The most commonly used is cosine distance - Lower values indicate higher similarity.
ORDER BY "Vector" <=> query_vector
Pgvector.EntityFrameworkCore package provides EF equivalent for the similarity search operators

SQL EF
<=> .CosineDistance()
<-> .L2Distance()
<#> .MaxInnerProduct()

We will be using CosineDistance for our similarity search.

Console.WriteLine("Enter your question:");
var question = Console.ReadLine();

var questionEmbedding = await ollamaApiClient.EmbedAsync(question!);
var questionVector = new Vector(questionEmbedding.Embeddings[0].ToArray());

var topSimilarResults = await dbContext.DesignPatterns
    .OrderBy(x => x.Embedding.CosineDistance(questionVector))
    .Take(2)
    .ToListAsync();

if (topSimilarResults.Count == 0)
{
    Console.WriteLine("No similar results found");
}

else
{
    Console.WriteLine("Top similar design patterns:");
    foreach (var designPattern in results)
    {
        Console.WriteLine($"{designPattern.Name}: {designPattern.Description}");
    }
}
Enter fullscreen mode Exit fullscreen mode

Enhancing Results with RAG

Instead of returning raw results, we can improve the experience using an LLM.

Steps:

  1. Retrieve relevant data using vector search
  2. Inject that data into a prompt
  3. Ask the LLM to generate a response

This is called Retrieval-Augmented Generation (RAG). It ensures responses are grounded in your actual data.

I’m introducing an additional client to interact with the LLM, using the Phi-4 model.

var ollamaApiChatClient = new OllamaApiClient("http://localhost:11434", "phi4-mini:latest");

var chatPrompt = "You are an expert solution architect.\n";

foreach (var designPattern in results)
{
    chatPrompt += $"Service: {designPattern.Name}\nDescription: {designPattern.Description}\n\n";
}

chatPrompt += $"User Question: {question}";

var chatResponse = await ollamaApiChatClient.GetResponseAsync(chatPrompt);

Console.WriteLine(chatResponse.Text);
Enter fullscreen mode Exit fullscreen mode

In the below screenshot you can see how the application responded to a user query by generating a context aware response based on the retrieved information.

LLM Response

References

Build a .NET AI vector search app
pgvector: Embeddings and vector similarity
OllamaSharp

Source code of this demo app is available here

Top comments (0)