I usually build simple applications that expose public APIs backed by PostgreSQL databases. But sometimes, it is not enough for the product. However, businesses often need more flexible and intelligent search capabilities than what PostgreSQL’s built-in search provides. Someone may argue that PostgreSQL supports full-text search. Yes, it's fine, but it's more for finding specific words or phrases. Semantic search focuses on understanding contextual meaning rather than exact keyword matches - and that’s what we’ll explore.
Database side
The first solution that comes to mind is Elasticsearch. It’s an excellent tool for our problem, but it introduces extra cost, additional infrastructure, and the need to learn and integrate a new technology. Since we already use PostgreSQL, we can simply enable the pgvector extension for vector search.
- Install NuGet to support vectors
- Configure a model to store and search vectors. I'm using the
ivfflat
index type, which is optimised for approximate nearest neighbor search. And I'm enablingvector_cosine_ops
cosine distance similarity search.
modelBuilder.HasPostgresExtension("vector");
modelBuilder.Entity<TextEmbedding>(entity =>
{
entity.HasKey(e => e.Id);
entity.Property(e => e.Content).IsRequired();
// all-minilm produces 384-dimensional vectors
entity.Property(e => e.Embedding).HasColumnType("vector(384)");
entity.Property(e => e.CreatedAt).IsRequired();
// Add index for vector similarity search
entity.HasIndex(e => e.Embedding)
.HasMethod("ivfflat")
.HasOperators("vector_cosine_ops");
});
Application part
Now that the database is ready to work with vectors, let’s move to the application side. We'll register an embedding service, create an endpoint to store text embeddings, and finally build a semantic search endpoint.
1. Register the embedding service.
Microsoft provides an abstraction layer for working with large language models via the Microsoft.Extensions.AI
NuGet package, so we don’t have to depend on a specific provider. I will use Ollama - a free, open-source tool that simplifies running LLMs. NuGet OllamaSharp
to work in NET.
services.AddSingleton<IEmbeddingGenerator<string, Embedding<float>>>(sp =>
{
var modelId = builder.Configuration["Ollama:EmbeddingModel"];
var baseUrl = builder.Configuration["Ollama:BaseUrl"];
return new OllamaApiClient(baseUrl, modelId);
});
2. Create an endpoint to convert text to vectors and store them
app.MapPost("/text", async (
Text[] request,
EmbeddingDbContext context,
IEmbeddingGenerator<string, Embedding<float>> embeddingService) =>
{
// Convert request to vectors
var embeddings = await embeddingService.GenerateAsync(
request.Select(s => s.Content)
);
var dbModels = embeddings.Select((embedding, index) => new TextEmbedding
{
Content = request[index].Content,
Embedding = new Vector(embedding.Vector),
CreatedAt = DateTime.UtcNow
}).ToList();
context.AddRange(dbModels);
await context.SaveChangesAsync();
return TypedResults.Created();
});
3. The magic happens here - semantic search using cosine similarity
app.MapGet("/text/search", async (
string query,
EmbeddingDbContext context,
IEmbeddingGenerator<string, Embedding<float>> embeddingService) =>
{
int limit = 5;
// Convert query to vector
var embeddings = await embeddingService.GenerateAsync(query);
var queryEmbedding = new Vector(embeddings.Vector);
// Find most similar texts using cosine distance
var similarTexts = await context.TextEmbeddings
.OrderBy(x => x.Embedding.CosineDistance(queryEmbedding))
.Select(x => new Text(x.Content))
.Take(limit)
.ToListAsync();
return TypedResults.Ok(similarTexts);
});
Test local
To run everything locally, we can use Ollama for generating embeddings. This approach avoids cloud APIs (like OpenAI or Azure) and keeps everything self-contained.
ollama:
image: ollama/ollama:latest
container_name: ollama
ports:
- "11434:11434"
entrypoint:
- "/bin/bash"
- "-c"
- "ollama serve & sleep 5 && ollama pull all-minilm && wait"
environment:
- OLLAMA_KEEP_ALIVE="24h"
volumes:
- ollama_data:/root/.ollama
The rest of the stack - the API, PostgreSQL, and database migrations - can be managed within the same Docker Compose setup.
Conclusion
By combining EF Core, PostgreSQL, and the pgvector extension, we can bring powerful semantic search capabilities directly into ASP.NET applications — without introducing new infrastructure.
The approach is efficient, cost-effective, and fully compatible with familiar .NET tools. Thanks to the Microsoft.Extensions.AI abstraction, we can easily swap between local (Ollama) and cloud (OpenAI, Azure) embeddings. This setup is ideal for building document search, recommendation engines, similarity detection, and RAG-based applications.
You can find the full implementation details on my
Architecture
flowchart LR
subgraph Architecture
API[API .NET]
DB[(PostgreSQL using pgvector)]
OLLAMA[Ollama LLM]
MIGRATIONS[EF Migrations]
end
API <--> DB
API <--> OLLAMA
MIGRATIONS --> DB
API Endpoints
Method
Endpoint
Description
GET
/text
List first 20 texts
POST
/text
Add texts (array of { content })
GET
/text/search
Vector search (?query=...)
Testing the API
You can use the provided .http file to easily test the API endpoints directly from Visual Studio Code or other compatible tools. This file contains example requests for adding, searching, and listing texts.
EmbeddingPoC
A proof-of-concept .NET API for text embedding and semantic search using PostgreSQL with pgvector and Ollama for embedding generation.
Features
- Store and retrieve text embeddings in PostgreSQL using Entity Framework Core and pgvector.
- Generate embeddings via Ollama API.
- REST endpoints for:
- Adding new texts and their embeddings.
- Searching for similar texts using vector similarity.
- Listing stored texts.
Usage
Simply run:
docker-compose up --build
…
Top comments (0)