Thang Chung

Posted on Nov 19, 2024

CoffeeShop App Infused with AI - Intelligent Apps Development

#genai #webdev #dotnet #beginners

Introduction

As far as we know, AI (GenAI) gained traction in recent years, it should be an undeniable trend and paradigm-shifting for everything we develop in the future. My job looks like not really related to what we talked about GenAI for many years, but recently when I watched .NET Conf 2024, I saw that it might affect what I'm doing in the next couple of months.

What I want to build is semantic searching (similar word searching in synonym meaning and different languages) and generates some random text for seeding data using GenAI with some popular LLM models.

This article is a result of what I researched and worked on GenAI with .NET apps. And it is a bedrock for everything I will do next. Let's get started with the application I would like to build as follows.

CoffeeShop with GenAI - technical stuff

Source code for these scenarios can be found at https://github.com/thangchung/practical-dotnet-aspire

I intend to use Ollama for local development (saving cost), and in a higher environment, I use Azure OpenAI service.

The business use cases for these scenarios are semantic search and chat completion (text summary for data seeding actually).

Semantic search with GenAI

Image from: https://blog.dataiku.com/semantic-search-an-overlooked-nlp-superpower

Let's say we have a chicken word on the database, then now we can search it with rooky or even poulet - French.

The technologies used to implement semantic search in this scenario are pgvector, and its .NET packages. We use the cosine distance searching which is supported by pgvector extension.

Supported distance functions are:

<-> - L2 distance
<#> - (negative) inner product
<=> - cosine distance
<+> - L1 distance (added in 0.7.0)
<~> - Hamming distance (binary vectors, added in 0.7.0)
<%> - Jaccard distance (binary vectors, added in 0.7.0)

And we use <=> - cosine distance for this scenario

SELECT p.id, p.description, p.embedding, p.price, p.type, p.updated, p.embedding <=> @__vector_0 AS "Distance"
FROM item.products AS p
ORDER BY p.embedding <=> @__vector_0

Recently, I read through the blog at https://nikiforovall.github.io/dotnet/2024/10/19/semantic-search-via-elastic-dotnet.html, and I'm very soon to bind in with semantic search via GenAI.

Chat completion (text summary)

If you use ChatGPT, then you know exactly what I'm talking about

In my scenario, I use this feature to summarize a keyword that I give, then let the LLM model infer and generate the keyword summary from it. For example, giving COFFEE_BLACK, then with a simple prompt like Generate the description of COFFEE_BLACK in max 20 words, then it will generate the description like Coffee black is a rich, bold brew, showcasing the pure essence of coffee without milk or sugar for an intense flavour.. How cool is that?

LLM model usages

And LLM models are used:

Ollama
- Embedded model: all-minilm
- Chat model: llama3.2:1b
Azure OpenAI service
- Embedded model: text-embedding-3-small
- Chat model: gpt-4o-mini

I leveraged .NET Aspire 9 to orchestrate all the components we used for these scenarios.

Implementation using `Microsoft.Extensions.AI` AI building blocks

Some NuGet packages that we used for these scenarios:

<PackageVersion Include="Microsoft.Extensions.AI" Version="$(AIExtensions)" />
<PackageVersion Include="Microsoft.Extensions.AI.Abstractions" Version="$(AIExtensions)" />
<PackageVersion Include="Microsoft.Extensions.AI.Ollama" Version="$(AIExtensions)" />
<PackageVersion Include="Microsoft.Extensions.AI.OpenAI" Version="$(AIExtensions)" />
<PackageVersion Include="Azure.AI.OpenAI" Version="2.1.0-beta.2" />

And some .NET Aspire components:

<PackageVersion Include="Aspire.Hosting" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Hosting.AppHost" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Hosting.PostgreSQL" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Hosting.RabbitMQ" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Hosting.Redis" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Hosting.Testing" Version="$(AspireVersion)" />
<PackageVersion Include="Aspire.Npgsql.EntityFrameworkCore.PostgreSQL" Version="9.0.0-rc.2.24551.3" />
<PackageVersion Include="Aspire.Azure.AI.OpenAI" Version="9.0.0-preview.5.24551.3" />
<PackageVersion Include="CommunityToolkit.Aspire.Hosting.Ollama" Version="9.0.0-beta.66" />

See the .NET Aspire Community just notified a couple of days ago at https://github.com/CommunityToolkit/Aspire/tree/main/src/CommunityToolkit.Aspire.OllamaSharp => We use it to simplify the set-up of Ollama and its models.

The .NET Aspire AppHost:

using CoffeeShop.AppHost;

var builder = DistributedApplication.CreateBuilder(args);

var postgresQL = builder.AddPostgres("postgresQL")
                        .WithImage("ankane/pgvector")
                        .WithImageTag("latest")
                        .WithLifetime(ContainerLifetime.Persistent)
                        .WithHealthCheck()
                        .WithPgWeb()
                        //.WithPgAdmin()
                        ;
var postgres = postgresQL.AddDatabase("postgres");

var redis = builder.AddRedis("redis")
                    // .WithContainerName("redis") // use an existing container
                    .WithLifetime(ContainerLifetime.Persistent)
                    .WithHealthCheck()
                    .WithRedisCommander();

var rabbitmq = builder.AddRabbitMQ("rabbitmq")
                        .WithLifetime(ContainerLifetime.Persistent) 
                        .WithHealthCheck()
                        .WithManagementPlugin();

var ollama = builder.AddOllama("ollama")
                .WithImageTag("0.3.14")
                .WithLifetime(ContainerLifetime.Persistent)
                .WithDataVolume()
                //.WithOpenWebUI()
                ;

var allMinilmModel = ollama.AddModel("all-minilm", "all-minilm");
var llama32Model = ollama.AddModel("llama32", "llama3.2:1b");

var productApi = builder.AddProject<Projects.CoffeeShop_ProductApi>("product-api")
                        .WithReference(postgres).WaitFor(postgres)
                        .WithEnvironment($"ai:Type", "ollama")
                        .WithEnvironment($"ai:EMBEDDINGMODEL", "all-minilm")
                        .WithEnvironment($"ai:CHATMODEL", "llama3.2:1b")
                        .WithReference(ollama).WaitFor(allMinilmModel).WaitFor(llama32Model)
                        .WithSwaggerUI();

// set to true if you want to use OpenAI
bool useOpenAI = true;
if (useOpenAI)
{
    // builder.AddOpenAI(productApi);
    var openAI = builder.AddConnectionString("openai");
    productApi
            .WithReference(openAI)
            .WithEnvironment("ai:Type", "openai")
            .WithEnvironment("ai:EMBEDDINGMODEL", "text-embedding-3-small")
            .WithEnvironment("ai:CHATMODEL", "gpt-4o-mini");
}

builder.AddProject<Projects.CoffeeShop_Yarp>("yarp")
    .WithReference(productApi).WaitFor(productApi);

builder.Build().Run();

IEmbeddingGenerator implementation

We need to register IEmbeddingGenerator in Program.cs:

if (builder.Configuration.GetValue<string>("ai:Type") is string type && type is "ollama")
{
    builder.Services.AddEmbeddingGenerator<string, Embedding<float>>(b => b
        .UseOpenTelemetry()
        .UseLogging()
        .Use(new OllamaEmbeddingGenerator(
            new Uri(builder.Configuration["AI:OLLAMA:Endpoint"]!),
            "all-minilm")));
}
else
{
    builder.AddAzureOpenAIClient("openai");
    builder.Services.AddEmbeddingGenerator<string, Embedding<float>>(b => b
        .UseOpenTelemetry()
        .UseLogging()
        .Use(b.Services.GetRequiredService<OpenAIClient>().AsEmbeddingGenerator(builder.Configuration.GetValue<string>("ai:EMBEDDINGMODEL")!)));
}

And, we will create an embedded generator engine for a product item:

public interface IProductItemAI
{
    bool IsEnabled { get; }

    ValueTask<Vector> GetEmbeddingAsync(string text);

    ValueTask<Vector> GetEmbeddingAsync(ItemV2 item);

    ValueTask<IReadOnlyList<Vector>> GetEmbeddingsAsync(IEnumerable<ItemV2> item);
}

And its implementation:

public class ProductItemAI(
    IWebHostEnvironment environment, 
    ILogger<ProductItemAI> logger, 
    IEmbeddingGenerator<string, Embedding<float>> embeddingGenerator = null) 
    : IProductItemAI
{
    private const int EmbeddingDimensions = 384;

    private readonly ILogger _logger = logger;

    public bool IsEnabled => embeddingGenerator is not null;

    public ValueTask<Vector> GetEmbeddingAsync(ItemV2 item) =>
        IsEnabled ?
            GetEmbeddingAsync(CatalogItemToString(item)) :
            ValueTask.FromResult<Vector>(null);

    public async ValueTask<IReadOnlyList<Vector>> GetEmbeddingsAsync(IEnumerable<ItemV2> items)
    {
        // remove for brevity
    }

    public async ValueTask<Vector> GetEmbeddingAsync(string text)
    {
        // remove for brevity
    }

    private string CatalogItemToString(ItemV2 item)
    {
        _logger.LogDebug("{item.Type} {item.Description}", item.Type, item.Description);
        return $"{item.Type} {item.Description}";
    }
}

Then, whenever we want to use it to generate an embedded vector, we can simply use it like:

IReadOnlyList<Vector> embeddings = await catalogAI.GetEmbeddingsAsync(catalogItems);

Chat Completion implementation

// Program.cs
builder.AddChatCompletionService("openai");

// ChatCompletionServiceExtensions.cs
public static class ChatCompletionServiceExtensions
{
    public static void AddChatCompletionService(this IHostApplicationBuilder builder, string serviceName)
    {
        var pipeline = (ChatClientBuilder pipeline) => pipeline
            .UseFunctionInvocation()
            .UseOpenTelemetry(configure: c => c.EnableSensitiveData = true);

        if (builder.Configuration["ai:Type"] == "openai")
        {
            builder.AddOpenAIChatClient(serviceName, pipeline);
        }
        else
        {
            builder.AddOllamaChatClient(serviceName, pipeline);
        }
    }
    // remove for brevity
    // ...
}

Then, use it like

var prompt = $"Generate the description of {catalogItems[i].Type} in max 20 words";
var response = await chatClient.CompleteAsync(prompt);
catalogItems[i].SetDescription(response.Message?.Text);

Ollama screenshots

Chat completion to summary text

We run it in seeding data (ProductDbContextSeeder.cs)

Semantic search

GET https://{{hostname}}/p/api/v2/item-types?q=cafe
content-type: application/json

Azure OpenAI service screenshots

Chat completion to summary text

We run it in seeding data (ProductDbContextSeeder.cs)

Semantic search

GET https://{{hostname}}/p/api/v2/item-types?q=cafe
content-type: application/json

Azure AI Studio

It took around 345 tokens to embed 11 product items in this scenario:

And 496 total token count on around 8 total requests:

References

That's enough for today. Happy hacking!

Top comments (2)

Công Lý • Nov 19 '24

Always my go-to reads! Thank you aThang!

Loi Nguyen Tan • Jan 7

thx, I'm having a very productive reading

DEV Community

CoffeeShop App Infused with AI - Intelligent Apps Development

Introduction

CoffeeShop with GenAI - technical stuff

Semantic search with GenAI

Chat completion (text summary)

LLM model usages

Implementation using `Microsoft.Extensions.AI` AI building blocks

IEmbeddingGenerator implementation

Chat Completion implementation

Ollama screenshots

Chat completion to summary text

Semantic search

Azure OpenAI service screenshots

Chat completion to summary text

Semantic search

Azure AI Studio

References

Top comments (2)

Read next

Building a Nickname-Based Crypto Transfer Service Like WhiteBIT's QuickSend: A Developer's Guide

When Is The Best Time To Look for a New Job?

Seeder vs Factory: Populating Test Data in Laravel

Hacking the Python Import System and Rewriting the AST For Durable Execution

Introduction

CoffeeShop with GenAI - technical stuff

Semantic search with GenAI

Chat completion (text summary)

LLM model usages

Implementation using Microsoft.Extensions.AI AI building blocks

IEmbeddingGenerator implementation

Chat Completion implementation

Ollama screenshots

Chat completion to summary text

Semantic search

Azure OpenAI service screenshots

Chat completion to summary text

Semantic search

Azure AI Studio

References

Read next

Building a Nickname-Based Crypto Transfer Service Like WhiteBIT's QuickSend: A Developer's Guide

When Is The Best Time To Look for a New Job?

Seeder vs Factory: Populating Test Data in Laravel

Hacking the Python Import System and Rewriting the AST For Durable Execution

Implementation using `Microsoft.Extensions.AI` AI building blocks