DEV Community

Cover image for Deploy and use DeepSeek R1 with Azure and .NET
Uroš Miletić
Uroš Miletić

Posted on • Originally published at uveta.io

1

Deploy and use DeepSeek R1 with Azure and .NET

Table of Contents

Introduction

DeepSeek models have taken technological world by surprise, demonstrating that cutting-edge AI development is no longer confined to the certain valley made out of silicon, but has become a global phenomenon. Although Microsoft has traditionally partnered with OpenAI, the users of its technologies still have reasons to be optimistic. The Azure cloud platform recently announced support for the DeepSeek R1 model through its Azure AI Foundry service. Currently in public preview, the model may be run in serverless mode and is free of charge. This article will guide you through deploying the R1 model and integrating it with .NET applications.

Deploy DeepSeek R1 on Azure AI Foundry

Deploying the DeepSeek model on Azure is straightforward, even for those new to Azure AI Foundry (formerly Azure AI Studio).

Start by creating a new hub, which serves as a container for your AI applications and models. This can be done via AI Foundry Management Center. Note that the region you select for your hub will impact model availability. As of February 2025, the DeepSeek R1 model is available in East US, East US 2, West US, West US 3, South Central US, and North Central US regions only.

Creating Azure AI Foundry hub

Next, you need to create a new project. In the Management Center select the hub you created, and click on "New project" button. Provide a name for your project and click "Create." Your project will be ready in a few seconds.

Creating Azure AI Foundry project

Once you have your hub and project ready, you can deploy DeepSeek R1 model. Navigate to the Model catalog tab within your project. Search for the "DeepSeek R1" model, and click "Deploy". Provide a region unique name for your model and optionally apply content filters. Click "Deploy" (again) to start provisioning the model, which may take a few minutes.

Deploying DeepSeek R1 model

After deployment finishes, you will find the model in the Models + endpoints tab of your project. Select the deployment name to access detailed information, including the endpoint URL and API key, which are necessary for programmatic consumption.

DeepSeek R1 deployment details

Use the chat playground available in the Playgrounds tab of your project to ensure the deployment is functioning correctly. Make sure to select DeepSeek R1 deployment before starting the conversation. This step helps verify that the model will work seamlessly when integrated programmatically.

Chat playground

Consume from .NET

Models deployed via Azure AI Foundry can be accessed from any programming language that supports HTTP requests. For .NET, Azure provides an SDK through the Azure AI Inference library. To consume the model, create a chat client using the deployment endpoint URL and API key, and then run chat completion.

using Azure;
using Azure.AI.Inference;
const string Endpoint = "<ENDPOINT>";
const string ApiKey = "<API-KEY>";
const string SystemMessage =
"""
Assistant is a conversational agent named DeepSeek. It's purpose is to discuss any topic with the user.
Initially greet the user, introduce yourself, and mention that conversation can be stopped by typing "exit" in the chat. Then continue with the conversation.
Limit answers to 50-100 words. If the user asks for more information, provide a brief answer and ask if they would like more details.
""";
var chatCompletion = new ChatCompletionsClient(new Uri(Endpoint), new AzureKeyCredential(ApiKey));
List<ChatRequestMessage> history = [new ChatRequestSystemMessage(SystemMessage)];
var options = new ChatCompletionsOptions { Messages = history, MaxTokens = 2048 };
var response = await chatCompletion.CompleteAsync(options);
string? message = response.Value.Content;
Console.WriteLine($"DeepSeek > {message}");
view raw Program.cs hosted with ❤ by GitHub

Consume from SemanticKernel

For more complex applications using Semantic Kernel, consuming models deployed in Azure AI Foundry is straightforward. Utilize the Microsoft.SemanticKernel.Connectors.AzureAIInference connector library. Register the AI Inference connector using the deployment name, endpoint URL, and API key while building the kernel. Once configured, provision the kernel and use the IChatCompletionService service to run chat completion.

using Azure;
using Azure.AI.Inference;
const string Endpoint = "<ENDPOINT>";
const string ApiKey = "<API-KEY>";
const string SystemMessage =
"""
Assistant is a conversational agent named DeepSeek. It's purpose is to discuss any topic with the user.
Initially greet the user, introduce yourself, and mention that conversation can be stopped by typing "exit" in the chat. Then continue with the conversation.
Limit answers to 50-100 words. If the user asks for more information, provide a brief answer and ask if they would like more details.
""";
var chatCompletion = new ChatCompletionsClient(new Uri(Endpoint), new AzureKeyCredential(ApiKey));
List<ChatRequestMessage> history = [new ChatRequestSystemMessage(SystemMessage)];
var options = new ChatCompletionsOptions { Messages = history, MaxTokens = 2048 };
var response = await chatCompletion.CompleteAsync(options);
string? message = response.Value.Content;
Console.WriteLine($"DeepSeek > {message}");
view raw Program.cs hosted with ❤ by GitHub

Conclusion

Complete .NET and Semantic Kernel chat samples are available on GitHub. Make sure you add the deployment name, endpoint URL, and API key where indicated in the code to run the applications without issues.

Keep in mind that the DeepSeek R1 model on Azure is still in preview and is subject to throttling and rate limiting. While it may take from couple of seconds up to few minutes to receive a meaningful response, the service is currently free, allowing for extensive experimentation.

API Trace View

Struggling with slow API calls?

Dan Mindru walks through how he used Sentry's new Trace View feature to shave off 22.3 seconds from an API call.

Get a practical walkthrough of how to identify bottlenecks, split tasks into multiple parallel tasks, identify slow AI model calls, and more.

Read more →

Top comments (0)

A Workflow Copilot. Tailored to You.

Pieces.app image

Our desktop app, with its intelligent copilot, streamlines coding by generating snippets, extracting code from screenshots, and accelerating problem-solving.

Read the docs

AWS GenAI LIVE!

GenAI LIVE! is a dynamic live-streamed show exploring how AWS and our partners are helping organizations unlock real value with generative AI.

Tune in to the full event

DEV is partnering to bring live events to the community. Join us or dismiss this billboard if you're not interested. ❤️