Peter Bons

Posted on Sep 30 • Edited on Oct 29

Forget MCP, Use OpenAPI for AI Agent tools

#semantickernel #dotnet #mcp #llm

This blog post demonstrates how to build an AI agent using dotnet and a Semantic Kernel plugin that calls web APIs to access external data instead of using an MCP server based on an OpenAPI document.

Introduction

Ok, you got me, it's a clickbait post title. But hear me out. If you are in to Language Models you cannot ignore the widespread enthusiasm that MCP brought. Lots of MCP servers popped up and support is widely available. From your favorite AI infused code editor to one of the many Copilots Microsoft is trying to sell you, everywhere you can plug them in. And it is not a bad thing, they can add great value to your AI agents and workflows. But there is also a dark side to the success story. First of all, a lot of MCP servers are intended to run locally and because of that, they can pose a real security risk. Second, not all of them are maintained and updated well. And what if you already invested in opening up your software by providing web APIs? That is what this post is all about. We will build an agent that can transform an OpenAPI document into a set of tools the model can call using the Semantic Kernel OpenAPI capabilities.

Getting the ingredients right

For this scenario we will build an agent that can help us answering questions about flight status and realtime flight data using te Aviationstack API. You can sign up for a free access key in under a minute. After that, you can grab their OpenAPI document from here:

We also need a model to talk to. You can run one in the cloud, use Hugging Face, Microsoft Foundry Local or something else but I choose* to use the qwen3 model through Ollama:

Ollama is the easiest way to get up and running with large language models such as gpt-oss, Gemma 3, DeepSeek-R1, Qwen3 and more..

Once installed, download and run the model:

ollama pull qwen3
ollama serve

The following should show up, verifying it's running. Time to note the port number, we will need it later.

🟢 Mtime=2025-09-29T18:51:05.926+02:00 level=INFO source=routes.go:1332 msg="server config" env="map[CUDA_VISIBLE_DEVICES: GPU_DEVICE_ORDINAL: HIP_VISIBLE_DEVICES: HSA_OVERRIDE_GFX_VERSION: HTTPS_PROXY: HTTP_PROXY: NO_PROXY: OLLAMA_CONTEXT_LENGTH:4096 OLLAMA_DEBUG:INFO OLLAMA_FLASH_ATTENTION:false OLLAMA_GPU_OVERHEAD:0 OLLAMA_HOST:http://127.0.0.1:11434

Now that we have the prerequisites in place, it is time to move on to some code. By the way, I am using a dotnet 10 single file application for this blogpost.

*Actually I tried different models like phi-4-mini, Llama3.1 and several others but Qwen3 provided the most consistent results and always succeeded to make the proper tool calls. Thing is, you get the best result by trying out the various models as all behave different depending on the use case.

Building the agent

We start by setting up Semantic Kernel.

Semantic Kernel is a lightweight, open-source development kit that lets you easily build AI agents and integrate the latest AI models into your C#, Python, or Java codebase.

#:package Microsoft.SemanticKernel@*
#:package Microsoft.SemanticKernel.Connectors.Ollama@1.65.0-alpha
#:package Microsoft.SemanticKernel.Plugins.OpenApi@*
#:package Microsoft.Extensions.DependencyInjection@10.0.0-rc.1.25451.107
#:package Microsoft.Extensions.Logging.Console@10.0.0-preview.7.25380.108

#:property PublishAot=false

using System.Web;
using Microsoft.Extensions.DependencyInjection;
using Microsoft.Extensions.Logging;
using Microsoft.SemanticKernel;
using Microsoft.SemanticKernel.ChatCompletion;
using Microsoft.SemanticKernel.Connectors.Ollama;
using Microsoft.SemanticKernel.Plugins.OpenApi;

var builder = Kernel.CreateBuilder();

var ollamaEndpoint = new Uri("http://127.0.0.1:11434");
var modelId = "qwen3";
using var loggerFactory = LoggerFactory.Create(builder => builder.SetMinimumLevel(LogLevel.Information).AddConsole());

builder.Services.AddSingleton(loggerFactory);

builder.AddOllamaChatCompletion(modelId, ollamaEndpoint);

var kernel = builder.Build();

With the foundation in place it is time to move on and import the OpenAPI document. Semantic Kernel has the ability to turn the document into a plugin that can be called by the model. The plugin will then call the web API and pass the result back to the model. You can load the OpenAPI document from file or by url. It also provides a callback method that can be used to authenticate against the api by setting the authorization header for example. In our case, the aviationstack api expects an access_key in the request:

await kernel.ImportPluginFromOpenApiAsync("flight_api",
    @"Z:\Sources\dotnet10\ai\openapi-trimmed.yaml",
    executionParameters: new OpenApiFunctionExecutionParameters()
    {
        AuthCallback = AuthenticateRequestAsyncCallback
    });

static Task AuthenticateRequestAsyncCallback(HttpRequestMessage request, CancellationToken cancellationToken = default)
{
    var queryParts = HttpUtility.ParseQueryString(request.RequestUri!.Query);
    queryParts.Set("access_key", "1052d67e5de98f2c58f664e1e5a30a12");
    var uriBuilder = new UriBuilder(request.RequestUri!)
    {
        Query = queryParts.ToString()
    };
    request.RequestUri = uriBuilder.Uri;

    Console.WriteLine($"Requesting: {request.RequestUri}");

    return Task.CompletedTask;
}

(yes, I included a real & valid access key but it is rate limited on the free tier so you're welcome to use it and try if it succeeds)

So lets add a way to prompt the model and ask some questions!

OllamaPromptExecutionSettings executionSettings = new()
{
    FunctionChoiceBehavior = FunctionChoiceBehavior.Auto(),
};

var chatGPT = kernel.GetRequiredService<IChatCompletionService>();

var history = new ChatHistory();
history.AddSystemMessage("Use a made up value for the access_key parameter.");

Console.WriteLine("You can now enter messages to send to the model. Type 'exit' to quit.");
while (Console.ReadLine() is { } line && !line.Equals("exit"))
{
    history.AddUserMessage(line);
    string assistantResponse = string.Empty;
    await foreach (var result in chatGPT.GetStreamingChatMessageContentsAsync(history, executionSettings, kernel))
    {
        assistantResponse += result.Content;
        Console.Write(result.Content);
    }

    history.AddAssistantMessage(assistantResponse);
}

Play Time!

So, let's ask a question: when is the next flight departure from AMS with destination CDG?

The next scheduled flight from Amsterdam (AMS) to Charles de Gaulle (CDG) is:

Flight: Aeromexico AM6192
Departure: September 29, 2025 at 11:35 AM (AMS)
Arrival: September 29, 2025 at 12:55 PM (CDG)

Flight Status: Scheduled
Estimated Arrival Delay: 46 minutes
Departure Delay: 77 minutes

This is the earliest flight found for your route. Let me know if you need further details!

If you take a look at the response from the LLM you will see it reasons about what parameter values to use when calling the tool:

[..]
So the function call would be to flight_api_Flights with access_key, dep_iata=AMS, arr_iata=CDG, flight_status=scheduled, limit=1. That should get the next scheduled flight from AMS to CDG. If there's no scheduled flight, maybe check for active ones. But the user is asking for the next flight, so scheduled is the right status. Let's proceed with that.

info: Microsoft.SemanticKernel.KernelFunction[0]
Function flight_api-Flights invoking.
Requesting: https://api.aviationstack.com/v1/flights?access_key=1052d67e5de98f2c58f664e1e5a30a12&limit=1&flight_status=scheduled&dep_iata=AMS&arr_iata=CDG
info: Microsoft.SemanticKernel.KernelFunction[0]
Function flight_api-Flights succeeded.
info: Microsoft.SemanticKernel.KernelFunction[0]
Function flight_api-Flights completed. Duration: 0.6109388s

Okay, let me process the user's query and the tool response. The user asked for the first next flight from AMS to CDG. The tool call used the Flights endpoint with the parameters access_key, dep_iata=AMS, arr_iata=CDG, flight_status=scheduled, and limit=1. The response from the tool shows a flight scheduled for 2025-09-29, departing AMS at 11:35 and arriving at CDG at 12:55. The airline is Aeromexico, flight number AM6192.
[..]

As instructed, it used a dummy API key which is replaced by Semantic Kernel with the real value when the request is about to be made to demonstrate how you could handle more complex authentication scenarios.

The full model output can be found here.

Trimming large OpenAPI documents

Most times you are calling a subset of the endpoints available. Take the Microsoft Graph API for example, the OpenAPI document is very large resulting in too much functions available for the model to work with. Luckily for us there is tooling for that. I used Microsoft Hidi to transform the document by specifying only the operations I am interested in:

 hidi transform -d .\openapi.yaml -f yaml -o .\openapi-trimmed.yaml -v 3.0
 --op Flights,Routes

The command above creates a new yaml file based on the original, excluding all endpoints except the Flights and Routes operations endpoints.

Wrapping up

We have seen how easy it is to convert an OpenAPI document into a set of tools that a model can call and how to handle authentication. There are still enough points we haven't touched yet like error handling, for example I hit some rate limits that the model couldn't handle.

When you already have a set of well defined APIs it is a very fast and easy way to integrate them into your agents.

Full source can be found here.

Any questions or comments? Drop them below!

DEV Community