Updated for 2026 — covering NLWeb's evolution since Build 2025, the shift from Semantic Kernel to Microsoft Agent Framework, GPT-4o, and Cloudflare's managed NLWeb deployment path.
Natural language interfaces are no longer a novelty — they're the new baseline expectation. Users increasingly expect to ask your app questions the same way they'd talk to a colleague. With Microsoft's NLWeb, you can retrofit your existing .NET + SQL application with a conversational layer without rebuilding it from scratch. In this updated guide, we'll walk through the full process using the latest tooling available in mid-2026.
What is NLWeb? (2026 Update)
NLWeb is an open-source protocol from Microsoft, conceived by R.V. Guha (the creator of RSS, RDF, and Schema.org), that lets any web property respond to natural language queries directly — without a search engine acting as intermediary. Think of it as what HTML did for publishing: a universal, open standard that makes your site a first-class citizen of the emerging agentic web.
Every NLWeb instance is also a Model Context Protocol (MCP) server, exposing your content to AI agents via two key endpoints:
-
/ask— for human-facing conversational queries -
/mcp— for agent-to-agent discovery and tool calls
Since its announcement at Build 2025, the ecosystem has matured significantly:
- Cloudflare added native NLWeb support via its AutoRAG infrastructure in early 2026, offering a fully managed deployment path
- Microsoft joined the MCP Steering Committee and contributed an updated authorization spec and MCP server registry design
- Build 2026 featured NLWeb prominently in the Agents and Apps track, alongside .NET 11's new agentic web building blocks
NLWeb is technology-agnostic — it supports all major LLMs, vector databases, and operating systems.
What's Changed Since the Original Guide
| Area | Original (May 2025) | Updated (June 2026) |
|---|---|---|
| Model | GPT-4 | GPT-4o (default) |
| Orchestration | Semantic Kernel (experimental agents) | Microsoft Agent Framework (MAF) — GA in Q2 2026 |
| .NET Version | ASP.NET Core (unspecified) | .NET 11 |
| NLWeb Deployment | Manual self-hosted only | Manual or Cloudflare AutoRAG (managed) |
| Embeddings | text-embedding-3-small |
text-embedding-3-small / text-embedding-3-large
|
Architecture Overview
User
└─► NLWeb UI (/ask endpoint)
└─► ASP.NET Core (.NET 11) Controller
└─► Microsoft Agent Framework
├─► GPT-4o (Azure OpenAI)
├─► Embedding + Vector DB (Azure AI Search / Qdrant)
└─► SQL Server
└─► Schema.org JSON response back to NLWeb UI
Prerequisites
- .NET 11 Web API project (ASP.NET Core)
- Microsoft SQL Server
- Node.js (for the NLWeb UI demo)
- Azure OpenAI access (GPT-4o + text-embedding-3-small)
- A vector database: Azure AI Search (recommended), Qdrant, Pinecone, or Weaviate
- Docker (optional, for deployment)
Step-by-Step Setup
Step 1: Clone NLWeb
git clone https://github.com/microsoft/NLWeb.git
cd NLWeb
Key folders:
-
code/— Python reference implementation and/askendpoint logic -
demo/— Embeddable frontend UI -
docs/— MCP protocol and Schema.org response specs
Step 2: Run the NLWeb UI
cd demo
npm install
npm start
This launches a local chat interface. Queries are POSTed to your backend's /ask endpoint. You can also embed the UI snippet directly into any webpage.
Step 3: Understand the NLWeb Request/Response Contract
Incoming request to your API:
{
"query": "What are the top-selling products this month?",
"context": []
}
Expected response (Schema.org JSON-LD):
{
"@context": "https://schema.org",
"@type": "ItemList",
"itemListElement": [
{
"@type": "Product",
"name": "Wireless Mouse",
"description": "Ergonomic wireless mouse",
"offers": {
"@type": "Offer",
"price": "29.99",
"priceCurrency": "USD"
}
}
]
}
NLWeb renders this structured response as a conversational answer in the UI.
Step 4: Install the Microsoft Agent Framework (MAF)
MAF is the production-ready successor to Semantic Kernel's agent layer, reaching GA in Q2 2026. It merges the best of AutoGen and Semantic Kernel into a single, cleaner SDK.
dotnet add package Microsoft.Extensions.AI
dotnet add package Microsoft.SemanticKernel
dotnet add package Microsoft.Agents.AI.OpenAI
dotnet add package Azure.AI.OpenAI
Note for existing SK users: Semantic Kernel v1.x is supported until at least one year after MAF GA. Your plugins and vector store integrations migrate cleanly — the mental model is identical, just with a cleaner API surface.
Step 5: Update Your .NET 11 API
Define the Query Model
public class NLWebQuery
{
public string Query { get; set; } = string.Empty;
public List<string> Context { get; set; } = new();
}
Create the Controller
[ApiController]
[Route("api")]
public class NLWebController : ControllerBase
{
private readonly INLWebAgentService _agentService;
public NLWebController(INLWebAgentService agentService)
{
_agentService = agentService;
}
[HttpPost("ask")]
public async Task<IActionResult> Ask([FromBody] NLWebQuery query)
{
var schemaOrgResponse = await _agentService.ProcessAsync(query.Query, query.Context);
return Ok(schemaOrgResponse);
}
}
Step 6: Build the Agent Service with MAF
This is where the 2026 approach differs significantly. Instead of manually chaining prompt → vector search → SQL, the Microsoft Agent Framework handles orchestration.
using Microsoft.Extensions.AI;
using Microsoft.Agents.AI;
using Azure.AI.OpenAI;
using Azure.Identity;
public class NLWebAgentService : INLWebAgentService
{
private readonly IChatClient _chatClient;
private readonly IVectorSearchService _vectorSearch;
private readonly IProductRepository _productRepo;
public NLWebAgentService(
IChatClient chatClient,
IVectorSearchService vectorSearch,
IProductRepository productRepo)
{
_chatClient = chatClient;
_vectorSearch = vectorSearch;
_productRepo = productRepo;
}
public async Task<object> ProcessAsync(string query, List<string> context)
{
// 1. Embed the query
var relevantContext = await _vectorSearch.SearchAsync(query, topK: 5);
// 2. Build prompt with retrieved context
var systemPrompt = $"""
You are a data analyst for an e-commerce platform.
Based on the context below, generate a SQL query to answer the user's question.
Return ONLY the SQL — no explanation.
Context:
{string.Join("\n", relevantContext)}
""";
var messages = new List<ChatMessage>
{
new(ChatRole.System, systemPrompt),
new(ChatRole.User, query)
};
// Add conversation history for contextual follow-up
foreach (var prior in context)
messages.Insert(messages.Count - 1, new(ChatRole.Assistant, prior));
// 3. Call GPT-4o via MAF
var response = await _chatClient.CompleteAsync(messages);
var sql = response.Message.Text;
// 4. Execute SQL
var data = await _productRepo.ExecuteQueryAsync(sql);
// 5. Return Schema.org JSON
return FormatAsSchemaOrg(data);
}
}
Register Services in Program.cs (.NET 11)
var builder = WebApplication.CreateBuilder(args);
// MAF: Layer 1 - IChatClient (provider-agnostic)
builder.Services.AddSingleton<IChatClient>(_ =>
new AzureOpenAIClient(
new Uri(builder.Configuration["AzureOpenAI:Endpoint"]!),
new DefaultAzureCredential())
.GetChatClient("gpt-4o")
.AsIChatClient());
// Vector search (Azure AI Search recommended for .NET + enterprise)
builder.Services.AddSingleton<IVectorSearchService, AzureAISearchService>();
builder.Services.AddScoped<IProductRepository, SqlProductRepository>();
builder.Services.AddScoped<INLWebAgentService, NLWebAgentService>();
builder.Services.AddControllers();
var app = builder.Build();
app.MapControllers();
app.Run();
Step 7: Prepare SQL Data for Vector Search
Before your app can do semantic retrieval, you need to embed your SQL data and store it in your vector database.
// One-time ingestion pipeline
public async Task IngestProductsAsync()
{
var products = await _db.Products
.Select(p => new { p.Id, p.Name, p.Description, p.Category })
.ToListAsync();
foreach (var product in products)
{
var text = $"{product.Name}: {product.Description} (Category: {product.Category})";
var embedding = await _embeddingClient.GetEmbeddingAsync(text);
await _vectorDb.UpsertAsync(new VectorRecord
{
Id = product.Id.ToString(),
Vector = embedding,
Metadata = new { product.Name, product.Category, SourceTable = "Products" }
});
}
}
Best practices for 2026:
- Use hybrid search (keyword + semantic) in Azure AI Search for best recall
- Embed at the right granularity — row-level for products, paragraph-level for FAQs
- Store
SourceTableandRowIdin metadata for traceability and debugging
Step 8: Format the Response as Schema.org
private object FormatAsSchemaOrg(DataTable table)
{
return new
{
@context = "https://schema.org",
@type = "ItemList",
itemListElement = table.Rows.Cast<DataRow>().Select(row => new
{
@type = "Product",
name = row["Name"],
description = row["Description"],
offers = new
{
@type = "Offer",
price = row["Price"],
priceCurrency = "USD"
}
})
};
}
Step 9: Test the Full Flow
- Start your .NET API:
dotnet run - Start the NLWeb UI:
npm startin thedemo/folder - Type: "What are the top-selling products this month?"
- Observe the full pipeline: query → embed → vector search → GPT-4o → SQL → Schema.org JSON → conversational answer in UI
Bonus: Add Memory and Context Persistence
For multi-turn conversations, persist chat history using Redis or SQL and pass it back with each request:
// Store turn history in Redis
await _cache.SetStringAsync($"chat:{sessionId}", JsonSerializer.Serialize(history));
MAF handles the concept of a session/conversation thread natively — no manual history wiring needed if you use its agent abstraction.
Alternative: Managed Deployment with Cloudflare AutoRAG
If you don't want to manage the NLWeb infrastructure yourself, Cloudflare now offers a one-click managed path:
- Go to Cloudflare Dashboard → AI → AI Search
- Select Create AI Search → Website
- Choose NLWeb Worker — Cloudflare automatically deploys the
/askand/mcpendpoints - Point the NLWeb UI to your Cloudflare Worker URL
This is ideal for teams that want NLWeb running in production without managing vector DB infrastructure or NLWeb server updates manually.
Resources
- NLWeb GitHub
- Microsoft Agent Framework Docs
- Semantic Kernel → MAF Migration Guide
- Cloudflare NLWeb Integration
- Azure AI Search (Vector + Hybrid)
- Schema.org
- .NET 11 at Build 2026
Final Thoughts
A year on from NLWeb's announcement, the ecosystem has grown from a promising experiment into a solidifying standard. With Cloudflare's managed path, .NET 11's native agentic web support, and the Microsoft Agent Framework replacing the more fragmented Semantic Kernel agent APIs, the integration story is significantly cleaner in 2026.
The core idea remains the same: your existing .NET + SQL data is more valuable than you're exposing through a traditional REST interface. NLWeb, GPT-4o, and vector search give your users a natural language front door to that data — whether they're humans typing questions or AI agents making autonomous requests.
The agentic web is here. Your app should be part of it.
By Jino R Krishnan | Updated June 2026
Top comments (0)