Build an AI-Powered Developer Portal with Backstage and .NET
Want to apply AI, not just read about it? Most tutorials stop at a "Hello World" chatbot. We are going to build something that actually solves a common engineering headache: stale documentation.
Who this is for
This guide is for platform engineers and .NET developers who need to organize a growing software landscape without forcing teams to manually write YAML files.
What you will build
You will build a dynamic developer portal using Backstage that automatically populates its service catalog. We will use a .NET CLI tool to scan source code and use local AI (Ollama) to generate summaries.
- Source repo: demo-backstage-catalog-generator
- Constraint: We use local inference only. No source code ever leaves your machine.
Have you ever needed to update a service, but forgot what it does? Or spent time trying to understand code you have not touched in months? We usually solve this with a README.md that nobody updates, or a wiki nobody keeps current.
An Internal Developer Portal (IDP) solves this by making the software landscape visible, but only if the data is fresh. Automation is the only way to avoid the "stale metadata" trap.
Want to skip ahead? Check out the complete working demo on
GitHub with all
the code ready to run.
Prerequisites
Before we start, make sure you have the following installed:
- .NET SDK 8+
-
Node.js (includes
npxandyarn) -
Ollama with the
llama3:8bmodel pulled - A GitHub account (for hosting and deployment)
Why Does an Internal Developer Portal Matter?
The term "Internal Developer Portal" (IDP) can be a little misleading, since it sounds like a tool exclusively for developers only. In reality, it functions as an internal organizational portal focused entirely on your software portfolio. Unlike a general-purpose SharePoint site where everything is dumped in one place and nothing is easy to find, an IDP is deliberately narrow in scope. It covers your software landscape and nothing else, which is exactly what makes it powerful.
An IDP becomes the single source of truth for your engineering organization. It answers critical questions across every role:
- Engineers: Which services exist, who owns them, what they do, and where the APIs are.
- Team leads and architects: Team composition, squad ownership, and architectural decisions with their rationale.
- New joiners: How to get up to speed on a codebase they have never seen.
- Platform and operations teams: What is running in production, who is responsible, and the lifecycle status.
Beyond just listing services, a mature IDP centralizes Architecture Decision Records (ADRs). I'd argue those are more valuable than the service catalog itself, because they capture why a decision was made. Without a central place to surface them, they end up in forgotten wiki pages or git repositories nobody checks.
The challenge is keeping all of this populated and accurate. If you rely on engineers to manually maintain metadata YAML files, the data grows stale within weeks. Automation is the only sustainable path.
Formulating the Architecture
Here's the plan to extract metadata from source code and present it visually:
- Backstage: the UI layer where engineers browse and discover services, APIs, documentation, and team ownership
- .NET Core: a CLI tool that scans project folders, extracts metadata, and generates Backstage-compatible YAML
- Ollama: runs AI inference locally. No source code leaves the machine, no API costs.
- Static hosting: deploy to Netlify, Azure Static Web Apps, or any provider of your choice
Why Ollama? Because it runs locally and you do not want to expose your code
to the AI agents over the public internet. You do not know what and how they
use it for, and it does not feel safe. If your employer finds out, you're
done.
Setting Up the Project Infrastructure
You can either follow along and build everything from scratch, or clone the demo repository to get started immediately.
To clone the demo:
git clone https://github.com/bgener/demo-backstage-catalog-generator.git
cd demo-backstage-catalog-generator
To build from scratch, start by downloading and running the LLM model we will use in this guide:
ollama pull llama3:8b
ollama serve
We use llama3:8b specifically. It is significantly faster for local
inference than the full-size model and produces more consistent, concise
output for our use case. If you have a powerful GPU, feel free to use llama3
instead.
Next, scaffold the baseline .NET services. We’ll create one Web API and one MVC project:
mkdir Backstage-Dev-Portal
cd Backstage-Dev-Portal
dotnet new webapi -n ServiceA
dotnet new mvc -n ServiceB
dotnet new sln -n Backstage-Dev-Portal
dotnet sln add ServiceA/ServiceA.csproj
dotnet sln add ServiceB/ServiceB.csproj
You can replace the default controllers with real logic later. These raw services represent the uncataloged microservices in your organization.
Building a Smart Catalog Generator in .NET
We will build a .NET CLI tool using OllamaSharp. It scans each project, sends relevant files to the local AI model, and generates a single catalog-info.yaml file containing all services, ready for Backstage to consume.
Create the tool and add the required package:
dotnet new console -n ProjectSummarizer
cd ProjectSummarizer
dotnet add package OllamaSharp
Instead of sending every file to the AI, we take a smarter approach to avoid token limits and save compute time. We will send only *.csproj, Program.cs, and the folder structure. This is all the context the AI needs to understand the project structure and purpose.
Replace Program.cs with the implementation below. The full version is in the demo repository. Here we focus on the key parts.
First, set up the Ollama client and configure the system prompt. This is the most fragile part of the chain: the system prompt has to force the model into a YAML-safe format without it hallucinating markdown backticks:
var ollamaApiClient = new OllamaApiClient(
new Uri("http://localhost:11434")) { SelectedModel = "llama3:8b" };
var chat = new Chat(ollamaApiClient, systemPrompt:
"You are a technical documentation assistant. " +
"You produce concise, YAML-safe summaries of .NET projects. " +
"Output only plain text, no markdown, no bullet points, no quotes, no colons, no newlines.");
Instead of sending every file to the AI, we only send *.csproj, Program.cs, and the folder structure. This is all the context the model needs.
Prompt sanitization is critical. If your Program.cs contains complex
string literals or nested colons, the AI might pass them through to your YAML,
breaking the Backstage parser. Always sanitize the output before writing the
file.
var sb = new StringBuilder();
sb.AppendLine($"Project: {projectName}");
sb.AppendLine("Folder structure:");
AppendFolderStructure(projectDir, sb, "");
sb.AppendLine(File.ReadAllText(csprojPath));
var programPath = Directory.GetFiles(projectDir, "Program.cs", SearchOption.AllDirectories)
.FirstOrDefault();
if (programPath != null)
sb.AppendLine(File.ReadAllText(programPath));
The prompt itself uses few-shot examples to guide the model toward the output format we want:
var prompt = "Summarize the project in 1-2 sentences based on the files provided. " +
"Do not output anything else. " +
"Examples of good output: " +
"REST API service providing weather forecasts with temperature data\n" +
"ASP.NET MVC application with React frontend for managing todo items\n\n"
+ sb.ToString();
await foreach (var token in chat.SendAsync(prompt, cts.Token))
summaryBuilder.Append(token);
Finally, each summary is sanitized and assembled into a Backstage-compatible YAML entry:
var summary = summaryBuilder.ToString().Trim()
.Replace("\n", " ").Replace(":", " -").Replace("\"", "'");
var yamlEntry = $@"
apiVersion: backstage.io/v1alpha1
kind: Component
metadata:
name: {projectName.ToLowerInvariant()}
description: ""{summary}""
spec:
type: service
lifecycle: production
owner: group:default/engineering";
Run the generator against the target directory (use . if you are already in the project root):
dotnet run --project ProjectSummarizer -- .
You should see the AI streaming its summaries in real time:
Most of the real work here is figuring out the prompt. Even a tiny change can produce a completely different output. I encourage you to experiment with the system prompt and the user prompt to see how it affects quality. That is the real learning here.
Integrating the AI Catalog with Backstage
With the catalog-info.yaml ready, we can integrate it into a Backstage instance. Install Backstage:
npx @backstage/create-app
Follow the prompts to name it dev-portal.
Now point Backstage to your generated catalog file. Open app-config.yaml in the dev-portal directory and add the following under the catalog section:
catalog:
locations:
- type: file
target: ../Backstage-Dev-Portal/catalog-info.yaml
rules:
- allow: [Component]
This tells Backstage where to find the AI-generated service metadata. The target path is relative to the Backstage root directory. Adjust it to point to wherever your generator wrote the catalog-info.yaml.
To run it locally:
cd dev-portal
yarn dev
Open http://localhost:3000 in your browser. You should see all your services listed in the Software Catalog with AI-generated summaries visible in the description column.
Deploying the Portal
To host this, build your portal as a static site:
yarn build:static
Push the output to GitHub and deploy to any static hosting provider: Netlify, Azure Static Web Apps, Vercel, or even self-hosted on Kubernetes. Set the build command to yarn build:static and the publish directory to dist.
A static Backstage build is great for read-only catalogs. If you need dynamic
features like authentication, real-time plugin backends, or write
operations, you will need to deploy the full Backstage backend as a Node.js
service instead.
The part that is easy to miss is that a static Backstage portal has no live catalog refresh. The catalog is baked at build time, so there is always a lag between a code change and the portal showing it.
Automating with CI/CD
The catalog only stays current if the generator runs automatically. Here is a GitHub Actions workflow that regenerates summaries on every push to main:
name: Update Backstage Catalog
on:
push:
branches: [main]
jobs:
generate-catalog:
runs-on: ubuntu-latest
steps:
- uses: actions/checkout@v4
- name: Setup .NET
uses: actions/setup-dotnet@v4
with:
dotnet-version: ‘8.0.x’
- name: Install and start Ollama
run: |
curl -fsSL https://ollama.com/install.sh | sh
ollama serve &
sleep 5
ollama pull llama3:8b
- name: Generate catalog
run: dotnet run --project ProjectSummarizer -- "$GITHUB_WORKSPACE"
- name: Commit updated catalog
run: |
git config user.name "github-actions"
git config user.email "github-actions@github.com"
git add catalog-info.yaml
git diff --cached --quiet || git commit -m "chore: regenerate AI catalog summaries"
git push
CI Performance: Running Ollama in CI uses CPU-only inference by default. A
llama3:8b summary takes about 20-30 seconds per project on a standard GitHub
runner. For a large monorepo, your CI bill will spike. Consider using a
persistent self-hosted runner with a GPU if you scale this.
Final thoughts
Automate the metadata generation where the code lives, and keep the UI (Backstage) as a thin, static client. That is the whole trick. Once those two are separate, the "stale documentation" problem mostly goes away.
-
Use narrow context: Don't send the whole repo. Files like
Program.csand*.csprojare usually enough context for the model. - Sanitize strictly: AI output is non-deterministic. Always strip colons and newlines before writing to YAML.
- Start static: Read-only is much easier to maintain. Add a backend only when you actually need write operations or auth.
I would not use this in production unless I had tested the sanitization logic against a few real codebases first. The Replace calls that strip colons and newlines are intentionally minimal. They break on YAML with complex string values, like connection strings or environment variable blocks.
FAQ
Can I use OpenAI instead of Ollama?
Yes, but you will be sending your source code (or at least your Program.cs) to a third party. Use a local model if security is a concern.
Does this replace README files?
No. It replaces the "Service Directory" that usually lives in a spreadsheet. It points engineers to the README they actually need.
How do I handle project renames?
The generator uses the folder or .csproj name. If you rename them, Backstage will see it as a new component unless you map the identity stably.
I help teams build exactly this kind of internal tooling, from developer portals to platform engineering. See my work or get in touch.


Top comments (0)