DEV Community

sai pramod upadhyayula
sai pramod upadhyayula

Posted on

docfx-remote-include: inline remote markdown into DocFX, and govern the assembled page

docfx-remote-include is a Markdig extension and a dotnet CLI tool for DocFX. It does two things: it inlines markdown fetched from an HTTP service at build time, and it can send each fully-assembled page to a transform service for centralized governance.

It is not a fork of DocFX. It plugs into DocFX's public
BuildOptions.ConfigureMarkdig seam, so it tracks upstream DocFX releases as a regular
NuGet dependency. It targets .NET 8, 9, and 10.

If you read my earlier post on this project, it had a per-include AI rewrite feature —
you tagged an individual include and a model rewrote that fragment in place. That's
gone now. Governance moved up to a single page-level transform that runs once on the
assembled page, which is both simpler and a better fit for the problem. More on that
below.

The include directive

In any markdown file processed by DocFX:

Some local content.

[!remoteinclude[Welcome](snippets/welcome.md)]

Inline usage also works: today's status is [!remoteinclude[s](status/prod.md)].
Enter fullscreen mode Exit fullscreen mode

At build time the extension performs GET {baseUrl}/{source}, parses the response as
markdown, and inlines it. The directive has two shapes that share one syntax:

  • Block — the directive is the only thing on its line. The fetched markdown is inlined as block content (headings, lists, paragraphs).
  • Inline — the directive appears mid-paragraph. The fetched markdown must reduce to a single paragraph; only its inline content is spliced in, with no <p> wrapper.

Nested directives, cycle detection (via an AsyncLocal source stack, max depth 8), an
in-process per-build cache, and concurrency capped at 8 in-flight requests are all
built in. By default a missing source fails the build; --allow-missing renders a
visible error placeholder instead.

If your content service uses a non-trivial URL scheme, set urlTemplate with a
{source} placeholder:

{
  "baseUrl": "https://api.example.com/",
  "urlTemplate": "content/GetFile?path={source}"
}
Enter fullscreen mode Exit fullscreen mode

Optional page transform

After all includes are resolved and a page is fully assembled, the build can send that
page to a page transform service. The service owns the rules; pages only declare
intent via YAML frontmatter:

---
transform:
  audience: engineer
  intent: onboarding
  overrides:
    prerequisites: "target macOS users"
---
Enter fullscreen mode Exit fullscreen mode

The extension extracts this transform: block, sends the assembled page plus the
metadata to your configured endpoint, and uses the response. Governance happens once,
at the page level. The library only defines the contract:

public interface IPageTransformService
{
    Task<PageTransformResponse> TransformAsync(
        PageTransformRequest request, CancellationToken ct = default);
}
Enter fullscreen mode Exit fullscreen mode

Omit the transform config to disable it. The library itself has no Azure dependency —
implement IPageTransformService with an LLM or with deterministic rules.

The reference content-and-transform service

The repo ships a reference service (samples/knowledge-service) that plays both roles
from one endpoint surface:

Endpoint Method Purpose
/content/{path} GET Serves markdown for [!remoteinclude] directives
/transform POST Transforms the assembled page
/health GET Health check

Point both at the same service in remoteinclude.json:

{
  "baseUrl": "http://localhost:8080/content/",
  "transform": {
    "endpoint": "http://localhost:8080/transform",
    "auth": { "mode": "none" }
  }
}
Enter fullscreen mode Exit fullscreen mode

The /transform endpoint has three behaviors depending on how you configure it:

  • With central guidance (markdown under content/guidance/): it treats that guidance as the source of truth, compares each assembled page against it, and inserts > [!NOTE] **Team Override** callouts above content that deviates. It also harmonizes tone and structure for the page's audience/intent.
  • Without guidance content: it is a passthrough to the LLM — harmonizing tone and structure for the declared audience/intent, with no comparison or override callouts.
  • Without an AiEndpoint configured: it returns the page unchanged. /content still works.

Running it as a sidecar

The container binds 0.0.0.0:8080 (set via ASPNETCORE_URLS in the Dockerfile), reads
all configuration from environment variables, and has no required dependencies, so it
runs cleanly as a sidecar next to a docs build. With no content/guidance and no
AiEndpoint it still serves /content and returns pages unchanged from /transform.

Kubernetes — run it in the same Pod and probe /health:

containers:
  - name: docs-build
    image: your-docs-builder:latest
    # baseUrl/transform.endpoint point at http://localhost:8080
  - name: knowledge-service
    image: ghcr.io/saipramod/knowledge-service:latest
    ports:
      - containerPort: 8080
    env:
      - name: Transform__AiEndpoint
        value: ""            # empty = passthrough/no-op transform
    readinessProbe:
      httpGet: { path: /health, port: 8080 }
Enter fullscreen mode Exit fullscreen mode

Because the sidecar shares localhost with the build container, point both baseUrl
and transform.endpoint at http://localhost:8080. A Docker Compose example is in the
sample's README.

Using it

As a CLI (dotnet tool), with a remoteinclude.json next to your docfx.json:

dotnet tool install -g Documentation.DocfxRemoteInclude.Cli
docfx-ri build docs/docfx.json
Enter fullscreen mode Exit fullscreen mode

As a library, for hosts that call Docset.Build(...):

using Docfx;
using Docfx.RemoteInclude;

using var client = new HttpRemoteContentClient(
    baseUri: new Uri("https://internal.example.com/"),
    authHandler: async (request, ct) =>
        request.Headers.Authorization = new("Bearer", await GetJwtAsync(ct)));

await Docset.Build("docs/docfx.json", new BuildOptions
{
    ConfigureMarkdig = pipeline => pipeline.UseRemoteInclude(client, new RemoteIncludeOptions
    {
        PageTransformService = myTransformService, // optional IPageTransformService
    }),
});
Enter fullscreen mode Exit fullscreen mode

Provide your own IRemoteContentClient for non-HTTP sources, custom auth (mTLS, signed
URLs), or on-disk caching.

Auth and credentials

Both the content and transform auth blocks accept
{ "mode": "none" | "default" | "managedIdentity" | "jwt" | "key", "value": "...", "scope": "..." }.
value can indirect through environment variables with $VAR / ${VAR}, and scope
overrides the OAuth audience for default/managedIdentity modes. Credentials are read
from environment variables or a host-supplied callback — never from docfx.json, and
never written to disk.

Try it end to end

The repo includes a runnable samples/basic DocFX site wired to the
samples/knowledge-service reference service. Run the service, build the sample, and
you get shared snippets and page-level governance working together:

# terminal 1
cd samples/knowledge-service
dotnet run

# terminal 2
docfx-ri build samples/basic/docfx.json
Enter fullscreen mode Exit fullscreen mode

MIT-licensed. Issues and PRs welcome.

GitHub: github.com/saipramod/docfx-remote-include
NuGet: Documentation.DocfxRemoteInclude · Documentation.DocfxRemoteInclude.Cli


Sai Pramod Upadhyayula is a Senior Software Engineer at Microsoft working on
AI-powered enterprise knowledge platforms, and a contributor to the DocFX
open-source ecosystem.

Top comments (0)