What this project is
DocuMCP is a documentation server that exposes knowledge bases through the Model Context Protocol (MCP). AI agents connect to it and can search, read, and manage documentation across multiple sources: uploaded documents (PDF, DOCX, XLSX, EPUB, HTML, Markdown), ZIM archives served by Kiwix, and Git template repositories. It also has a REST API and a Vue 3 admin panel.
The original version was a Laravel application. It had 18 MCP tools, 7 prompts, an OAuth 2.1 authorization server, OIDC authentication, Meilisearch for full-text search, and 97%+ test coverage. It worked. This article is about why I rewrote it in Go and what the process taught me.
Why rewrite at all?
The PHP version was not broken. Laravel is a productive framework, and the test coverage gave me confidence in the code. I rewrote it because of operational friction, not because I had a problem with PHP.
The Laravel deployment required PHP, a process manager (RoadRunner or Octane), Meilisearch as a separate service, and the usual Composer dependency tree. The container image was around 200 MB. Each new deployment meant coordinating multiple processes, and each process was another thing that could fail at 3 AM.
Go offered a different deployment model: a single statically-linked binary, a distroless container image (~45 MB), millisecond startup, and native concurrency without a process manager. All dependencies compile into the binary — no runtime interpreter, no extension loading, no shell needed in the container.
The other factor was the MCP SDK. The official Go MCP SDK had just reached a usable state, and I wanted the type safety and tooling that Go provides for protocol work. Registering a tool in the Go SDK looks like this:
mcp.AddTool(h.server, &mcp.Tool{
Name: "unified_search",
Description: "Search across ALL content types in a single request...",
InputSchema: mcp.ToolInputSchema{
Type: "object",
Properties: map[string]map[string]any{ /* ... */ },
},
}, h.handleUnifiedSearch)
The handler is a typed function, the schema is validated at compile time, and the SDK manages the SSE transport. In PHP, I had more ceremony around JSON schema validation and request dispatching.
Architecture overview
The Go version follows a layered structure:
cmd/documcp/ # Cobra CLI: serve, worker, migrate, version, health
internal/
app/ # Lifecycle: Foundation → ServerApp / WorkerApp
auth/oauth/ # OAuth 2.1 authorization server
auth/oidc/ # OIDC authentication (external IdP)
handler/api/ # REST API handlers
handler/mcp/ # MCP tool + prompt handlers
handler/oauth/ # OAuth endpoint handlers
model/ # Domain types (typed status constants, no ORM tags)
repository/ # PostgreSQL queries (pgx, handwritten SQL)
service/ # Business logic, error translation
search/ # PostgreSQL FTS (tsvector/tsquery + pg_trgm)
extractor/ # Document content extraction (PDF, DOCX, XLSX, EPUB, HTML, MD)
client/kiwix/ # Kiwix ZIM archive client
queue/ # River job queue (Postgres-native, admin UI at /admin/river)
crypto/ # AES-256-GCM encryption at rest
observability/ # OpenTelemetry, Prometheus, slog
web/frontend/ # Vue 3 + TypeScript (embedded in binary via go:embed)
The key lifecycle abstraction is Foundation, a struct that owns all shared dependencies (database pool, Redis clients, repositories, search, extractors, encryption, observability). Both ServerApp (HTTP) and WorkerApp (job processing) receive a Foundation and compose their own dependencies on top of it.
type Foundation struct {
Config *config.Config
Logger *slog.Logger
PgxPool *pgxpool.Pool
RedisClient *redis.Client
// ...repositories, search, extractors, encryption...
}
This pattern makes the dependency graph explicit. Every service gets its dependencies through constructor injection, and Foundation provides the wiring point without a DI container.
One nice bonus of using River for the job queue: it ships with an embeddable admin UI that I mount at /admin/river. Queue visibility with zero extra infrastructure.
Dropping Meilisearch for PostgreSQL full-text search
This was the biggest single change in terms of operational overhead. The PHP version used Meilisearch as a separate service for full-text search. Meilisearch is excellent software, but it was another process to run, another data store to keep in sync, and another thing to monitor.
PostgreSQL already had all my data. It also has a mature full-text search engine that I think is underused. The Go version uses tsvector/tsquery with a custom text search configuration, pg_trgm for fuzzy matching, and unaccent for diacritic-insensitive search.
The search layer is a Searcher struct that accepts typed parameters and returns normalized results:
type Searcher struct {
db *pgxpool.Pool
logger *slog.Logger
}
func (s *Searcher) Search(ctx context.Context, params SearchParams) (*SearchResponse, error) {
expanded := ExpandSynonyms(params.Query)
// Route to index-specific query builder
switch params.IndexUID {
case IndexDocuments:
return s.searchDocuments(ctx, expanded, params, limit)
case IndexZimArchives:
return s.searchZimArchives(ctx, expanded, params, limit)
case IndexGitTemplates:
return s.searchGitTemplates(ctx, expanded, params, limit)
}
}
Federated search across all three indexes uses a UNION ALL query, so it is one round trip to PostgreSQL instead of three separate HTTP calls to Meilisearch. Synonym expansion happens in Go before the query reaches PostgreSQL. The SQL stays simple and the synonyms are unit-testable.
The tradeoff: PostgreSQL FTS requires more manual tuning than Meilisearch (custom dictionaries, index maintenance, query parsing). But it eliminated an external dependency, and the search quality is good enough for a documentation use case where queries are typically specific.
Building OAuth 2.1 from scratch
The PHP version also had a custom OAuth implementation. For Go, the common choice is fosite, but I decided to build the OAuth 2.1 authorization server from scratch again.
The reason was scope. DocuMCP needs a specific set of OAuth flows: authorization code with PKCE (required by OAuth 2.1), device authorization (RFC 8628) for CLI/agent clients that lack a browser, and dynamic client registration (RFC 7591) so MCP clients can register themselves. It also publishes RFC 9728 Protected Resource Metadata so clients can discover the authorization server automatically.
Building this on fosite would have meant learning fosite's abstraction layer, mapping my storage to its interfaces, and debugging through its middleware chain when something didn't match. For a well-scoped set of RFCs, the direct implementation was more predictable.
The OAuth service follows the same patterns as the rest of the codebase. It defines a repository interface where it is consumed:
type OAuthRepo interface {
CreateClient(ctx context.Context, client *model.OAuthClient) error
FindClientByClientID(ctx context.Context, clientID string) (*model.OAuthClient, error)
CreateAuthorizationCode(ctx context.Context, code *model.OAuthAuthorizationCode) error
FindAuthorizationCodeByCode(ctx context.Context, codeHash string) (*model.OAuthAuthorizationCode, error)
RevokeAuthorizationCode(ctx context.Context, id int64) error
// ...access tokens, refresh tokens, device codes
}
type Service struct {
repo OAuthRepo
config config.OAuthConfig
appURL string
logger *slog.Logger
}
All tokens are stored as HMAC-SHA256 hashes. The HMAC key is derived from the session secret using HKDF with a salt, so the raw token values never touch the database. The OAuth consent pages use html/template. No JavaScript framework, no XSS surface.
I will cover the OAuth implementation in more detail in Part 2 of this series.
The ZIM fan-out problem
The most interesting engineering challenge was federated search across ZIM archives. DocuMCP integrates with Kiwix Serve, which hosts ZIM files — offline snapshots of sites like DevDocs, Wikipedia, and Stack Exchange. One instance can serve hundreds of archives.
The unified search tool needs to search across all relevant archives in a single MCP tool call. The naive approach (fan out to every searchable archive in parallel) collapsed when the archive count reached ~470 (mostly DevDocs). Kiwix Serve could not handle hundreds of concurrent search requests, and everything timed out.
The fix had three parts:
1. FTS pre-filtering. Before fanning out, the server searches its own zim_archives table using PostgreSQL FTS to find archives relevant to the query. If you search for "react hooks," you get React-related archives, not all 470.
resp, err := h.searcher.Search(ctx, search.SearchParams{
Query: query,
IndexUID: search.IndexZimArchives,
Limit: int64(h.federatedMaxArchives * 3),
})
2. Budget splitting. DevDocs archives are small (fewer than 1000 articles each) and highly relevant for programming queries. They get their own fan-out budget separate from larger general archives like Wikipedia. This prevents a few large archives from crowding out dozens of small, relevant DevDocs archives.
3. Semaphore-bounded concurrency. The actual fan-out uses a channel-based semaphore capped at 10 concurrent requests, with a configurable deadline on the entire batch (default 3 seconds, KIWIX_FEDERATED_SEARCH_TIMEOUT):
sem := make(chan struct{}, 10)
for i := range archives {
wg.Add(1)
go func(name string) {
defer wg.Done()
sem <- struct{}{} // acquire
defer func() { <-sem }() // release
results, err := kiwixClient.Search(fanoutCtx, name, query, searchType, limit)
// ...
}(archives[i].Name)
}
This brought search times from 30-second timeouts down to about 250ms on a warm cache, across the ~20 archives that actually match a typical query after FTS selection.
A few surprises
Error translation turned out to be worth the boilerplate. The service layer translates repository errors into domain errors, and handlers translate domain errors into HTTP status codes. It feels like extra work up front, but it means the MCP handler and the REST handler share the same service code without leaking database details:
// Service layer
var ErrNotFound = errors.New("not found")
func (s *DocumentService) FindByUUID(ctx context.Context, uuid string) (*model.Document, error) {
doc, err := s.repo.FindByUUID(ctx, uuid)
if errors.Is(err, sql.ErrNoRows) {
return nil, fmt.Errorf("document %s: %w", uuid, ErrNotFound)
}
return doc, err
}
Two Redis clients, not one
I ended up with two Redis clients: one instrumented with OpenTelemetry for the event bus and application-level operations, and one bare client for rate limiting and health checks. The rate limiter uses MULTI/INCR/EXPIRE/EXEC pipelines at high frequency, and tracing every one of those creates noise that drowns out meaningful spans. The bare client has MaxRetries: -1 to prevent retry-induced partial responses in the pipeline.
Pure Go extraction actually worked
I expected to need cgo or external tools for PDF and DOCX extraction. Instead, the extractor pipeline uses pure Go libraries for all six formats: PDF, DOCX, XLSX, EPUB, HTML, and Markdown. The EPUB case was what convinced me — an EPUB file is a ZIP of XHTML files, so the whole thing is archive/zip + encoding/xml + bluemonday for sanitization + htmltomarkdown for chapter processing. No external dependencies, no cgo. Each extractor has configurable limits (max extracted text size, max ZIP files for DOCX, max sheets for XLSX).
OIDC discovery was the other thing that bit me. The OIDC client fetches the provider's .well-known/openid-configuration at startup. Early in development, transient provider unavailability during startup caused permanent failures and the server would crash and restart in a loop. Exponential backoff with retries fixed it. A small thing, but a real problem in any environment where the identity provider starts up alongside the application.
If I did it again
I would hit the MCP SDK's SSE transport with focused integration tests from day one. The Go MCP SDK was new, and its SSE transport had behaviors I didn't expect. I spent time debugging connection lifecycle issues that would have shown up in tests.
I would also use sqlc or a query builder for the simpler queries. I wrote all SQL by hand with pgx. For the complex FTS queries and UNION ALL federated search, that was the right call — no query builder would have generated those. But for basic CRUD, handwritten SQL is repetitive and error-prone when columns change. A hybrid approach (sqlc for CRUD, handwritten for complex queries) would have saved time.
Third, I'd nail down structured logging conventions earlier. slog is good, but I spent time later standardizing which fields to include across handlers. Picking a convention (request ID, user ID, operation name) up front would have avoided the cleanup pass.
The numbers
| Metric | PHP (Laravel) | Go |
|---|---|---|
| Container image | ~200 MB | ~45 MB |
| MCP tools | 18 | 16 |
| MCP prompts | 7 | 6 |
| External search dependency | Meilisearch | None (PostgreSQL FTS) |
| Test coverage | 97%+ | 76.7% |
| Linter rules | PHPStan L9 | golangci-lint, 26 linters |
| Runtime processes | PHP + process manager + Meilisearch | Single binary |
| Deployment artifact | Container + config | Single binary or container |
The tool count went down because I dropped the Confluence integration (the API complexity was not worth maintaining for a feature nobody used) and consolidated some overlapping tools. Test coverage is lower in Go (76.7% vs 97%), partly because the PHP codebase was older and more fully tested, and partly because I prioritized integration-level tests over exhaustive unit coverage for repository code that is mostly SQL.
Up next
Part 2 covers two subsystems in detail: the OAuth 2.1 authorization server (PKCE, device authorization, dynamic client registration) and the MCP protocol integration (tool registration, SSE transport, authentication flow).
The code is open source at github.com/c-premus/documcp. If you have questions about the architecture or want to discuss any of the decisions, I am happy to talk about it in the comments.
Top comments (0)