DEV Community

Ebi Soroush
Ebi Soroush

Posted on

A simple api gateway from scratch written in golang

Building an API Gateway From Scratch in Go

API gateways are one of those infrastructure components that feel intimidating from the outside. You know they handle auth, routing, rate limiting, but the internals are a black box. I decided to fix that by building one from scratch in Go, using nothing but the standard library's net/http. No frameworks. Just code.

This post walks through the design of simple-api-gateway: what it does, how it's structured, and the key implementation decisions along the way.


The Core Idea: A Gateway is Just a Middleware Chain

Before diving in, it's worth naming the central insight: an API gateway is fundamentally a configurable http.Handler. Every feature, auth, rate limiting, tracing, is just middleware wrapped around a reverse proxy.

In Go, that pattern looks like this:

type Middleware func(http.Handler) http.Handler
Enter fullscreen mode Exit fullscreen mode

A middleware takes a handler and returns a new handler that does something before or after calling the next one. Chain enough of these together and you have a gateway.

The Gateway struct implements http.Handler directly:

type Gateway struct {
    config     *types.Config
    router     *Router
    middleware map[string]Middleware
}

func (gw *Gateway) ServeHTTP(w http.ResponseWriter, r *http.Request) {

    route, params := gw.router.Match(r)
    if route == nil {
        http.NotFound(w, r)
        return
    }

    handler := gw.buildChain(route)
    handler.ServeHTTP(w, r)
}
Enter fullscreen mode Exit fullscreen mode

No magic, just a struct with a method. This also means the gateway is fully testable without a live server. You can call ServeHTTP directly in a test.


Routing

I designed the router to handle three path patterns: exact (/users), parameterised (/users/:id), and prefix wildcards (/users/*). Routes are sorted by specificity at startup so more precise matches always win.

func (ro *Router) Match(r *http.Request) (*types.Route, map[string]string) {
    for _, route := range ro.routes { // sorted: exact > param > prefix
        if matched, params := matchPath(route.Path, r.URL.Path); matched {
            if methodAllowed(route.Methods, r.Method) {
                return &route, params
            }
        }
    }
    return nil, nil
}
Enter fullscreen mode Exit fullscreen mode

One deliberate limitation: :param segments are matched and used for routing, but the extracted values aren't injected into the request context. For the purpose of this project, learning the gateway pattern, that tradeoff was acceptable, but it's a clear gap for production use.


The Middleware Chain

When a route is matched, its configured middleware is assembled into a chain and applied around the upstream proxy:

func (gw *Gateway) buildChain(route *types.Route) http.Handler {
    handler := gw.proxy.Forward(route.Target)

    // Apply in reverse so the first listed middleware is outermost
    for i := len(route.Middleware) - 1; i >= 0; i-- {
        name := route.Middleware[i]
        if mw, ok := gw.middleware[name]; ok {
            handler = mw(handler)
        }
    }
    return handler
}
Enter fullscreen mode Exit fullscreen mode

The order middleware is listed in config.yaml is the order requests traverse the chain, so if you list [tracing, auth, ratelimit, logging, metrics], tracing wraps everything, auth runs next, and so on. Getting this ordering right matters a lot in practice.

This was designed so each built-in is registered at startup and custom middleware can be added before calling Start():

gw := core.NewGateway(cfg)
gw.Register("my-middleware", func(next http.Handler) http.Handler {
    return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
        // do something before
        next.ServeHTTP(w, r)
        // do something after
    })
})
gw.Start()
Enter fullscreen mode Exit fullscreen mode

Authentication

The auth middleware validates JWT Bearer tokens and injects verified claims into the request context:

func NewAuthMiddleware(verifier TokenVerifier) Middleware {
    return func(next http.Handler) http.Handler {
        return http.HandlerFunc(func(w http.ResponseWriter, r *http.Request) {
            token := extractBearer(r)
            if token == "" {
                writeError(w, http.StatusUnauthorized, "missing token")
                return
            }

            claims, err := verifier.Verify(token)
            if err != nil {
                writeError(w, http.StatusUnauthorized, "invalid token")
                return
            }

            ctx := context.WithValue(r.Context(), claimsKey{}, claims)
            next.ServeHTTP(w, r.WithContext(ctx))
        })
    }
}
Enter fullscreen mode Exit fullscreen mode

The middleware itself doesn't care how a token is verified, that's delegated to a TokenVerifier interface:

type TokenVerifier interface {
    Verify(token string) (Claims, error)
}
Enter fullscreen mode Exit fullscreen mode

This separation means swapping in a different authentication backend is a matter of implementing one interface, with no changes to the middleware itself. The active verifier is selected at startup based on the auth.type field in config — currently jwt is the built-in implementation, but plugging in something like a Keycloak OIDC verifier would just mean implementing Verify against Keycloak's token introspection endpoint and setting type: keycloak in your config:

auth:
  enabled: true
  type: keycloak # swap without touching middleware code
  # keycloak-specific config would go here
Enter fullscreen mode Exit fullscreen mode

Downstream handlers retrieve the injected claims with middleware.ClaimsFromContext(ctx), regardless of which verifier produced them.


Rate Limiting

Rate limiting uses a per-IP token bucket from golang.org/x/time/rate:

func (rl *rateLimiter) getLimiter(ip string) *rate.Limiter {
    rl.mu.Lock()
    defer rl.mu.Unlock()

    if lim, ok := rl.clients[ip]; ok {
        return lim
    }

    lim := rate.NewLimiter(rate.Limit(rl.rps), rl.burst)
    rl.clients[ip] = lim
    return lim
}
Enter fullscreen mode Exit fullscreen mode

This works fine for a single instance but has two limitations worth knowing: the map has no eviction, so memory grows unbounded over time; and there's no shared state across instances, so it breaks under horizontal scaling. A production system would push this to Redis. For a learning project, the simplicity is worth it.


Observability

Three layers of observability are built in.

Logging uses Go 1.21's slog with configurable JSON or text output. Each request gets a structured log entry with method, path, status, and duration.

Metrics are exposed via Prometheus at a configurable path (default /metrics). Three instruments are tracked:

gateway_requests_total{method, path, status}
gateway_request_duration_seconds{method, path}
gateway_requests_in_flight
Enter fullscreen mode Exit fullscreen mode

The /metrics endpoint itself is served before route matching, bypassing all middleware, otherwise you'd need to auth-exempt your own scraper.

Distributed tracing uses OpenTelemetry. The tracing middleware creates a server span for each request and propagates a traceparent header to the upstream, so traces stitch together across service boundaries. Spans are exported to Jaeger over gRPC OTLP.

One subtle gotcha: tracing must be explicitly listed in each route's middleware array. It is not in the internal isKnownMiddleware registry, so if you omit it, there's no warning — requests just won't be traced.


Configuration

Everything is controlled by a single config.yaml, parsed strictly with gopkg.in/yaml.v3's KnownFields — unknown keys are rejected at startup rather than silently ignored:

server:
  port: 8080

routes:
  - path: /users
    methods: [GET]
    target: http://localhost:9002
    middleware: [tracing, auth, ratelimit, logging, metrics]

auth:
  enabled: true
  type: jwt
  secret: "your-secret"

rate_limit:
  enabled: true
  requests_per_second: 5
  burst: 4

metrics:
  prometheus:
    enabled: true
    path: /metrics
  otel:
    enabled: true
    endpoint: localhost:4317 # host:port only, no scheme
Enter fullscreen mode Exit fullscreen mode

Running It

# Start Jaeger for tracing
docker compose up -d

# Start the gateway + mock upstream services
make run-example

# Generate a dev JWT and hit an endpoint
make get-token
# Run this a couple of time to have some request with tracing
curl -H "Authorization: Bearer <token>" http://localhost:8080/users

# Inspect traces and metrics
open http://localhost:16686       # Jaeger UI
curl http://localhost:8080/metrics
Enter fullscreen mode Exit fullscreen mode

Observability in Action

Here's what the observability stack looks like when the gateway is running.

Prometheus metrics endpoint

Hitting /metrics exposes all three instruments in standard Prometheus text format — request counts broken down by method, path, and status code, latency histograms, and the current in-flight count.

Prometheus /metrics endpoint output

Structured JSON logs

With logging.format: json set in config, every request produces a structured log entry. This makes logs trivially parseable by tools like Loki, Datadog, or any log aggregator.

Structured JSON request logs

Jaeger trace view

Each request generates a server span with timing and metadata. Because traceparent is propagated to the upstream, the full trace stitches together across the gateway and the backend service in a single waterfall view.

Jaeger trace waterfall for a single request

Jaeger service graph

The service graph shows the call relationships between the gateway and its upstream services, making it easy to see which services are being hit and at what rate.

Jaeger service dependency graph

What I Took Away

Building this clarified something I'd always glossed over: the complexity in production gateways (Kong, Traefik, AWS API Gateway) isn't in the core concept, it's entirely in the operational concerns layered on top. Distributed rate limiting, hot config reloading, circuit breaking, mTLS, dynamic service discovery. The fundamental pattern, a middleware chain in front of a reverse proxy, is surprisingly approachable.

If you want to genuinely understand a piece of infrastructure, build a limited version of it. The gaps in your mental model become impossible to ignore once you're responsible for filling them.

The full source is on GitHub at github.com/realEbi/simple-api-gateway.

Top comments (0)