DEV Community

Akarshan Gandotra
Akarshan Gandotra

Posted on

Part 2 — NGINX auth_request: the small primitive that changed everything

In Chapter 1 I claimed our entire Auth Gateway is built on top of one NGINX directive: auth_request. This chapter is a deep dive into how that directive actually works, and the four or five sharp edges that bit us before we got the config right.

If you already know auth_request cold, skim to "Sharp edge 1" near the bottom — that's where the real war stories are.

What auth_request actually does

Drop this in a location block:

location /user-management/ {
    auth_request /auth;
    proxy_pass http://user-service;
}
Enter fullscreen mode Exit fullscreen mode

When a request matches /user-management/, NGINX:

  1. Pauses the main request before doing anything to the upstream.
  2. Fires an internal subrequest to /auth.
  3. Looks at the subrequest's HTTP status:
    • 2xx → continue with the main request.
    • 401 or 403 → abort the main request and return that status to the client.
    • Anything else → fall through to your error_page directives, or return 500.

That's the entire surface area. Two things to internalize:

  • The subrequest never reaches the client. The client only sees the result — usually a 200 from your upstream, or a 401 NGINX synthesized.
  • The subrequest target is just a normal location block. Any NGINX feature works there: proxy_pass, timeouts, retries, keepalive pools, even another auth_request (don't do this).

The subrequest lifecycle, as a timeline

Notice the order: the subrequest is fully finished before NGINX touches the upstream. There is no streaming, no overlap. That's why latency adds up — your auth time is purely sequential to your upstream time.

What the subrequest actually looks like

Here is the full auth.conf we ship in our Helm chart, trimmed of comments and noise:

location = /auth {
  internal;                          # not callable from outside

  proxy_pass_request_body off;        # Critical: don't ship the body
  proxy_set_header Content-Length "";
  proxy_method POST;

  proxy_set_header X-Original-URI    $request_uri;
  proxy_set_header X-Original-Method $request_method;
  proxy_set_header X-Original-Host   $host;
  proxy_set_header X-Original-URL    $scheme://$http_host$request_uri;
  proxy_set_header X-Request-ID      $request_id;
  proxy_set_header X-Tenant-ID       $tenant_id;
  proxy_pass_request_headers on;      # forward Authorization, cookies, etc.

  proxy_buffering        off;
  proxy_http_version     1.1;
  proxy_set_header       Connection "";

  proxy_connect_timeout 5s;
  proxy_send_timeout    10s;
  proxy_read_timeout    10s;
  proxy_next_upstream   error timeout invalid_header http_500 http_502 http_503 http_504;
  proxy_next_upstream_tries   2;
  proxy_next_upstream_timeout 15s;

  error_page 502 503 504 = @auth_unavailable;

  proxy_pass http://auth_service/auth;
}
Enter fullscreen mode Exit fullscreen mode

Every line in there is the result of an outage post-mortem. Worth walking through the non-obvious ones.

internal;

Without this, /auth would be a public endpoint anyone could hit. With it, NGINX only allows the location to be called from a subrequest. Try curl https://your-host/auth and you get 404. This is the same pattern NGINX uses for its own named locations.

proxy_method POST + proxy_pass_request_body off

The Auth Service doesn't care about the request body. It cares about the URI, the method, the tenant, and the bearer token. So we strip the body and force POST. Two reasons:

  • Performance. A 50 MB upload would otherwise be buffered to the auth subrequest before it could be streamed to the upstream. That's a non-starter.
  • Security. The Auth Service shouldn't be a side-channel exfiltration target for upstream payloads.

But we're forcing POST even though we drop the body. Why? Because some load balancers and observability tools treat POST /auth differently from GET /auth, and we wanted the subrequest to be obviously a write of a decision request, not a read.

proxy_pass_request_headers on

The Auth Service needs Authorization, Cookie, X-Forwarded-For, etc. We pass them all. The subrequest is in-cluster — there's no trust boundary between NGINX and the Auth Service.

proxy_set_header X-Original-*

NGINX rewrites the URI of a subrequest to the subrequest target (/auth). The Auth Service has no idea what URL the client originally hit. So we explicitly forward:

  • X-Original-URI — the path with query string, used for endpoint matching and audit.
  • X-Original-Method — the original HTTP verb.
  • X-Original-Host — the host header, useful for tenant resolution by hostname.
  • X-Original-URL — full URL, for logs.

These headers are the contract between NGINX and the Auth Service. Change them carelessly and you break every auth decision in the platform.

proxy_buffering off

For a tiny 2-line JSON response, buffering hurts more than it helps. We get a few hundred microseconds back per request with this.

proxy_http_version 1.1 + Connection ""

Combined with the upstream's keepalive 64, this enables connection reuse between NGINX and the Auth Service. Without it, every subrequest opens a fresh TCP connection — disastrous at any real RPS.

Timeouts and retries

proxy_connect_timeout 5s;
proxy_send_timeout    10s;
proxy_read_timeout    10s;
proxy_next_upstream_tries   2;
proxy_next_upstream_timeout 15s;
Enter fullscreen mode Exit fullscreen mode

Translation: try to connect in 5s, send in 10s, read in 10s. If we get a connection error or 5xx, retry once. The whole thing is bounded at 15s.

These are huge ceilings — a healthy auth pod responds in single-digit milliseconds. They exist for the worst case: a partition, a failing pod, a slow JWKS fetch on first request. We'd rather wait 15 seconds and serve a clean 503 than time out at 1 second and have flaky behavior under load.

error_page 502 503 504 = @auth_unavailable

This is the fail-closed line. If the Auth Service is unreachable after retries, NGINX runs the @auth_unavailable named location instead of just 502'ing the client.

Pulling identity out: auth_request_set

A subrequest succeeding (200) tells NGINX to continue, but on its own it doesn't tell the upstream service who is calling. That's where auth_request_set comes in. It pulls headers off the subrequest's response and binds them to NGINX variables, which we then forward.

From locations.conf:

auth_request /auth;
auth_request_set $auth_time     $upstream_response_time;
auth_request_set $auth_status   $upstream_status;
auth_request_set $identity_id   $upstream_http_x_identity_id;
auth_request_set $identity_type $upstream_http_x_identity_type;
auth_request_set $identity_name $upstream_http_x_identity_name;
auth_request_set $session_id    $upstream_http_x_session_id;
auth_request_set $auth_error_message $upstream_http_x_auth_error_message;
auth_request_set $auth_error_code    $upstream_http_x_auth_error_code;
Enter fullscreen mode Exit fullscreen mode

Two patterns at play:

  • $upstream_response_time and $upstream_status are the auth subrequest's transport metadata. We capture them so they end up in our log line.
  • $upstream_http_x_identity_id is NGINX's way of saying "the value of the X-Identity-ID response header on the most recent upstream call." We freeze that into $identity_id before we touch the actual upstream service — otherwise the upstream's response headers would clobber it.

Then, in the same location, we pass those variables forward:

proxy_set_header X-Identity-ID   $identity_id;
proxy_set_header X-Identity-Type $identity_type;
proxy_set_header X-Identity-Name $identity_name;
proxy_set_header X-Session-ID    $session_id;
proxy_set_header X-Tenant-ID     $tenant_id;
proxy_set_header X-Request-ID    $request_id;
Enter fullscreen mode Exit fullscreen mode

The upstream service trusts these. It doesn't see the JWT. It doesn't validate the token. It loads the user by X-Identity-ID and gets on with its life.

Named error_page locations: clean envelopes for every failure

auth_request returning 401 doesn't automatically send a clean 401 to the client — it just tells NGINX the request was unauthorized. By default the response body is empty, which makes for sad logs and worse client behavior.

We use named error_page locations to attach JSON envelopes:

location /user-management/ {
  auth_request /auth;
  # ...
  error_page 401 = @unauthorized;
  error_page 403 = @forbidden;
  error_page 500 = @internal_server_error;
  error_page 502 503 504 = @upstream_unavailable;
  proxy_pass http://user-service;
}
Enter fullscreen mode Exit fullscreen mode

And the named locations live in custom_error_locations.conf:

location @unauthorized {
  internal;
  default_type application/json;
  add_header X-Request-ID $request_id always;
  return 401 '{"source":"auth","message":"Unauthorized","code":"$auth_error_code","error":"$auth_error_message"}';
}

location @forbidden {
  internal;
  default_type application/json;
  if ($auth_error_message = "No such API found") {
    return 404 '{"source":"auth","message":"NotFound","code":"$auth_error_code","error":"$auth_error_message"}';
  }
  return 403 '{"source":"auth","message":"Forbidden","code":"$auth_error_code","error":"$auth_error_message"}';
}

location @auth_unavailable {
  internal;
  default_type application/json;
  access_log /dev/stdout auth_unavailable;
  return 503 '{"source":"auth","message":"Auth Service Unavailable","error":"Auth Service unreachable"}';
}

location @upstream_unavailable {
  internal;
  default_type application/json;
  access_log /dev/stdout upstream_unavailable;
  return 503 '{"source":"auth","message":"Upstream Service Unavailable","error":"Upstream Service unreachable"}';
}
Enter fullscreen mode Exit fullscreen mode

Couple of things worth highlighting:

  • $auth_error_code and $auth_error_message were captured from the subrequest's X-Auth-Error-Code and X-Auth-Error-Message response headers. The Auth Service writes these on every deny, and NGINX surfaces them verbatim to the client.
  • The if inside @forbidden is how we handle the "endpoint not registered in the trie" case. The Auth Service signals it with a 403 + a specific message, and NGINX rewrites that to 404. The wire-level shape stays consistent, but the status reflects what the client should actually see.
  • Both fail-closed branches use a separate log format (auth_unavailable, upstream_unavailable). When something is on fire, you want it in its own log stream so dashboards aren't drowned by 200s.

Sharp edge 1: subrequests don't cache

People expect this to work:

proxy_cache auth_cache;
proxy_cache_key "$http_authorization";
auth_request /auth;
Enter fullscreen mode Exit fullscreen mode

It doesn't. auth_request deliberately ignores proxy_cache — the subrequest fires every time. There's no built-in TTL on auth decisions.

Why is that the right default? Because auth decisions are not cacheable in the general case:

  • The same token might be revoked in the next 50 ms.
  • The required permissions for an endpoint might change.
  • The tenant context can change between requests (different X-Tenant-ID).

You can roll your own cache — for example, by keying off (token_hash, endpoint, method) and storing decisions in a shared cache — but you're now responsible for invalidating it when anything about the auth state changes. We chose a different approach: caching inside the Auth Service process itself. That's Chapter 8.

Sharp edge 2: auth_request_set is run in main-request context

This bit us on day three. Consider:

location /api/ {
  auth_request /auth;
  auth_request_set $identity_id $upstream_http_x_identity_id;
  proxy_pass http://api-service;
}
Enter fullscreen mode Exit fullscreen mode

The variable $identity_id is not set when the subrequest returns. It's set when auth_request_set is evaluated in the main request — which happens after the subrequest, but $upstream_http_x_identity_id refers to the most recent upstream response in the current request context. Since the subrequest just finished, this works. But here's the trap:

location /api/ {
  auth_request /auth;
  proxy_pass http://api-service;
  # ❌ auth_request_set lives below proxy_pass
  auth_request_set $identity_id $upstream_http_x_identity_id;
}
Enter fullscreen mode Exit fullscreen mode

auth_request_set directives are order-independent within a location (they apply at request setup), but if you start playing tricks with if or set-based conditionals, you can read $identity_id before auth_request_set evaluates and get an empty string. Lesson: keep auth_request_set together, immediately after auth_request, before any proxy_set_header.

Sharp edge 3: subrequest 5xx vs subrequest 401

A subtle one. If the Auth Service returns 401, the client sees 401. If the Auth Service returns 500, what does the client see?

By default: 500. Because auth_request propagates the subrequest's status if it's not 2xx and not 401/403.

That's almost never what you want. A 500 from the auth pod is "auth is broken," not "the user is broken." The client shouldn't see "internal server error" for what is operationally an auth outage.

Fix: explicit error_page mapping.

error_page 500 = @internal_server_error;
error_page 502 503 504 = @auth_unavailable;
Enter fullscreen mode Exit fullscreen mode

Now any 5xx from the auth subrequest gets a clean envelope. We tell oncall via the alert pipeline (Chapter 9), not via the client.

Sharp edge 4: proxy_intercept_errors

Default is off, which is correct in our location blocks. We explicitly set it because we burned half a day on a related bug:

location /api/ {
  auth_request /auth;
  proxy_intercept_errors off;   # important
  proxy_pass http://api-service;
}
Enter fullscreen mode Exit fullscreen mode

If you set proxy_intercept_errors on, NGINX will run upstream error responses (e.g., a 404 from the actual api-service) through your error_page mappings. Suddenly your "no such API found" 404 from the Auth Service and a "user not found" 404 from the upstream both end up in @forbidden's 404 branch. They look identical to the client. They're completely different problems.

Keep proxy_intercept_errors off on the upstream location. Let upstream errors pass through unmolested. Only auth-side errors should run through your named locations.

Sharp edge 5: NGINX never sees auth's body

The Auth Service can't return a JSON body that NGINX uses. Only the status code and response headers matter. If the Auth Service writes:

HTTP/1.1 401 Unauthorized
X-Auth-Error-Code: TOKEN_EXPIRED
X-Auth-Error-Message: token expired

{"detail":"token expired"}
Enter fullscreen mode Exit fullscreen mode

…NGINX sees the 401 and the two X-Auth-* headers. The body is discarded. So the contract is:

  • Status decides allow/deny/error.
  • Response headers carry identity (on 200) or failure metadata (on 4xx).
  • Response body is for nobody. Don't bother.

Internalize this and the Auth Service handler design becomes much simpler — it's writing headers, not JSON.

What this directive bought us

To put it bluntly: auth_request is the difference between "we operate an Auth Gateway" and "we operate an Auth Library That Every Service Includes." It moved the decision point off every service's hot path and onto a single dedicated pod. Everything else in this series — endpoint classification, multi-tenant routing, revocation, observability — sits on top of that one primitive.

flowchart TD
    A[auth subrequest returns] --> B{status}
    B -->|200| C[continue to proxy_pass]
    B -->|401| D["@unauthorized<br/>return 401 JSON"]
    B -->|403| E["@forbidden<br/>404 if 'No such API found'<br/>else 403"]
    B -->|500| F["@internal_server_error<br/>return 500 JSON"]
    B -->|502/503/504| G["@auth_unavailable<br/>return 503 JSON<br/>(fail-closed)"]
Enter fullscreen mode Exit fullscreen mode

What's next

Chapter 3 goes inside the Auth Service: the controller, the handler chain, JWT validation, the per-tenant RSA public-key cache, and the decision-reason model. We'll spend most of it in Go code.

If you implement an auth_request-backed gateway after reading this and a bit catches you, drop a comment. The five sharp edges above are the ones we hit. There are probably another five waiting for you.

Top comments (0)