Proxy OpenAI Through Kong AI Gateway on Kubernetes

#kubernetes #kong #ai #devops

The Problem With Talking Directly to LLMs

Most teams start by wiring their app straight to the OpenAI API. It works — until you need to add auth, rate limiting, observability, or swap out the model provider. Now you're rewriting application code instead of config.

An AI Gateway solves this. One entry point, one place to govern traffic, providers become swappable. Kong Gateway is a mature choice here — it's been doing this for APIs for years, and the AI Proxy plugin extends that to LLMs.

This post walks through the key ideas. For the full step-by-step guide, head over to the tutorial on Hashnode.

What We're Building

A Kong Gateway 3.14 data plane running on Kubernetes (kind locally), connected to a Kong Konnect control plane. The AI Proxy plugin sits on a route and handles forwarding to OpenAI — your app just talks to Kong.

Your app
  → POST /ai/chat (Kong proxy)
    → AI Proxy plugin attaches API key
      → OpenAI API
        → response back to your app

Your app never holds an OpenAI key. Kong does. You get rate limiting, logging, and model-swapping for free at the gateway layer.

The Key Bit: decK Config as Code

The most interesting part of this setup is using decK to define the service, route, and plugin as a YAML state file — then syncing it to Konnect, which pushes it down to the data plane automatically.

# kong-ai.yaml
_format_version: "3.0"

services:
  - name: openai-service
    url: https://api.openai.com
    routes:
      - name: openai-chat-route
        paths:
          - /ai/chat
        plugins:
          - name: ai-proxy
            config:
              route_type: llm/v1/chat
              auth:
                header_name: Authorization
                header_value: "Bearer $OPENAI_API_KEY"
              model:
                provider: openai
                name: gpt-4o
                options:
                  max_tokens: 512

One sync command and Konnect pushes the config to every connected data plane:

deck gateway sync kong-ai.yaml \
  --konnect-token "$KONNECT_TOKEN" \
  --konnect-control-plane-name "kong-ai-tutorial"

Once it's live, a single HTTPie call confirms the whole chain is working:

http POST localhost:8080/ai/chat \
  Content-Type:application/json \
  messages:='[{"role": "user", "content": "What is Kong Gateway in one sentence?"}]'

The response comes back in the standard OpenAI chat format — because Kong normalises it — even if you swap the underlying model later.

Try It Yourself

The full tutorial covers:

Creating a kind cluster with correct port mappings
Setting up a Konnect control plane and downloading cluster certs
Creating a System Account + Admin Role + PAT (the right way to handle automated access)
Installing Kong 3.14 via Helm with a complete values file
Full decK state file for the AI Proxy plugin
Troubleshooting guide for the common failure modes

👉 Kong AI Gateway on Kubernetes: Proxy OpenAI via Konnect