Governance-first AI gateway for teams that aren't ready for enterprise tooling

#opensource #ai #devops #governance

If you work in a regulated organisation, you have probably seen this play out: leadership wants AI in production, security wants an audit trail, and the team in the middle has two options. Either ship something fast with no governance — shadow tools, no DLP, no audit log — or wait twelve to eighteen months for an enterprise platform to get procured and approved. Neither is good.

Most of the tools available to bridge that gap fall into one of three camps:

Too complex. LiteLLM, Kong, Azure APIM. Good tools, but built for teams that already have DevOps capacity and a budget for AI infrastructure.
Too expensive. Enterprise AI governance platforms with six-figure contracts.
Too cloud-dependent. Require sending your data to a third party, which is a non-starter under data residency rules in finance, healthcare, and the public sector.

I have been working on a small Apache-2.0 project called Synapse AI Gateway that aims at the space between those options. docker compose up brings the whole stack — postgres, backend, admin console — and you have it running in under five minutes. Governance controls run on every inference request before they ever reach a model.

GitHub: synapse-ai-gateway/synapse-ai-gateway

The core idea: governance bound to the credential

The design hinges on one decision: every API key is bound at creation to a system prompt, a model allowlist, a team identity, and rate limits. The team that gets a key for an approved HR-assistant use case cannot quietly repurpose that key for something else. They need a new key, which means a new approval.

That is the difference between governance-as-policy (a wiki page nobody reads) and governance-as-infrastructure (the gateway refuses the request). Policies do not enforce themselves. Controls in the request path do.

The five layers every request passes through

client app
   │
   ▼
┌─────────────────────────────────────────┐
│ 1. auth + use-case scoping              │  →  inject system prompt, check model allowlist
├─────────────────────────────────────────┤
│ 2. prompt DLP                           │  →  block / redact / alert
├─────────────────────────────────────────┤
│ 3. hybrid routing (on-prem vs cloud)    │  →  classification decides backend
├─────────────────────────────────────────┤
│ 4. immutable audit log                  │  →  PostgreSQL append-only, SHA-256 hashes
├─────────────────────────────────────────┤
│ 5. response DLP + anomaly detection     │  →  webhook alerts
└─────────────────────────────────────────┘
   │
   ▼
LLM backend (Ollama, vLLM, OpenAI, Anthropic, Azure, Google)

Layer 1 validates the key, injects the bound system prompt, checks the model allowlist. Invalid key or unapproved model returns 403 immediately.

Layer 2 is a built-in regex DLP engine. Three outcomes per category: block (HTTP 400), redact (sanitise and forward), alert (log and forward). Patterns live in a config file you can hot-reload. No external service required — this matters if your data sovereignty rules say PII cannot leave your perimeter even for a scan.

Layer 3 routes by data classification. A key tagged sensitive is allowed only to on-premises backends (Ollama, vLLM). A key tagged non_sensitive can go to a cloud provider for higher capability. Consuming applications do not change — they always speak the OpenAI API.

Layer 4 writes one row per request to PostgreSQL: timestamp, team, model, token count, latency, DLP outcome, HTTP status. Prompt and response are stored as SHA-256 hashes, never plaintext. That preserves forensic hash-matching while protecting staff privacy.

Layer 5 scans responses on the way back out and surfaces anomalies (usage spikes, repeated DLP blocks, off-hours bursts) via webhook.

Quick start

git clone https://github.com/synapse-ai-gateway/synapse-ai-gateway
cd synapse-ai-gateway
docker compose up -d

Every setting has a working default, so that genuinely is the whole quick start for a local trial. Before exposing the stack beyond localhost, copy .env.example to .env and set real values for JWT_SECRET, ADMIN_PASSWORD, and POSTGRES_PASSWORD.

Admin console at http://localhost:5173. Log in, create a team in the UI, copy the API key (shown once), and you're ready:

curl -X POST http://localhost:8080/v1/chat/completions \
  -H "Authorization: Bearer <YOUR_TEAM_API_KEY>" \
  -H "Content-Type: application/json" \
  -d '{
    "model": "llama3.2:latest",
    "messages": [{"role": "user", "content": "hello"}]
  }'

It is OpenAI-API-compatible, so any OpenAI SDK works. Point base_url at http://localhost:8080/v1 and pass the team key. The backend is fully transparent to the client.

The honest comparison with LiteLLM

LiteLLM is excellent. It is the right tool for a different problem: routing across 100+ providers with maximum flexibility, at scale, with a team that already has DevOps capacity. Its per-worker footprint is correspondingly larger — appropriate for the high-throughput case it is built for.

Synapse AI Gateway's full stack runs at ~113 MB at idle (backend 73 MB + postgres 32 MB + frontend 8 MB). The whole stack is three containers — postgres, backend, admin console — brought up by a single docker compose up. No Redis, no message broker, no Kubernetes required to get started. If you are just starting out and need a governance layer you can deploy in an afternoon, that footprint matters. If you are running millions of requests per day, it does not, and LiteLLM is the better choice.

The other meaningful difference is DLP. LiteLLM's DLP integrates with PromptGuard, Pangea, or Azure Content Safety — external services with their own pricing, accounts, and data flows. Synapse's DLP is built in. For an organisation whose data residency rules say PII does not leave the perimeter, "built in" is not a feature preference — it is a hard requirement.

What it is not

Worth saying clearly:

Not for high-scale teams. If you are routing millions of requests per day, use LiteLLM or Kong.
Not a replacement for an enterprise governance platform. No SOC 2 attestation, no SLAs, no commercial support.
Not magic. It will not make a poorly-designed AI rollout safe by itself. It gives you the controls; you still need a sensible policy on how to use them.

Production hardening

Specific facts, all verified in the repo:

Three containers totalling ~113 MB at idle (backend 73 MB + postgres 32 MB + frontend 8 MB)
97 tests, ~88% line coverage
GitHub Actions CI with Bandit (static security) and Trivy (CVE scanning + image scanning to GitHub Security)
GDPR, HIPAA, and PCI-DSS policy packs — one-click apply with pre-configured DLP patterns
Per-team, per-model spend attribution with budget alerts
Append-only PostgreSQL audit schema
OpenAI-compatible REST API on /v1/chat/completions

The README.md has a deployment checklist for production: rotate every default secret, terminate TLS at a reverse proxy, use managed PostgreSQL, restrict CORS, review the DLP patterns for your jurisdiction.

Contributing

The repo is Apache-2.0. There is a CONTRIBUTING.md with DCO sign-off and a list of good first issues — DLP patterns for additional jurisdictions, additional backend adapters, a Helm chart. If there is a use case the current design does not cover, open an issue.

GitHub: synapse-ai-gateway/synapse-ai-gateway

If your organisation is staring at the gap between "ship AI now with no controls" and "wait two years for the enterprise platform," this is meant for you.