When we launched Ajah two weeks ago,
261 developers cloned it in the first week.
The product worked. But it wasn't
production-ready for enterprise teams.
Today that changes.
Here's exactly what we shipped and why
each piece matters for teams running
LLMs in production.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
RATE LIMITING PER FEATURE
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
The problem: a single misconfigured
agent or a traffic spike on one feature
can exhaust your entire API budget before
anyone notices.
The fix: per-feature rate limiting using
a Redis sliding window counter.
Configure requests per minute from the
Settings page — no code changes needed.
When a feature exceeds its limit, the
gateway returns 429 before the request
ever reaches your LLM provider:
{
"error": "rate limit exceeded",
"feature": "chat",
"limit": 60,
"reset_in_seconds": 34
}
Response headers include X-RateLimit-Limit
and X-RateLimit-Reset for client-side
handling. One Redis INCR call per request —
sub-millisecond overhead.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
EMAIL ALERTS VIA SMTP
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
The problem: Slack webhooks reach
developers. They don't reach compliance
teams, finance teams, or anyone who
needs an audit trail.
The fix: SMTP email alerts alongside
existing Slack webhooks.
Configure once via the Settings API:
POST /settings
{
"smtp_config": {
"host": "smtp.gmail.com",
"port": 587,
"username": "alerts@yourcompany.com",
"password": "your-app-password",
"from": "alerts@yourcompany.com"
}
}
Then set alert_email_to per feature.
Cost spikes and risk flags fire email
automatically — subject lines like:
[Ajah Alert] Cost spike — feature: chat
[Ajah Alert] Risk flag — feature: support-bot
Fire-and-forget goroutines. Zero latency
added to the hot path.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
PER-DEPENDENCY HEALTH CHECKS
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
The problem: {"status":"ok"} is useless
when your load balancer needs to know
which specific dependency is down at 2am.
The fix: /health now pings Redis,
PostgreSQL, and ClickHouse individually
with a 3-second timeout per dependency:
{
"status": "ok",
"version": "0.1.0",
"dependencies": {
"redis": {"status": "ok"},
"postgres": {"status": "ok"},
"clickhouse": {"status": "ok"}
}
}
If any dependency is down, the response
returns HTTP 503 with the specific error:
{
"status": "degraded",
"dependencies": {
"redis": {
"status": "down",
"error": "dial tcp: connection refused"
}
}
}
Your monitoring system, load balancer,
and on-call engineer know exactly what
to fix.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
GRAFANA DASHBOARD
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
The problem: we shipped 10 Prometheus
metrics two weeks ago. Nobody wants
to build 18 Grafana panels from scratch.
The fix: docs/grafana-dashboard.json
— one import, production dashboard.
18 panels across 5 sections:
Traffic
→ Requests per second by feature
→ Requests per second by provider
Latency
→ LLM p50 and p95 by provider
→ Scorer p50 and p95
Cost
→ Cost per hour by feature (USD)
→ Cost per hour by model (USD)
Quality and Safety
→ Hallucination risk gauges by feature
→ Claim density risk by feature
→ Narrative drift risk by feature
Warnings and PII
→ Warning rate by risk level
→ PII detection rate by feature
Import the JSON, point at your Prometheus
datasource, and you have a complete
LLM observability dashboard in under
60 seconds.
━━━━━━━━━━━━━━━━━━━━━━━━━━━━━
Ajah is open source, self-hosted,
MIT licensed.
No data leaves your server.
No vendor lock-in.
No acquisition risk.
→ github.com/VigneshReddy-afk/ajah
→ useajah.com
Top comments (0)