ABU SUFIAN

Posted on Feb 22

Rule-Based Decisions + Optional LLM Explanations: Building event-intel-engine

#llm #fastify #backend #opensource

The problem we're solving

You have a stream of events—payment failures, API errors, timeouts. You might already have dashboards and counts. What's often missing is:

A clear, repeatable rule that turns those numbers into an action: allow, review, or block. Without it, "what should we do?" stays a manual or ad-hoc call.
A single place that both applies that rule and can optionally answer why in plain language. Often you either get a decision with no narrative, or you bolt on an LLM and risk using it for the decision itself—which makes behavior harder to reason about and audit.

So the problem is: deterministic decisions from event counts, plus optional human-readable explanation, without the LLM driving the outcome.

Existing solutions and our angle

Plenty of systems already do rule-based decisions and policy engines—feature flags, circuit breakers, fraud rules, and so on. We're not claiming to replace those. We wanted a small, self-contained service that:

Ingests events and keeps rolling counters (e.g. 5m / 30m).
Exposes a simple allow/review/block decision from those counters, with no external AI in that path.
Optionally adds a short explanation of that decision. That's where we brought in an LLM—for fun. We wanted to see if we could get a readable "why" and "what to do next" without ever letting the model change the decision. So the core value is the rule engine and the API; the LLM is an extra layer we tried on top.

How we keep the LLM out of the decision

We split the system into two layers:

Decision layer: Counts (e.g. failures in the last 5m and 30m) go into a fixed rule function. Same counts → same result. No LLM, no non-determinism. This is the part that actually "solves" the problem.
Explanation layer: After the decision is fixed, we optionally send that decision plus the same evidence to an LLM and ask it only to explain it in plain language and suggest next steps. If the LLM is down, slow, or returns invalid data, we fall back to a simple rule-based explanation. The decision never changes because of the LLM.

So: rules solve the problem; the LLM is there for fun and readability.

Architecture in a nutshell

Events → stored in SQLite (subject, event_type, fingerprint, status_code, latency_ms, meta_json, at).
Counters → for each subject we query the last 5m and 30m and aggregate by event_type (failure/success/latency).
Decision → pure function of (failures_5m, failures_30m, successes_5m) → allow | review | block.
Explain → we build an evidence payload (counters + recent events), hash it, and check the cache. On miss we call the LLM with (decision + evidence), validate the response with Zod, then cache. On LLM failure we return a rule-based fallback.

Stack and layout

Node 20+, Fastify (helmet, rate-limit, API-key auth), SQLite (better-sqlite3, WAL). Tables: events, subject_state, idempotency_keys, subject_explanations. Zod for request/response validation. The LLM is any OpenAI-compatible API (e.g. Groq); the app runs fine without it—explain just uses the fallback then.

Routes: GET /v1/events (list), POST /v1/events (ingest), GET /v1/subjects/:subject/decision (public), GET /v1/subjects/:subject/explain (cached, LLM or fallback), plus profile and ask. Auth via Bearer or x-api-key; health and decision are public by default.

How decisions are computed

Counters come from queries like:

SELECT event_type, COUNT(*) as c FROM events WHERE subject = @subject AND at >= @since GROUP BY event_type

We sum failure/success (and optionally latency) per window. The rule is something like:
failures_5m >= 5 or failures_30m >= 5 → block; failures_30m >= 3 → review; else → allow.

Thresholds live in code so you can tune or replace them.

How explain works (and why the API never depends on the LLM)

We build evidence (counters + last N events + a small breakdown), hash it with the decision, and look up the cache. On hit we return the stored explanation and set X-Explain-Cache: HIT. On miss we call the LLM with (decision + evidence), validate the JSON with Zod, then cache. If the LLM is unconfigured, times out, or fails validation, we return a rule-based explanation instead. So the API always returns 200 with some explanation; the LLM is best-effort and "for fun."

How to run and use it

Setup:

git clone https://github.com/Sufian-Abu/event-intel-engine cd event-intel-engine npm install cp .env.example .env

Set at least API_KEYS=your-key. Optionally: HOST, PORT, SQLITE_PATH, RATE_LIMIT_MAX. For the LLM explain path: LLM_PROVIDER, LLM_API_KEY, LLM_BASE_URL, LLM_MODEL. Without LLM config, explain uses the fallback.

npm run dev

Or with Docker: docker compose up --build.

Ingest an event:

curl -X POST http://localhost:8080/v1/events \ -H "Authorization: Bearer YOUR_KEY" \ -H "Content-Type: application/json" \ -d '{"subject":"user-42","event_type":"failure","fingerprint":"charge.failed","status_code":402}'

Get decision (no auth):
curl http://localhost:8080/v1/subjects/user-42/decision

Get explanation (LLM or fallback):
curl "http://localhost:8080/v1/subjects/user-42/explain?window=50" \ -H "Authorization: Bearer YOUR_KEY"

In short

We wanted a small service that does rule-based allow/review/block from event counts and exposes it over a simple API. Existing solutions do that; our twist was adding an optional LLM explainer for fun—so we get a readable "why" without ever letting the model decide.

GitHub: github.com/Sufian-Abu/event-intel-engine
The README has full setup, env vars, and API details.

DEV Community