California just announced it'll start ticketing driverless cars that break traffic laws. That got me thinking — not about self-driving cars specifically, but about a problem I've hit on three different projects: how do you make an automated system respect a set of rules that change over time?
Whether you're building a CI/CD pipeline that enforces deployment policies, an API gateway with rate-limiting rules, or a workflow engine that needs to comply with business regulations, you eventually need a rule engine. And if you reach for a massive enterprise framework on day one, you'll regret it.
Here's how I build lightweight rule engines that actually hold up in production.
The Problem: Hardcoded Rules Rot Fast
Every project starts the same way. Someone says "just add an if-statement." So you do.
# This is fine... for now
def check_deployment(deploy_request):
if deploy_request.target == "production" and not deploy_request.has_approval:
return Denied("Production deploys require approval")
if deploy_request.time.hour < 9 or deploy_request.time.hour > 17:
return Denied("No deploys outside business hours")
return Approved()
Then the rules multiply. Then someone wants to change them without a code deploy. Then different environments need different rules. Then someone asks for an audit log of which rules fired and why.
Now your neat little function is 200 lines of nested conditionals, and every change is a production risk.
The Core Pattern: Separate Rules From Execution
The fix isn't a framework — it's a pattern. You need three things:
- A rule definition format (data, not code)
- An evaluation engine (small, testable, deterministic)
- A result collector (for audit trails and debugging)
Here's the minimal version I keep coming back to:
from dataclasses import dataclass, field
from typing import Any, Callable
import operator
# Map string operators to actual functions
OPERATORS = {
"eq": operator.eq,
"ne": operator.ne,
"gt": operator.gt,
"lt": operator.lt,
"gte": operator.ge,
"lte": operator.le,
"in": lambda val, collection: val in collection,
"not_in": lambda val, collection: val not in collection,
"contains": lambda collection, val: val in collection,
}
@dataclass
class Rule:
name: str
field: str # dot-notation path into the context
op: str # operator key from OPERATORS
value: Any # what we're comparing against
message: str = "" # human-readable explanation
severity: str = "error" # error, warning, info
@dataclass
class RuleResult:
rule: Rule
passed: bool
actual_value: Any = None
def resolve_field(obj: dict, path: str) -> Any:
"""Navigate nested dicts with dot notation: 'deploy.target.env'"""
current = obj
for key in path.split("."):
if isinstance(current, dict):
current = current.get(key)
else:
return None
return current
def evaluate(rules: list[Rule], context: dict) -> list[RuleResult]:
results = []
for rule in rules:
actual = resolve_field(context, rule.field)
op_func = OPERATORS.get(rule.op)
if op_func is None:
raise ValueError(f"Unknown operator: {rule.op}")
try:
passed = op_func(actual, rule.value)
except TypeError:
passed = False # type mismatch = rule not satisfied
results.append(RuleResult(rule=rule, passed=passed, actual_value=actual))
return results
Nothing fancy. No DSL parser, no YAML templating language, no dependency injection. Just data in, results out.
Loading Rules From Config
The real power comes when rules live outside your code. I typically use JSON or YAML, loaded at startup or fetched from a config service.
import json
def load_rules(path: str) -> list[Rule]:
with open(path) as f:
raw = json.load(f)
return [Rule(**r) for r in raw["rules"]]
# rules.json
# {
# "rules": [
# {
# "name": "business_hours_only",
# "field": "request.hour",
# "op": "gte",
# "value": 9,
# "message": "Action not permitted outside business hours",
# "severity": "error"
# },
# {
# "name": "max_batch_size",
# "field": "payload.items_count",
# "op": "lte",
# "value": 1000,
# "message": "Batch size exceeds safe limit",
# "severity": "warning"
# }
# ]
# }
Now your ops team can tweak compliance rules without touching application code. You can version the rule files in git, diff them in PRs, and roll them back independently.
Adding Rule Groups and Short-Circuit Logic
In practice, you'll want to group rules. Some groups should short-circuit (stop on first failure), others should collect all violations.
@dataclass
class RuleGroup:
name: str
rules: list[Rule]
mode: str = "all" # "all" = collect everything, "first_fail" = stop early
def evaluate_group(group: RuleGroup, context: dict) -> list[RuleResult]:
results = []
for rule in group.rules:
actual = resolve_field(context, rule.field)
op_func = OPERATORS[rule.op]
try:
passed = op_func(actual, rule.value)
except TypeError:
passed = False
result = RuleResult(rule=rule, passed=passed, actual_value=actual)
results.append(result)
# bail early if this group uses short-circuit mode
if not passed and group.mode == "first_fail":
break
return results
def evaluate_all_groups(groups: list[RuleGroup], context: dict) -> dict:
return {
group.name: evaluate_group(group, context)
for group in groups
}
This is the 80/20 point. You've got configurable rules, grouped evaluation, short-circuit logic, and a full audit trail of what passed and what didn't. For most projects, this is enough.
When You Actually Need More
I've only outgrown this pattern twice in eight years. The signs you need something heavier:
- Rules reference other rules ("if rule A passed, skip rule B") — now you need a dependency graph
- Rules need temporal logic ("this value was X five minutes ago") — now you need state
- Non-technical users need to author rules — now you need a UI and probably a real DSL
If you hit those cases, look at existing open-source rule engines for your ecosystem before building one. Python has projects like business-rules. JavaScript has json-rules-engine. Go has grule-rule-engine. They handle the graph traversal and conflict resolution that you don't want to write yourself.
But don't start there. Start with the 50-line evaluator above and see how far it takes you.
Practical Tips From Production
A few things I learned the hard way:
- Always log the full context alongside results. When someone asks "why was this request denied at 2 AM last Tuesday," you want the exact input that was evaluated, not just the rule name that fired.
- Version your rule sets. Every time rules change, tag the version. Store the version alongside any decision the engine made. You'll need this for audits.
- Test rules like code. Write unit tests for your rule definitions. Feed them known contexts, assert expected outcomes. This catches typos in field names and logic inversions before production does.
- Set up dry-run mode from day one. Before enforcing a new rule, run it in shadow mode — evaluate but don't block. This has saved me from deploying overly aggressive rules more times than I want to admit.
def make_decision(groups: list[RuleGroup], context: dict, dry_run: bool = False):
all_results = evaluate_all_groups(groups, context)
failures = [
r for results in all_results.values()
for r in results
if not r.passed and r.rule.severity == "error"
]
decision = "deny" if failures and not dry_run else "allow"
# always log regardless of mode
log_decision(context, all_results, decision, dry_run)
return decision, all_results
Wrapping Up
The pattern here isn't specific to any domain. I've used it for deployment gates, invoice validation, content moderation filters, and API request policies. The shape is always the same: define rules as data, evaluate them against a context, collect the results.
Start with the simplest evaluator that does the job. Keep rules in version-controlled config files. Log everything. Add complexity only when the current system genuinely can't express what you need.
Forty lines of code and a JSON file will get you surprisingly far.
Top comments (0)