If you're building anything on top of MCP (Model Context Protocol), you'll eventually hit this question: once an LLM decides to call a tool, who actually checks whether it's allowed to?
MCP's spec defines how tools are discovered and invoked — it says nothing about who's allowed to call what, or under which conditions. That's left entirely to whoever builds the host. Left unaddressed, the default is wide open: any role can call any tool with any arguments. Bolt on a quick fix and you usually end up with one of two patterns: auth logic scattered across each tool implementation, or a coarse role check that can't actually look at the arguments (so "agents can process refunds" has no way to express "but only up to $500").
Neither scales once you have more than a couple of roles and tools.
perso is a small Rust project I built to give this problem a real answer: a policy enforcement engine for MCP tool calls, compiled to a single portable WebAssembly binary.
The idea
You write your access rules as plain JSON — "agents can process refunds, but only up to $500," "managers can delete records they own," "this tool is blocked unless MFA is verified" — and perso compiles that into a .wasm binary. Drop that binary into any host (a backend server, an MCP server, an edge function, a CLI) and it answers one question, in microseconds, for every tool call: Allow or Deny.
The LLM never sees or touches the role token — the host owns that, extracted from its own session/JWT. perso just evaluates the call against the policy and hands back a decision plus a human-readable reason.
json{ "tool_name": "process_refund", "roles": ["agent"],
"condition": { "NumericCheck": { "source": "Arguments", "field": "amount", "op": "Lte", "value": 500.0 } } }
That one rule is enough to stop an agent role from approving an $800 refund, no matter how convincingly the LLM was talked into trying.
Conditions can check arguments, agent attributes, or resource attributes, and combine with All/Any/Not. The whole rule set gets pre-expanded at load time into a flat map, so every actual evaluation is an O(1) lookup plus a small condition check — no glob matching, no scanning, at request time.
Default action is Deny: anything not explicitly allowed gets rejected.
Seeing it work: perso-demo
Reading rules in a README only goes so far, so I built perso-demo — a small chat app where an LLM (Groq, llama-3.1-8b-instant) calls tools against a mock B2B CRM, and perso intercepts every single tool call intent before it executes.
You pick a role — agent, manager, or admin — and chat naturally:
"Process a $200 refund for order ORD-8821" → allowed, under the agent's $500 cap
"Try to process a $800 refund" → denied, NumericCheck fails
"Delete customer C-9001's record" as a manager who doesn't own it → denied, FieldEquals fails (user_id != owner_id)
"Run a bulk update" as admin without MFA → denied, the All condition needs both env: production and mfa_verified
Every decision shows up inline in the chat — green for allow, red for deny — with the exact reason from the policy engine. There's also a policy sidebar showing the raw rules and a live JSON panel, so you can watch a non-trivial RBAC + attribute-based policy enforced in real time without a single line of auth code inside the tool implementations themselves.
The SDK that wires it together
The demo's backend doesn't talk to the raw WASM ABI directly — it goes through @teknokeras/perso-sdk, the official Node.js SDK for perso.
The raw WASM exports are just four C-style functions (alloc, dealloc, init, evaluate) that move length-prefixed JSON across the WASM memory boundary. The SDK wraps that into a clean async API:
import { Perso } from '@teknokeras/perso-sdk'
const perso = await Perso.load('path/to/perso.wasm', {
policy: 'path/to/policy.json',
})
const decision = await perso.evaluate({
tool: 'process_refund',
args: { order_id: 'ORD-8821', amount: 800 },
role: 'agent',
agentAttributes: { user_id: 'agt-099', env: 'production' },
})
// { decision: 'Deny', reason: '...' }
It also adds structured audit logging on top, with pluggable transports (consoleTransport, httpTransport, fileTransport, or your own), so every decision can optionally be shipped somewhere durable for later review — useful when you need to show why an agent did or didn't do something, not just trust its own account of it.
This is exactly the SDK perso-demo's backend uses: one shared Perso instance, loaded once at startup, sitting in front of every tool call before it reaches the mock CRM.
Why I think this matters
A policy layer like this doesn't make an agent safe by itself — it doesn't stop an LLM from being manipulated into wanting to do something harmful. What it does is bound the blast radius once that happens: a hijacked agent still can't call bulk_update without env == production and mfa_verified, no matter what the prompt convinced it to attempt. Default-deny means anything the policy doesn't explicitly cover fails closed. It's one control among several you'd want in a production agentic system — alongside input validation, model-level guardrails, and monitoring — and it's still a control most MCP integrations don't have in place yet, even where general-purpose authorization tools (like OPA) could in principle be adapted to cover it.
Try it
Engine: github.com/teknokeras/perso — Rust workspace, cargo build, compiles to wasm32-unknown-unknown
Demo: github.com/teknokeras/perso-demo — pnpm install && pnpm dev, needs a free Groq API key
Node SDK: github.com/teknokeras/perso-sdk-node — npm install @teknokeras/perso-sdk
Happy to answer questions on the policy model, the WASM ABI, or how to embed perso in a non-Node host (it works the same way in Rust, Python, Go — anything with a WASM runtime).
Top comments (0)