I needed to run AI agent workflows locally, in TypeScript, with real permission
enforcement. I spent time with the obvious options — n8n, OpenClaw, Hermes —
and none of them quite fit. So I built skelm.
This post is about what specifically didn't fit, what skelm does differently,
and one concrete example of why the difference matters.
What the existing tools get wrong (for my use case)
n8n
n8n is a genuinely great tool. But it's built around visual flows and a
node-graph model. I want to write real .ts modules, use the full TypeScript
type system, run vitest, and check my workflows into git like any other
code. I don't want a JSON config I can't diff cleanly or a canvas I need
a browser to edit.
OpenClaw
OpenClaw is excellent for personal assistant use cases — chat routing,
multi-channel, agent conversations. But it's fundamentally a messaging
framework. I needed pipeline primitives: branch, parallel, forEach,
loop, wait. Composable, testable, deterministic. OpenClaw isn't built
for that and doesn't pretend to be.
Hermes and most LLM orchestration frameworks
The permission model is advisory. You tell the model what it should do
in the system prompt. The model tries to comply. This works until it doesn't
— a new tool capability ships, the model decides the task requires it anyway,
or a crafted input nudges it past the instruction. Advisory permissions are
not a security boundary. They're a politeness request.
What skelm does differently
1. Real TypeScript, not a DSL
Workflows are .ts files. No YAML, no JSON config, no visual canvas. The
full type system, full test runner, full IDE support:
export default pipeline({
id: 'triage-incident',
input: z.object({ incidentId: z.string(), severity: z.enum(['critical', 'high', 'low']) }),
steps: [
branch({
id: 'severity-gate',
on: (ctx) => (ctx.input as { severity: string }).severity,
cases: {
critical: parallel({
id: 'triage',
steps: [
code({ id: 'search-issues', run: searchGitHub }),
code({ id: 'create-channel', run: createSlackChannel }),
],
}),
high: code({ id: 'notify-oncall', run: notifyOncall }),
low: code({ id: 'acknowledge', run: sendAck }),
},
}),
agent({
id: 'root-cause',
backend: 'codex',
prompt: (ctx) => buildRcaPrompt(ctx),
permissions: {
fsRead: ['./runbooks'],
fsWrite: [],
networkEgress: 'deny',
},
}),
],
})
That's a real, runnable workflow. skelm run triage-incident.workflow.ts. No gateway
--input '{"incidentId":"INC-001","severity":"critical"}'
required for local runs. No browser.
2. Structural permissions, not advisory ones
This is the core design decision that everything else follows from.
In skelm, permissions are enforced at the boundary before the backend is
called — not in the context window. Here's what that looks like:
agent({
id: 'fix-bug',
backend: 'codex',
prompt: (ctx) => `Fix: ${(ctx.input as { test: string }).test}`,
permissions: {
fsRead: ['./src', './test'],
fsWrite: ['./src'], // only src — tests and config stay read-only
allowedExecutables: ['node', 'npm'],
networkEgress: 'deny',
allowedMcpServers: [],
allowedSkills: [],
},
})
These aren't hints. They're enforced at three layers:
Pre-run mapper — before any SDK call, skelm translates the policy into
backend-specific options and refuses unsafe combinations. fsWrite: ['*']
with an approval policy set throws CodexPermissionError immediately.
Skelm refuses to silently escalate to danger-full-access sandbox mode.
Backend in-process enforcement — Codex's sandbox, Pi's permission
enforcer, and skelm's own @skelm/agent backend all enforce natively
in-process. Skelm's job is to ensure the translation from declared
permissions to backend options is complete and doesn't widen anything.
Gateway egress proxy — the gateway runs an embedded CONNECT proxy.
HTTP_PROXY is merged into subprocess environments so outbound TCP is
intercepted. allowHosts is enforced at the TCP level, not in the model's
context window.
3. A concrete example of why this matters
When I wired up the Codex backend I set networkEgress: 'deny' and mapped
it to networkAccessEnabled: false. That blocks sandbox-shell network
calls — curl, wget, etc.
But Codex also has a built-in web_search tool that runs inside the Codex
process, not through the sandbox shell. networkAccessEnabled: false does
not disable it. My deny was covering one surface and missing another.
Here's the fix in the permission mapper:
// Codex has two distinct network surfaces:
// 1. networkAccessEnabled — sandbox-shell egress (curl, wget, etc.)
// 2. webSearchMode — model's built-in tool, runs in-process,
// NOT interceptable by the gateway proxy
const networkAccessEnabled = policy.networkEgress !== 'deny'
// web_search enabled ONLY on explicit blanket 'allow'.
// { allowHosts: ['api.example.com'] } also disables it —
// the proxy can enforce allowHosts for shell TCP, but not in-process search.
const webSearchAllowed = policy.networkEgress === 'allow'
const webSearchEnabled = webSearchAllowed
const webSearchMode: WebSearchMode = webSearchAllowed ? 'live' : 'disabled'
This is exactly the kind of gap advisory permissions can't cover. A system
prompt that says "don't access the network" doesn't know web_search exists.
4. Multi-backend, pluggable, not locked in
I wanted to swap backends without rewriting workflows. skelm supports:
| Backend | What it is |
|---|---|
@skelm/agent |
First-party, any OpenAI-compatible endpoint, in-process enforcement |
@skelm/codex |
OpenAI Codex via @openai/codex-sdk — sandbox-aware, MCP, streaming |
@skelm/pi |
Pi coding agent |
@skelm/opencode |
Opencode coding agent (native + ACP) |
@skelm/vercel-ai |
Vercel AI SDK |
| Built-in ACP | Copilot, Claude Code, Gemini CLI |
The same agent() step works across all of them. Swap backend: 'pi' for
backend: 'codex' and the permission model, skill injection, and streaming
all carry through.
5. Human-in-the-loop as a first-class primitive
wait() pauses a run until a caller resumes it via HTTP:
wait({
id: 'human-review',
message: 'Review this expense and approve or reject it.',
output: z.object({
decision: z.enum(['approve', 'reject']),
comments: z.string().optional(),
}),
})
# Resume from anywhere
curl -X POST http://localhost:14738/runs/<runId>/resume \
-d '{"output":{"decision":"approve","comments":"Looks good"}}'
Not a webhook integration, not a plugin. A first-class step kind.
6. Tamper-evident audit
Every permission decision, tool call, secret access, and approval is written
to a hash-chained audit journal. You can verify the chain hasn't been altered:
skelm audit query --verify
This matters for compliance use cases where you need to prove what the agent
was actually permitted to do, not just what it was told.
What skelm is not
It's not a finished product. Pre-v1, APIs unstable, some rough edges. The
permission model is solid and the core step primitives work. The ecosystem
around it — integrations, docs, tooling — is still growing.
It's also not trying to replace n8n for visual workflow automation, or
OpenClaw for conversational agents. It occupies a different space: you want
to write code, you care about what your agents are actually permitted to
do, and you want that enforced structurally.
Getting started
npm install -g skelm
skelm init my-project && cd my-project && npm install
skelm run workflows/hello.workflow.ts --input '{"name":"world"}'
Full docs and source: github.com/scottgl9/skelm
If you've hit the same wall with n8n, OpenClaw, Hermes, or any other
orchestration framework — I'd genuinely like to hear what the gap was for
you. Happy to answer questions about the permission model or any of the
backend integrations in the comments.
Top comments (0)