DEV Community

Cover image for Put a hard stop in front of your CrewAI crew's tool calls
Brian Hall
Brian Hall

Posted on

Put a hard stop in front of your CrewAI crew's tool calls

CrewAI makes it easy to stand up a crew. You give a few agents roles, hand them tools, let them delegate work to each other, and the thing mostly runs itself. That autonomy is the appeal. It's also the problem. Once a crew is moving, every agent in it can reach for a tool, and there's nothing between the model deciding to call something and the call actually happening.

The usual fix is a careful prompt and crossed fingers. Or a second LLM that "reviews" the action, which is hoping with extra latency. I wanted a check that doesn't depend on a model being in a good mood: something deterministic that runs before the tool call fires and gives a real answer. Allow it, hold it for a human, or block it.

That's what Faramesh does. It's open source, it works with CrewAI through a one-line wrapper, and this is the actual end-to-end setup, every command and every policy snippet pulled straight from how the tool really works.

The idea

A tool call is the moment an agent stops reasoning and starts doing. Reading a doc is one thing. Spending money, sending mail to a customer, or hitting a production API is another. Those are the moments worth putting a rule in front of.

Faramesh runs as a local daemon. Your whole policy lives in one file, governance.fms, and the daemon checks every tool call against it before the call runs. No LLM sits in that decision path, so the same action under the same policy always gets the same verdict. You get one of three:

  • permit the call runs
  • defer the call pauses and waits for a human to approve or reject
  • deny the call is blocked before it happens

The point is that it's deterministic. You can read the policy, reason about it, and know what it'll do. That's the whole difference from asking a second model to babysit the first one.

Install

Install the CLI:

curl -fsSL https://raw.githubusercontent.com/faramesh/faramesh-core/main/install.sh | bash
faramesh --version
Enter fullscreen mode Exit fullscreen mode

Then add the SDK to your CrewAI project:

pip install faramesh-sdk crewai
Enter fullscreen mode Exit fullscreen mode

Generate the policy

From the root of your project:

faramesh init
Enter fullscreen mode Exit fullscreen mode

Faramesh inspects the repo, finds your framework, discovers your tools, and writes a starter governance.fms. The important part of the default: every discovered tool starts at defer. Nothing runs until you've reviewed it. That's the safe direction to fail.

Wire the crew

This is the only step that's CrewAI-specific, and it's small. You wrap each agent's tools in a GovernedToolSet and give that set an identity. Here's a crew before:

from crewai import Agent, Crew, Task
from crewai_tools import SerperDevTool, BraveSearchTool

researcher = Agent(
    role="researcher",
    tools=[SerperDevTool(), BraveSearchTool()],
)
writer = Agent(role="writer", tools=[])

crew = Crew(agents=[researcher, writer], tasks=[...])
Enter fullscreen mode Exit fullscreen mode

And after:

from faramesh import GovernedToolSet
from crewai import Agent, Crew, Task

researcher_tools = GovernedToolSet(
    [SerperDevTool(), BraveSearchTool()],
    agent_id="research-crew/researcher",
)

researcher = Agent(role="researcher", tools=researcher_tools)
writer     = Agent(role="writer",     tools=[])

crew = Crew(agents=[researcher, writer], tasks=[...])
Enter fullscreen mode Exit fullscreen mode

That's the whole integration. Use one GovernedToolSet per agent so each crew member gets its own identity in the policy. That's what lets you give the researcher and the writer different rules, which matters more than it sounds like, since in a crew the agents have genuinely different jobs and should have genuinely different permissions.

Write the rules, per role

Open governance.fms. Because each agent has its own id, you write a policy block per role. Here the researcher can search but nothing else, and the writer can't touch tools at all:

import "github.com/faramesh/faramesh-registry/frameworks/crewai@1.0.0"

agent "research-crew/researcher" {
  default deny

  rules {
    permit serper_search
    permit brave_search
  }

  rate_limit "*_search": 30 per minute

  budget daily {
    max       $20
    on_exceed defer
  }
}

agent "research-crew/writer" {
  default deny
  rules { }
}
Enter fullscreen mode Exit fullscreen mode

A few things worth reading off that:

default deny means anything not explicitly allowed gets blocked. You opt tools in, you don't opt them out. Rules are checked top to bottom and the first match wins.

The rate_limit line caps both search tools at 30 calls a minute, so a confused agent can't hammer an API in a loop. The budget block puts a daily ceiling on spend and, when it's hit, defers instead of denying, the work pauses for a human rather than just dying. The writer's empty rule block plus default deny means it has no tool access at all, which is exactly right for an agent whose job is to write, not act.

Validate before you ship anything:

faramesh check
faramesh plan
Enter fullscreen mode Exit fullscreen mode

check parses and type-checks the file. plan prints the exact decision diff, so you can see what changes before it's live.

Apply and run

Turn on enforcement and run the crew normally:

faramesh apply
python my_crew.py
Enter fullscreen mode Exit fullscreen mode

A permit returns the tool result like nothing's there. A defer returns a structured response telling the agent its action is pending approval, the crew doesn't crash, the call just doesn't go through yet. You watch and clear the queue from another terminal:

faramesh approvals list
faramesh approvals approve apr-9001
Enter fullscreen mode Exit fullscreen mode

Once approved, the agent's next attempt goes through. Or, if you've decided it should always be allowed, promote the rule to permit in the file and run faramesh apply again. One thing to know: apply is the only way to change the running policy. There's no quiet hot-reload where editing a file changes what your crew can do mid-run. You edit, you apply. It's deliberate on purpose.

Crews delegate, so the policy understands delegation

The thing that makes CrewAI CrewAI is agents handing work to each other. Faramesh models that directly. If your researcher delegates to your writer, you can bound what that delegation is allowed to carry:

agent "research-crew/researcher" {
  delegate {
    target_agent = "research-crew/writer"
    scope        = "read-only"
    ttl          = "5m"
  }
}
Enter fullscreen mode Exit fullscreen mode

The daemon validates delegation against the crew's actual structure at runtime, so one agent can't quietly hand another a capability it wasn't granted. That's a failure mode specific to multi-agent setups, and it's nice to have it covered in the same file as everything else.

Why bother

Every decision the daemon makes also lands in a tamper-evident log you can verify offline with faramesh audit verify. That matters the day someone asks what your crew actually did and "I think the prompt told it not to" isn't a good enough answer.

None of this makes your agents smarter. It means the moments that carry real risk go through a deterministic rule you wrote and can read, instead of through luck. For a single agent that's useful. For a crew, where several agents are acting and delegating at once, it's the difference between a demo and something you'd leave running.

Faramesh is open source. The repo is at github.com/faramesh/faramesh-core if you want to poke around or break it. If you wire it into a crew and something's off, tell me. That's all super helpful feedback at this point.

Top comments (0)