If you are building multi-agent systems in Python, CrewAI is one of the biggest frameworks you need to know.
And if your CrewAI workloads are starting to get expensive, the simplest way to control that spend is to put an LLM gateway in front of them instead of wiring every agent directly to one provider.
In this article, I’ll explain what CrewAI is, why it got so popular, and how to use it with Lynkr so your agents can run with better model routing, caching, and lower cost.
I built Lynkr, so that part comes with the obvious founder disclosure. Still, CrewAI is worth understanding on its own because it has become one of the main entry points for people building agent systems in Python.
What is CrewAI?
CrewAI is an open-source Python framework for orchestrating multiple AI agents.
At the time of writing, the GitHub repo has 53k stars.
The project describes itself as a:
Fast and Flexible Multi-Agent Automation Framework
Its core idea is simple:
- define agents with roles and goals
- define tasks
- decide how they collaborate
- run them as a system instead of a single prompt chain
That is the mental model behind the name CrewAI: not one agent, but a crew of specialized agents working together.
Why CrewAI matters
A lot of agent demos are still just one prompt plus one tool call.
CrewAI matters because it pushes people toward more structured systems:
- researcher agent
- writer agent
- reviewer agent
- planner agent
- execution agent
Each one can have a different role, context, and tool setup.
That makes it useful for:
- research pipelines
- content workflows
- internal business automation
- data gathering + summarization flows
- agent handoff patterns
- more production-style orchestration than “just call the model again”
The reason it got traction is that it sits in a nice middle ground:
- higher-level than wiring every agent loop yourself
- more concrete than vague "agent platform" marketing
- easy enough for Python developers to start with quickly
The two big concepts in CrewAI: Crews and Flows
From the current repo README, CrewAI emphasizes two core concepts.
1. Crews
Crews are teams of agents collaborating with autonomy.
This is the “multi-agent” part most people think of first:
- specialized roles
- role-based collaboration
- delegation
- agents working together toward a result
2. Flows
Flows are the more controlled, event-driven side.
This is where CrewAI becomes more production-friendly:
- execution paths
- state management
- conditional logic
- integration with normal Python code
- more deterministic orchestration when you need it
That combination is a big part of the pitch:
- Crews for agent autonomy
- Flows for production control
Why CrewAI gets expensive fast
This part usually becomes obvious after the first real project.
A single-agent script is one thing.
A multi-agent system is different.
Costs grow because you now have:
- multiple agents making separate LLM calls
- handoffs between agents
- intermediate summaries
- retries
- reflection/replanning
- tool use across several steps
- repeated context being passed around the system
So the problem is not just “what model am I using?”
It becomes:
- do all agents need the same expensive model?
- should the planner use the same model as the formatter?
- how much repeated context is being resent?
- can simple routing/classification work go to cheaper models?
- can repeated flows benefit from cache hits?
That is exactly the kind of workload where a gateway layer starts making sense.
Where Lynkr fits
If CrewAI is the orchestration layer, Lynkr can sit underneath it as the LLM gateway.
That means your architecture becomes:
CrewAI agents / flows
↓
Lynkr
↓
Ollama / OpenRouter / Bedrock / OpenAI / Azure / Databricks / others
Instead of wiring each agent stack directly to one provider, you point your model traffic at one gateway endpoint and let that layer decide what happens next.
Why use Lynkr with CrewAI?
This is the important part.
The real benefit is not just “use any provider.”
That is table stakes now.
The better reason is that Lynkr gives you three strong levers for agent workloads:
1. Prompt caching
Multi-agent systems resend a lot of context.
That can include:
- system prompts
- task descriptions
- agent roles and backstories
- previous step context
- the same instructions reused across repeated runs
Lynkr’s caching layer helps reduce the amount of repeated input you pay for.
For agent systems, that matters a lot more than it does in one-off chat prompts.
2. Tier routing
Not every step in a CrewAI workflow deserves your strongest model.
Examples:
Use a cheaper/faster model for:
- classification
- routing
- formatting
- deterministic transformation
- simple extraction
- narrow sub-tasks
Use a stronger model for:
- planning
- reasoning-heavy synthesis
- ambiguous task decomposition
- final high-stakes output
This is exactly what tier routing is for.
3. One stable model endpoint
Once your agents grow from a prototype into a system, you usually want:
- one model boundary
- one place to switch providers
- one place to add failover
- one place to add policy and cost control
That is what a gateway layer gives you.
What Lynkr says it does well today
From the current Lynkr README, the main cost/performance claims are:
- 53% fewer tokens on tool-heavy requests
- 87.6% compression on large JSON tool results
- 171ms semantic cache hits
- automatic tier routing
- zero code changes at the client boundary once the endpoint is swapped
Those numbers come from coding-tool workloads, not specifically a published CrewAI benchmark.
So the honest framing is:
- I am not claiming a public CrewAI benchmark showing exactly 50% lower cost on every workload
- I am saying CrewAI has the exact kind of multi-step agent workload where these levers matter most
That is why “50% lower cost” is a fair headline shape for the category, but the actual result will depend on how your CrewAI system is built.
How to get started with CrewAI
From the current CrewAI README, installation starts like this:
uv pip install crewai
If you also want the tools extras:
uv pip install 'crewai[tools]'
The project also provides a CLI starter for creating a new crew project:
crewai create crew <project_name>
That scaffolds a project with:
main.pycrew.pyagents.yamltasks.yaml.env
So CrewAI is designed to be used as a real project structure, not just a single script.
A simple mental model for CrewAI code
A better way to think about CrewAI is:
- define who each agent is
- define what each task needs done
- define how work moves between agents
- then execute the whole workflow as one coordinated system
That is the real shift from a normal single-agent app.
You are not just prompting one model repeatedly.
You are designing a small working system with roles, handoffs, and outputs.
A minimal conceptual example looks like:
from crewai import Agent, Task, Crew
researcher = Agent(
role="Researcher",
goal="Find the best information on a topic",
backstory="You are great at gathering relevant details"
)
writer = Agent(
role="Writer",
goal="Turn research into a clear output",
backstory="You write concise, structured summaries"
)
research_task = Task(
description="Research the latest browser agent frameworks",
agent=researcher
)
write_task = Task(
description="Write a short technical summary from the research",
agent=writer
)
crew = Crew(
agents=[researcher, writer],
tasks=[research_task, write_task]
)
result = crew.kickoff()
print(result)
That is not copied from their exact starter file, but it reflects the basic CrewAI model:
- roles
- tasks
- orchestration
How to use CrewAI with Lynkr
The practical pattern is straightforward:
- install CrewAI
- install and start Lynkr
- point the model calls used by your CrewAI stack at Lynkr instead of directly at one provider
- let Lynkr handle routing/caching/provider flexibility underneath
1. Install Lynkr
npm install -g lynkr
2. Configure Lynkr
A simple cloud-backed setup from the current Lynkr README looks like this:
# .env
MODEL_PROVIDER=openrouter
OPENROUTER_API_KEY=your-key
FALLBACK_ENABLED=false
PORT=8081
PROMPT_CACHE_ENABLED=true
SEMANTIC_CACHE_ENABLED=true
Then start Lynkr:
lynkr start
If you want local-first testing, Lynkr also supports local backends like:
- Ollama
- llama.cpp
- LM Studio
That is useful for CrewAI because some low-value steps can run cheaply or locally, while harder reasoning tasks can still escalate.
3. Route CrewAI’s model traffic through Lynkr
The exact code depends on which model client you use with CrewAI.
The architecture is the important part:
CrewAI model client → Lynkr base URL → actual provider(s)
Because Lynkr gives you an OpenAI-compatible gateway surface, the integration is most natural when your CrewAI model configuration can target an OpenAI-style endpoint.
That lets you keep CrewAI as the orchestration layer while Lynkr becomes the control plane for model choice and cost behavior.
A better way to think about model assignment in CrewAI
Here is where most teams leave money on the table.
They do this:
- planner agent → expensive model
- researcher agent → same expensive model
- formatter agent → same expensive model
- reviewer agent → same expensive model
That is easy, but wasteful.
A better shape is:
- planner → strong reasoning model
- researcher → medium model
- summarizer → medium or cheap model
- formatter → cheap model
- repeated workflows → cached through gateway
The point is not that every step should be cheap.
The point is that different agent roles have different model requirements.
CrewAI already encourages role specialization.
Lynkr makes it easier to pair that with cost specialization.
A concrete example
Imagine a CrewAI workflow for market research.
You have:
- one agent gathering raw sources
- one agent extracting facts
- one agent writing the report
- one agent reviewing for quality
Without a gateway, teams often default to one premium model for all four.
With Lynkr underneath, the better pattern is:
- gather/extract → cheaper tier
- writing → medium tier
- review/final reasoning → stronger tier
- repeated report skeleton/context → cache where possible
That is a much more rational cost shape.
Why this matters more for CrewAI than normal apps
A normal app may only hit the LLM a few times.
A CrewAI system can explode the number of calls because the framework is designed around multiple agents and structured orchestration.
So the value of a gateway grows with:
- number of agents
- number of task handoffs
- amount of repeated context
- number of production runs
- number of providers you want to evaluate
That is why CrewAI is such a good fit for the “put a gateway underneath it” pattern.
What Lynkr does not replace
Important distinction:
- CrewAI is still the orchestration framework
- Lynkr is still the LLM gateway
Lynkr does not replace CrewAI’s agent/task/flow model.
It complements it by making the model layer cheaper and more flexible.
Honest tradeoffs
It is worth being direct here.
A gateway adds another infrastructure layer.
That is worth it when:
- you have multiple agents
- you care about spend
- you want provider flexibility
- you are moving toward production usage
It may not be worth it when:
- you are just learning CrewAI
- you are running a toy example once
- simplicity matters more than control
So I would not tell every beginner to add a gateway on day one.
But once your CrewAI project becomes real, the gateway question shows up quickly.
Final take
CrewAI is one of the most important open-source frameworks in the multi-agent Python ecosystem right now.
It gives you a useful structure for building agent systems with:
- roles
- tasks
- crews
- flows
- production-style orchestration
And if those systems are getting expensive, Lynkr is a practical way to put a cost-and-routing layer underneath them.
That gives you:
- one stable model endpoint
- provider flexibility
- caching for repeated context
- tier routing for different agent roles
- a better chance of keeping multi-agent systems affordable as they scale
If you want to try the stack:
- CrewAI:
https://github.com/crewAIInc/crewAI - Lynkr:
https://github.com/Fast-Editor/Lynkr
If you are already running CrewAI in production, I think the right question is not:
“What is the best model?”
It is:
“Which parts of my agent system actually deserve the expensive model?”
Top comments (0)