Can AI agents be systematically optimized for cost and latency using evolutionary methods?
As multi-agent systems grow in complexity, managing their operational cost and latency becomes a practical concern. Token usage and execution time scale with the number of agents, the length of prompts, and the depth of orchestration logic. Manual optimization of these systems is time-consuming and difficult to do systematically.
Agent Evolver applies genetic evolution to AI agent code, optimizing for cost and speed. It is built using the KISS framework.
The Limits of Prompt Engineering
Prompt engineering is a common approach to improving agent behavior, but it addresses only one dimension of agent performance. An agent's efficiency also depends on:
- How the orchestrator delegates to sub-agents
- Whether operations are batched or run sequentially
- Which tools are created dynamically vs. hardcoded
- How checkpointing affects recovery time
- Whether task management adds overhead or saves tokens
These are code-level concerns, not prompt-level ones, and they require a different optimization approach.
Evolutionary Optimization
Agent Evolver applies principles from evolutionary computation—specifically, mutation, crossover, and Pareto-based selection—to agent codebases.
Here is how the process works:
1. Seed the Population
You provide a task description specifying what you want the agent system to accomplish. Agent Evolver then uses a coding agent to generate an initial agent implementation. This produces complete, runnable code including:
- Orchestrator patterns for long-running tasks
- Dynamic todo list management
- Tool creation at runtime
- Checkpointing for resilience
- Sub-agent delegation strategies
The coding agent searches the web for current patterns in building efficient agents, incorporating publicly available techniques.
2. Mutate and Crossover
Each generation, Agent Evolver applies two evolutionary operations:
Mutation: A successful agent variant is selected, its code is analyzed, and targeted improvements are applied—shortening prompts, adding caching, batching operations, or optimizing algorithms. The improver agent reads the code, understands the architecture, and makes specific modifications.
Crossover: Two high-performing variants are selected, and their respective strengths are combined. For example, if Variant A has effective caching logic and Variant B has more compact prompt structures, crossover produces offspring that incorporate both.
3. Pareto Frontier Selection
Agent Evolver optimizes for multiple objectives simultaneously using a Pareto frontier of non-dominated solutions.
Consider two agents:
- Agent A: 5,000 tokens, 10 seconds
- Agent B: 3,000 tokens, 15 seconds
Neither dominates the other. Agent A is faster; Agent B is cheaper. Both represent valid trade-offs, so both remain on the frontier.
An agent is removed only when another agent is both cheaper and faster. This preserves diversity in the population and avoids premature convergence to a local optimum.
The system uses crowding distance to maintain diversity, ensuring that when the frontier needs trimming, solutions remain distributed across the trade-off curve.
Comparison with Prompt Optimization
Prompt optimization tools tune prompt text while leaving agent code unchanged. Agent Evolver operates on both prompts and code:
| Traditional Prompt Optimization | Agent Evolver |
|---|---|
| Tunes prompt text | Optimizes prompts AND code |
| Single objective (accuracy) | Multi-objective (cost + speed) |
| Static architecture | Evolves architecture |
| Manual iteration | Automated generations |
| Local improvements | Global search via genetics |
The improver agent analyzes control flow, identifies redundant API calls, finds opportunities for parallelization, and restructures agent delegation hierarchies.
Architecture
The system follows this structure:
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Task Description │
└─────────────────────────────────────────────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Initial Agent Creation (Relentless Coding Agent) + Web Search for Best Practices │
└─────────────────────────────────────────────────────────────┬────────────────────────────────────────────┘
│
▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Evolution Loop │
│ ┌────────────────────────────────────────────────────────────────────────────────────────────────◄──┐ │
│ │ Mutation (80%): Single parent, Targeted changes │ Crossover (20%): Two parents, Combine │ │ │
│ └───────────────────────────────────────────────┬────────────────────────────────────────────────┘ │ │
│ ▼ │ │
│ ┌────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │
│ │ Evaluation: Measure tokens_used, execution_time │ │ │
│ └───────────────────────────────────────────────┬────────────────────────────────────────────────┘ │ │
│ ▼ │ │
│ ┌────────────────────────────────────────────────────────────────────────────────────────────────┐ │ │
│ │ Pareto Frontier Update: Keep non-dominated solutions, Trim by crowding distance │ │ │
│ └───────────────────────────────────────────────┬────────────────────────────────────────────────┘ │ │
│ └───────────────── More generations? ───────────────┘ │
└─────────────────────────────────────────────────────┬────────────────────────────────────────────────────┘
│ Done
▼
┌──────────────────────────────────────────────────────────────────────────────────────────────────────────┐
│ Optimal Agent Output: Best trade-off on Pareto frontier │
└──────────────────────────────────────────────────────────────────────────────────────────────────────────┘
Each generation, the system:
- Samples from the Pareto frontier
- Applies mutation or crossover
- Evaluates the offspring
- Updates the frontier with any non-dominated variants
- Copies the current best to an
optimal_agentdirectory
The best agent is always available, even while evolution continues.
Getting Started
from kiss.agents.create_and_optimize_agent import AgentEvolver
evolver = AgentEvolver()
best_agent = evolver.evolve(
task_description="""
Build a code review agent that can:
1. Analyze pull requests for bugs and style issues
2. Suggest improvements with explanations
3. Auto-fix simple issues when confident
""",
max_generations=10,
initial_frontier_size=4,
max_frontier_size=6,
mutation_probability=0.8,
)
print(f"Optimal agent: {best_agent.folder_path}")
print(f"Tokens used: {best_agent.metrics['tokens_used']}")
print(f"Execution time: {best_agent.metrics['execution_time']:.2f}s")
print(f"Success: {best_agent.metrics['success']}")
The result is a complete agent package including code, config, tests, and documentation.
Summary
Agent Evolver automates the optimization of AI agent systems by treating agent code as an evolvable artifact. Rather than manually iterating on prompts and code, you define a task and let the evolutionary loop search for efficient implementations across both cost and latency dimensions.
Each generation of evolution incorporates current publicly available knowledge about building efficient agents, so improvements from the broader community can be absorbed automatically.
Agent Evolver is part of the KISS (Keep It Simple, Stupid) agent framework. It is open-source and available on GitHub.

Top comments (0)