How to Build a CLI AI Agent Task Runner (Architecture + Code)
AI agents work best when they have a structured system for receiving, routing, and executing tasks. A CLI AI agent task runner is the backbone of any serious autonomous agent setup.
In this post I'll walk through the architecture, the key components, and what I've learned from running one in production.
What Is a CLI AI Agent Task Runner?
A CLI AI agent task runner is a system that manages and executes tasks autonomously through a command-line interface. Tasks can be anything: running scripts, calling APIs, generating content, deploying code, or triggering other agents.
The "agent" part means it acts without direct human intervention — you submit a task, the runner handles it.
Why it matters:
- Automate the repetitive. Anything you do more than twice is a candidate.
- Scale non-linearly. One runner can coordinate 10 agents doing parallel work.
- Standardize interfaces. Every task goes through the same pipeline — easy to debug, monitor, and extend.
Core Architecture: 3 Components
1. Task Queue
Receives tasks, stores them until an agent is ready to process them. Can be as simple as a SQLite table or as robust as RabbitMQ.
# Simple SQLite-based task queue
import sqlite3
conn = sqlite3.connect('tasks.db')
conn.execute('''
CREATE TABLE IF NOT EXISTS tasks (
id TEXT PRIMARY KEY,
type TEXT,
prompt TEXT,
status TEXT DEFAULT 'pending',
created_at TIMESTAMP DEFAULT CURRENT_TIMESTAMP
)
''')
def enqueue(task_id, task_type, prompt):
conn.execute(
"INSERT INTO tasks (id, type, prompt) VALUES (?, ?, ?)",
(task_id, task_type, prompt)
)
conn.commit()
2. Status States
Track every task through its lifecycle. Minimum viable set: pending → in_progress → done | failed.
def update_status(task_id, status, result=None):
conn.execute(
"UPDATE tasks SET status=?, result=? WHERE id=?",
(status, result, task_id)
)
conn.commit()
In production I use backlog, in_progress, done, failed, blocked_on_human — the extra states prevent tasks from getting lost when something needs external input.
3. Agent Routing
Dispatch tasks to the right agent based on type, capability, or availability.
AGENT_ROUTES = {
'summarize': 'local_llm',
'web_research': 'browser_agent',
'code_review': 'claude_agent',
'custom': 'claude_agent', # default
}
def route_task(task):
agent = AGENT_ROUTES.get(task['type'], 'claude_agent')
return dispatch_to(agent, task)
The CLI Interface
Using click to expose the task runner:
import click
import uuid
@click.group()
def cli():
pass
@cli.command()
@click.option('--type', default='custom')
@click.argument('prompt')
def submit(type, prompt):
"""Submit a task to the runner."""
task_id = str(uuid.uuid4())[:8]
enqueue(task_id, type, prompt)
click.echo(f"Task {task_id} queued")
@cli.command()
@click.argument('task_id')
def status(task_id):
"""Check task status."""
row = conn.execute(
"SELECT status, result FROM tasks WHERE id=?", (task_id,)
).fetchone()
if row:
click.echo(f"Status: {row[0]}")
if row[1]:
click.echo(f"Result: {row[1][:200]}")
else:
click.echo("Task not found")
if __name__ == '__main__':
cli()
Usage:
python runner.py submit --type custom "Analyze the Q4 sales report and flag anomalies"
# Task a3f2b1c9 queued
python runner.py status a3f2b1c9
# Status: done
# Result: Found 3 anomalies in December billing...
What I've Learned Running This in Production
1. Status states save you. When something breaks at 2 AM, you want to know exactly where in the pipeline it stopped. Vague "failed" with no context is useless. Log everything.
2. Idempotency matters. Tasks get retried. If your task is "send email," make sure running it twice doesn't send two emails. Use idempotency keys or check-before-act patterns.
3. Separate the runner from the agent. The runner handles orchestration. The agent handles execution. Mixing them makes both harder to change.
4. The hardest part is priority logic. Once you have 50+ tasks in the queue, figuring out which one to run next becomes its own problem. Start simple (timestamp order), then add priority scores when you need them.
5. Human-in-the-loop tasks are normal. Some tasks need human approval or input before they can proceed. Model this explicitly (blocked_on_human) instead of letting them hang forever.
Skip the Setup Grind
If you want a production-ready version with persistent memory, multi-agent coordination, priority scoring, and a working CLI — I packaged our full setup as a starter kit.
CLI Agent Starter Kit v2 → ($24)
It includes the task runner architecture above, plus the agent personality files, memory system, and skill templates we use in production. Everything you need to go from zero to a running autonomous agent in an afternoon.
Have you built something similar? What does your task routing look like? Drop it in the comments.
Top comments (0)