DEV Community

Batty
Batty

Posted on

The Case for Markdown as Your Agent's Task Format

When I started coordinating multiple AI coding agents, my first instinct was JSON for task definitions. Structured, parseable, unambiguous. It lasted about a week.

The problem wasn't parsing. The problem was everything else.

What went wrong with JSON

{
  "id": 27,
  "title": "Add JWT authentication to the API",
  "status": "in-progress",
  "assigned_to": "engineer-1",
  "description": "Implement JWT-based auth middleware for all protected routes. Use the jsonwebtoken crate. Add login and refresh endpoints.",
  "acceptance_criteria": [
    "All protected routes return 401 without valid token",
    "Login endpoint returns access + refresh tokens",
    "Tests cover happy path and expired token"
  ]
}
Enter fullscreen mode Exit fullscreen mode

This works for machines. But when you're supervising agents and need to quickly check what's happening:

  • cat tasks.json gives you a wall of brackets and quotes
  • git diff shows structural noise alongside actual changes
  • Editing a task description means navigating JSON syntax
  • The agent needs a JSON parser to read its own assignments

Every interaction with the task file required tooling. I couldn't just look at it.

What Markdown gets right

The same task in Markdown:

---
id: 27
status: in-progress
assigned_to: engineer-1
---

# Add JWT authentication to the API

Implement JWT-based auth middleware for all protected routes.
Use the jsonwebtoken crate. Add login and refresh endpoints.

## Acceptance criteria

- All protected routes return 401 without valid token
- Login endpoint returns access + refresh tokens
- Tests cover happy path and expired token
Enter fullscreen mode Exit fullscreen mode

Now:

  • cat task-27.md is immediately readable
  • git diff shows exactly what changed in human terms
  • Editing means opening a file and typing prose
  • The agent reads it as natural language — no parser needed

The YAML frontmatter carries the structured data (ID, status, assignment). The body is free-form Markdown that both humans and agents read natively.

The kanban board is a directory

Forget Jira. Forget Trello. A kanban board is a directory of Markdown files:

board/tasks/
  027-add-jwt-authentication.md    # status: in-progress
  028-write-api-tests.md           # status: todo
  029-fix-dashboard-css.md         # status: done
Enter fullscreen mode Exit fullscreen mode

Want to see what's in progress?

grep -l "status: in-progress" board/tasks/*.md
Enter fullscreen mode Exit fullscreen mode

Want to see the full board?

for f in board/tasks/*.md; do
  status=$(grep "^status:" "$f" | cut -d' ' -f2)
  title=$(grep "^# " "$f" | head -1 | sed 's/^# //')
  printf "%-15s %s\n" "$status" "$title"
done
Enter fullscreen mode Exit fullscreen mode

Output:

in-progress     Add JWT authentication to the API
todo            Write API endpoint tests
done            Fix dashboard CSS bug
Enter fullscreen mode Exit fullscreen mode

No database. No API. No special tooling. Standard Unix commands on standard files.

Why agents prefer Markdown

AI coding agents already understand Markdown. It's in their training data — millions of README files, issue descriptions, and documentation pages. When you hand an agent a Markdown task file, it reads the title, understands the acceptance criteria, and starts working.

Compare this to handing an agent a JSON blob. The agent can parse it, but the format adds friction. Escaped strings, nested objects, array indices — all syntax that carries no information about the task itself.

Markdown is the closest thing to natural language that's still structured enough to parse programmatically. The frontmatter gives you machine-readable fields. The body gives the agent context in the format it understands best.

Git as the state machine

When every task is a Markdown file, git becomes your state machine:

# What changed today?
git log --since="today" --name-only -- board/tasks/

# Who moved this task to done?
git log -1 --format="%an %ai" -- board/tasks/029-fix-dashboard-css.md

# What did the task look like before the agent modified it?
git diff HEAD~1 -- board/tasks/027-add-jwt-authentication.md

# Roll back a task to its previous state
git checkout HEAD~1 -- board/tasks/027-add-jwt-authentication.md
Enter fullscreen mode Exit fullscreen mode

Every state transition is a commit. Every change is auditable. Rollback is git checkout. You get version control for your project management — not as a feature you built, but as a side effect of using files.

The practical setup

Batty uses this pattern for its entire task management layer. Each task is a Markdown file with YAML frontmatter. The kanban board is a directory. State changes are file edits. The supervisor reads task files to dispatch work, and agents read them to understand their assignments.

The format:

---
id: 31
status: todo
priority: high
tags: [api, auth]
---

# Implement rate limiting on public endpoints

Add rate limiting middleware to all public API routes.
Use a token bucket algorithm with configurable limits per endpoint.

## Context

The /api/search endpoint is getting hammered by scrapers.
Current traffic: ~200 req/sec, target limit: 60 req/sec.

## Done when

- Rate limiter middleware applied to all public routes
- Returns 429 with Retry-After header when limit exceeded
- Limits configurable per-route in config.yaml
- Tests cover limit enforcement and header correctness
Enter fullscreen mode Exit fullscreen mode

An agent reads this and knows exactly what to build, what "done" means, and what context matters. No format translation needed.

When Markdown isn't enough

Markdown works for task definitions, status tracking, and agent instructions. It doesn't work for everything:

  • Structured config — YAML or TOML. "Who talks to whom" is a graph, not prose.
  • Event logs — JSONL. Machine-parseable, append-only, jq-friendly.
  • Message routing — Maildir files. Atomic delivery via filesystem rename.

The right format depends on who reads it. Tasks are read by humans and agents — Markdown. Config is read by the daemon — YAML. Logs are read by scripts — JSONL. Pick the format that matches the reader.


Try it: cargo install batty-cliGitHub | Demo

Top comments (2)

Collapse
 
thlandgraf profile image
Thomas Landgraf

Reached the same conclusion independently, but for specifications rather than tasks. Full disclosure: I'm the creator of SPECLAN, a VS Code extension that structures requirements as a hierarchical tree of Markdown files with YAML frontmatter — Goal then Feature then Requirement then Scenario then Acceptance Criterion, one file per entity, with ID and parent reference and status in the frontmatter. The same properties you describe for tasks apply to specs: grep-able, diff-able, agent-readable without tooling, git is the state machine for free.

The one thing we leaned into hard that tasks might not need: using the status field as an ownership signal, not just a label. "draft" means the PO is holding it, "approved" means it's handed off to the dev team, "under-test" means it's handed back. The status controls who can do what, and AI agents only implement approved specs — they can't pick up a draft. Same Markdown+YAML substrate, just with the state machine encoded as workflow gates.

Curious how you handle task dependencies in Batty — do you reference parent/child tasks in frontmatter, or just order by ID and let the agent figure out sequencing?

Collapse
 
admin_chainmail_6cfeeb3e6 profile image
Admin Chainmail

We stumbled into this exact pattern independently. Our agent system uses three markdown files as its core state: activity.md (timestamped action log), metrics.md (snapshot numbers each session), and decisions.md (strategic decisions with reasoning). The agent reads these at the start of every session to orient itself before doing anything.

The underrated benefit is debuggability. When the agent does something wrong, you can read the logs and trace exactly what information it had when it made the decision. Try doing that with a JSON blob or a database row.

One edge case we hit: markdown files grow over time. After 40 sessions, the activity log is too large to read in one pass. The agent now reads just the latest session, but that means it can lose context about decisions made 20 sessions ago. We added a separate decisions.md to capture the 'why' of strategic choices so they don't get buried in the activity stream.