From GitHub Issue to Merged PR: Building an Autonomous Dev Pipeline with Claude Code

#claudecode #automation #github #aiagents

We run 25+ repositories at Stackbilt. One founder. Issues pile up. The boring stuff — doc fixes, test gaps, type errors — never gets prioritized because there's always something more urgent.

So we built a system where an AI agent picks up labeled GitHub issues, writes the fix, opens a PR, and posts a summary. No human in the loop until code review.

The pipeline

GitHub Issue (labeled "aegis")
    → Issue Watcher (hourly cron)
    → Task Queue (D1)
    → cc-taskrunner (Claude Code session)
    → Auto-PR on auto/{category}/{task-id} branch
    → Session digest

The cc-taskrunner is open source. It pulls tasks from a queue, spins up Claude Code sessions with structured prompts, and handles the lifecycle.

Governance tiers

Not every task should run unsupervised:

auto_safe — docs, tests, research, refactors → executes immediately
proposed — bugfixes, features → requires approval

Classification is deterministic. GitHub labels map to categories. No LLM in the classifier.

Safety hooks

No interactive prompts (AskUserQuestion blocked)
No destructive git ops (force push, reset hard blocked)
No production deploys
No secret access

What works well

The system excels at work humans deprioritize: documentation drift, test coverage gaps, type error cleanup. Tight scope = high merge rate.

What breaks

completion_signal_missing — agent finishes but doesn't output TASK_COMPLETE. Repeated 11+ times/week. Mitigation: scan for git commits as secondary signal.

Large file timeouts — 800+ LOC files hit turn limits. Auto-bumps max_turns now.

Vague prompts — "Improve the auth system" → scattered changes. Fix: write prompts like junior engineer tickets.