From GitHub Issue to Merged PR: My Complete AI-Powered Development Workflow

#ai #claude #github #githubactions

A few months ago, I was burning through an embarrassing amount of Claude tokens monthly while shipping maybe two meaningful features. The wake-up call came during a debugging session where I spent the better part of an hour having Claude explain why the same function worked differently in different contexts—only to discover I'd been feeding it outdated code samples from three iterations ago.

The Cost of Wing-It Development

Here's what broken AI development looks like: you start a conversation with Claude about adding user authentication, thirty messages later you're discussing database migrations, then somehow you're refactoring your entire error handling system. Each context switch burns tokens, and worse, the AI starts hallucinating solutions based on code it thinks exists but doesn't.

I was treating Claude like Stack Overflow with a conversation interface—throwing problems at it reactively instead of systematically. The result? Half-implemented features, context drift that made debugging impossible, and token bills that made me question whether AI development was worth it.

Planning Is Non-Negotiable

The breakthrough came when I stopped coding first and started planning first. Every feature now begins with a structured conversation where Claude and I define three things before touching any code:

Context boundaries — exactly what codebase state we're working from, what's been changed recently, and what assumptions we can make. No "you know that auth system we built" — explicit file references and current state.

Tool constraints — which APIs we're using, what our deployment pipeline expects, and what our testing setup can handle. Claude needs to know it's writing for GitHub Actions, not generic CI.

Success criteria — what "done" looks like, not just functionally but how we'll know the code is maintainable six months from now.

This planning phase costs relatively little but saves a lot downstream. Claude generates much better code when it knows exactly what problem it's solving within defined constraints.

Document Everything, Start Fresh Sessions

The temptation is to keep one long conversation going because it feels efficient. It's not. After enough back-and-forth, Claude starts making assumptions based on earlier context that may no longer be accurate. I learned to recognize the warning signs:

Claude references code I never shared in the current session
Solutions become overly complex for simple problems
The AI starts suggesting fixes for problems that don't exist

Instead of fighting context drift, I document everything mid-conversation. When planning a feature, I have Claude generate a structured project brief that includes the technical approach, file structure, and key implementation notes. Then I save that brief and start a fresh session when it's time to code.

The brief becomes the handoff document between sessions. New Claude session gets the brief, current codebase state, and the specific task. No context drift, no hallucinated solutions.

Adversarial Code Review Catches Complacency

Here's something I discovered by accident: Claude gets comfortable with its own code. If I ask it to review code it just wrote, it tends to approve its own patterns even when they're problematic. The AI develops blind spots around its own work.

The solution is adversarial review using different models. After Claude writes a feature, I paste the code into a conversation with a different model and ask for a critical review without mentioning Claude wrote it. Different models catch different issues:

# Claude wrote this authentication decorator
def require_auth(f):
    def decorated_function(*args, **kwargs):
        if 'user_id' not in session:
            return redirect('/login')
        return f(*args, **kwargs)
    return decorated_function

Claude reviewed its own code: "Looks good, handles the authentication check properly."

The adversarial review: "This loses the function metadata and doesn't handle POST data properly on redirect. Also no CSRF protection."

Both criticisms were valid. Claude had gotten comfortable with a pattern that worked but wasn't robust. Cross-model review catches these blind spots before they become production issues.

The Complete Workflow in Action

Here's how a typical feature development cycle works now:

Brainstorming phase — I describe the feature need to Claude conversationally. We explore approaches, discuss tradeoffs, and identify the core complexity. This generates understanding, not code.

Planning phase — Claude creates a detailed project plan with milestones, breaking the feature into discrete GitHub issues. Each issue gets acceptance criteria, technical notes, and priority ranking.

## Issue: Add password reset functionality
**Priority:** P1
**Acceptance Criteria:**
- User can request password reset via email
- Reset links expire after 24 hours  
- Process works with existing auth system
**Technical Notes:**
- Extend User model with reset_token field
- Add email service integration
- Update auth middleware to handle reset flow

Project setup — Claude generates GitHub Projects configuration, complete with automation rules. I can literally copy-paste its output into GitHub's project setup interface.

Development — Each issue becomes its own focused Claude session using GitHub Codespaces. Claude gets the project brief, current codebase, and single issue context. Write code, test locally, commit.

Review — Different AI model reviews the code adversarially before I create the pull request.

The workflow scales because each step is self-contained and documented. I can pick up development from my phone using Codespaces, spend fifteen minutes with the Claude VS Code extension getting an issue implemented, then put it down again.

When It All Clicked

The system clicked during a weekend when I was traveling but wanted to knock out a few small issues. Using just my laptop and GitHub's web interface, I:

Created three new issues from user feedback
Had Claude prioritize them within the existing project
Used Codespaces to implement the highest priority fix
Got adversarial review from a different model
Merged the PR

What used to take half a day and leave me second-guessing every decision took under an hour — and I was confident in the output because the process held. That's the real payoff.

The key insight is that sustainable AI development isn't about finding the perfect prompt — it's about building repeatable systems that prevent the expensive mistakes. Planning upfront, documenting everything, and using fresh context prevents the token-burning death spirals that make AI development feel unsustainable.

Every system has overhead, but this overhead scales. The planning documents become project knowledge. The adversarial reviews catch patterns I need to improve. The structured workflow means I can develop from anywhere without losing momentum.

What's Next

Next up, we're dipping our toes into FolioChat to explore how RAG actually works for developers who build things. We'll see if retrieval-augmented generation can maintain context better than conversation history, and whether it's worth the complexity for real development workflows.

complete-ai-workflow

Tags: ai-development, claude, github, workflow, productivity

Series: Building With AI Agents — Article 4 of 12