Tanishq

Posted on Feb 14

How I Shipped 3 Production SaaS Backends in 30 Days Using Claude Code (Without Context Loss Destroying Everything)

#ai #typescript #productivity #saas

I've been using Claude Code for the last 4 months to build SaaS backends. Love it. Until I don't.

You know the pattern. Day 1: Claude writes beautiful auth logic. You're impressed. Day 3: Ask it to add Stripe webhooks. Day 5: Auth is broken. No idea what changed. Day 7: Context window full. Start new session. Day 8: "Wait, what database schema are we using again?"

Every. Single. Time.

I'd spend more time re-explaining my project than actually building it. The "brilliant colleague with amnesia" metaphor is painfully accurate.

The Context Loss Problem Nobody's Solving

Here's what I kept hitting:

Mid-session drift. Claude would start with async/await, then randomly switch to .then() chains 200 lines later. Why? Context degradation. The model "forgets" earlier patterns as the conversation grows.

Schema amnesia. I'd define a users table with specific columns in message 5. By message 40, Claude's suggesting queries for columns that don't exist.

Security regression. RLS policies carefully set up in Phase 1? Completely ignored when adding features in Phase 3.

The Groundhog Day effect. Close laptop Friday. Open Monday. Spend 30 minutes re-explaining the entire project before Claude can write a single line.

I tried everything the internet suggested:

✗ Longer prompts with full context (hit token limits, quality degraded anyway)
✗ Custom instructions (too vague, didn't persist across sessions)
✗ Separate chats for each feature (lost the big picture, broke dependencies)
✗ Manual "memory dumps" (exhausting, error-prone)

Nothing worked. The fundamental issue is that LLMs have working memory, not long-term memory. They're brilliant in the moment, terrible at maintaining state.

What Actually Fixed It: Multi-Agent Orchestration

I realized the problem isn't the AI. It's the workflow.

Human developers don't keep entire codebases in their heads either. They use documentation. Design docs. Database schemas. API specs. External references that persist.

So I built a system that orchestrates Claude through specialized agents, each with fresh context windows and specific jobs.

The Four Files That Maintain State

1. PROJECT.md - The vision document

Problem being solved (plain English)
Target users and workflows
Core value proposition
Success criteria

2. REQUIREMENTS.md - Traceable feature definitions

Every requirement has a unique ID (AUTH-01, PAY-02, etc.)
v1 scope (must have), v2 scope (future), out-of-scope (won't do)
Acceptance criteria for each

3. ROADMAP.md - Phased execution plan

Phase 0: Infrastructure
Phase 1: Core feature
Phase 2: Supporting features
Phase 3: Polish
Each requirement mapped to specific phases

4. STATE.md - The living memory

Completed phases (locked from modification)
Current phase (only modifiable code)
Database schema (exact DDL)
API routes built (paths, methods, business logic)
Architectural decisions

These files are sized to avoid context degradation (under 10k tokens each) and serve as a single source of truth for both humans and AI.

The Multi-Agent System

Instead of one long Claude conversation, the system spawns specialized parallel agents:

Research agents (4 running in parallel before coding):

Stack researcher → best technologies for your domain
Features researcher → table stakes vs differentiators
Architecture researcher → system design patterns
Pitfalls researcher → common mistakes to avoid

Execution agents:

Planner → creates verified task plans
Executor → runs plans with atomic commits
Verifier → tests and auto-debugs
Mapper → analyzes existing codebase

Each agent gets fresh context. No degradation. No drift.

The Workflow Cycle

1. INITIALIZE
   Describe vision → AI creates PROJECT.md, REQUIREMENTS.md, ROADMAP.md

2. DISCUSS (each phase)
   Shape implementation preferences before committing

3. PLAN  
   Research domain patterns → create verified execution plan

4. EXECUTE
   Run plans in parallel waves with fresh contexts → atomic git commits

5. VERIFY
   User acceptance testing with automatic debugging

Repeat 2-5 for each phase

Critical rule: Completed phases are locked. The AI can only modify code in the current phase. This prevents the "adding payments breaks auth" problem entirely.

Boilerplate-Aware Intelligence

The AI knows what's already built in the boilerplate (auth, Stripe, Razorpay, Supabase, multi-tenancy, emails, admin panel). It only plans what's custom to your domain.

This means:

Zero time wiring auth to database
Zero time setting up payment webhooks
Zero time building admin panels
Pure focus on your unique business logic

The Results (Why I'm Sharing This)

Last 30 days, I built 3 production SaaS backends using this system:

Analytics Dashboard (13 hours total, across 4 sessions)

Custom analytics schema (metrics, data_points, aggregations)
Ingestion API with validation
Time-series calculations (daily, weekly, monthly)
CSV export with date filtering
Now has 8 paying users making $96/month

Feedback Widget (11 hours, 3 sessions)

Feedback schema with metadata
Widget embedding API (iframe + script tag)
Admin CRUD with filtering
Email notifications on submission
Webhook system for integrations
5 signups in first week

Content Calendar (9 hours, 2 sessions)

Content schema with scheduling
CRUD API with role-based access
Publishing logic with timezone handling
Calendar view backend
En route to production

All production-ready. All built with AI orchestration. All using persistent state across weeks.

Commands That Run It

After building this system for myself, I packaged it:

/propelkit:new-project

This master command:

Asks deep questions about your project
Spawns research agents for your domain
Creates PROJECT.md, REQUIREMENTS.md, ROADMAP.md
Generates phased execution plan
Hands you off to phase-by-phase building

Then for each phase:

/propelkit:discuss-phase 1    # Shape your preferences
/propelkit:plan-phase 1       # Research + create execution plan  
/propelkit:execute-phase 1    # Build with parallel agents
/propelkit:verify-work        # Test with auto-debugging

The system maintains STATE.md automatically. Close laptop. Come back days later. Resume exactly where you left off.

PropelKit - The Packaged System

After the third project, I productized it.

What you get:

Production Next.js boilerplate (saves 100+ hours):

Auth (email, OAuth, sessions)
Stripe + Razorpay payments
Supabase (PostgreSQL with RLS)
Multi-tenancy (organizations, teams, roles)
Credits system (usage-based billing)
Email templates (8 pre-built)
Admin panel (user management, analytics)
26 AI PM commands

Stack: Next.js 16, TypeScript, Supabase, Stripe, Razorpay

One-time purchase. You own the code. Build unlimited products.

The AI PM uses the exact multi-agent orchestration system described above. Persistent state. Parallel research. Boilerplate-aware. Atomic commits.

Demo: propelkit.dev (watch the AI questioning, research, roadmap generation, and execution)

Why This Approach Works

Context engineering - Separate files under degradation thresholds, not one massive chat

Multi-agent orchestration - Fresh contexts per agent, no drift accumulation

Boilerplate awareness - AI knows what exists, only builds what's custom

Atomic commits - One feature per commit, precision rollback

Phase locking - Completed code stays completed, no random rewrites

Domain research - AI understands your industry before writing code

This isn't just for PropelKit. The principles work anywhere - you need persistent state files and fresh context windows per task.

What's your experience with AI code context loss? Have you found other systems that work?

DEV Community