Chudi Nnorukam

Posted on Feb 10 • Edited on Jul 14 • Originally published at chudi.dev

Pain Point to MVP in 7 Days: The StatementSync Case Study

#mvp #saas #ai #startup

Originally published at chudi.dev

"I spend 10 hours a week just typing numbers from PDFs into spreadsheets."

That's what a freelance bookkeeper told me. She processes 50+ bank statements per month. Each one takes 10-15 minutes of manual transcription. The tedium is real.

I built StatementSync to fix this. One week from idea to production.

What Pain Point Does StatementSync Solve?

Freelance bookkeepers process 50 or more bank statement PDFs every month, manually typing each transaction into a spreadsheet. That's 10-plus hours of tedious, error-prone work per week. Existing tools either require proprietary software integrations, charge $0.25–$1.00 per file, or use generic OCR with 80–85% accuracy that creates more errors than it prevents.

The Pain Point

Bookkeepers work with bank statements constantly. The workflow:

Client sends PDF bank statement
Open PDF, open spreadsheet
Manually type each transaction
Check for errors
Repeat 50+ times per month

The tools that exist either:

Require proprietary software (QuickBooks, Xero)
Charge per file ($0.25-1.00 per statement)
Have terrible accuracy (OCR garbage)

For someone processing 50+ statements monthly, per-file pricing adds up fast. $25-50/month minimum, scaling with volume.

Why Existing Tools Fail Bookkeepers

I asked the bookkeeper why she didn't use the tools that existed. The answers were consistent across five more interviews:

Per-file pricing punishes heavy users. At $0.25–0.50 per statement and 50+ statements monthly, the monthly bill rivals a software subscription. But you're not getting subscription-level reliability—one bad batch month and you've paid for nothing.

Proprietary software lock-in. QuickBooks can import bank data, but only after you configure a connection to the bank—which requires administrative access that clients often don't share with their bookkeeper. The bank statement PDF exists precisely because the connection doesn't.

OCR accuracy is unreliable. Generic OCR tools treat a bank statement like a document. Bank statements have fixed patterns: date column, description column, debit column, credit column, balance column. The patterns are predictable enough that OCR is overkill—and OCR's 80–85% accuracy on complex layouts creates more work correcting errors than manual entry would have.

The gap StatementSync fills: structure-aware parsing (not OCR, not AI), delivered via web with no setup, at a flat monthly rate. Three specific failures in the market, one specific fix.

How Was StatementSync Validated Before Any Code Was Written?

MicroSaaSBot's Researcher agent scored StatementSync 78 out of 100 before any code was written. Severity rated 8/10 — bookkeepers lose real billable time daily. Persona clarity rated 9/10. Willingness to pay rated 8/10 — users were already paying per-file competitors. Problems scoring below 60 are killed outright; 78 cleared the threshold with confidence.

Validation Before Code

MicroSaaSBot's validation phase scored StatementSync before I wrote a single line of code (the full MicroSaaSBot system is introduced in Introducing MicroSaaSBot):

Criteria	Score	Notes
Problem Severity	8/10	Daily pain point, high time cost
Persona Clarity	9/10	"Freelance bookkeeper processing 50+ statements/month"
Market Size	7/10	Niche but clear demand
Willingness to Pay	8/10	Currently paying for inferior solutions
Overall	78/100	Proceed

Problems scoring below 60/100 get killed. No code written. This prevents building products nobody wants.

The validation phase confirmed:

Real people have this problem
They're already paying for solutions
Current solutions have clear weaknesses to exploit

The Week

Day 1-2: Deep Validation

MicroSaaSBot's Researcher agent dug deeper:

Competitive analysis (TextSoap, HappyFox, manual OCR tools)
Pricing research ($0.25-1.00 per file is standard)
Feature gap analysis (batch upload, bank-specific parsing)

Key insight: Flat-rate pricing would be a massive differentiator. Heavy users hate per-file fees.

Day 3: Architecture

MicroSaaSBot's Architect agent designed the system:

Frontend: Next.js 15 (App Router)
Auth: Clerk (handles signup, OAuth)
Database: Supabase PostgreSQL
Storage: Supabase Storage (PDFs, exports)
Payments: Stripe (subscriptions)
PDF Parsing: unpdf (serverless-compatible)
Hosting: Vercel

Critical decision: Pattern-based extraction instead of LLM inference.

LLM extraction would cost $0.01-0.05 per statement in API calls. Pattern-based extraction costs nothing at runtime. For a flat-rate product, this is the difference between profit and loss. Stripe's subscription billing docs make clear that sustainable flat-rate pricing only works when marginal cost per unit is negligible—otherwise usage spikes destroy margins.

Day 4-6: Implementation

MicroSaaSBot's Developer agent built:

Day 4: Auth flow, database schema, file upload

// Prisma schema
model User {
  id                  String   @id @default(cuid())
  clerkId             String   @unique
  email               String
  subscriptionTier    Tier     @default(FREE)
  conversionsThisMonth Int     @default(0)
  lastResetAt         DateTime @default(now())
}

model Conversion {
  id              String   @id @default(cuid())
  userId          String
  originalFileName String
  status          Status   @default(PENDING)
  extractedData   Json?
  excelPath       String?
  csvPath         String?
}

Day 5: PDF parsing engine

async function extractTransactions(pdfBuffer: Buffer): Promise<Transaction[]> {
  const pdf = await getDocument({ data: pdfBuffer }).promise;
  const text = await extractText(pdf);

  // Pattern matching for supported banks
  const bank = detectBank(text);
  const parser = getParser(bank); // Chase, BofA, Wells, Citi, Capital One

  return parser.extract(text);
}

Day 6: Export generation, Stripe integration, dashboard

Day 7: Deployment

MicroSaaSBot's Deployer agent:

Configured Vercel deployment
Set up Supabase production
Connected Stripe webhooks
Ran smoke tests

Live by end of day.

What Was the Hardest Technical Problem During the Build?

The hardest technical problem was PDF processing on Vercel serverless. pdf-parse depends on native canvas bindings that Vercel's runtime can't compile, causing production build failures at 2 AM on day five. Switching to unpdf — a pure JavaScript library built for serverless environments — resolved the issue in two hours and has worked reliably in production ever since.

The Technical Challenge

Problem: pdf-parse doesn't work on Vercel serverless.

pdf-parse has native dependencies that fail on Vercel's serverless runtime. I discovered this at 2 AM when the production build crashed.

Solution: Switch to unpdf.

unpdf is built for serverless from the ground up. No native dependencies, works perfectly on Vercel. The switch took 2 hours but saved the deployment.

If you're processing PDFs on Vercel, Netlify, or any serverless platform, use unpdf. Not pdf-parse. Save yourself the debugging.

The Product

StatementSync today:

Free Tier:

3 conversions/month
Single file upload
7-day history

Pro Tier ($19/month):

Unlimited conversions
Batch upload (20 files)
90-day history
Priority support

The $19/month flat rate is the differentiator. Process 50 statements? Same price. Process 200? Same price. Heavy users save money. Light users get simplicity. The full case for flat-rate over per-file pricing is in flat-rate vs per-file SaaS pricing.

Results

Metric	Value
Time to build	7 days
Processing time	3-5 seconds per statement
Extraction accuracy	99%
Supported banks	5 (Chase, BofA, Wells, Citi, Capital One)
Runtime cost per extraction	$0 (pattern-based)

What I'd Do Differently

Start with one bank - Supporting 5 banks day one was overkill. Start with Chase (most common), add others based on demand.
Skip the dashboard MVP - Users just want to upload and download. The fancy dashboard came before proving the core value.
Launch before Day 7 - Could have deployed a working version by Day 5 and iterated publicly.

The Validation Framework

The scoring rubric MicroSaaSBot used is replicable for any idea:

Severity (0–10): How much does this problem hurt? Daily annoyance = 7+. Occasional friction = below 5. Quantify the time or money cost if you can.

Persona clarity (0–10): Can you describe the person who has this problem in one sentence with specific attributes? "Freelance bookkeepers who process 50+ bank statements monthly" = 9. "People who work with documents" = 2.

Existing solutions (0–10): Are people already paying for something that partially solves this? Existing paid solutions mean validated willingness to pay. No solutions means either a greenfield opportunity or no demand—confirm which before proceeding.

Differentiation path (0–10): Do you have a specific unfair advantage or structural difference from what exists? StatementSync's pattern-based extraction enabling flat-rate pricing was the differentiation. A technically identical product with the same pricing would score a 3 here.

Threshold for go: 70+. Not 60—that was a typo in my notes that survived into documentation. At 60, the idea is salvageable but needs more validation work before coding.

Kill at 60 or below. Build at 70+. The 10 points between 60 and 70 are where founders rationalize themselves into building things nobody wants.

What Does Building Fast Actually Require?

Building fast requires validating before you code, making critical architecture decisions based on business constraints rather than familiarity, and launching before the product is perfect. MicroSaaSBot compressed weeks of research and implementation into seven days by eliminating the tedious execution grind — but the strategic decisions about pricing, persona, and architecture still required human judgment.

The Lesson

Building fast doesn't mean building sloppy. It means:

Validate before you code - Kill bad ideas early
Architecture matters - Pattern-based vs LLM extraction was the key decision
Launch before perfect - Iteration beats planning

MicroSaaSBot compressed weeks of work into days by handling the tedious parts automatically. I focused on the decisions that mattered.

StatementSync is proof that AI-assisted development can ship real products, not just demos.

The First Users

StatementSync's first five users came from a single Reddit comment in r/bookkeeping.

I described the problem—10+ hours transcribing bank PDFs monthly—and asked if anyone had found a good solution. Four people replied they'd tried existing tools but found them too expensive or too complex. One asked if I knew of a flat-rate option.

I shared the link. All five signed up within 24 hours. One converted to paid within 48 hours.

This is the signal the scoring rubric can't measure: whether real people in the target community respond to your positioning with recognition rather than explanation. "Finally" is a better signal than "that's interesting." Three of those first five users opened with some version of "finally."

The first month surfaced edge cases no amount of testing with sample statements would have caught. Chase statements processed correctly. Bank of America required a parser fix—their statement format changed in 2024 and the transaction date pattern didn't match. One user reported a missing transaction; it turned out to be a summary balance row that the parser was treating as a transaction.

Each edge case became a specific parser improvement. After six weeks, accuracy was effectively 100% for the five supported banks.

The lesson: building fast puts you in front of real users quickly. Real users reveal the edge cases that exist in the messy real world, not the controlled sample PDFs you test against. The 7-day build wasn't the end of the project—it was the beginning of the iteration loop that actually made the product good.

Sources

Validated Learning - The Lean Startup (Lean Startup)
Minimum Viable Product: a guide - Eric Ries (Eric Ries)

DEV Community