This is part 3 of a series on building MetaBulkify, a Shopify app for bulk editing metaobjects via CSV.
- Part 1: Excel data corruption and GID/handle resolution
- Part 2: Shopify platform traps (scopes, throttling, dev store billing)
This post is about two things: how I build software as a solo developer with AI, and how I designed pricing to let users try the app properly before committing.
How I actually build things
I have a day job. I code side projects on evenings and weekends. Time is the scarcest resource.
My workflow: I write detailed spec documents, then delegate implementation to Claude Code. Not prompts. Not "build me an app." Actual specs — 200+ line instruction documents covering data models, edge cases, error handling, security considerations.
Then Claude Code writes the code, and I review every line.
This is not "AI wrote my app." I make every architecture decision:
- Which GraphQL mutations to use
- How billing works
- Where to put the usage tracking table
- What happens when Cloud Tasks retries a failed job
- Whether to fail-open or fail-closed on API errors
Claude Code is a fast pair programmer who never gets tired and never pushes back when I ask for a third rewrite.
What a spec doc looks like
For the billing system, my spec included:
- A plan comparison table (Free vs Pro, exact limits)
- Prisma schema for the usage tracking table
- The exact GraphQL query for subscription detection
- Error handling rules (fail-closed, count attempts not successes)
- Idempotency requirements (Cloud Tasks retries must not double-count)
- UI mockups in pseudocode (which Polaris Web Components to use, where to show banners)
- Test steps (including "manually set DB value to 9999 and verify block")
Claude Code implemented the whole thing. I reviewed, found three issues, sent detailed feedback. Claude Code revised. I reviewed again, found one more issue, sent feedback. Third revision was clean. Shipped.
Three review rounds for billing alone. The bugs I caught:
- Counting basis mismatch — blocking based on total rows, but recording only successful rows. This lets users retry failed imports to bypass the quota.
-
No idempotency guard — Cloud Tasks retries could double-count usage. Fixed with an atomic
updateManyWHERE clause. - Soft cap instead of hard cap — "warning only" on Pro tier quota exceeded. Changed to hard block because a limit you don't enforce isn't a limit.
Each of these was a product decision, not a coding error. Claude Code implemented what I asked for. I asked for the wrong thing. The review process caught it.
The spec-driven advantage
Writing specs before coding forces you to think about edge cases upfront. "What happens if the import worker crashes after recording usage but before updating the job status?" is a question that naturally comes up when writing the spec. It rarely comes up when coding.
It also creates a paper trail. When something breaks in production, I can look at the spec and see whether the implementation deviated from the plan, or whether the plan itself was wrong.
And honestly, writing specs is faster than writing code. A 200-line spec takes me 2-3 hours. The implementation takes Claude Code 20 minutes. My review takes another hour. Total: ~4 hours for a feature that would take me 2-3 days to code manually.
Pricing design: letting users try before they commit
Bulk operation apps have a specific challenge. People don't use them every day — they need them when they need them. The pricing has to reflect that.
My solution:
| Free | Pro ($9.99/mo) | |
|---|---|---|
| CSV Export | Unlimited | Unlimited |
| CSV Import | 50 rows (lifetime) | 10,000 rows/month |
The thinking behind each decision:
Exports are free and unlimited. I want people to try the app with zero risk. Export your data, check the CSV format, see if it fits your workflow. If it doesn't work for you, you haven't paid anything.
50 free import rows. Enough to run a real test — pick a metaobject type, import a few entries, verify everything works in your store. You can confirm the app handles your field types, reference fields, and data format correctly before spending anything.
No free trial on Pro. Instead of a time-limited trial where you're racing the clock, the free tier gives you a permanent sandbox. Take your time. Test on Monday, come back Thursday, test again. No pressure.
$9.99/month for Pro. This is a focused tool that does one thing. The price reflects that — it's not trying to be a Swiss Army knife.
The numbers
Real talk:
- Revenue: $9.99 (one subscription — they uninstalled after 2 hours due to a billing detection bug, not a pricing problem)
- Infrastructure cost: effectively $0/month at current scale
- Development time: a few weeks of evenings and weekends
- Spec docs written: probably more words than actual code
Tech stack
For the curious:
- React Router v7 + Shopify App Remix
- Polaris Web Components (
<s-banner>,<s-button>— not the React library) - Neon Postgres + Prisma
- GCP Cloud Run + Cloud Tasks
- JSON-based i18n (English + Japanese)
The infrastructure cost is near zero at current scale. When that changes, it'll mean the app is making money — which is a good problem to have.
What I'd do differently
- Test with Excel-edited CSVs from day one. The normalization issue (Part 1) would have surfaced immediately.
- Test billing on a real store, not a dev store. The dev store billing trap (Part 2) cost me a customer.
- Ship with fewer features. I18n could have waited. The dry-run preview was essential; everything else was nice-to-have.
- Start marketing before the app is "done." It's never done. I should have posted in Shopify community forums and dev.to weeks earlier.
Try MetaBulkify
Export and import Shopify metaobjects via CSV with dry-run preview.
→ MetaBulkify on the Shopify App Store
Free plan includes unlimited exports and 50 import rows.
I'm the developer — questions and feedback welcome in the comments.

Top comments (0)