When AI Meets Reality (Ep. 3) — The Failed App Experiment, $355 in 3 Weeks, and Local AI Catches Up

#ai #agentdev #productivity #devlog

The failed experiment that changed how I think about AI monetization.

I told my agent to build one useful app per day. For three weeks it built unit converters, color pickers, and countdown timers. Technically correct. Completely useless. Nobody came.

The problem wasn't the execution. The execution was fine. The problem was that when execution costs drop to near zero, execution stops being the advantage. I was automating the wrong thing.

Three shifts that followed:

From apps to experiments. Instead of "build me a useful tool," I started giving specific creative direction: what the experience should feel like, what problem it solves for a specific person, what makes it interesting. One of those experiments reached #3 on Hacker News. The others are still sitting there. The difference between them isn't technical.

From building to packaging knowledge. Once execution is cheap, the new bottleneck is packaging. Most people with real expertise can't monetize it because turning knowledge into products is hard. AI agents handle the packaging -- the course structure, the landing page, the email sequence. Within three weeks of redirecting the agent from building apps to packaging knowledge, I hit $355 in revenue against $400/month in AI costs. Not profit. But close enough to prove the model.

Local AI caught up faster than expected. I ran Qwen 3.5 9B on my MacBook and my iPhone without any internet connection. Both worked. The gap between cloud and local models is closing faster than the benchmarks suggest. What runs locally in late 2025 would have been cloud-only a year ago.

The central insight across all three: AI does exactly what you direct it to do. With bad direction, you get unit converters. With specific human taste and vision, you get something that earns attention or revenue.

The real bottleneck was never the AI. It was having something worth building.

Full episode (audio + transcript): https://thoughts.jock.pl/p/when-ai-meets-reality-ep3

Newsletter on AI agents and practical automation: https://thoughts.jock.pl

Top comments (1)

Apex Stack • Mar 24

The "packaging knowledge" shift is the key insight here. Once execution costs collapse, the real advantage moves to what you know and how you structure it — not whether you can build it. I went through a similar realisation running a programmatic SEO site with 8,000+ stock tickers: the content generation pipeline (Ollama + Qwen 3.5 9B) was almost trivial to build, but the actual value was in figuring out which signals matter for each page type and structuring the prompts around that knowledge.

The Qwen 3.5 9B point resonates — we run it locally for batch content generation across 12 languages and the quality-to-cost ratio is genuinely difficult to argue against for structured, domain-specific tasks.

What was different about the experiment that hit #3 on HN? I'm curious whether it was the problem clarity (specific person, specific friction) or something about the presentation/timing. The gap between "technically correct but nobody came" and "reached top of HN" seems like the most interesting thing in this post.