How we built OUTRANKgeo: an AI search visibility tracker built by AI agents

#saas #seo #ai #startup

When I started WWG, I had a specific bet: could a software company be run almost entirely by AI agents?

Not "AI-assisted" — actually run by agents. CEO, CTO, CMO, engineers, QA, marketing. Each with a role, a task queue, and a heartbeat schedule. I'd check in once a day.

OUTRANKgeo is the first product that came out of this experiment. And the way it got built is at least as interesting as what it does.

What OUTRANKgeo does

OUTRANKgeo tracks your brand's visibility in AI-generated search responses — specifically ChatGPT and Claude.

The problem it solves: 60% of searches now end without a click. AI search makes this worse — there's no list of results, just one synthesized answer. Either your brand is in that answer or it isn't. We built OUTRANKgeo because we couldn't find a tool that told us where we stood.

Enter a brand or URL. The tool runs 5+ queries in your category across ChatGPT and Claude, scores your AI visibility, and shows which competitors appear where you don't.

Stack: Next.js (frontend), Supabase (database + auth), Railway (worker service), GCP, Vercel (deployment)

Free scan: https://outrankgeo.com — no credit card, results in minutes

How it was built: the AI-agent company architecture

The build team was 11 AI agents running on Paperclip — an agentic work management system. Each agent has:

A defined role (CEO, CTO, Code Reviewer, QA Engineer, Content Marketer, etc.)
A task inbox
A heartbeat schedule (wakes up, works, reports, sleeps)
A budget (Claude API costs)

Agents communicate via task comments. When the CTO is blocked, they create a task for the CEO. When code needs review, it gets routed to the Code Reviewer agent. No Slack. No meetings. No standups.

What actually worked

The agents shipped a functional product. That's the headline. Code was written, reviewed, tested, and deployed without human engineers. The CI/CD pipeline ran. Bugs were caught in QA.

The task management system (Paperclip) was the critical layer. Without structured task handoffs, agents would have lost context and duplicated work constantly. With it, they could operate across sessions with reasonable continuity.

GEO scan accuracy was validated by the QA agent running real test queries and comparing outputs. The Happy Path — sign up → add brand → run scan → see results — was verified before launch.

What didn't work (yet)

Agent memory is limited. Each heartbeat is a fresh context window. Agents sometimes repeat analysis they've already done. We're working on better memory layers.

Context loss between sessions. Complex decisions sometimes need to be reconstructed from task comments. Longer tasks require careful documentation or agents drift.

Confident wrongness. The worst failure mode: an agent making a definitive-sounding decision that's subtly incorrect. We added more in-review checkpoints to catch these.

The architecture decision I'd make differently

I'd build memory and context as a first-class system earlier. The agents work well on discrete tasks. They struggle with continuity across many sessions of a complex project. This is solvable — we just underinvested in it early.

Where this goes

OUTRANKgeo is the proof of concept. If an AI-agent team can ship a SaaS product that works and gets real users, the cost structure of software companies changes fundamentally. We're running that experiment live, in public.

Try the product: https://outrankgeo.com
Follow the build: updates coming to LinkedIn and here.

Questions welcome — happy to go deep on the agent architecture, the Paperclip system, or the GEO/AI visibility problem.