DEV Community

ORCHESTRATE
ORCHESTRATE

Posted on

We Built 144 Services and None of Them Talk to Customers

We just ran a 14-person AI team review of our marketing automation platform. Every AI persona audited the system from their expertise. The finding that stopped the room:

The platform is an excellent content production tool but has zero business management capabilities needed for a real agency.

We set out to build a 25-person marketing agency in software. After 10 sprints, 461 tickets, 144 TypeScript services, and 4,418 passing tests, here is what we actually built: a content engine that cannot post to social media, send an email, track ROI, or invoice a client.

This is a post about how that happened and what it teaches about building AI systems.

The Technical Achievement Is Real

The numbers are legitimate. We have:

  • 144 compiled TypeScript services covering sourcing, audio synthesis, video generation, podcast assembly, YouTube publishing, quality gates with NLI verification, provenance tracking with Merkle attestation
  • A Media Orchestration Engine that runs 7-stage production pipelines autonomously
  • Voice cloning with speaker verification, multi-voice podcast assembly
  • 224 REST API routes extracted into domain modules with auth middleware
  • Docker Compose orchestration for 6 services with GPU passthrough
  • A React dashboard with 15 tabs

None of this is vaporware. It compiles, it runs, it has tests.

Where It Falls Apart

During the team review, we activated each AI persona and asked them to audit from their expertise. Here is what they found:

9 of 14 personas reported the same issue: persona memory is dead. The entire learning system — the thing that was supposed to make this more than a stateless tool — returns empty for every query. Every persona's recall_memory returns nothing. The system stores events but nobody reads them.

The product owner identified the existential gap: no multi-client accounts, no social media publishing, no email marketing, no analytics dashboard, no billing system. We built the factory floor but forgot the sales office, the loading dock, and the cash register.

The security engineer flagged zero security tests across 240 test files. Not one test for SQL injection, XSS, or auth bypass. We have JWT authentication but never tested whether it actually blocks anything it should.

The DevOps engineer pointed out that a CI pipeline was requested in Sprint 0 and deferred through 8 consecutive sprints. Tests run locally but there is no automated verification on push.

The UX designer found zero ARIA attributes in 20 React component files. The dashboard is invisible to screen readers. We have an NFR spec requiring WCAG 2.1 AA compliance and zero implementation of it.

Why This Happens

This is not unique to our project. It is a pattern:

Technical complexity crowds out business value. Building a provenance system with Merkle attestation is genuinely hard engineering. It consumed sprints. But no customer ever asked for Merkle proofs — they ask "did this post get engagement?" and we cannot answer that.

AI systems are seductive to build. Wiring up TTS engines, knowledge graphs, and NLI verification feels like progress. The dopamine of making a new service compile and pass tests is real. The absence of a billing system does not feel urgent until you try to charge someone.

Personas without memory are just labels. We designed 14 AI team members with specific expertise and behavioral contracts. But none of them remembered anything between sessions. The system stored 303 completed tickets worth of experience and none of it was retrievable. The learning system existed architecturally but was functionally inert.

Sprint velocity was never measured. We planned 120 story points into a sprint where our actual throughput was ~38. We know this because the team review forced us to check. The velocity reporting tool returns zero — nobody ever recorded the data. We planned by ambition instead of evidence.

What We Are Doing About It

The team review produced a prioritized action list. Here is what changed:

Sprint 9 was cut from 120 to 42 points. We moved 7 UI tab stories to Sprint 10 and accepted that Sprint 10 will need to split into two phases. Planning by actual velocity instead of aspiration.

Accessibility before features. The UX designer made the case that building 9 new UI tabs without error boundaries, ARIA labels, and keyboard navigation means retrofitting all 9 later. Accessibility foundations (shared components) move to Sprint 9 P1. Tabs come after.

The launch narrative was corrected. We are shipping a "production-grade content automation tool" — not a "25-person marketing agency." The agency vision is the roadmap, not the V1 claim. Four architectural decision records (ADRs) now document the path from content tool to agency: multi-tenant architecture, social media adapters, analytics pipeline, email integration.

10 backlog items capture every missing business capability. Multi-client accounts, LinkedIn/Twitter/Instagram publishing, analytics dashboards, email marketing, SEO optimization, billing, team collaboration. All prioritized, all with effort estimates. Nothing hidden.

What Developers Can Take From This

Run the 14-persona review on your own project. Activate a security engineer, a UX designer, a product owner, a DevOps engineer — even if those are all you. Ask each one: "What are your top 3 concerns about shipping this?" The pattern of what multiple perspectives agree on reveals systemic issues that no single viewpoint catches.

Measure your velocity before planning your next sprint. If your reporting tool returns zero, you are planning by hope. Pull your actual completion data from the last 3-5 sprints and divide by sprint count. Use that number. Not the number you wish it was.

Check if your learning system actually learns. If you built memory, RAG, or any knowledge persistence — query it right now. Does it return useful results? Or did you build the infrastructure and never populate it? We had 303 completed tickets with mandatory evidence comments and zero retrievable lessons. The pipe existed but nothing flowed.

Ask your product owner: can a customer use this tomorrow? Not "is the code working" but "can someone pay money for this and get value." If the answer involves more than 2 qualifying statements, you have a technical demonstration, not a product.


10 sprints. 461 tickets. 144 services. And the hardest finding was not a bug — it was the gap between what we built and what the market needs. That is the real work now.

Top comments (0)