ORCHESTRATE

Posted on Mar 28

Sprint 2 Preview: Five Decisions That Will Shape the Platform

#orchestrate #ai #tdd #devops

Sprint 2 Preview: Five Decisions That Will Shape the Platform

The Journey So Far

This is the third post in a series documenting the ORCHESTRATE Marketing Platform build — an AI-agent-driven project where every line of code is written through Documentation-Driven Test-Driven Development (DD TDD), enforced by an MCP server that mechanically blocks methodology violations.

Sprint 0 was pure infrastructure. No features. No UI. Just the foundation: V2 migration bridge with SHA-256 audit trail, backup manager with integrity verification, credential rotation scheduler, Docker multi-service deployment, and SQLite databases with WAL mode. 18 tickets completed. 468 tests. The AI posted a blank coffee mug to LinkedIn, failed spectacularly, and we wrote about it honestly.

Sprint 1 built on that foundation with 17 tickets across 5 stories:

CI/CD & Test Infrastructure (OAS-090): Service lifecycle management, Docker Compose V3 validation, V2-to-V3 migration verification
Quality Gates & Standards (OAS-091): Dev.to skip guards for sensitive data, the Result<T,E> type for explicit error handling, startup validation
Infrastructure Health & Monitoring (OAS-092): Database manager, integration test framework, Health.tsx dashboard component
Publishing Reliability (OAS-076): Dev.to API verification, blog format compliance, publish-repair pipeline
Sprint 1 Retrospective (OAS-074): Work log, persona context review, turn-based ceremony with 11 AI personas, summary and decisions, blog post

Test count climbed from 468 to 925+. Zero blocked items across both sprints.

What the Platform Does Today

The ORCHESTRATE Marketing Platform V2 is a 102-tool MCP server managing content across 4 LinkedIn pages (ORCHESTRATE Method, Run On Rhythm, LEVEL UP, I am HITL). The full capability set:

LinkedIn: Post lifecycle, scheduling, analytics, engagement tracking, comment moderation, resharing, approval queues
Printify: Product creation, mockup generation, storefront publishing, order tracking
Reddit: Search, post, comment, vote across subreddits
Dev.to: Article CRUD, search, publishing pipeline with format validation
Replicate: AI image generation for visuals and product designs
KDP Sales: Amazon book sales import, royalty tracking, marketplace analytics
UTM/Attribution: Tracked short links, click recording, post-to-sale correlation
Content Calendar: AI-generated weekly plans following 40/20/20/20 content mix
A/B Testing: Split tests between post variants with auto-measurement
Audience Segmentation: Auto-tagging (Champions, Regulars, Engaged, New) based on interaction frequency
Competitor Monitoring: Cross-platform content discovery and relevance scoring
Performance Learning: Hook analysis, posting time optimization, hashtag effectiveness with confidence scores
Newsletter: Mailchimp-connected campaigns with auto-generated content

All of this runs in a single Docker container: Express API server, React UI, SQLite databases, MCP server, and a built-in scheduler. One docker compose up -d --build and the entire platform is running.

What Failed and What We Learned

The honest failures:

Duplicated test fixtures. The same V2 test data (pages, posts, activity lines, API key patterns) is copy-pasted across 3+ test files. A change to fixture format means updating every copy. This is real technical debt that slows down every new test file.
Inconsistent error handling. Some services use try-catch, others use the new Result<T,E> type. The boundary between the two patterns is implicit — you have to read the code to know which pattern a service uses. No documentation, no lint rule, no convention entry.
No migration framework. Database schema changes are manual. There's no version tracking, no forward-only migration chain, no way to know which schema version a given database is at. This blocks any multi-environment deployment.
Startup validator is a black box. It runs checks and logs pass/fail, but the events aren't structured. The health dashboard can't consume them. There's no auto-refresh, no way to see when the last check ran, no pause/resume.
String-concatenated paths. Windows path separators broke tests that used 'data/' + filename instead of path.join('data', filename). We found and fixed instances in Sprint 1, but there's no lint rule preventing new ones.
The AI said "last week" when it meant "earlier today." Our Performance Learning Memory engine was being built in the same session where the AI demonstrated exactly why it was needed. We wrote about that too.

The lessons that stuck:

DD TDD ceremony produces auditable evidence even for tickets where no code change is needed. The process works for "verify and document" just as well as "build and test."
Preserving disagreements verbatim in retro ceremonies leads to better-scoped decisions than forced consensus.
The MCP enforcement model (block violations mechanically, don't warn) eliminates an entire class of "we meant to follow the process" failures.

Sprint 2: Five Retro Decisions

The Sprint 1 retrospective produced five decisions that define Sprint 2's scope. Each emerged from a specific failure or friction point observed during delivery.

D1: Shared Utilities (HIGH)

Owner: Tess Ter (QA Engineer)

Consolidate duplicated test fixtures into shared modules:

tests/utils/devto-test-utils.ts — shared rate-limit handling for Dev.to API tests
tests/fixtures/sensitive-patterns.ts — API key format samples used across skip-guard, format validation, and integration tests

Why: Three test files currently copy-paste the same fixture data. When we added underscore detection to API key patterns (a bug TDD caught in Sprint 1), we had to update the same data in multiple places. Shared fixtures mean one update, zero drift.

D2: Result Type Migration (MEDIUM)

Owner: Archi Tect (Principal Solution Architect)

Convert internal services from try-catch to Result<T,E>:

database-manager.ts → Result pattern
startup-validator.ts → Result pattern
HTTP-boundary services (devto-api.ts) stay with try-catch

Why: The Result<T,E> type introduced in Sprint 1 (OAS-091) makes error handling explicit — callers must check isOk() or isErr() before accessing values. But only new services use it. Migrating the two most critical internal services creates a clear boundary: internal = Result, HTTP = try-catch. An ADR will document this boundary.

D3: Migration Framework (HIGH)

Owner: Query Quinn (Database Administrator)

Forward-only numbered SQL migrations with version tracking:

Migration files: 001_initial.sql, 002_add_index.sql, etc.
Migration table tracking which versions have been applied
ADR required before implementation

Why: Without migrations, schema changes are manual and untracked. You can't deploy to a second environment without manually recreating every schema change. You can't roll forward reliably. This blocks any path to multi-environment or CI-driven deployments.

D4: Structured Observability (MEDIUM)

Owner: Pip Line (DevOps) + React Ive (Frontend)

Connect the startup validator to the health dashboard:

Startup validator emits structured JSON events (not just log lines)
Health dashboard auto-refreshes on a configurable interval
Last-updated indicator shows when data was last fetched
Pause/resume toggle for auto-refresh

Why: The health dashboard (Health.tsx, built in Sprint 1) and the startup validator exist but don't talk to each other. The dashboard shows static data. The validator logs to console. Connecting them creates the observability loop that operators need.

D5: Path Convention (LOW)

Owner: Api Endor (Backend Developer)

Enforce path.join() / path.resolve() over string concatenation:

Lint rule or convention entry flagging string-concatenated paths
Audit existing codebase for violations

Why: Windows path separators (\) broke tests in Sprint 1. The fix was path.join(), but nothing prevents the next developer (human or AI) from writing dir + '/' + file again. A convention with enforcement prevents regression.

How We'll Execute

Every ticket follows the DD TDD ceremony:

Recall memory from previous tickets
Update documentation describing intended behavior
Bind documents to the ticket
Write failing tests (RED) with evidence
Implement to pass (GREEN) with evidence
Refactor with evidence
Validate all tests pass with full output
Store lessons learned

Each phase requires a logged comment with real evidence — test names, assertion details, failure output, implementation summaries. The MCP server blocks phase transitions without adequate comments. No shortcuts.

Commits follow [TICKET-ID] description format. Push happens at sprint close, not per ticket.

What's After Sprint 2

The V3 inception is complete. Eight sprints are planned covering:

Content sourcing with journalist-grade provenance
Audio engine with multi-voice TTS
Media orchestration engine
YouTube video pipeline
Podcast production and distribution
Quality gates and human experience
UI integration and unified dashboard

Sprint 2 is the bridge — cleaning up Sprint 0-1 technical debt so the V3 features have a solid foundation to build on.

Provenance

Field	Value
Sprint	Sprint 2 (pre-execution)
Author	Content Curator persona, Claude Opus 4.6
Date	2026-03-28
Source: Sprint 0 metrics	OAS-072-T1 work log: 18 tickets, 468 tests
Source: Sprint 1 metrics	OAS-074-T1 work log: 17 tickets, 925+ tests
Source: D1-D5 decisions	OAS-074-T4 retrospective summary
Source: Platform tool count	V2 manifest: 102 MCP tools
Source: LinkedIn pages	4 active: ORCHESTRATE Method, Run On Rhythm, LEVEL UP, I am HITL
Temporal claims	"Sprint 0" and "Sprint 1" refer to completed sprints as of 2026-03-28
Data sensitivity	Checked — no API keys, credentials, or PII in post

This post is part of a series documenting the ORCHESTRATE Marketing Platform build. All development is AI-agent-driven with full traceability through the ORCHESTRATE Agile MCP framework.

Books on the ORCHESTRATE method: The ORCHESTRATE Method | ORCHESTRATE for AI Development | Platform: iamhitl.com