Wrought

Posted on Apr 3 • Originally published at fluxforge.ai

I Built a Real-Time Artemis II 3D Tracker in One Session — Here's the Engineering Pipeline That Made It Possible

#webdev #ai #space #react

On April 1, 2026, four astronauts launched aboard Orion on Artemis II — humanity's first crewed voyage beyond low Earth orbit since Apollo 17 in 1972.

I wanted to track it. Not on a static NASA page. Not on someone else's stream overlay. I wanted an interactive 3D visualization with real telemetry, in my browser, that I built myself.

Six hours - one afternoon - later, I had one. Live at artemis-tracker-murex.vercel.app.

47 files. ~8,000 lines of TypeScript. 15 unit tests. 5 serverless API proxies. Degree-8 Lagrange interpolation at 60fps. An AI mission chatbot. Deep Space Network status. Deployed on Vercel.

Built in a single session using Claude Code with a structured engineering pipeline called Wrought.

This post isn't about "look what AI can do." It's about what happens when you give an AI agent engineering discipline instead of just a prompt.

What the App Does

ARTEMIS is a real-time 3D mission tracker that combines three NASA data sources into one interactive visualization:

OEM Ephemeris Files from NASA's AROW system — actual spacecraft state vectors (position and velocity) at 4-minute intervals, interpolated to 60fps using Lagrange polynomials
Deep Space Network XML — live antenna status from Goldstone, Canberra, and Madrid
JPL Horizons API — Moon position in the same J2000 reference frame as the spacecraft data

The result: you can watch Orion move along its trajectory in real time, see its speed, distance from Earth, distance to the Moon, and which ground stations are currently talking to it.

There's also an AI chatbot powered by Gemini 2.5 Flash. Common questions like "How long is the mission?" resolve instantly via client-side quick-answer buttons — no API call needed. Free-text questions hit a curated knowledge base through a system prompt.

The Problem with "Vibe Coding"

Every week, someone posts "I built X in 20 minutes with AI." And every week, the comments are the same: Does it have tests? How's the error handling? What happens when the API is down? Did you actually read the code?

These are fair questions. The dirty secret of AI-assisted speed runs is that most of them produce code that works for a demo and breaks in production. The AI generates plausible code. You accept it. Ship it. Move on.

The issue isn't speed — it's the absence of process. No design review. No architecture decision records. No code review. No root cause analysis when something goes wrong. Just "prompt → code → deploy → pray."

I wanted to show there's a better way.

The Pipeline

Wrought is a structured engineering pipeline I built for Claude Code. It enforces a specific sequence of skills for every significant piece of work:

Finding → Research → Design → Blueprint → Implementation → Code Review

Each stage produces a documented artifact. Each artifact feeds the next stage. The AI agent can't skip ahead — the pipeline is the process.

Here's how it played out for ARTEMIS.

Stage 1: Finding

Every task starts with a Findings Tracker — a structured document that captures what you're building, why, and tracks it through every stage.

Finding: Interactive Artemis II Live Visualization
Type: Gap
Severity: High
Rationale: Artemis II launched 2026-04-01. No unified interactive tracker exists.
  NASA data is scattered across OEM files, XML feeds, and on-demand APIs.
  Mission window is ~10 days — time-sensitive opportunity.

This isn't bureaucracy. It's cross-session memory. If I stop working and come back tomorrow, the tracker tells me exactly where I left off and what decisions were already made.

Stage 2: Research

Before building a chatbot, I needed to decide how to build it. The research skill evaluated three approaches:

Approach	Pros	Cons	Verdict
FAQ Bot (pattern matching)	Zero cost, instant	Can't handle novel questions	Too rigid
System Prompt + LLM	Simple, full knowledge in context	Per-query API cost	Selected
RAG (vector search)	Scales to large corpora	Massive overengineering for ~3K tokens of facts	Overkill

The entire Artemis II knowledge base — mission timeline, crew bios, spacecraft specs, orbital mechanics — fits in about 3,000 tokens. That's smaller than this blog post. Building a RAG pipeline with embeddings, a vector database, and chunking strategy for 3,000 tokens of content would have been absurd.

The research also surfaced that Gemini 2.5 Flash has a genuinely free tier (15 requests per minute, 1,000 per day, no credit card). Claude would have been higher quality, but the $0 budget constraint made Gemini the pragmatic choice.

Stage 3: Design

The design stage evaluated four architecture options with weighted scoring:

Option	Stack	Score	Why Not
A: Vite + R3F	Vite, React, React Three Fiber, Vercel	8.6/10	Selected
B: Vite + 2D	Vite, React, Canvas 2D	6.2/10	No depth perception for 3D trajectory
C: Next.js + R3F	Next.js, React, R3F	7.8/10	SSR adds hydration complexity for a pure client app
D: Vanilla Three.js	Three.js, no framework	5.4/10	Manual state management, no HMR for scene

The design document also specified the data pipeline for all four NASA sources, the coordinate system (J2000 Earth-centered, 1 unit = 10,000 km), the HUD layout, and camera presets.

Key insight: Next.js was actively wrong for this project. There's no content to server-render. No SEO to optimize. No dynamic routes. It's a WebGL canvas that talks to APIs. Vite gives you sub-second HMR and a smaller bundle without the hydration tax.

Stage 4: Blueprint

The blueprint translated the design into an implementation spec: 48 files across 8 phases, with acceptance criteria for each phase. It specified the exact file structure, the interfaces for the OEM parser and interpolator, the Zustand store shape, and the serverless proxy signatures.

This is where the pipeline pays dividends. By the time implementation starts, the AI agent has:

A clear architecture to follow (not just a vague prompt)
Specific interfaces to implement (not ad-hoc decisions mid-code)
An explicit scope boundary (what to build AND what not to build)

Stage 5: Implementation

One iteration. All 15 tests passing. Build succeeds. Deployed to Vercel.

That's not AI magic — that's the upstream work paying off. When the design document specifies "Zustand store with scalar selectors to avoid re-render storms" and the blueprint defines the exact store interface, the implementation becomes an execution problem, not a design problem.

Stage 6: Code Review (This Is Where It Gets Interesting)

After implementation, the pipeline runs a multi-agent code review called /forge-review. Four specialized agents review the code in parallel:

Complexity Analyst — algorithmic time/space complexity
Paradigm Enforcer — consistency within files
Efficiency Sentinel — performance anti-patterns
Data Structure Reviewer — access patterns vs. data structure selection

The first review found 5 critical issues:

Critical 1: 60fps Re-Render Storm

The HUD reads spacecraft state as an object from Zustand.
Zustand creates a new object reference every update.
React re-renders all HUD components every frame.
At 60fps, that's 60 full React reconciliation passes per second.

The fix: DataDriver writes position to a shared useRef for the 3D scene (zero React overhead), and throttles Zustand store updates to 4Hz for the HUD. Each HUD card uses a scalar selector (state => state.spacecraft.speed) instead of selecting the whole object.

Critical 2: O(n) Linear Scan in Hot Path

The Lagrange interpolator was doing a linear search through all state vectors to find the nearest data point. With a 3,232-line OEM file at 60fps, that's ~194,000 comparisons per second.

The fix: binary search. O(log n). The data is already sorted by epoch.

// Binary search for closest data point
let lo = 0, hi = vectors.length - 1;
while (lo < hi) {
  const mid = (lo + hi) >> 1;
  if (vectors[mid].epochMs < t) lo = mid + 1;
  else hi = mid;
}

Critical 3: Per-Frame Memory Allocations

The interpolator was calling .slice() and .map() inside the hot loop — allocating new arrays every frame. At 60fps, that's 120+ garbage-collected arrays per second.

The fix: module-level reusable buffer, direct array indexing instead of .map(), and accepting epochMs as a number instead of creating Date objects.

Critical 4: Stale Closure in useChat

The chat hook captured messages in a closure at mount time. When a user sent a second message, the API call used the stale initial array — losing the conversation history.

The fix: useRef tracking the latest messages array, with useCallback reading from the ref instead of the closure.

Critical 5: StrictMode Double-Mount Breaking Polls

A fetchedRef guard prevented re-fetching in StrictMode's double-mount cycle, but also broke cleanup — orphaning intervals and timeouts.

The fix: remove the ref guard entirely. Use AbortController for cancellation. Clean up all intervals and timeouts in the effect's return function.

The Re-Review

After fixing all five criticals through an RCA (root cause analysis) cycle, the pipeline ran a second code review:

Severity	First Review	Re-Review
Critical	5	0
Warning	10	2
Suggestion	8	2

The two remaining warnings were both benign edge cases (a module-level buffer that's safe in single-threaded browsers, and a ref timing gap that's correctly compensated for). Zero criticals.

This is the pipeline's value proposition. Without the review stage, those five bugs would have shipped. The linear scan and re-render storm would have made the app noticeably janky on mobile. The stale closure would have broken multi-turn chat. And you'd never know until users complained.

The Numbers

Metric	Value
Source files	47
Lines of code	~8,000
Unit tests	15
Implementation iterations	1
Critical bugs caught	5 (all fixed)
Build time	~2 seconds
Bundle size	149KB app + 1.1MB Three.js/R3F
NASA data sources	4
Serverless proxies	5
Pipeline artifacts	12 documents
Production deploys	8

What I Learned

1. Upstream design eliminates downstream churn. The implementation completed in one iteration because the blueprint answered most design questions in advance. "What shape is the Zustand store?" isn't a question you want the AI deciding mid-implementation.

2. Code review catches real bugs, not style nits. The forge-review found a 60fps re-render storm and an O(n) hot path — genuine performance issues that would have shipped silently without the review stage.

3. RAG is the new microservices. Not everything needs it. A 3,000-token knowledge base doesn't need embeddings, vector search, and a retrieval pipeline. System prompt stuffing is boring and effective.

4. The audit trail is the product. Every decision — why Vite over Next.js, why Gemini over Claude, why Lagrange over Runge-Kutta — is documented in the pipeline artifacts. Six months from now, when someone asks "why did you build it this way?", the answer exists in the design doc, not in someone's memory.

Try It

Live app: artemis-tracker-murex.vercel.app
Source code: github.com/fluxforgeai/ARTEMIS
Wrought pipeline: wrought-web.vercel.app

The full pipeline artifacts — finding, research, design, blueprint, reviews, RCAs — are all in the repo's docs/ directory. The process is as open as the code.

Built with Claude Code + Wrought by FluxForge AI.

DEV Community