I kept re-explaining my stack to Claude on every new project — so I built it one

#showdev #ai #frontend #opensource

fe-rail solved the workflow problem. Spec, build, review, PR. The agent stopped skipping steps, stopped committing .env files, stopped needing me to catch the same three mistakes every session. That part I wrote about already.

What I didn't fix was the first hour of every new project. Clone an empty repo, open Claude Code, and watch it ask me the same twenty questions it asked last time. Router or no router? Where does server state live? What's the button supposed to look like when it's loading? I'd answer them once, in prose, and by the third project I was pasting the same paragraph into the same first prompt. That's not a workflow problem. That's a missing-scaffolding problem, and no amount of process discipline fixes it. You can review a PR perfectly and still end up with a design system the agent invented from nothing, because nothing told it not to.

So I built railhead.

What I actually wanted

Not another SPA boilerplate. There are plenty of those, and none of them were built with a specific failure mode in mind: an AI agent filling gaps with its training-data defaults instead of my decisions. Left alone, Claude's default frontend looks like every other AI-generated frontend: purple-to-blue gradients, Inter everywhere, a grid of identical icon-plus-heading cards, a cream background nobody asked for. It's not wrong, exactly. It's just not mine, and it's the same "not mine" every single time.

I wanted a repo where the agent's first move is reading three files, not guessing. Where "minimize hallucination" isn't a vibe I repeat in the system prompt but an actual constraint the codebase enforces: API types come from one generated file or the build fails, not from whatever shape the agent thinks the backend probably returns.

There are exactly two rules at the top of AGENTS.md, and every other decision in the template traces back to one of them. First: minimize hallucination. One API client, one set of generated types, no inventing an endpoint that sounds plausible. Second: no over-engineering. Don't add an abstraction, a library, or a folder nobody asked for, even if it's a "best practice" the model has seen a thousand times in training data. That second rule is the one I almost skipped, and it's the one that matters more day to day. An agent that hallucinates an endpoint gets caught by tsc. An agent that quietly adds a state management library nobody needed just makes the codebase worse in a way nothing catches.

What's actually in it

React 19, TypeScript 6 in strict mode, Vite 8. React Router 7 in library mode: routing only, no data loaders fighting with the query cache. TanStack Query owns every byte of server state; Zustand only shows up for state that's genuinely shared across the tree, not as a reflex. React Hook Form plus Zod for anything with an input field. Tailwind v4 and shadcn/ui, added through the CLI so I own the component after it lands, not before. Biome instead of the ESLint-plus-Prettier stack, because two tools that can disagree with each other is one tool too many for an agent to reconcile on its own. Vitest for units, Playwright for the two paths that actually matter: the happy one and the one where the request fails.

None of that is exotic. What's different is what backs it up.

What a first run looks like

$ git clone https://github.com/sh5623/railhead.git
$ cd railhead
$ pnpm install
$ pnpm dev

  VITE v8.0.16  ready in 340 ms
  ➜  Local:   http://localhost:5173/

Open that URL without touching .env first and you get a blank white page, on purpose. src/lib/env.ts validates VITE_API_BASE_URL with Zod at startup and throws if it's missing. Loudly, in the console, instead of quietly rendering a broken app that looks fine until you click something. I didn't want the failure mode to be "works until it doesn't." I wanted it to be "doesn't work, and tells you exactly why, in one line."

$ cp .env.example .env
$ pnpm dev

Now the home route is the onboarding walkthrough I mentioned, not a placeholder, but an actual page explaining the rules it's about to hand off to. Ask an agent to add a screen after that, and it's not starting from a blank file and a vague memory of what React apps usually look like. It's starting from a LoginPage and a HealthBadge it can read, an AGENTS.md it can grep, and a pnpm check && pnpm typecheck it knows will fail loudly if it drifts.

That last part isn't optional either. Husky runs lint-staged on every commit. CI runs the full sequence on every push, in a clean environment: typecheck, biome ci, unit tests, build, then Playwright. "Works on my machine" and "works in the PR" can't quietly diverge.

Three files, on purpose

Here's the thing I kept getting wrong in earlier attempts: I'd write one long README explaining everything, and the agent would either skim it or treat conventions and vibes as the same category of instruction. So railhead splits on a rule I stole from the tooling itself: if a tool can enforce it, the docs don't repeat it.

biome.json and tsconfig.json cover lint, format, import order, null-safety. Run pnpm check and pnpm typecheck and that whole category is handled. No prose required, because prose is the thing agents forget under pressure.

What's left goes in AGENTS.md, and only what's left: the single API pattern (one generated openapi-fetch client, no hand-rolled fetch calls, no as any when the types don't match: if they don't match, the spec is stale, go fix the spec), where server state is allowed to live, how query keys get built so cache invalidation doesn't turn into guesswork. CLAUDE.md is a two-line stub that just points at AGENTS.md, so there's exactly one document to keep in sync, not two drifting copies.

Then there's DESIGN.md and PRODUCT.md, which exist because "make it look good" is not an instruction, it's a wish. DESIGN.md has actual tokens: OKLCH color, a five-step elevation ladder from flat to overlay, a rule that says the brand accent gets 4 to 8 percent of any surface and nothing more. It also has a bans list, and I'll admit I enjoyed writing it more than I should have: no gradient text, no glassmorphism by default, no identical card grids, no purple-to-blue gradient as the default accent (calling that one out by name, since I've shipped it myself). If you're about to reach for one of these, the file tells you to restructure instead. PRODUCT.md is shorter: who the user is, three words for the brand voice, and a short list of things this product is explicitly not trying to be.

Three files. Not one giant one, not ten scattered ones. Enough separation that an agent reading "add a settings page" pulls the design tokens without also re-reading the API conventions it already has memorized for this session.

The tokens are OKLCH, not hex, because I wanted contrast ratios to be something you can reason about instead of eyeball, and the elevation ladder is mapped directly to how shadcn components already render, so a new component lands on-system without anyone deciding where its shadow goes by hand. The accent cap is the part I'd point to first, though: 4 to 8 percent of any surface, nothing more. That's the actual mechanism behind banning that purple gradient I keep bringing up. It's not banned for being ugly. It's banned because "gradient as the whole surface" and "accent as 4 percent of the surface" are two different design languages, and only one of them was a decision I made on purpose.

The onboarding page is the demo

I didn't want to just tell people this works, so the template ships with a page that shows it. Clone the repo, run pnpm dev, and the home route isn't a blank canvas. It's a scrolling walkthrough of the template itself, station by station, built along a little train-line metaphor because I couldn't resist the name. It covers the two hard rules, walks through the canonical patterns, and has a section titled, honestly, "installing fe-rail is optional." Because it is. railhead decides what you build with. fe-rail decides what order you build it in. They're not the same rail, and neither one requires the other.

Two of the pages in that walkthrough aren't demo content, though. They're load-bearing. src/features/auth and src/features/health are the canonical implementations of "here's the one way we call the backend" and "here's the one way we gate a route." The README is explicit that when you bootstrap a real project from this template, you delete the onboarding tour and keep those two. They're not sample code to be replaced. They're the pattern the rest of the app is supposed to copy.

What I'm still not sure about

I'll be honest about where this is thin. railhead is young. I started building it today, and as of writing it hasn't been bootstrapped into an actual production project yet, just built out and walked through by hand. Everything above is real and it works, but "works when I test it" and "survives someone else's first messy feature branch" are different claims, and I only get to make the first one right now.

The thing I still go back and forth on is the shadcn MCP server. It's committed to the repo, on purpose, so the agent can look up a component's real props instead of guessing. That's the whole "minimize hallucination" principle applied to the UI layer, not just the API layer. But it also means the first thing a new contributor sees is a permission prompt, and I'm not fully sold that the tradeoff is worth it for every team that clones this. Right now I'm leaving it in and living with the friction.

The other open question is how much of DESIGN.md survives contact with an actual designer who wants their own system instead of mine. I wrote the bans list from my own taste, and taste doesn't generalize. I think the structure — tokens plus a bans list plus a reason for each ban — generalizes better than the specific values do, but I haven't tested that claim on anyone else's project yet.

Try it

git clone https://github.com/sh5623/railhead.git
cd railhead
pnpm install
cp .env.example .env
pnpm dev

It's MIT licensed, same as fe-rail. If you're already using fe-rail, they're built to sit together: railhead gives the agent the stack and the rules, fe-rail gives it the workflow. Neither one asks you to adopt the other.

I'd like to know what breaks first when you try it on your own stack. What's the one convention your team has that isn't in AGENTS.md, DESIGN.md, or PRODUCT.md yet? Would you rather the agent inferred it, or asked?