WonderLab

Posted on Jun 28

Open Source Project #108: ai-website-cloner-template — One Command, Parallel Agents, Any Website Reversed into Next.js Code

#ai #frontend #website #development

Introduction

"Point it at a URL, run /clone-website, and your AI agent will inspect the site, extract design tokens and assets, write component specs, and dispatch parallel builders to reconstruct every section."

This is article #108 in the "One Open Source Project a Day" series. Today's project is ai-website-cloner-template — a GitHub template repository for reverse-engineering any website into a Next.js codebase using AI coding agents.

Turning a "website screenshot" into "runnable code" is a classic Vibe Coding scenario, but most implementations stop at the surface: ask the LLM to look at a screenshot, approximate the layout, fill in placeholder content. This template takes a fundamentally different approach: it defines a rigorous multi-phase agent workflow whose core principle is "completeness beats speed" — every Builder Agent must have exact getComputedStyle() values, genuinely downloaded assets, and complete interaction state specifications before touching any code.

22k stars signals real demand for this use case. But the more interesting story is the engineering design — especially using git worktrees to achieve true parallel multi-agent construction.

What You'll Learn

The five-phase clone pipeline: reconnaissance → foundation → component specs → parallel build → assembly QA
git worktree parallel agent pattern: how each Builder Agent works in an isolated branch
Component spec file design principles: why the spec is a "contract," not a "reference"
Interaction model identification: distinguishing click-driven, scroll-driven, and time-driven behaviors before building
Cross-platform agent support: how 13 AI coding tools are all served from a single source file

Prerequisites

Experience with Claude Code, Cursor, or similar AI coding tools
Basic familiarity with Next.js
Basic understanding of git branches

Project Background

Overview

ai-website-cloner-template is a GitHub template repository (is_template: true) that pre-scaffolds a Next.js 16 + shadcn/ui + Tailwind v4 project alongside a /clone-website AI Agent Skill.

Usage pattern: use "Use this template" on GitHub to create your own repository, start your AI agent, run /clone-website <target-url>, and the agent completes the full pipeline from reconnaissance to working code.

The project's core engineering contribution isn't the tech stack choice — it's the multi-agent collaboration protocol behind the /clone-website Skill, particularly using git worktrees for genuinely parallel component builds and enforcing the "spec first, builder second" constraint.

Author / Team

Author: JCodesMore
Primary Language: TypeScript
License: MIT
Community: Discord

Project Stats

⭐ GitHub Stars: 22,100+
🍴 Forks: 3,173+ (most used to create independent projects)
📄 License: MIT
📅 Created: March 2026

Features

What It Does

Traditional "screenshot clone" approach:
Screenshot → LLM approximates layout → placeholder content → manual color/spacing corrections → iterate

ai-website-cloner-template approach:
/clone-website https://example.com
    ↓
Phase 1 Recon: screenshots + scroll/click/hover interaction sweep + design token extraction
    ↓
Phase 2 Foundation: update fonts/colors/globals.css + download all assets + extract SVG icons
    ↓
Phase 3 Specs: write spec files per section (exact CSS values + real content + interaction states)
    ↓
Phase 4 Parallel Build: git worktree per section + dispatch Builder Agents in parallel
    ↓
Phase 5 Assembly QA: merge all worktrees + wire up page + npm run build verification

Use Cases

Platform migration: Rebuild a site you own from WordPress/Webflow/Squarespace into a Next.js codebase, gaining full code ownership
Lost source code: The site is live but the repo is gone, the developer left, or the stack is too legacy to maintain — recover the code from the live version in a modern format
Learning: Deconstruct how production sites achieve specific layouts, animations, and responsive behavior by working with the actual code (not just screenshots)

Not Intended For

The README explicitly states:

Phishing or impersonation: forbidden for deceptive purposes, impersonation, or illegal activities
Passing off others' designs as your own: logos, brand assets, and original copy belong to their owners
Violating terms of service: some sites explicitly prohibit scraping or reproduction — check first

Quick Start

1. Create your own repository

On the GitHub project page, click Use this template → Create a new repository (don't clone the template repository directly).

2. Clone and install

git clone https://github.com/YOUR-USERNAME/YOUR-NEW-REPO.git
cd YOUR-NEW-REPO
npm install

3. Start Claude Code (recommended)

claude --chrome   # --chrome starts Chrome MCP for browser automation

4. Run the clone skill

/clone-website https://target-website.com

Multiple URLs can be processed in parallel:

/clone-website https://site1.com https://site2.com

5. Dev preview

npm run dev        # Start dev server
npm run check      # Run lint + typecheck + build

Supported AI Agent Platforms

Agent	Status
Claude Code	Recommended (Opus 4.7)
Codex CLI	Supported
OpenCode	Supported
GitHub Copilot	Supported
Cursor	Supported
Windsurf	Supported
Gemini CLI	Supported
Cline	Supported
Roo Code	Supported
Continue	Supported
Amazon Q	Supported
Augment Code	Supported
Aider	Supported

Cross-platform support works like this: AGENTS.md is the single source of truth for all project instructions. Running bash scripts/sync-agent-rules.sh auto-generates platform-specific copies (CLAUDE.md, GEMINI.md, .cursor/, .windsurf/, etc.) from that single file.

Deep Dive

The Five-Phase Pipeline

Phase 1: Reconnaissance

This is not just screenshots. Reconnaissance requires three mandatory tasks:

Screenshots: Full-page screenshots at 1440px (desktop) and 390px (mobile), saved to docs/design-references/. These are the visual master reference for the entire process.

Mandatory interaction sweep (the most commonly skipped step):

Scroll sweep — scroll slowly from top to bottom, observe:
  - At what scroll position does the navbar change appearance?
  - Which elements animate in when entering the viewport?
  - Which sections have scroll-snap points?
  - Is a smooth scroll library (Lenis, Locomotive Scroll) active?

Click sweep — click every element that looks interactive:
  - What content does each tab/pill switch to?
  - What modals open, what dropdowns appear?

Hover sweep — hover every element that might have hover states:
  - Color changes, scale, shadow, opacity, underlines...

Responsive sweep — test at 1440px / 768px / 390px:
  - Note layout changes and the approximate breakpoint where they occur

All findings are saved to docs/research/BEHAVIORS.md — the "behavior bible" for the entire cloning process.

Page topology: Map every distinct section from top to bottom, document each section's interaction model (static / click-driven / scroll-driven / time-driven), save as docs/research/PAGE_TOPOLOGY.md — the assembly blueprint.

Phase 2: Foundation Build

Done by the Orchestrator Agent directly — not delegated to sub-agents because it touches many files:

Update src/app/layout.tsx: configure the target site's actual fonts via next/font/google or next/font/local
Update src/app/globals.css: map the target site's color tokens (background, foreground, primary, muted...) to the shadcn variable system using oklch color space
Extract all SVG icons → save as named React components in src/components/icons.tsx
Run the asset download script (scripts/download-assets.mjs): batch-download all images and videos to public/
Verify: npm run build passes

Asset discovery runs JavaScript via browser MCP to precisely enumerate all <img>, <video>, and CSS background-image elements, including absolutely-positioned overlay layers — a section that looks like a single image is often a background watercolor + foreground UI mockup PNG + an overlay icon. Missing any layer makes the clone look empty.

// Run via browser MCP to discover all assets
JSON.stringify({
  images: [...document.querySelectorAll('img')].map(img => ({
    src: img.src || img.currentSrc,
    alt: img.alt,
    parentClasses: img.parentElement?.className,
    siblings: img.parentElement ? [...img.parentElement.querySelectorAll('img')].length : 0,
    position: getComputedStyle(img).position,
    zIndex: getComputedStyle(img).zIndex
  })),
  backgroundImages: [...document.querySelectorAll('*')].filter(el => {
    const bg = getComputedStyle(el).backgroundImage;
    return bg && bg !== 'none';
  }).map(el => ({
    url: getComputedStyle(el).backgroundImage,
    element: el.tagName + '.' + el.className?.split(' ')[0]
  })),
  // ... videos, fonts, favicons
})

Phase 3: Component Specs

Before any Builder Agent is dispatched, a spec file must be written for that section in docs/research/components/<name>.md. The spec is the contract between extraction and construction — not optional.

What the spec file contains:

A screenshot crop of the section (local path)
Exact CSS values extracted via getComputedStyle() — not eyeballed estimates
Downloaded asset local paths (the public/ path, not the original URL)
Real text content (from element.textContent, not placeholders)
All interaction states (content per tab state, CSS diff before/after scroll trigger, transition animation parameters)
Responsive breakpoint behaviors

CSS extraction script (executed via browser MCP):

(function(selector) {
  const el = document.querySelector(selector);
  const computed = getComputedStyle(el);
  const props = [
    'fontSize','fontWeight','fontFamily','lineHeight','letterSpacing',
    'color','backgroundColor','background',
    'padding','paddingTop','paddingRight','paddingBottom','paddingLeft',
    'margin','borderRadius','boxShadow','display','flexDirection',
    'gap','alignItems','justifyContent','position','zIndex',
    // ... full property list
  ];
  return Object.fromEntries(props.map(p => [p, computed[p]]));
})(selector)

Complexity budget rule: If a section's spec file exceeds ~150 lines, the section is too complex for one agent — split it into smaller pieces. This is a mechanical check that cannot be overridden with "but they're all related."

Phase 4: Parallel Build via git Worktrees

This is the key architectural decision — git worktrees give each Builder Agent an isolated working branch:

# Orchestrator creates a worktree per section
git worktree add .worktrees/hero-section feature/hero-section
git worktree add .worktrees/features-grid feature/features-grid
git worktree add .worktrees/pricing-section feature/pricing-section

# Each Builder Agent works in its own worktree
# Builder receives the full spec file content inline in its prompt
# Builder verifies: npx tsc --noEmit passes before finishing

What each Builder Agent receives:

Full spec file content (inlined into the prompt, not a path reference)
Screenshot crop path
Downloaded asset local paths
Global style system (font variables, color tokens)

The Builder doesn't need to read other sections' code or understand the overall page structure. Its only job: implement this one component to spec, with TypeScript compiling clean.

Phase 5: Assembly & QA

# Orchestrator merges all worktree branches
git merge feature/hero-section feature/features-grid feature/pricing-section ...
# Resolve merge conflicts (Orchestrator has full context for smart resolution)

# Wire up all section components in correct order in src/app/page.tsx

# Final verification
npm run build    # Must pass — no exceptions

The Single Most Expensive Mistake: Wrong Interaction Model

The SKILL file dedicates significant space to this because building a click-based UI when the original is scroll-driven means a complete rewrite, not a CSS tweak.

Identification protocol:

Don't click first. Scroll slowly and observe if anything changes on its own.
If yes → scroll-driven. Extract the mechanism: IntersectionObserver, scroll-snap, position: sticky, animation-timeline, or JS scroll listeners.
If no → then test click/hover-driven interactivity.
Document explicitly in the spec: INTERACTION MODEL: scroll-driven with IntersectionObserver or INTERACTION MODEL: click-to-switch with opacity transition.

Classic scroll-driven patterns to watch for:

A sticky sidebar where the active item auto-changes as content scrolls past (IntersectionObserver, NOT click handlers)
Tabbed/pill content that cycles when built as click-based
Smooth scroll libraries (Lenis, Locomotive Scroll) — check for .lenis class or scroll container wrappers

Project Structure

src/
  app/              # Next.js routes
  components/
    ui/             # shadcn/ui primitives
    icons.tsx       # SVG icons extracted from target (React components)
  lib/utils.ts      # cn() utility
  types/            # TypeScript interfaces
  hooks/            # Custom React hooks
public/
  images/           # Downloaded images from target
  videos/           # Downloaded videos from target
  seo/              # Favicons, OG images, webmanifest
docs/
  research/         # Extraction output: component specs, behaviors, topology
  design-references/ # Screenshots (desktop + mobile)
scripts/
  sync-agent-rules.sh  # Sync AGENTS.md to all platform formats
  sync-skills.mjs      # Sync /clone-website to all platforms
AGENTS.md           # Single source of truth for all agent instructions
CLAUDE.md           # Claude Code config (imports AGENTS.md)
GEMINI.md           # Gemini CLI config (imports AGENTS.md)

Cross-Platform Design: One Source, Many Targets

The template supports 13 AI coding platforms without maintaining 13 separate instruction sets:

What	Source of Truth	Sync Command
Project instructions	`AGENTS.md`	`bash scripts/sync-agent-rules.sh`
`/clone-website` skill	`.claude/skills/clone-website/SKILL.md`	`node scripts/sync-skills.mjs`

Each script generates platform-specific copies automatically. Agents that can read the source files natively need no regeneration.

This pattern — single source file plus generation scripts — is worth borrowing for any tool that needs to support multiple AI coding environments.

Resources

Official Links

🌟 GitHub: JCodesMore/ai-website-cloner-template
🎬 Demo video: YouTube link in the project README
💬 Discord: discord.gg/hrTSX5yTpB

Summary

ai-website-cloner-template's value isn't the tech stack choice (Next.js + shadcn + Tailwind is standard). It's the multi-agent collaboration protocol that's been thought through carefully.

A few design decisions worth internalizing:

"Completeness beats speed": Builder Agents receive everything before starting. No guessing mid-build. The constraint makes results more reliable at the cost of a slower reconnaissance phase — the author considers that tradeoff correct.

Complexity budget (150-line spec = split signal): A mechanical rule controlling task scope, not a judgment call. Engineering discipline, not intuition.

git worktree parallelism: Each Builder works in an isolated branch; the Orchestrator merges at the end. Parallelism isn't "run multiple tasks simultaneously" — it's isolated work with clear merge semantics.

Single source + sync scripts: AGENTS.md is the single source of truth for 13 platforms. One edit, one script run, all platforms updated. A pattern worth copying for any multi-environment AI toolchain.

The underlying principle — define the process rigorously enough that any capable agent can follow it, rather than relying on one specific agent to improvise correctly — is applicable well beyond website cloning.

Explore PrimeSkills — a curated marketplace of AI agents and skills, each validated against real enterprise workflows. No hype, just what actually works.

Visit my personal site for more insights and interesting products.

DEV Community