DEV Community

Cover image for Open Source Project #108: ai-website-cloner-template — One Command, Parallel Agents, Any Website Reversed into Next.js Code
WonderLab
WonderLab

Posted on

Open Source Project #108: ai-website-cloner-template — One Command, Parallel Agents, Any Website Reversed into Next.js Code

Introduction

"Point it at a URL, run /clone-website, and your AI agent will inspect the site, extract design tokens and assets, write component specs, and dispatch parallel builders to reconstruct every section."

This is article #108 in the "One Open Source Project a Day" series. Today's project is ai-website-cloner-template — a GitHub template repository for reverse-engineering any website into a Next.js codebase using AI coding agents.

Turning a "website screenshot" into "runnable code" is a classic Vibe Coding scenario, but most implementations stop at the surface: ask the LLM to look at a screenshot, approximate the layout, fill in placeholder content. This template takes a fundamentally different approach: it defines a rigorous multi-phase agent workflow whose core principle is "completeness beats speed" — every Builder Agent must have exact getComputedStyle() values, genuinely downloaded assets, and complete interaction state specifications before touching any code.

22k stars signals real demand for this use case. But the more interesting story is the engineering design — especially using git worktrees to achieve true parallel multi-agent construction.

What You'll Learn

  • The five-phase clone pipeline: reconnaissance → foundation → component specs → parallel build → assembly QA
  • git worktree parallel agent pattern: how each Builder Agent works in an isolated branch
  • Component spec file design principles: why the spec is a "contract," not a "reference"
  • Interaction model identification: distinguishing click-driven, scroll-driven, and time-driven behaviors before building
  • Cross-platform agent support: how 13 AI coding tools are all served from a single source file

Prerequisites

  • Experience with Claude Code, Cursor, or similar AI coding tools
  • Basic familiarity with Next.js
  • Basic understanding of git branches

Project Background

Overview

ai-website-cloner-template is a GitHub template repository (is_template: true) that pre-scaffolds a Next.js 16 + shadcn/ui + Tailwind v4 project alongside a /clone-website AI Agent Skill.

Usage pattern: use "Use this template" on GitHub to create your own repository, start your AI agent, run /clone-website <target-url>, and the agent completes the full pipeline from reconnaissance to working code.

The project's core engineering contribution isn't the tech stack choice — it's the multi-agent collaboration protocol behind the /clone-website Skill, particularly using git worktrees for genuinely parallel component builds and enforcing the "spec first, builder second" constraint.

Author / Team

  • Author: JCodesMore
  • Primary Language: TypeScript
  • License: MIT
  • Community: Discord

Project Stats

  • ⭐ GitHub Stars: 22,100+
  • 🍴 Forks: 3,173+ (most used to create independent projects)
  • 📄 License: MIT
  • 📅 Created: March 2026

Features

What It Does

Traditional "screenshot clone" approach:
Screenshot → LLM approximates layout → placeholder content → manual color/spacing corrections → iterate

ai-website-cloner-template approach:
/clone-website https://example.com
    ↓
Phase 1 Recon: screenshots + scroll/click/hover interaction sweep + design token extraction
    ↓
Phase 2 Foundation: update fonts/colors/globals.css + download all assets + extract SVG icons
    ↓
Phase 3 Specs: write spec files per section (exact CSS values + real content + interaction states)
    ↓
Phase 4 Parallel Build: git worktree per section + dispatch Builder Agents in parallel
    ↓
Phase 5 Assembly QA: merge all worktrees + wire up page + npm run build verification
Enter fullscreen mode Exit fullscreen mode

Use Cases

  1. Platform migration: Rebuild a site you own from WordPress/Webflow/Squarespace into a Next.js codebase, gaining full code ownership
  2. Lost source code: The site is live but the repo is gone, the developer left, or the stack is too legacy to maintain — recover the code from the live version in a modern format
  3. Learning: Deconstruct how production sites achieve specific layouts, animations, and responsive behavior by working with the actual code (not just screenshots)

Not Intended For

The README explicitly states:

  • Phishing or impersonation: forbidden for deceptive purposes, impersonation, or illegal activities
  • Passing off others' designs as your own: logos, brand assets, and original copy belong to their owners
  • Violating terms of service: some sites explicitly prohibit scraping or reproduction — check first

Quick Start

1. Create your own repository

On the GitHub project page, click Use this templateCreate a new repository (don't clone the template repository directly).

2. Clone and install

git clone https://github.com/YOUR-USERNAME/YOUR-NEW-REPO.git
cd YOUR-NEW-REPO
npm install
Enter fullscreen mode Exit fullscreen mode

3. Start Claude Code (recommended)

claude --chrome   # --chrome starts Chrome MCP for browser automation
Enter fullscreen mode Exit fullscreen mode

4. Run the clone skill

/clone-website https://target-website.com
Enter fullscreen mode Exit fullscreen mode

Multiple URLs can be processed in parallel:

/clone-website https://site1.com https://site2.com
Enter fullscreen mode Exit fullscreen mode

5. Dev preview

npm run dev        # Start dev server
npm run check      # Run lint + typecheck + build
Enter fullscreen mode Exit fullscreen mode

Supported AI Agent Platforms

Agent Status
Claude Code Recommended (Opus 4.7)
Codex CLI Supported
OpenCode Supported
GitHub Copilot Supported
Cursor Supported
Windsurf Supported
Gemini CLI Supported
Cline Supported
Roo Code Supported
Continue Supported
Amazon Q Supported
Augment Code Supported
Aider Supported

Cross-platform support works like this: AGENTS.md is the single source of truth for all project instructions. Running bash scripts/sync-agent-rules.sh auto-generates platform-specific copies (CLAUDE.md, GEMINI.md, .cursor/, .windsurf/, etc.) from that single file.


Deep Dive

The Five-Phase Pipeline

Phase 1: Reconnaissance

This is not just screenshots. Reconnaissance requires three mandatory tasks:

Screenshots: Full-page screenshots at 1440px (desktop) and 390px (mobile), saved to docs/design-references/. These are the visual master reference for the entire process.

Mandatory interaction sweep (the most commonly skipped step):

Scroll sweep — scroll slowly from top to bottom, observe:
  - At what scroll position does the navbar change appearance?
  - Which elements animate in when entering the viewport?
  - Which sections have scroll-snap points?
  - Is a smooth scroll library (Lenis, Locomotive Scroll) active?

Click sweep — click every element that looks interactive:
  - What content does each tab/pill switch to?
  - What modals open, what dropdowns appear?

Hover sweep — hover every element that might have hover states:
  - Color changes, scale, shadow, opacity, underlines...

Responsive sweep — test at 1440px / 768px / 390px:
  - Note layout changes and the approximate breakpoint where they occur
Enter fullscreen mode Exit fullscreen mode

All findings are saved to docs/research/BEHAVIORS.md — the "behavior bible" for the entire cloning process.

Page topology: Map every distinct section from top to bottom, document each section's interaction model (static / click-driven / scroll-driven / time-driven), save as docs/research/PAGE_TOPOLOGY.md — the assembly blueprint.

Phase 2: Foundation Build

Done by the Orchestrator Agent directly — not delegated to sub-agents because it touches many files:

  1. Update src/app/layout.tsx: configure the target site's actual fonts via next/font/google or next/font/local
  2. Update src/app/globals.css: map the target site's color tokens (background, foreground, primary, muted...) to the shadcn variable system using oklch color space
  3. Extract all SVG icons → save as named React components in src/components/icons.tsx
  4. Run the asset download script (scripts/download-assets.mjs): batch-download all images and videos to public/
  5. Verify: npm run build passes

Asset discovery runs JavaScript via browser MCP to precisely enumerate all <img>, <video>, and CSS background-image elements, including absolutely-positioned overlay layers — a section that looks like a single image is often a background watercolor + foreground UI mockup PNG + an overlay icon. Missing any layer makes the clone look empty.

// Run via browser MCP to discover all assets
JSON.stringify({
  images: [...document.querySelectorAll('img')].map(img => ({
    src: img.src || img.currentSrc,
    alt: img.alt,
    parentClasses: img.parentElement?.className,
    siblings: img.parentElement ? [...img.parentElement.querySelectorAll('img')].length : 0,
    position: getComputedStyle(img).position,
    zIndex: getComputedStyle(img).zIndex
  })),
  backgroundImages: [...document.querySelectorAll('*')].filter(el => {
    const bg = getComputedStyle(el).backgroundImage;
    return bg && bg !== 'none';
  }).map(el => ({
    url: getComputedStyle(el).backgroundImage,
    element: el.tagName + '.' + el.className?.split(' ')[0]
  })),
  // ... videos, fonts, favicons
})
Enter fullscreen mode Exit fullscreen mode

Phase 3: Component Specs

Before any Builder Agent is dispatched, a spec file must be written for that section in docs/research/components/<name>.md. The spec is the contract between extraction and construction — not optional.

What the spec file contains:

  • A screenshot crop of the section (local path)
  • Exact CSS values extracted via getComputedStyle() — not eyeballed estimates
  • Downloaded asset local paths (the public/ path, not the original URL)
  • Real text content (from element.textContent, not placeholders)
  • All interaction states (content per tab state, CSS diff before/after scroll trigger, transition animation parameters)
  • Responsive breakpoint behaviors

CSS extraction script (executed via browser MCP):

(function(selector) {
  const el = document.querySelector(selector);
  const computed = getComputedStyle(el);
  const props = [
    'fontSize','fontWeight','fontFamily','lineHeight','letterSpacing',
    'color','backgroundColor','background',
    'padding','paddingTop','paddingRight','paddingBottom','paddingLeft',
    'margin','borderRadius','boxShadow','display','flexDirection',
    'gap','alignItems','justifyContent','position','zIndex',
    // ... full property list
  ];
  return Object.fromEntries(props.map(p => [p, computed[p]]));
})(selector)
Enter fullscreen mode Exit fullscreen mode

Complexity budget rule: If a section's spec file exceeds ~150 lines, the section is too complex for one agent — split it into smaller pieces. This is a mechanical check that cannot be overridden with "but they're all related."

Phase 4: Parallel Build via git Worktrees

This is the key architectural decision — git worktrees give each Builder Agent an isolated working branch:

# Orchestrator creates a worktree per section
git worktree add .worktrees/hero-section feature/hero-section
git worktree add .worktrees/features-grid feature/features-grid
git worktree add .worktrees/pricing-section feature/pricing-section

# Each Builder Agent works in its own worktree
# Builder receives the full spec file content inline in its prompt
# Builder verifies: npx tsc --noEmit passes before finishing
Enter fullscreen mode Exit fullscreen mode

What each Builder Agent receives:

  • Full spec file content (inlined into the prompt, not a path reference)
  • Screenshot crop path
  • Downloaded asset local paths
  • Global style system (font variables, color tokens)

The Builder doesn't need to read other sections' code or understand the overall page structure. Its only job: implement this one component to spec, with TypeScript compiling clean.

Phase 5: Assembly & QA

# Orchestrator merges all worktree branches
git merge feature/hero-section feature/features-grid feature/pricing-section ...
# Resolve merge conflicts (Orchestrator has full context for smart resolution)

# Wire up all section components in correct order in src/app/page.tsx

# Final verification
npm run build    # Must pass — no exceptions
Enter fullscreen mode Exit fullscreen mode

The Single Most Expensive Mistake: Wrong Interaction Model

The SKILL file dedicates significant space to this because building a click-based UI when the original is scroll-driven means a complete rewrite, not a CSS tweak.

Identification protocol:

  1. Don't click first. Scroll slowly and observe if anything changes on its own.
  2. If yes → scroll-driven. Extract the mechanism: IntersectionObserver, scroll-snap, position: sticky, animation-timeline, or JS scroll listeners.
  3. If no → then test click/hover-driven interactivity.
  4. Document explicitly in the spec: INTERACTION MODEL: scroll-driven with IntersectionObserver or INTERACTION MODEL: click-to-switch with opacity transition.

Classic scroll-driven patterns to watch for:

  • A sticky sidebar where the active item auto-changes as content scrolls past (IntersectionObserver, NOT click handlers)
  • Tabbed/pill content that cycles when built as click-based
  • Smooth scroll libraries (Lenis, Locomotive Scroll) — check for .lenis class or scroll container wrappers

Project Structure

src/
  app/              # Next.js routes
  components/
    ui/             # shadcn/ui primitives
    icons.tsx       # SVG icons extracted from target (React components)
  lib/utils.ts      # cn() utility
  types/            # TypeScript interfaces
  hooks/            # Custom React hooks
public/
  images/           # Downloaded images from target
  videos/           # Downloaded videos from target
  seo/              # Favicons, OG images, webmanifest
docs/
  research/         # Extraction output: component specs, behaviors, topology
  design-references/ # Screenshots (desktop + mobile)
scripts/
  sync-agent-rules.sh  # Sync AGENTS.md to all platform formats
  sync-skills.mjs      # Sync /clone-website to all platforms
AGENTS.md           # Single source of truth for all agent instructions
CLAUDE.md           # Claude Code config (imports AGENTS.md)
GEMINI.md           # Gemini CLI config (imports AGENTS.md)
Enter fullscreen mode Exit fullscreen mode

Cross-Platform Design: One Source, Many Targets

The template supports 13 AI coding platforms without maintaining 13 separate instruction sets:

What Source of Truth Sync Command
Project instructions AGENTS.md bash scripts/sync-agent-rules.sh
/clone-website skill .claude/skills/clone-website/SKILL.md node scripts/sync-skills.mjs

Each script generates platform-specific copies automatically. Agents that can read the source files natively need no regeneration.

This pattern — single source file plus generation scripts — is worth borrowing for any tool that needs to support multiple AI coding environments.


Resources

Official Links


Summary

ai-website-cloner-template's value isn't the tech stack choice (Next.js + shadcn + Tailwind is standard). It's the multi-agent collaboration protocol that's been thought through carefully.

A few design decisions worth internalizing:

"Completeness beats speed": Builder Agents receive everything before starting. No guessing mid-build. The constraint makes results more reliable at the cost of a slower reconnaissance phase — the author considers that tradeoff correct.

Complexity budget (150-line spec = split signal): A mechanical rule controlling task scope, not a judgment call. Engineering discipline, not intuition.

git worktree parallelism: Each Builder works in an isolated branch; the Orchestrator merges at the end. Parallelism isn't "run multiple tasks simultaneously" — it's isolated work with clear merge semantics.

Single source + sync scripts: AGENTS.md is the single source of truth for 13 platforms. One edit, one script run, all platforms updated. A pattern worth copying for any multi-environment AI toolchain.

The underlying principle — define the process rigorously enough that any capable agent can follow it, rather than relying on one specific agent to improvise correctly — is applicable well beyond website cloning.


Explore PrimeSkills — a curated marketplace of AI agents and skills, each validated against real enterprise workflows. No hype, just what actually works.

Visit my personal site for more insights and interesting products.

Top comments (0)