DEV Community

GSD: Zero to Productive in Claude Code Without the Faff

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:42 +0000

The Context Rot Problem, Sorted

Ever had a Claude Code session go long and watched the model just... forget what it was doing three steps ago? Context rot. The window fills up, quality tanks, you're repeating yourself, and the whole thing turns into a shambles. Everyone hits it eventually.

GSD (Get Shit Done) fixes this by splitting your work into phases: discuss, plan, execute, verify. Each phase keeps its own focused context instead of dumping everything into one massive window. The model knows what to pay attention to right now rather than trying to hold your entire session in its head.

48K stars on GitHub, which is mental for a prompting framework. 69 commands, 24 agents running researchers and verifiers in parallel, and it works across 12+ runtimes. Not just Claude Code. Cursor, Gemini, Copilot, Cline, OpenCode, the lot.

The bit I reckon matters most: you write what you want in a structured spec, GSD breaks it into atomic tasks, each task gets its own clean context, and you get proper git commits per task. No more "one massive diff that nobody wants to review" at the end of a session.

Built by TACHES. The GSD-2 fork takes it further with even more agents. If you've been faffing about with raw Claude Code prompts and wondering why long sessions go wonky, start here. Nick the bits that work for you, bin the rest.

ntfy + PingMe: Get Pinged When Your Agent Finishes

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:35 +0000

You kick off a Claude Code task that's going to take twenty minutes. You tab away. You forget. Forty minutes later you remember, and the agent's been sat there waiting for input since minute twelve. Proper waste of time, and I kept doing it until I wired up notifications.

ntfy is the one I reach for. Self-hostable, no signup, no faff. One curl and your phone buzzes.

curl -d "Agent finished the migration" ntfy.sh/my-agent-alerts

Someone built an ntfy MCP server if you want the MCP route, though I'd recommend calling it through mcporter rather than loading the full tool schema into your context (see why MCP is a context tax). Even simpler: agent-notify hooks ntfy into Claude Code's lifecycle events directly, so you get pinged on completion without any MCP wiring at all. That's what I run. Mint.

PingMe does something different. Single binary, env var config, blasts notifications to Slack, Telegram, Discord, Teams, Pushover, Mastodon, email, and about ten more. If your team lives across five different chat apps (and whose doesn't), PingMe covers them all from one command. There's a GitHub Action too, which is handy for CI pipelines.

I use ntfy because agent-notify means zero wiring on my end. The agent finishes, I get a ping. PingMe's the better shout if you need to fan out alerts to the whole team across platforms. Pick whichever matches your setup, but stop checking your terminal every three minutes like a muppet. That was me. Don't be me.

Claude Flow: The Multi-Agent Swarm Orchestrator Before It Got a New Name

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:29 +0000

What Claude Flow Was

Right, so this tool landed in mid-2025 and it properly turned heads. Claude Flow was built by Reuven Cohen (GitHub handle ruvnet), and the pitch was straightforward: take Anthropic's Claude, strap a multi-agent orchestration layer on top of it, and let coordinated swarms of AI agents tackle software development tasks. Not one agent fumbling through your codebase on its own. Dozens of them, working in parallel, with a queen agent calling the shots.

The repo went live on GitHub around June 2025 under ruvnet/claude-flow, and the thing that set it apart from other "run multiple agents" tools at the time was SPARC.

SPARC: The Methodology Baked Into the Tool

SPARC stood for Specification, Pseudocode, Architecture, Refinement, Completion. It was a structured, test-driven approach to AI development that came packaged directly into Claude Flow. You didn't have to invent a workflow or figure out how to prompt your agents. You ran npx claude-flow sparc run and the tool walked your swarm through each phase.

Ten specialised modes. The agents knew what phase they were in, what their job was at each stage, and how to hand off work to the next phase. If you'd been struggling with the "I told the agent to build a feature and it went off the rails" problem, SPARC was the answer Claude Flow offered. Structure the work, structure the agents, structure the output.

How the Swarm Actually Worked

This is where it got interesting. Claude Flow used what they called a "hive mind" architecture. A queen agent sat at the top, coordinating sub-agents below. Each sub-agent had a specialisation. Some did research. Some wrote code. Some reviewed it. The queen figured out who should do what and when.

By the time v2.7 rolled around, the platform had:

60+ specialised agents running in coordinated swarms
AgentDB memory powered by SQLite with semantic queries
Neural memory enhancement for cross-session recall
Claude Code integration via MCP (Model Context Protocol)
Parallel agent execution across tasks

The AgentDB bit was particularly clever. v2.7.x introduced 150x faster semantic queries with 56% less memory usage. So your agents could actually remember what they'd done across sessions without the whole thing grinding to a halt.

What Happened Next

Claude Flow eventually became Ruflo in early 2026. Not just a rename, a full architectural shift to Rust/WASM. But that's a separate story.

📚 Geek Corner
SPARC to Skills: The original SPARC methodology was rigid by design. Five phases, sequential, with explicit gates between them. This works brilliantly for well-defined tasks but fights against the reality that most development is messy and iterative. Ruflo's shift to skills-based orchestration reflects a pattern seen across the whole AI tooling space in 2025/2026: structured phases give way to composable capabilities. HumanLayer's RPI methodology made the same move. GitHub's spec-driven development is another flavour of it. The industry is converging on "give agents the right tools and context, then let them figure out the order" over "prescribe the exact sequence." SPARC was the right answer for mid-2025. Skills are the right answer for where we are now.

📚 Geek Corner

SPARC to Skills: The original SPARC methodology was rigid by design. Five phases, sequential, with explicit gates between them. This works brilliantly for well-defined tasks but fights against the reality that most development is messy and iterative. Ruflo's shift to skills-based orchestration reflects a pattern seen across the whole AI tooling space in 2025/2026: structured phases give way to composable capabilities. HumanLayer's RPI methodology made the same move. GitHub's spec-driven development is another flavour of it. The industry is converging on "give agents the right tools and context, then let them figure out the order" over "prescribe the exact sequence." SPARC was the right answer for mid-2025. Skills are the right answer for where we are now.

When to Use It

If you're running Claude Code and you want to throw multiple agents at a problem in parallel with proper coordination, Ruflo (the current incarnation) is one of the most mature options out there. 29,000+ stars on GitHub. Active development. Real architectural investment under the hood.

The tradeoff is complexity. This is not a "pip install and go" tool. It's an enterprise-grade platform with consensus algorithms (Raft, Byzantine, Gossip), distributed swarm intelligence, and a WASM runtime. If you need five agents working on different files simultaneously with shared memory and coordination, it's brilliant. If you need one agent to fix a bug, it's massive overkill.

For simpler multi-agent needs, have a look at Claude Squad which takes a much lighter approach to running parallel agents.

Obsidian Skills: Let Your Agent Manage Your Second Brain

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:22 +0000

What It Is

Obsidian Skills by Steph Ango (Obsidian's CEO) gives Claude Code full access to your Obsidian vault. Search notes, create new ones, update existing ones, manage links and tags. 22,000 stars.

# Install
npx skills add kepano/obsidian-skills

Your agent can now search your vault semantically, create notes from conversation context, and link related ideas together. It treats your vault as a knowledge base it can both read and write to.

Why I Rate It

I keep research notes, meeting summaries, and tool evaluations in Obsidian. Before this skill, pulling that context into an agent conversation meant copy-pasting from one app to another. Now I just say "check my notes on browser automation" and the agent searches my vault, finds the relevant pages, and uses them as context.

The write side is useful too. Agent finishes researching a topic? "Write a summary to my vault under technology/browser-tools." Done. No manual note-taking after the conversation ends.

The fact that Obsidian's own CEO built this tells you where the ecosystem is heading. Your notes app is becoming agent-accessible infrastructure, not just a place you type into.

My Security Agent Stack: How Zerocool Guards the Perimeter

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:15 +0000

The Setup

My wololo setup has a security agent called Zerocool. It runs recon, scans for vulns, and reviews code for security issues before anything ships. Not a single tool. A stack of tools coordinated by one agent.

Here's what's in the stack and why.

Recon Layer

Argus (jasonxtn) for quick information gathering. Python-based, clean TUI, covers networks, web apps, and security environments. Good for the "what am I looking at" phase when Zerocool first touches a target. Not the heaviest toolkit but it's fast and the interface is proper nice for an agent to parse.

WebCopilot for attack surface mapping. Enumerates subdomains (assetfinder, sublister, subfinder, amass, findomain, gobuster), filters live hosts via dnsx, crawls endpoints, then uses gf patterns to extract params vulnerable to XSS, LFI, SSRF, SQLi, open redirect, and RCE. Scans them with dalfox, kxss, sqlmap. It's the automated "find every door and window" tool. Point it at a domain and it maps the whole surface.

Pentesting Layer

Shannon is the star of the stack. 37,000 stars. Autonomous white-box AI pentester by Keygraph. It reads your source code, identifies attack vectors, and then actually executes real exploits. Injection, auth bypass, SSRF, XSS. Reports only proven vulnerabilities with copy-paste PoCs. Not theoretical risk assessments. Proof.

96% on the XBOW benchmark. Handles 2FA, TOTP, SSO, browser automation, parallel exploitation. The Lite version is AGPL-3.0. Pro adds SAST, SCA, secrets scanning, and CI/CD integration.

PentAGI for fully autonomous scanning. 14,600 stars. Sandboxed Docker execution with 20+ security tools baked in (nmap, metasploit, sqlmap). Knowledge graph via Neo4j. Team of specialist AI agents for research, dev, and infra. Multi-LLM support. docker compose up and it's running.

I use Shannon for targeted white-box testing on code I control. PentAGI for broader autonomous scanning where I want the agent to find things I haven't thought of.

Code Security Layer

Ghost Security Skills for AI-native code analysis inside Claude Code. Four skills: ghost:repo-context (understand the codebase), ghost:scan-deps (dependency vulnerabilities), ghost:scan-secrets (leaked credentials), ghost:scan-code (code-level security issues). Install and your agent can security-review a PR before it merges.

This is the layer that runs on every commit. Shannon and PentAGI run on schedules or before releases. Ghost Security runs continuously.

How Zerocool Uses Them

The agent picks tools based on what phase of the security review it's in:

Recon: Argus + WebCopilot map the target
Static analysis: Ghost Security scans the code and dependencies
Dynamic testing: Shannon runs white-box exploits against the running app
Autonomous sweep: PentAGI does a broad scan for anything the targeted tools missed
Report: Zerocool compiles findings, deduplicates, and files issues

All of this is ethical/authorised testing on my own infrastructure. If you're pointing any of these at targets you don't own, that's on you.

Scrapling: Scrape Anything Without Getting Blocked

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:09 +0000

Why Not Just Use Requests

Because the site blocks you. Or serves you a Cloudflare challenge page. Or fingerprints your TLS stack and returns garbage. Or redesigns their HTML and all your selectors break overnight.

Scrapling handles all of this. 35,000 stars. Three fetcher tiers depending on how hostile the target is:

Fetcher (HTTP level): fastest, uses httpx with browser-grade TLS fingerprinting. Good for APIs and sites without bot detection.
StealthyFetcher (real browser): spins up a Playwright browser with anti-detection patches. Handles JavaScript rendering, Cloudflare Turnstile, and most bot checks.
PlayWrightFetcher (full control): same browser engine but gives you direct Playwright API access for complex flows.

Pick the lightest tier that works. Escalate only when you need to. Most sites fold to Fetcher with the right TLS config.

The Selector Trick

The bit that sold me: adaptive selectors. You write a selector once and Scrapling generates multiple fallback strategies (text matching, attribute similarity, structural position). When the site changes their class names or restructures the DOM, your scraper keeps working because it falls back to a selector that still matches.

That's the difference between a scraper you maintain weekly and one you maintain monthly.

For Agents

I use it as a fallback in my /research skill when markdown.new or Jina Reader can't get through. Cloudflare blocks them, anti-bot walls go up, paywall gates slam shut. Scrapling's StealthyFetcher punches through most of it.

pip install scrapling

from scrapling import StealthyFetcher
page = StealthyFetcher.fetch("https://blocked-site.com")
print(page.css("article").text)

For a deeper comparison with every other scraping and browser tool, see the Browser Tools series.

Clawdbot: The Chrome Extension That Lets Agents Drive Your Browser

Steven Gonsalvez — Sun, 26 Apr 2026 19:46:02 +0000

The Relay Pattern

Here's a pattern that makes a lot of sense once you see it. Instead of your agent controlling a headless browser it spun up on its own (Puppeteer style), what if the agent could control your actual browser? The one you're already using. With your cookies, your sessions, your extensions, your everything.

Clawdbot, created by Peter Steinberger, does exactly this. It's a Chrome extension paired with a gateway server. The extension sits in your browser. The gateway sits between your agent and the extension. Your agent sends commands to the gateway, the gateway relays them to the extension, and the extension executes them in your real browser session.

The "relay" bit is the clever part. It can route control locally (agent on your machine, browser on your machine) or remotely (agent running somewhere else, still controlling your browser). Same protocol either way.

Why This Matters

Most browser automation tools create a parallel universe. They open a fresh browser with no state, no logins, no context. Then your agent has to log in, navigate to the right page, and reconstruct all the context that already exists in the browser tab you've got open.

Clawdbot skips all that. Your agent operates inside your existing session. It sees what you see. It can interact with pages you're already authenticated on. No credential passing, no cookie juggling, no "wait, why is it asking me to log in again?"

The Name Drama

Right, so about that name. It launched as Clawdbot. Then Anthropic's lawyers got involved because, well, "Clawd" is a bit on the nose. It got renamed to Moltbot, then settled on OpenClaw. By March 2026 it had racked up 247k stars on GitHub, making it one of the fastest growing repos in the space. Wild ride for a Chrome extension.

Getting Started

Install the Chrome extension from the repo, spin up the gateway server, and point your agent at it. The OpenClaw repo has the setup instructions. It's straightforward if you've ever installed a Chrome extension from source before.

Google Stitch: AI-Native UI Design That Actually Understands Your Design System

Steven Gonsalvez — Sun, 26 Apr 2026 19:45:56 +0000

The Problem With AI-Generated UI

Every AI coding tool can spit out a UI. You ask for a dashboard, you get a dashboard. You ask for a landing page, you get a landing page. The problem is it looks like every other AI-generated dashboard and landing page. Generic colours, default spacing, system fonts. It works, technically, but it looks like nobody who cares about design touched it. And the moment you try to get two screens to look like they belong to the same product, you're back to manually tweaking CSS until something vaguely coheres.

Google Stitch takes a different approach. It's an AI-native design tool from Google Labs, built on Gemini, that generates high-fidelity UI from natural language prompts and exports real HTML/CSS. It launched at Google I/O in May 2025 as a fairly basic experiment. The March 2026 v2 update turned it into something properly useful: infinite canvas, multi-screen generation (up to five connected screens at once), interactive prototyping, and a voice canvas for talking through your design ideas. Figma's stock dropped 10% the day the update shipped, which tells you roughly how seriously the market took it.

DESIGN.md: The Bit That Actually Matters

The flashy generation stuff is nice, but the feature that got me paying attention is DESIGN.md. It's a plain markdown file that encodes your entire design system. Colour palette with semantic tokens (primary, surface, accent), typography (font families, sizes, weights, line heights), spacing scale, grid conventions, border radius, shadows. Everything your design system defines, written in a format that both humans and language models can parse without breaking a sweat.

When you prompt Stitch, it passes your DESIGN.md as context to Gemini. The model treats those values as hard constraints, not suggestions. Every generated UI follows your system. Brand colours, spacing scale, typography, all consistent across screens. It's the difference between "generate me a settings page" and "generate me a settings page that looks like it belongs in our product."

Why Developers Should Care

Here's why DESIGN.md is more interesting than yet another design tool. It's portable. Completely tool-agnostic. Drop it in your repo root and any AI coding agent that reads context files will pick it up. Claude Code reads it. Cursor reads it. Copilot reads it. Your agent generates UI that respects your design system without you having to re-explain your brand colours in every prompt.

You can extract a DESIGN.md from any existing URL. Stitch scrapes the design tokens from a live site and produces the file for you. Got a client's marketing site and need to build an internal tool that matches? Point Stitch at their URL, grab the DESIGN.md, and your coding agent generates components that feel like they belong.

Think of it as the README.md of design systems. Designers define the system, developers commit the file, agents consume it. No Figma plugin, no design token pipeline to maintain, no arguing about whether that blue is #2563EB or #3B82F6. It's in the file. Full spec is at stitch.withgoogle.com/docs/design-md/overview.

Here's what a minimal one looks like:

## Colors
- Primary: #1A73E8
- Primary Dark: #1557B0
- Background: #FFFFFF
- Surface: #F8F9FA
- Error: #EA4335
- Text Primary: #202124
- Text Secondary: #5F6368

## Typography
- Font Family: Inter, sans-serif
- Heading 1: 32px, 700 weight
- Body: 16px, 400 weight
- Caption: 12px, 400 weight

## Spacing
- Base unit: 8px
- Values: 4, 8, 16, 24, 32, 48px

## Components
- Button border radius: 8px
- Card shadow: 0 1px 3px rgba(0,0,0,0.12)
- Input border: 1px solid #DADCE0

That's it. Your agent reads this and every button, card, and heading it generates uses those exact values.

Getting Started

Free at stitch.withgoogle.com with a Google account. The workflow is straightforward:

Generate a DESIGN.md from an existing site URL, or create one from scratch in Stitch's editor.
Drop it in your repo root (or .stitch/DESIGN.md if you prefer keeping things tidy).
Prompt your AI coding agent to build UI. It picks up the design system automatically.

For quick iteration, Stitch's canvas lets you generate multiple connected screens and prototype interactions between them before you write a line of code. When you're happy, export the HTML/CSS and hand it off to your component framework of choice.

When It Falls Short

The generated code is clean HTML/CSS but not component-ised. You'll want to refactor the output into React, Vue, or whatever you're running. Complex interactions beyond simple navigation need building by hand after export. And the five-screen limit means larger apps get done in batches.

The DESIGN.md format isn't standardised either. It's Google's convention and other tools happen to work with it because markdown is universal. No formal spec. If Google bins Stitch tomorrow, the file is still useful, but the tooling around it disappears.

The Bigger Question

Are we watching the design-to-development workflow get properly rewired here?

For years, the pipeline has been: designer creates in Figma, developer squints at the spec, developer approximates it in code, designer files a ticket saying the padding is wrong, developer adjusts by 4 pixels, repeat until someone gives up or the sprint ends. The tooling between design and code has always been niche, expensive, and full of cognitive load. Figma-to-code plugins. Style dictionaries. Design token pipelines. Handoff ceremonies. None of it flows naturally.

DESIGN.md is interesting because it turns design into an intermediate DSL that makes the whole thing commodity for developers. A markdown file. In your repo. That your agent reads. No Figma plugin, no handoff, no "inspect mode." The design system is code-adjacent from the start.

Now, I've been told many times by design organisations that those extra 10 pixels on a button contribute to 1% of revenue for large companies. Maybe they do. At scale, design systems matter enormously. The pixel-level precision, the A/B testing of border radius changes, the obsessive consistency across 200 screens. That's real work with real business impact, and I'm not dismissing it.

But for the 95% of teams that aren't operating at that scale? The ones shipping MVPs, building internal tools, prototyping features? The Figma-to-dev pipeline is overhead they can't afford. DESIGN.md gives them "good enough consistency" at near-zero cost. Write the file once, every agent respects it, move on.

The question is whether this eventually scales up to replace the enterprise design workflow too, or whether there's a permanent split: DESIGN.md for speed, Figma for precision. I reckon both survive, but the percentage of work that needs Figma-level precision shrinks every time the AI-generated output gets a bit better.

Bottom line: DESIGN.md might be the most consequential thing about Stitch. Not because the design tool is revolutionary, but because a plain markdown file as the bridge between design and code is such a stupidly simple idea that it makes you wonder why we spent a decade building elaborate pipelines instead.

UI/UX Pro Max: Stop Your AI Making Everything Look the Same

Steven Gonsalvez — Sun, 26 Apr 2026 19:45:49 +0000

The Problem

Ask any AI coding tool to "build me a dashboard" and you get the same thing every time. Inter font. Purple-to-blue gradient. Cards with rounded corners. Drop shadows everywhere. It looks like every other AI-generated dashboard because the model defaults to what it's seen most often in training data.

UI/UX Pro Max is a skill that gives your agent actual design taste. 60,000+ stars. 50+ distinct UI styles, 97 colour palettes, 57 font pairings, 25 chart types, and design system generation. Install it and your agent stops defaulting to the same generic SaaS template.

How It Works

npx uipro-cli

It installs as a Claude Code skill (works with Cursor and others too). When you ask for UI, the agent picks from a curated library of styles and applies consistent design tokens instead of improvising. The difference is immediately visible. Layouts that look like a human designer touched them, not a model trained on Tailwind UI screenshots.

Pairs well with Impeccable for the anti-pattern side (what NOT to do) and Google Stitch for the DESIGN.md system. Stack all three and your agent-generated UIs stop looking like AI slop.

PinchTab: 12MB Binary That Replaces Playwright for AI Agents

Steven Gonsalvez — Sun, 26 Apr 2026 19:45:42 +0000

Playwright is brilliant for CI testing. But when you're giving an AI agent browser access, it's like handing someone a fire hose when they asked for a glass of water. The agent doesn't need the full DOM. It needs to know what's on screen and how to click things.

PinchTab gets this right. It's a 12MB Go binary, zero dependencies, that starts an HTTP server and gives your agent REST endpoints to control Chrome. The trick is it serves the Accessibility Tree instead of raw HTML. That's roughly 800 tokens per page instead of the 4,500 to 12,000 you'd get from Playwright dumping the full DOM. For agents burning through context windows, that's a proper big deal.

Elements get stable refs like e5 instead of fragile XPath selectors or pixel coordinates. Your agent says "click e5" and it clicks. Deterministic. No guessing, no "click at coordinates 340,220 and hope the layout hasn't shifted."

Stealth mode is baked in. It masks navigator.webdriver and spoofs Canvas/WebGL fingerprints, which matters if you're automating sites that actively block headless browsers.

Any agent that can make HTTP calls can use it. Claude Code, Cursor, whatever. No MCP server needed, just plain REST. There's multi-instance orchestration with a dashboard if you're running several browsers at once.

One thing to flag: there's an open SSRF vulnerability (CVE-2026-30834) when I last checked. Worth looking into before you deploy it anywhere public-facing. For local agent use it's fine, but I wouldn't put it on a server without patching that first.

Runs on macOS, Linux, and Docker. MIT licensed. I reckon this or something like it is where agent browser tooling ends up, because feeding 12,000 tokens of DOM soup to an LLM for every page visit was always a bit mental.

Deepgram: $200 Free STT That Makes Voice Coding Actually Work

Steven Gonsalvez — Sun, 26 Apr 2026 19:45:36 +0000

Why This Matters for Coding Agents

Voice input for coding used to be a gimmick. Whisper was slow. Commercial options cost a fortune. The latency between speaking and text appearing was long enough to break your train of thought.

Deepgram changed the maths. Their Nova-3 model does real-time streaming transcription fast enough that the text appears as you speak, not after. And the free tier gives you $200 in credit, which is roughly 12,000 minutes of transcription. That's a lot of talking before you pay a penny.

The Vibe Coding Angle

Wire Deepgram into any STT tool (justspeaktoit, a custom script, whatever) and suddenly voice is a real input method for your coding agents. "Refactor the auth middleware to use the new token format" spoken out loud, transcribed in under 200ms, piped into Claude Code. No typing. No context switch.

The accuracy on technical speech is surprisingly good. It handles "refactor," "middleware," "useState," "async await" without flinching. Not perfect on obscure library names, but proper solid on the vocabulary you actually use while coding.

Getting Started

# Sign up at deepgram.com, grab your API key
# $200 free credit, no card required

# Quick test with curl
curl -X POST "https://api.deepgram.com/v1/listen" \
  -H "Authorization: Token YOUR_KEY" \
  -H "Content-Type: audio/wav" \
  --data-binary @audio.wav

Or just install justspeaktoit which wraps Deepgram with a macOS menu bar app. Press hotkey, speak, text appears. Sorted.

Entire CLI: Git Blame for the AI Era

Steven Gonsalvez — Sun, 26 Apr 2026 19:45:29 +0000

Who actually wrote this code?

Git tells you what changed. Entire tells you why, and who. Or what.

Here's the problem. You're running Claude Code, Codex, Gemini CLI, whatever. The agent writes a hundred lines, you tweak ten, commit, push. Six months later someone's debugging that function and git blame says your name. But you didn't write most of it. You don't even remember what prompt produced it. The reasoning, the agent's decisions, the tool calls, all gone.

Entire hooks into your git push and captures the full AI session. Prompts, responses, tool calls, files touched, token usage. Everything gets stored on a hidden branch (entire/checkpoints/v1) so your main history stays clean. Each commit gets a 12-character Checkpoint ID linking back to the session on their dashboard.

The line-level attribution is the bit that matters. Not just "AI helped with this file" but actual percentage breakdowns of which lines were agent-written versus human-written. For audits, for onboarding, for debugging at 2am when you need to understand intent behind code you didn't write. Proper useful.

Thomas Dohmke (ex-GitHub CEO) started this with a $60M seed. Agent-agnostic by design, works with Claude Code, Codex, Gemini CLI, Cursor, the lot. I reckon every team shipping AI-assisted code daily is going to need something like this eventually.