Juan Torchia

Posted on Apr 29 • Originally published at juanchi.dev

Who owns the code Claude Code wrote? I ran git blame on a real project and the result is uncomfortable

#english #typescript #claudecode #agentesia

Who owns the code Claude Code wrote? I ran git blame on a real project and the result is uncomfortable

I made a mistake that wasn't a typo or a logic bug — it was epistemic. For three months I used Claude Code like it was a glorified autocomplete, without thinking about what was happening to the authorship of everything I was committing. The code worked. Tests passed. Deploys came out clean. And I signed every commit as if I'd written every single line.

Two weeks ago I saw the HN thread "Who owns the code Claude Code wrote?" climb to 407 points and the first thing I thought was: I don't know the answer for my own repository either.

So I went to find out.

AI-generated code and intellectual property: the actual state of things

My thesis, before the numbers: IP ownership of AI-generated code is not an urgent legal problem for most devs — it's an operational accountability problem that explodes when something fails in production and nobody can sign off on the chain of decisions.

The legal framework is genuinely murky. The USPTO said in 2023 that AI output without human creative intervention isn't patentable or registrable as authored work. The U.S. Copyright Office has open cases. There's no specific case law in Argentina. No big company is suing any individual dev for using Claude Code in a SaaS project.

But that legal vacuum isn't what keeps me up at night. What keeps me up at night is this: if a critical bug appears in code Claude generated, who understands it well enough to fix it at 2am? Who can face the client? Who signs the postmortem?

That's what git blame revealed.

The experiment: git blame on AI-generated code

The project is an event processing backend — Next.js API routes, PostgreSQL on Railway, a couple of async workers. I started it in February, used it as a sandbox to go deep with Claude Code. Exactly the context I mentioned when I published the first Claude Code analysis on the Pro plan.

I ran this:

# Script to count lines per author in the repo
# Excludes auto-generated files and node_modules

git log --format='%H' | while read commit; do
  git show --stat "$commit" | tail -1
done

# More surgical version: blame per file
git ls-files '*.ts' '*.tsx' | while read file; do
  git blame --line-porcelain "$file" 2>/dev/null \
    | grep '^author ' \
    | sed 's/^author //'
done | sort | uniq -c | sort -rn

The output left me staring at the screen for a while:

# Real result — backend project, 4.2k lines of TypeScript
# (excluding package-lock, auto-generated migrations and fixtures)

   2587  Juan Torchia
   1634  Claude (via Claude Code)

61% me, 39% Claude Code. But that's the average. When I filtered only the business logic files — the handlers, services, event parsers — the number flipped:

# Business logic files only (services/, handlers/, lib/)
git ls-files 'src/services/*.ts' 'src/handlers/*.ts' 'src/lib/*.ts' \
  | while read file; do
    git blame --line-porcelain "$file" 2>/dev/null \
      | grep '^author '
  done | sort | uniq -c | sort -rn

# Output:
#    412  Claude (via Claude Code)
#    289  Juan Torchia

59% Claude, 41% me. In the heart of the system, the AI wrote more than I did.

Now the uncomfortable question: can I explain each of those 412 lines if someone asks me in a code review? Can I debug them without reading the full diff first?

The honest answer is not always.

Where accountability breaks — not the law

This connects to something I learned last year, when an agent deleted my production database and the viral HN post about that incident left out exactly what mattered: who had the context for the rollback.

With AI-generated code, the accountability problem has three layers:

Layer 1: surface-level understanding
You accept Claude Code's output because it works and the tests pass. You don't question it because it doesn't look weird — Claude generates clean, well-structured code with reasonable variable names.

Layer 2: absence of design memory
The code exists but the decision to write it that way is nowhere. There's no comment saying "I chose this implementation because X." No commit message explaining the tradeoff. The decision lived in the conversation context with Claude that no longer exists.

Layer 3: the impossible postmortem
When something blows up, git blame gives you the commit author. But the commit is you — because you pushed it. The actual author of the logic has no email address to cc on the incident.

// This handler was generated by Claude Code in February
// I committed it unchanged because it "worked"
// Today I couldn't explain why it uses this retry strategy
// without reading the full code again

export async function processEventWithRetry(
  event: ProcessableEvent,
  maxAttempts = 3
): Promise<ProcessResult> {
  // Exponential backoff with jitter — why jitter?
  // Why this specific formula? I don't have it memorized.
  const delay = (attempt: number) =>
    Math.min(1000 * 2 ** attempt + Math.random() * 1000, 30000);

  for (let attempt = 0; attempt < maxAttempts; attempt++) {
    try {
      return await processEvent(event);
    } catch (err) {
      if (attempt === maxAttempts - 1) throw err;
      await sleep(delay(attempt));
    }
  }
  throw new Error("unreachable");
}

That code is correct. The jitter is standard practice to avoid thundering herd. But I didn't make that decision — I accepted it. The difference matters when someone asks me in production whether we can lower the max delay from 30 seconds.

The mistakes I made (and that you're going to make)

Mistake 1: committing without a design message
Every time I accepted Claude Code's output without documenting why that implementation, I lost irrecoverable context. The fix I landed on: commit messages with an [AI-context] section where I note the design decision I asked Claude for.

# Format I use now for commits with AI code
git commit -m "feat: retry handler with exponential backoff

[AI-context] Asked Claude Code for a retry strategy
that would tolerate thundering herd in concurrent workers.
Chose this implementation over simple polling because
event volume can spike to 500/min at peak.

Reviewed: delay logic, non-retryable error handling.
Did not review in depth: edge cases for event ordering."

Mistake 2: not having a complexity cutoff
I accepted whatever Claude generated as long as it passed tests. That's a recipe for code you can't maintain. My current rule: if I can't explain the implementation in two sentences without re-reading the code, I don't commit it until I understand it.

Mistake 3: confusing "code that works" with "code I understand"
Here's the real ownership problem. It's not legal — it's cognitive. Code you don't understand isn't yours, even if your name shows up in git blame. Real ownership of code is the ability to modify it with confidence.

FAQ: intellectual property of AI-generated code

Does code generated by Claude Code have copyright?
For now, in most jurisdictions: not autonomously. The U.S. Copyright Office requires human authorship. There is a gray zone when there's human "selection and creative arrangement" — meaning when you design the architecture and Claude implements it. Anthropic waived any claim over output in their terms of service. So if anyone has a claim, it's you. But that claim is weak if the human contribution was just "I wrote the prompt."

Can I use AI-generated code in a commercial project?
Yes, and most big companies are doing it. The real legal risk today isn't copyright — it's contractual indemnification. Some enterprise contracts have clauses requiring that code be "original" in the sense of not deriving from third-party works. Check the contract with a lawyer, not a blog post (including this one).

What if Claude Code reproduces code with a restrictive license?
That's the risk GitHub Copilot made visible in 2022 with the GPL block reproduction case. Anthropic says Claude was trained to avoid it, but there's no absolute technical guarantee. For critical commercial projects, tools like Amazon CodeWhisperer have a reference tracker that at least raises an alert.

Does git blame protect me or expose me?
Both. It protects you because it records that you made the decision to include that code — there's a human in the chain of responsibility, which is what emerging regulatory frameworks (including the EU AI Act) are looking for. It exposes you because if there's a legal or security issue, the commit under Juan Torchia's name says Juan Torchia consciously accepted that code.

How do I know what percentage of my code is AI-generated?
Without specific tooling, the most honest approximation is the git blame script I showed above, combined with searching your Claude Code conversation history if you have access. Some companies are starting to require this metric as part of software audits — similar to how supply chain attack reports now include dependency origin.

Does this matter for open source projects?
Yes, and more than for private projects. Several open source organizations already have explicit policies: the FSF doesn't accept AI-generated contributions. Linux kernel either. The argument is that you can't sign the DCO (Developer Certificate of Origin) on code you didn't write. If you contribute to projects with DCO, check the policy before sending a PR with Claude code.

My take, unfiltered

The question "who owns the code Claude Code wrote?" is the wrong question. The right question is: who can answer for that code when something goes wrong?

And the answer, today, has to be you.

Not because the law says so clearly — it doesn't. Not because Anthropic requires it — it can't. But because if you can't defend every line of code you push to production, you're building a system you don't control. And that, sooner or later, has real consequences that go well beyond a HN thread with 407 upvotes.

What I changed in my workflow after this experiment: I actively review the code Claude Code generates before committing it, I document design decisions in the commit message, and I keep a mental line of "can I explain this at 3am during an incident?" It's not perfect. But it's honest.

The 39% of my project that Claude Code wrote is still there. I'm not going to rewrite it — that would be wasting time on code that works. But I am going to know it better, line by line, before the next production deploy.

Same as when I reviewed my Postgres backups after the pgbackrest situation — not because there was an incident, but because I discovered I had confidence in something I hadn't audited. The pattern is the same: the tool is fine, the problem is blind trust in it.

If you're using Claude Code in production and you've never run git blame to see what percentage of lines are actually yours, now is the time. The number you find is probably going to be uncomfortable. That's fine. Discomfort is information.

Did you run the experiment on your own repo? The numbers you found interest me a lot more than any theoretical copyright discussion.

This article was originally published on juanchi.dev

DEV Community

Who owns the code Claude Code wrote? I ran git blame on a real project and the result is uncomfortable

Who owns the code Claude Code wrote? I ran git blame on a real project and the result is uncomfortable

AI-generated code and intellectual property: the actual state of things

The experiment: git blame on AI-generated code

Where accountability breaks — not the law

The mistakes I made (and that you're going to make)

FAQ: intellectual property of AI-generated code

My take, unfiltered

Top comments (0)