Using Claude Code to solve Advent of Code 2025

#claude #claudecode #programming #ai

Introduction

Let's be honest: with LLMs, the fun of Advent of Code is gone. You can paste any puzzle into ChatGPT or Claude and get a working solution in seconds. So I did it anyway, but to run a different experiment with Advent of Code 2025: what if I didn't write a single line of code? Instead, I gave Claude Code a single instruction file and let it solve the puzzles completely autonomously.

The result: 20 out of 22 challenges solved (91% success rate) with zero human-written code.

Check out my repo for more details

The Setup

I created a single file called INSTRUCTIONS.md with a 12-step process for each day:

Create a folder ./day_xx/
Navigate to the Advent of Code puzzle page
Save the input to ./day_xx/input.txt
Read Part 1, write strategy in ./day_xx/README.md
Write ./day_xx/part1.py
Test with examples
Run against actual input and submit
Write Part 2 strategy in README
Write ./day_xx/part2.py
Test Part 2
Run Part 2
Submit Part 2 answer

Then I ran: claude --chrome --dangerously-skip-permissions

Note: The --dangerously-skip-permissions flag bypasses all safety checks. This is terrible for production use, but necessary for this experiment where the agent needed to navigate websites and submit answers autonomously.

What Happened

Claude Code executed the entire workflow independently:

Used Chrome integration to navigate Advent of Code
Read puzzle descriptions on its own
Developed solution strategies and documented them
Wrote and tested Python code
Submitted answers to the website
Self-corrected when answers were wrong

Zero lines of code written by me. Just the instruction file.

Results

Completed: Days 2-8 (both parts), Day 9 Part 1, Days 10-11 (both parts), Day 12 Part 1

Failed: Day 9 Part 2, Day 12 Part 2

Total: 20/22 challenges = 91% autonomous completion

The repository generated approximately 42 Python files across days 2-12, each with full solution code, test files, and documented reasoning.

Example: Day 2 Strategy

Here's how Claude Code documented its approach for Day 2 (from the auto-generated README):

Part 1: Detect product IDs where "any substring starting from position 0 appears immediately after itself" (exactly twice repetition).

Part 2: Expand to catch IDs where "the entire string can be formed by repeating that substring at least 2 times."

The agent independently reasoned through the problem, identified the algorithmic approach, and implemented it - all without human guidance beyond the instruction template.

Limitations

Even with 91% success, the agent failed on 2 challenges. Looking at the failures:

Day 9 Part 2: Complex disk defragmentation problem that likely needed algorithmic insight the agent couldn't generate
Day 12 Part 2: Blocked by Day 9 Part 2's failure (dependency issue)

Some problems still require human algorithmic intuition and creative problem-solving. The agent excels at execution but can struggle with novel algorithmic insights.

Conclusion

This wasn't about pair programming or AI assistance. This was about autonomous execution from start to finish.

The agent navigated websites, read natural language descriptions, formulated strategies, wrote code, debugged failures, and submitted results - all independently. The only human input was a procedural instruction file.

Are we ready for fully autonomous development? Not quite. That 9% failure rate matters, especially when complex algorithmic thinking is required. But 91% autonomous completion on varied programming challenges suggests we're closer than I expected.

The future isn't AI replacing developers. It's developers orchestrating autonomous agents - providing high-level direction while the agent handles execution, testing, and iteration.

As I watched Claude Code navigate Advent of Code independently, I realized: the question isn't "can AI code?" anymore. It's "what level of abstraction should humans work at when AI handles the implementation?"

Check out the full repository to see all the auto-generated code and conversation transcripts.