Over the past few months, I've been living in Claude Code. I'm talking about serious, production-level work—building new services, refactoring legacy systems, rebuilding our integration test infrastructure from scratch. If I'm being honest, over 80% of the code I've shipped in that time was generated by AI.
But here's what nobody tells you: AI-generated code isn't replacing my engineering skills. It's amplifying them. And more importantly, it's exposing every gap in my own thinking.
The code Claude produces will only be as good as the design you drive it toward. After months of experimentation—vibe coding, setting draconian rules, trying every configuration imaginable—I've learned that great AI-assisted development isn't about prompting harder. It's about thinking better.
The Journey: From Chaos to Clarity
When I first started with Claude Code, I did what most developers do: I described what I wanted and hoped for magic. Sometimes it worked. Often it didn't.
Then I went in the opposite direction. Every instruction was spelled out. The results were getting better, but my performance wasn't. I was spending sometimes more time rewriting prompts than it would take me to write the code.
At some point then, I realized that I was using Claude Code wrong and that the problem of the code not getting the results I expected soon was because Claude Code is actually just a tool that reflects what your design and architecture are.
In summary, I realized that if your design is bad, the code it will generate will be bad. If your principles are clear, the code will be clear.
Design First, Generate Second
Here's what changed everything for me: I stopped thinking about "getting AI to write good code" and started thinking about describing my design in the most clear way possible so anyone can understand - AI or Human.
In my CLAUDE.md, I now keep a section like this:
markdown
## Code Quality Principles
**Function Design:**
- Pure functions with no side effects whenever possible
- Max 20 lines per function (guideline, not law)
- Max 3-4 parameters - use objects for more complex inputs
- Names should read like sentences: `calculateUserSubscriptionStatus()` not `checkUser()`
**Testing Philosophy:**
- Test-Driven Development - write the test first, always
- Each test should validate one behavior
- Use descriptive test names that explain the "why": `should_return_null_when_user_has_expired_trial`
**CRITICAL: Never use mocks (jest.fn(), jest.mock(), etc.) in tests.**
Instead:
- **Unit tests**: Create **fakes** (in-memory implementations) in `tests/__fakes__/`
- **Integration tests**: Use **Testcontainers** for real dependencies
**Fake example:**
// tests/__fakes__/database.fake.ts
export function createFakeDatabase(): Database {
const links = new Map<LinkId, Link>();
return {
async getLink(id: LinkId) {
const link = links.get(id);
return link
? { success: true, data: link }
: { success: false, error: { type: 'NOT_FOUND', linkId: id } };
},
async saveLink(link: Link) {
links.set(link.id, link);
return { success: true, data: link };
},
};
}
But here's the crucial part: these aren't arbitrary rules.
They're my answer to 'what does good look like in practice.
When I say "pure functions with no side effects," I'm communicating that I value testability and predictability over clever stateful solutions.
When I enforce 20-line functions, I'm saying I value readability and single responsibility.
Claude Code doesn't just follow these rules—it internalizes the philosophy behind them.
The TDD Cycle with AI: Green, Reflect, Refactor
The most powerful pattern I've found combines Test-Driven Development with AI iteration.
It goes like this:
- Write the test first (or have Claude write it based on your specification)
describe('MCP Server Authentication', () => {
it('should reject requests without valid API keys', async () => {
const request = createMockRequest({ apiKey: 'invalid' });
const response = await handleMCPRequest(request);
expect(response.status).toBe(401);
expect(response.error).toContain('Invalid API key');
});
});
Let Claude implement
Sometimes it nails it. Sometimes it doesn't. And here's the counterintuitive part: that's okay. If the test passes but the implementation feels wrong, you've learned something valuable.Reflect before refactoring
When the output isn't what you expected, pause. Don't immediately ask for a rewrite. Ask yourself:
Was my architectural vision clear in the prompt? Did I provide enough context about existing patterns in the codebase? Are my guidelines actually contradictory? Did I specify what to build but forget why this feature exists?
- Refactor with intention Now go back to Claude with clarity:
"The authentication logic works, but it's mixing concerns. Let's separate API key validation into its own pure function that returns a Result type. The handler should only orchestrate, not implement business logic."
The second iteration is almost always better—not because Claude "learned," but because you clarified your thinking.
Real Example: Test Infrastructure Rewrite
Let me give you a concrete example from my current work. I'm rewriting our integration tests—tests that have been running against actual production-like environments for years. These tests are slow (some take 5+ minutes), flaky (they fail randomly), and frankly, embarrassing. They're pure .js files with some functions and assertions thrown in. No real structure or patterns.
The complexity is real: we need authentication strategies that work in test environments, we need to mock AWS services (S3, SQS, DynamoDB), we need database state management, and we need all of this to be fast and reliable.
First attempt (vibe coding):
"Rewrite these integration tests to use Testcontainers and proper test architecture."
Claude generated... tests. They used Testcontainers. They had a setup and teardown. But they didn't even pass.
Some have failed because they couldn't authenticate.
Others failed because they didn't change the configuration and were trying to go on AWS instead of LocalStack, others were having cross dependencies with other tests and would be flaky - exactly the opposite of what we wanted to achieve.
Second attempt (after reflection):
I realized I was asking Claude to solve a design problem I hadn't solved myself. So I stepped back and documented the actual testing architecture:
## Integration Testing Principles
**Test Isolation:**
- Each test runs in its own Testcontainer instance
- Database state resets between tests using transactions
- No shared state between test cases
- Tests can run in parallel without conflicts
**Authentication Strategy:**
- Mock Authentication Service - We are focusing on the behaviour, not the authorization
**AWS Service Mocking:**
- LocalStack for S3, SQS, DynamoDB in containers
- One LocalStack instance per test file (isolated but fast)
- Test fixtures populate services in beforeAll
- Cleanup in afterAll, not in individual tests
**Test Structure:**
- Arrange: Set up test data using factory functions
- Act: Make actual HTTP requests to the service
- Assert: Verify both response and side effects (DB state, queue messages)
- Each test validates ONE behavior path
Then I prompted:
"Rewrite the links generation for tests following our integration testing principles. Use Testcontainers for MySQL, LocalStack for AWS Service. Show me the test structure for one complete test case."
The result? Fast, reliable, and actually maintainable.
Not because I prompted better, but because I designed better.
The better part? When I asked Claude to rewrite the next test file, it already understood the patterns. It knew the setup to follow.
The QA Multiplication Effect
Here's where this gets really interesting for teams. I'm on a 12-developer team with only 2 QAs company-wide. That's a bottleneck. My mission has been pushing toward 99% automated QA coverage.
Claude Code isn't just writing my application code—it's writing my test infrastructure. But only because I've been crystal clear about what quality means:
What This Means for Your Engineering Career
If you're worried that AI will make you obsolete, you're looking at it wrong. Claude Code isn't replacing engineers—it's raising the bar for what engineering means.
The skills that matter now:
System design thinking - Can you articulate how components should interact?
Architectural decision-making - Can you explain why this approach over that one?
Code review expertise - Can you spot the subtle bugs, the performance issues, the maintainability problems?
Quality standards definition - Can you define what "good" looks like for your domain?
These are senior-level skills. If you've been coasting on syntax knowledge and Stack Overflow copy-paste, yeah, you're in trouble. But if you're actually engineering—thinking about trade-offs, designing systems, setting quality standards—you've just gained a superpower.
The Uncomfortable Truth
Here's what I wish someone had told me when I started: Claude Code will ruthlessly expose the gaps in your own thinking.
If you can't explain your design clearly enough for an AI to implement it, you probably can't explain it to a human teammate either.
If your code quality standards are vague, Claude will produce vague code. If you don't actually understand the architectural patterns you're trying to use, the generated code will be architecturally confused.
This is uncomfortable. It's much easier to blame the tool. But it's also an incredible learning opportunity.
Every time Claude produces code that's "not quite right," you have a choice: get frustrated, or get curious. Why didn't it work? What was unclear? What assumption did I make that I didn't communicate?
Practical Advice: Start Small, Think Big
If you're just getting started with Claude Code:
Document your non-negotiables first
Before you generate a single line of code, write down your quality standards. What does "good code" mean in your context? Start with 5-10 core principles.Use TDD as your safety net
Write tests first. Let Claude implement. If tests pass but code feels wrong, you've learned something about your specification.Iterate in public
Share your CLAUDE.md files with your team. Debate them. Refine them. The best engineering standards are the ones the whole team understands and buys into.Measure what matters
Track: test coverage, bug escape rate, time-to-production. Not: lines of code generated, prompts per day. Optimize for outcomes, not activity.Embrace the reflection cycle
When output isn't what you wanted, resist the urge to immediately re-prompt. Take five minutes to understand why. Write it down. Update your guidelines. Then re-prompt.
The Future Is Already Here
We're in a weird transitional moment. Some developers are treating Claude Code like autocomplete++. Others are treating it like a junior developer to micromanage. Both are missing the point.
The developers who will thrive are the ones who realize this:
Claude Code isn't a replacement for engineering skills. It's a mirror that reflects the quality of your engineering thinking.
If your design is bad, your code will be bad. If your architecture is elegant, your code will be elegant. If your quality standards are high, your output will be high-quality.
Claude Code will be as good as you are.
Top comments (0)