How I Actually Use Claude as a Senior Dev Partner (Not Just a Code Generator)

#claude

Most developers I know use Claude the same way: paste some code, ask a question, get an answer, move on. It works. But it's leaving most of the value on the table.

After a few months of treating Claude as a genuine engineering partner — not a smarter autocomplete — here's the actual workflow that changed how I ship.

The setup that makes everything else work: CLAUDE.md

The single highest-leverage thing I did was create a CLAUDE.md file at my project root.

Claude Code reads it automatically at the start of every session. That means I never re-explain my stack, conventions, or current focus. Every session starts with Claude already knowing what I'm working on.

Mine looks roughly like this:

# Project: [My App]

## Stack
TypeScript · Next.js · Prisma · PostgreSQL · Vercel

## Conventions
- Naming: camelCase functions, PascalCase classes, kebab-case files
- Error handling: throw typed errors, never swallow
- Testing: Vitest + RTL, co-located __tests__ folders
- Imports: absolute paths from src/

## Always
- Add JSDoc to public functions
- Handle null/undefined explicitly
- Separate route handlers from business logic

## Never
- No any types
- No console.log in production code

## Current focus
Auth refactor — migrating from JWT to session-based auth

Seven lines that save five minutes of context-setting every session, compounded across every working day.

XML structure: why it matters and why most people skip it

Here's the thing about Claude that most generic prompt packs miss: Claude is trained to treat XML-tagged input as structured data. It's in Anthropic's own prompting documentation. The difference in output quality is real.

Compare these two prompts for a security review:

Generic:

Review this code for security issues

XML-structured:

<task>
Security review this code. Check for:
- Injection: SQL, command, XSS
- Authentication and authorization flaws
- Sensitive data exposure

Rate each finding: Critical / High / Medium / Low.
Suggest exact fixes. Rank by severity.
</task>

The first gets a paragraph of vague suggestions. The second gets a structured report with severity ratings, affected lines, and copy-paste fixes. Same model, different output — because of how the input is structured.

This is the pattern I use for basically every non-trivial prompt now.

The feature development loop

My actual workflow for building a feature:

1. Orient Claude to the relevant code

Before writing anything, I run the codebase orientation prompt against the files I'll be touching:

<task>
You are a senior engineer onboarding me to this codebase.
Analyze the structure and produce:
1. High-level architecture summary (2-3 sentences)
2. Key entry points and primary data flow
3. Conventions and patterns used throughout
4. Non-obvious decisions or gotchas I should know
</task>
<codebase>
{{PASTE FILE TREE OR KEY FILES HERE}}
</codebase>

This takes 30 seconds and means Claude's first suggestion is grounded in how the code actually works — not how it guesses it works.

2. Generate with constraints, not just descriptions

When I ask for code, I include: language, framework, validation library, auth approach, layer separation, and TypeScript mode. "Explain your design decisions before writing code" is always in there — it stops Claude from jumping to implementation before understanding the problem.

3. Review before I trust

I don't just ship Claude's output. I run a structured code review prompt against it:

<task>
Review this code as a senior engineer.
For each issue found:
- File and line number
- Issue type: bug / security / performance / style / maintainability
- Severity: Critical / High / Medium / Low
- Specific fix with code

Only flag real issues. Do not pad the review.
</task>

The "only flag real issues, do not pad" instruction is important. Without it, you get a mix of legitimate concerns and nitpicky style suggestions with no way to tell them apart.

Debugging: hypotheses, not guesses

When I'm stuck on a bug, the most useful prompt I have is the debugging hypothesis generator:

<task>
I have a bug. Generate 5 hypotheses about the root cause.
For each:
1. Hypothesis (what you think is happening)
2. Confidence: High / Medium / Low
3. Test plan (exactly how to verify or rule it out)
4. Fix if confirmed

Rank by confidence. Most likely first.
</task>
<bug_description>
{{DESCRIBE THE BUG AND SYMPTOMS}}
</bug_description>
<relevant_code>
{{PASTE RELEVANT CODE}}
</relevant_code>

This changed how I debug. Instead of going with my first instinct, I get 5 ranked hypotheses with test plans. The right answer is usually in there, and it's often not the one I would have checked first.

Architecture decisions: getting real recommendations

Most AI tools hedge on architecture questions. You ask "should I use a message queue or direct API calls?" and you get "it depends" followed by a balanced list of both options.

The fix is to tell Claude explicitly:

<task>
I need a recommendation, not a balanced comparison.
Analyze the options and recommend one.
Be direct. If the answer is obvious given my constraints, say so.
</task>
<context>
{{YOUR ARCHITECTURE QUESTION AND CONSTRAINTS}}
</context>

Adding "be direct" and "if the answer is obvious, say so" consistently produces actual recommendations over diplomatic non-answers.

The part most developers skip: PRs and docs

I generate PR descriptions from git diff now. It takes 30 seconds:

<task>
Write a PR description for this diff.
Include:
- What changed (1-2 sentences)
- Why (the problem being solved)
- How (approach taken)
- Testing done
- Any gotchas for reviewers

Tone: direct, no marketing language.
</task>
<diff>
{{PASTE GIT DIFF}}
</diff>

The "no marketing language" instruction is doing a lot of work there. Without it, you get PR descriptions full of "this robust solution elegantly addresses the underlying concern."

What I actually ship vs. what I used to ship

The workflow above didn't come from one insight. It came from noticing that Claude's output quality varied enormously based on how I asked — and slowly building a set of prompts that consistently produced good output.

If you want the full set — 50 prompts across codebase onboarding, code generation, review, debugging, architecture, and docs — I packaged everything into a prompt kit with a companion workflow guide. It's at gumroad.com/l/50-claude-prompts if you want to skip the experimentation and start with what works.

Otherwise, start with CLAUDE.md and XML structure. Those two changes alone will noticeably improve every session.