The Reality Check
We've all been there. You see those viral tweets about "vibe coding" – just tell AI what you want and boom, instant production-ready app! I was ready to ride that wave straight into the sunset.
Then I fired up my laptop, and reality came crashing down.
I ended up with a giant mess of spaghetti code I didn't understand and an app that "technically" worked sometimes if you squinted at it just right. I wasn't happy, and once you create a big mess like that, you'd rather just start over than start debugging.
Context engineering may seem like just a different buzz word, but, it really works. With some structure and planning, AI can actually produce working software. I turned my idea for an Azure DevOps MCP server into a testable working product in just one day using Claude Code. Here's how.
The Secret Sauce: My Workflow
My approach wasn't revolutionary, but the magic was in guiding the AI to work like a real software engineer:
- Create a detailed PRD (Product Requirements Document)
- Write a workflow guide for how tasks should be completed
- Break features into bite-sized chunks the AI can handle
This might sound basic, and definitely boring, but – the difference between throwing a paragraph of hopes and dreams at Claude versus co-creating structured context is like night and day.
Setting Up for Success
🏗️ Repository Structure
First tip: AI coding assistants automatically read certain files for context (like CLAUDE.md
or GEMINI.md
). I used this to my advantage by creating a file that referenced all my planning documents. Every time I opened Claude Code, it had the full context ready to go.
The other handy thing is that by creating files in specific directories you can create custom slash / commands to make actions easier to perform. I'll show how to utilize this later in the article, but lets jump into the workflow.
📋 The Power of Planning
Instead of jumping straight into code, I spent time creating a proper PRD with Claude's help. I used prd.md influenced by snarktank's create-prd.mdc's great starting place to generate the PRD. I copied the prd.md file into ~/.claude/commands/prd.md
so that I could use it as a slash command like below:
/prd "Please create a design for an mcp server for azure devops. I would like it to be able to run azure pipelines, wait for them to complete and then get the resulting logs if there is a failure."
The nice thing about this template is that it doesn't just generate the PRD immediately, it prompts Claud to ask insightful questions about:
- Non-functional requirements
- User personas
- Edge cases you hadn't considered
After going through the short planning phase I had
After the AI Generate plan was done I took some time to read it, and add:
- User journey maps
- API documentation links
- Resource references
to create my planning doc
Giving these concrete details gave the right information and places to look for when Claude was planning these features. Gathering all the data started to make me feel like I was stuck doing the boring part of coding, research, while Claude got to do the fun part, coding, but in the end it make a huge difference.
🔄 Think Vertically, Not Horizontally
Next step is to break down the big feature prd into small tasks that can be accomplished in one shot by Claude. To do this I copied create_tasks.md into ~/.claude/commands/create_tasks.md
and ran:
/create_tasks based on our prd
This would then go through the process of creating tasks, and then breaking down the tasks into subtasks. An important note is how this file prompts it to create tasks.
Originally when planning tasks a mistake I kept making was letting the AI plan tasks by activity instead of by feature. It would suggest:
- Task 1: Write all the code
- Task 2: Write all the tests
- Task 3: Set up CI/CD
Don't do this! It's a recipe for spaghetti!
Instead, the create_tasks.md enforces vertical feature development. Each task had to deliver a complete, shippable feature including:
- The actual code
- End-to-end tests
- Observability
- Documentation
This helped guide the task breakdown phase to be thinking about the features as little slices of deliverable working software, instead of bundling and trying to deliver every feature all at once and never getting anywhere. Next is actually doing the work!
The Execution Workflow That Actually Works
Once we have a PRD to describe our high level goal, and tasks of how to achieve that goal, we get down to actually doing the work. After some trial and error, I developed a workflow that kept Claude on track:
- Research and Exploration - Understand the APIs and requirements
- Write the Code - Implement the feature
- Write End-to-End Tests - Verify it actually works
- Manual Testing - Try it out yourself
- Update CI/CD - Keep the pipeline green
- Document Everything - Future you will thank you
The most important thing I found was to make sure it was always a full loop. Without making Claude Exercise the code, or run the tests, it would end up with overly triumphant messages about how great the feature it made is and how well it works.. but then you try to run the tests or use the feature and it wouldn't even compile. The hallucinations are real...
To have Claude complete tasks once again copy complete_tasks.md to ~/.claude/commands/complete_tasks.md
and then within Claude Code run:
/complete_tasks
and it would automatically start working on the first task the way I want. I will say that out of all the task files the complete_tasks.md is definitely the most opinionated of the bunch, so don't be afraid to modify that to your own coding style and workflow.
Now that Claude is working on tasks, I'll go through the flow in a little more depth to explain why I have it work the way I do.
🔍 Research First, Code Second
I noticed Claude would often jump straight into coding and then struggle with basic API calls. So I made it slow down and research first:
- Read the actual API documentation
- Try example curl commands
- Understand the data structures
This extra step eliminated so many silly mistakes and hallucinations, and saved a ton of time it used to spend trying to brute force things without the knowing data types it needed or return types.
🧪 The Testing Trap
This might be a hot take, but: only let AI write end-to-end tests.
I know this turns Martin Fowler's testing pyramid upside-down, but here's why: if you let an AI mock one thing, it will mock EVERYTHING. I can't count the times I've seen green tests that looked like:
def test_api_call():
# Mock the API
mock_api.return_value = "success"
# Test passes! 🎉
assert mock_api() == "success"
Cool test, bro. You tested that a mock returns what you told it to return. 🤦
End-to-end tests force the AI to write code that actually works with real APIs and real data.
📚 Documentation That Doesn't Suck
LLMs love writing comments. They REALLY love it. I had to explicitly tell Claude:
- Use proper docstrings only
- No inline comment novels
- Focus on the "why", not the "what"
It will still occasionally try to add comments to every block of code to explain what it is doing, but it happens a lot less if you prompt for good documentation practices.
The Results
After all this prep work, the actual "coding" day was surprisingly smooth:
- Tell Claude to start on the task list
- Review the code and tests
- Fire up MCP Inspector to try it out
- If it works, commit and move to the next task
Once Claude got into the groove, it was churning out quality, tested, documented features consistently. After about 5 hours of me reviewing code, saying okay continue, walking away, repeat, I had an MCP server that could run Azure Pipelines, get the log results and tell me what was wrong with it-Exactly what I designed originally, it even had an end to end test suite that worked and documentation humans and LLMs could understand.
Key Takeaways
✅ Time spent planning saves 10x time debugging
✅ Vertical features > horizontal layers
✅ End-to-end tests keep the AI honest
✅ Context is king – feed your AI well
Working with LLMs is definitely a skill that's constantly evolving. But with the right structure and context, Claude makes an incredible coding partner. You just need to be the director with the vision who can provide enough context to get things done.
If you are interested in in depth usage of this framework to implement a new feature in an existing codebase, subscribe to get notified about my next post on how to add a new feature to an existing code base in depth.
All my planning and execution templates are available in my context-engineering GitHub repo
What's your LLM workflow? Drop a comment below – I'd love to hear how others are taming their AI coding assistants! And if this helps you build something cool, definitely let me know. 🚀
PS:
If you are using Azure DevOps checkout ado-mcp to streamline your pipelines development workflow with AI.
Inspired by Cole Medin's context engineering intro and this excellent YouTube guide
Top comments (0)