A senior developer described her experience with Copilot agent mode last week:
"It feels like I'm babysitting sometimes."
She has 10 years of experience and two masters degrees. She's not bad at her job. She's using a tool she was handed without enough context about how to direct it.
That's not a talent problem. That's a training gap.
The Babysitting Problem Is Structural
When Copilot feels like babysitting, it usually means one of three things:
1. The prompts are too open-ended.
"Help me with this feature" puts all the decision-making back on you. You end up reviewing every suggestion, rejecting half, editing the rest. That IS babysitting.
Fix: Define the scope tightly. "Write the unit tests for this function. Cover happy path, null input, and the three edge cases in the spec comments. Don't touch the implementation." Now you're delegating, not supervising.
2. There's no context about your codebase conventions.
Generic Copilot output is often technically correct but practically wrong — it doesn't know your error handling patterns, your naming conventions, or which libraries you prefer. So everything needs review.
Fix: Front-load context. A CLAUDE.md or COPILOT.md in your repo with your conventions means the AI starts from your standards, not its defaults.
3. The task is the wrong size.
Copilot excels at bounded, well-defined tasks. "Refactor this module" is too large and too vague. "Extract this 80-line function into three smaller ones with these responsibilities" is the right size.
What "Not Babysitting" Looks Like
The shift happens when developers stop thinking of Copilot as a code generator and start thinking of it as a first-draft machine.
Here's what a senior dev who's past the babysitting phase sounds like:
"I described the behavior I wanted, gave it the relevant context from our codebase, told it what not to change, and ran the tests. Done in 8 minutes. Would have taken me 40."
She didn't supervise every line. She set the brief, reviewed the output, approved it. That's directing, not babysitting.
The 30-day ramp from babysitter to director is learnable. It's not about the tool getting better. It's about the human developing new prompting instincts.
Why This Is a Manager Problem, Not a Developer Problem
If your team is stuck in babysitting mode, the fix isn't "tell them to use it more."
The fix is:
- Pick one high-value, bounded workflow and train specifically for it
- Share what works — create a team prompt library, even if it's just a Notion doc
- Measure the right thing — not "are they using it" but "are they saving time on X task"
One targeted 3-hour session covering role-specific workflows moves teams from 20% utilization to 50%+ faster than any amount of self-paced video content.
The developers aren't the blocker. The missing structure is.
Where to Start
If your team is where most teams are — sporadic usage, mixed results, "it's fine but not transformative" — start here:
This week: Ask every developer on your team to pick one task they do every day that involves writing. Have them try doing it with Copilot for one week with a tighter prompt (behavior-focused, context-loaded, scoped constraints).
Next week: 30-minute team share. What worked? What didn't? What prompt patterns are emerging?
That's the seed of a team playbook. It compounds from there.
If you want the full framework — the 4-week ramp, the prompt patterns, the manager playbook — the first 3 modules are free:
👉 askpatrick.co/playbook-sample.html
And if you want to know whether your team's current utilization is where it should be:
👉 askpatrick.co/roi-calculator.html
Ask Patrick helps engineering teams actually use the AI tools they've already bought. Flat-fee training for your whole team. askpatrick.co
Top comments (0)