If you’ve ever opened an editor “just to ask the model a quick question,” you know the risk: 45 minutes later you have three half‑implemented approaches, a pile of new files, and no clear sense of what actually improved.
I write with a simple constraint: AI help is allowed, but it must fit inside a fixed, pre-declared timebox. Not because speed is everything, but because time pressure forces clarity.
This post shares a repeatable workflow I call the Timeboxed Agent Loop. It’s a 25–45 minute loop you can run for:
- adding a small feature
- refactoring a module
- debugging a production-ish issue
- writing tests for something you don’t fully understand yet
It’s not “let the assistant drive.” It’s you driving with guardrails.
The core idea: budgets create better prompts
Most prompt failures aren’t about phrasing. They’re about missing constraints.
When you timebox, you naturally answer:
- What’s the minimum acceptable outcome?
- What’s out of scope?
- What evidence will convince me it’s done?
Those answers turn into a better prompt and a better plan.
The loop (copy/paste template)
Create a scratchpad (issue comment, note, Notion page—anything) and paste this:
TIMEBOXED AGENT LOOP
Timebox: 35 minutes
Goal (1 sentence):
Definition of done (3 bullets):
Constraints (3 bullets):
Risks / unknowns (up to 3):
Inputs I can provide:
- Repo paths:
- Expected behavior:
- Example input/output:
- Logs / stack traces:
Plan:
1)
2)
3)
Stop conditions:
- If X happens, stop and ask.
- If Y is unclear, propose 2 options and wait.
Now you have a spec the assistant can actually follow.
Step 1 (5 minutes): write a “one-screen” spec
Before you ask for code, force the “one screen” rule:
- Goal: one sentence
- Done: 3 bullets maximum
- Constraints: 3 bullets maximum
Example (realistic refactor):
- Goal: Replace ad-hoc config parsing with a typed schema.
-
Done:
-
config.tsexportsConfig+loadConfig() - missing env vars produce a single actionable error message
- unit tests cover happy path + one failure case
-
-
Constraints:
- no new runtime dependency
- keep existing env var names
- Node 20+
That’s enough to start.
Step 2 (10 minutes): ask for a plan + patch strategy, not a full solution
I don’t start with “write the code.” I start with:
1) a plan with checkpoints
2) the exact files the assistant wants to touch
3) a patch strategy (how to apply changes safely)
Prompt:
You are helping me inside a 35-minute timebox.
Goal: Replace ad-hoc config parsing with a typed schema.
Done:
- config.ts exports Config + loadConfig()
- missing env vars produce a single actionable error message
- tests cover happy + failure
Constraints:
- no new runtime dependency
- keep env var names
Repo context:
- current config logic: src/config/index.ts
- tests: src/config/__tests__/
Task:
1) Propose a 3-step plan with checkpoints.
2) List which files you will change.
3) For each step, describe the smallest patch that keeps tests green.
If anything is ambiguous, ask up to 3 questions.
This is the difference between “here’s 200 lines” and “here’s how we’ll not break production.”
Step 3 (15 minutes): run the “patch, verify, narrate” cycle
Inside the timebox, I repeat a mini-cycle:
- Patch (smallest diff)
- Verify (tests, lint, or a single reproduction command)
- Narrate (what changed + why)
A good assistant response in this phase looks like:
- a focused diff
- commands to run
- a short explanation
If you can’t run commands in your environment, simulate verification by demanding the assistant provide:
- expected outputs
- edge cases
- how to roll back
Here’s a “diff-first” prompt that works well:
Make the smallest possible change that moves us toward the goal.
Return:
- a unified diff
- the command(s) I should run to verify
- what success looks like
- one rollback instruction
Why unified diff? Because it’s harder to hallucinate file boundaries, and easier for you to review.
Step 4 (5 minutes): force a stop and create a handoff
Timeboxes fail when you keep extending them.
When the timer hits zero, stop and write a handoff note. Even if the work is unfinished, you’ll avoid the “where was I?” tax.
Handoff template:
State:
- What works now:
- What’s failing / missing:
Next steps (max 3):
1)
2)
3)
Open questions:
- ?
This is also the moment to decide whether to:
- ship as-is
- create a follow-up timebox
- revert and try a different approach
Two concrete examples
Example A: debugging a flaky test
Timebox: 25 minutes
- Goal: make
UserServicetests deterministic - Done: flaky test removed or stabilized; root cause documented
- Constraints: don’t increase test runtime by >10%
Loop in practice:
1) Ask for a hypothesis list (max 5) based on the error + test file.
2) Pick one hypothesis and request the smallest patch.
3) Verify by running that one test in a loop (e.g. 20 runs).
4) If it’s still flaky, stop and move to hypothesis #2.
The key is serializing hypotheses. AI assistants love doing five things at once. Don’t let them.
Example B: adding a tiny feature
Feature: Add a --dry-run flag to a CLI command.
- Done:
- flag is documented in
--help - no network calls happen in dry-run mode
- one unit test asserts the behavior
- flag is documented in
Ask for the plan, then request a patch that only touches:
- CLI arg parsing
- one function boundary where side effects happen
- tests
If the assistant wants to “restructure the whole CLI,” that’s a timebox violation.
Common failure modes (and the guardrails that fix them)
1) Scope creep via “nice-to-haves”
- Guardrail: explicitly list out-of-scope items in the prompt.
2) Big bang refactors
- Guardrail: require “smallest patch that keeps tests green.”
3) Invisible verification
- Guardrail: every patch must include a verification command and expected result.
4) Ambiguity masquerading as progress
- Guardrail: allow only 3 questions; otherwise propose 2 options and wait.
Why this works
The Timeboxed Agent Loop does two things humans are bad at under uncertainty:
- it forces explicit constraints
- it creates a cadence of evidence (patch → verify)
You still get the leverage of AI. You just don’t pay for it with your entire afternoon.
If you try this, start with a 25-minute loop on something small. The point isn’t to go fast—it’s to stay in control.
Top comments (0)