Here's how I implemented a Planner-Executor-Critic loop to stop getting generic AI responses.
We've all been there: you ask an LLM to write a landing page or a blog post, and you get a generic, soulless wall of text. It lacks structure, it misses the nuance, and it usually takes 3-4 follow-up prompts to fix.
I decided to solve this by building Devwrite—a local, agentic writing assistant that mimics a human editorial process. Instead of one "generation," it uses a multi-agent loop to plan, draft, critique, and refine content, all running locally on my machine using gemma3:12b via Ollama.
Here’s a code-deep dive into how I built it.
The Architecture: 3 Agents, 1 Loop
The core is an orchestrator loop that manages three distinct agents.
1. The Planner Agent
This agent doesn't write content. Its only job is to return a JSON array of sections.
async function plannerAgent(userRequest) {
const prompt = `You are a Planner Agent.
Your goal is to break down the user's request into a logical list of distinct sections.
User Request: "${userRequest}"
Return ONLY a JSON object with a "plan" key containing an array of strings.`;
// ... call Ollama ...
return JSON.parse(response).plan;
}
2. The Critic Agent
This is where the magic happens. The Critic reviews the Executor's draft and assigns a score (1-10). If it's below 8, it rejects the draft.
async function criticAgent(section, content) {
const prompt = `You are a Critic Agent.
Review the following content for the section "${section}".
Content: "${content}"
Give a score from 1 to 10 (10 being perfect).
If the score is below 8, provide specific feedback.
Return ONLY a JSON object: { "score": number, "feedback": "string" }`;
return await callOllama(prompt, true);
}
3. The Orchestrator Loop
The loop ties it all together. It attempts to rewrite the content up to 2 times based on feedback.
// Inside the main loop...
while (attempts < MAX_ATTEMPTS) {
const critique = await criticAgent(section, content);
if (critique.score >= 8) {
break; // Approved!
}
// Refinement Step
const rewritePrompt = `Original request: "${userRequest}"
Critic Feedback: "${critique.feedback}"
Rewrite the content to address the feedback.`;
content = await callOllama(rewritePrompt);
attempts++;
}
Handling Local Model Flakiness
One challenge with 12B parameter models is that they sometimes fail to return valid JSON, or the connection drops. I implemented a robust retry wrapper for the Ollama client:
async function callOllama(prompt, json = false, retries = 3) {
for (let i = 0; i < retries; i++) {
try {
// ... make request ...
if (!response.response) throw new Error("Empty response");
return response.response;
} catch (error) {
if (i === retries - 1) throw error;
await sleep(1000 * (i + 1)); // Exponential backoff
}
}
}
The UI (Server-Sent Events)
To make it feel alive, I stream the "thought process" using SSE (Server-Sent Events). The Node.js backend pushes updates like {"type": "critique", "score": 6} to the React frontend, which renders real-time status updates.
This is what the output looks like
Devwrite proves that you don't need huge API budgets to build sophisticated agentic workflows. With a bit of logic and a decent local model, you can build systems that correct themselves.
Here's the github repo.




Top comments (0)