Building "Devwrite": A Local, Agentic Writing Assistant with Ollama & Node.js

#programming #javascript #beginners #ai

Here's how I implemented a Planner-Executor-Critic loop to stop getting generic AI responses.

We've all been there: you ask an LLM to write a landing page or a blog post, and you get a generic, soulless wall of text. It lacks structure, it misses the nuance, and it usually takes 3-4 follow-up prompts to fix.

I decided to solve this by building Devwrite—a local, agentic writing assistant that mimics a human editorial process. Instead of one "generation," it uses a multi-agent loop to plan, draft, critique, and refine content, all running locally on my machine using gemma3:12b via Ollama.

Here’s a code-deep dive into how I built it.

The Architecture: 3 Agents, 1 Loop

The core is an orchestrator loop that manages three distinct agents.

1. The Planner Agent

This agent doesn't write content. Its only job is to return a JSON array of sections.

    async  function  plannerAgent(userRequest)  {
      const  prompt  =  `You are a Planner Agent.
     Your goal is to break down the user's request into a logical list of distinct sections.
     User Request: "${userRequest}"
     Return ONLY a JSON object with a "plan" key containing an array of strings.`;
      // ... call Ollama ...
      return  JSON.parse(response).plan;  
    }

2. The Critic Agent

This is where the magic happens. The Critic reviews the Executor's draft and assigns a score (1-10). If it's below 8, it rejects the draft.

async  function  criticAgent(section,  content)  {
  const  prompt  =  `You are a Critic Agent.
 Review the following content for the section "${section}".
 Content: "${content}"
 Give a score from 1 to 10 (10 being perfect).
 If the score is below 8, provide specific feedback.
 Return ONLY a JSON object: { "score": number, "feedback": "string" }`;
  return  await  callOllama(prompt,  true);
}

3. The Orchestrator Loop

The loop ties it all together. It attempts to rewrite the content up to 2 times based on feedback.

// Inside the main loop...
while  (attempts  <  MAX_ATTEMPTS)  {
  const  critique  =  await  criticAgent(section,  content);
  if (critique.score  >=  8) {
  break;  // Approved!
 }
  // Refinement Step
  const  rewritePrompt  =  `Original request: "${userRequest}"
 Critic Feedback: "${critique.feedback}"
 Rewrite the content to address the feedback.`;
  content  =  await  callOllama(rewritePrompt);
  attempts++;
}

Handling Local Model Flakiness

One challenge with 12B parameter models is that they sometimes fail to return valid JSON, or the connection drops. I implemented a robust retry wrapper for the Ollama client:

async  function  callOllama(prompt,  json  =  false,  retries  =  3)  {
  for (let  i  =  0;  i  <  retries;  i++) {
  try {
  // ... make request ...
  if (!response.response) throw  new  Error("Empty response");
  return  response.response;
 } catch (error) {
  if (i  ===  retries  -  1) throw  error;
  await  sleep(1000  * (i  +  1));  // Exponential backoff
 }
 }
}

The UI (Server-Sent Events)

To make it feel alive, I stream the "thought process" using SSE (Server-Sent Events). The Node.js backend pushes updates like {"type": "critique", "score": 6} to the React frontend, which renders real-time status updates.

This is what the output looks like

Devwrite proves that you don't need huge API budgets to build sophisticated agentic workflows. With a bit of logic and a decent local model, you can build systems that correct themselves.

Here's the github repo.