How I ran 6 LLMs in parallel without paying a cent in API fees (Electron + DOM Injection)

#ai #claude #llm #chatgpt

Let’s be honest: trusting a single LLM with a complex problem is basically a coin toss right now.

I got incredibly tired of my daily workflow: ask ChatGPT a question -> get a confident answer -> paste the same question into Claude to fact-check -> get a contradictory answer -> ask Perplexity to break the tie. I was acting as the manual API router, and it was exhausting.

I wanted a "Peer Review" system where the AIs cross-checked each other. But I quickly ran into two massive roadblocks:

Cost: Running a 6-model cross-validation loop via official APIs (GPT-4o, Claude 3.5, deepseek, etc.) for every single query gets expensive fast.

Latency (The Waterfall): Chaining these sequentially means waiting minutes for an answer.

So, I decided to build AI Council — a local desktop app that bypasses APIs entirely and runs 6 AIs in parallel using their free web UIs. Here is how I built the orchestration logic without it turning into a complete disaster.

The Architecture: Electron & 6 BrowserViews
Instead of using standard REST API calls, the app is an Electron wrapper. It spins up 6 hidden (or visible, if you want the matrix vibe) BrowserView instances, loading the actual web interfaces of ChatGPT, Claude, Gemini, Perplexity, DeepSeek, and Grok.

The entire "API" is just DOM injection: parsing the HTML, finding the text area, simulating keystrokes, and clicking the "Send" button.

Bypassing the Waterfall: Fan-Out / Fan-In
If Model A waits for Model B, and Model B waits for Model C, the UX is dead. To solve this, I used a Fan-out/Fan-in orchestration approach.

Here is the flow:

The Primary Draft: You ask a question. The Primary AI (e.g., ChatGPT) generates a first draft.

Fan-Out (Parallel Review): The app takes that draft and broadcasts it to the other 5 AI panels at the exact same time. It hits the "submit" button on all 5 BrowserViews simultaneously.

Fan-In (Compilation): The app monitors the DOM of all 5 windows. Once they all stop generating, it extracts the text, compiles the feedback, and feeds it back to the Primary AI.

The Final Output: The Primary AI rewrites the answer based on the peer review.

Basically, the orchestration is just a giant, glorious Promise.allSettled.

// Conceptual Fan-out logic
async function runParallelReview(draft) {
  const reviewers = [claudeView, geminiView, deepseekView, grokView, perplexityView];

  // Fire them all at once
  const reviewPromises = reviewers.map(view => 
    injectPromptAndWaitForCompletion(view, `Review this draft: ${draft}`)
  );

  // Wait for all models to finish physically typing
  const reviews = await Promise.allSettled(reviewPromises);

  return compileReviews(reviews);
}

The Real Headache: Managing Web UI States
The hardest part wasn't the orchestration; it was dealing with the fact that web UIs change, and models stream text at different speeds.

How do you know when an AI is "done" typing when you don't have a clean API response?
You have to monitor the DOM state. For example, looking for the "Stop generating" button to disappear, or using a MutationObserver to watch the chat container. If "Grok's UI is being weird today," the whole promise chain could hang. I had to build robust timeout and fallback mechanisms for each specific wrapper so that one failing UI doesn't crash the entire council.

The Result
What I ended up with is a fully local, open-source app that gives me stress-tested, peer-reviewed answers. I even hooked it up to a local Telegram long-polling script, so I can text my council from my phone, my PC runs the BrowserViews, and it texts me the final consensus back. Zero cloud servers.

If you are curious about the DOM injection scripts or the Electron multi-view architecture, the entire project is open-source.

Check out the repo here: 👉 https://github.com/MinkyuTheBuilder/ai-council

Feel free to fork it, star it, or tell me why my DOM-scraping logic is terrible in the issues! I'd love to connect with anyone else building local multi-agent setups.

DEV Community

How I ran 6 LLMs in parallel without paying a cent in API fees (Electron + DOM Injection)

Top comments (0)