Tsunamayo

Posted on Mar 1

I built a desktop app that orchestrates Claude, GPT, Gemini and local Ollama in a 3-phase pipeline

#python #ai #opensource #showdev

I've been building desktop AI tools for a while, and one frustration kept coming up: every AI model has different strengths, but using them together was always manual work — copy-paste between apps, switch tabs, lose context.

So I built Helix AI Studio — an open-source desktop app that lets Claude, GPT, Gemini, and local Ollama models work together in a coordinated pipeline.

GitHub: https://github.com/tsunamayo7/helix-ai-studio

The Core Idea: Multi-Phase AI Pipelines

Instead of sending one prompt to one model, Helix routes your request through multiple AI models in sequence. Each model handles what it's best at:

Your prompt
    ↓
Phase 1: Claude (analysis & reasoning)
    ↓
Phase 2: GPT / Gemini (alternative perspective)
    ↓
Phase 3: Local Ollama model (offline processing / privacy)
    ↓
Final synthesized response

You configure which models run in which phases, and the output of each phase feeds into the next.

What's Inside

Desktop GUI (PyQt6)

Three chat tabs: cloudAI (Claude/GPT/Gemini), localAI (Ollama), mixAI (the pipeline)
Dark-themed native app (Windows and macOS)
Real-time streaming responses

Built-in Web UI (React + FastAPI)

Access from mobile or other devices on your LAN
WebSocket-based streaming — same experience as the desktop
JWT authentication

Local LLM Support

Ollama integration via httpx async calls
Model switching without restart
Works fully offline

RAG Memory

SQLite-based conversation storage
Retrieval-augmented context for follow-up questions

Tech Stack

Layer	Tech
Desktop GUI	PyQt6
Web backend	FastAPI + Uvicorn + WebSocket
Web frontend	React + Tailwind CSS
Local LLMs	Ollama
Cloud AIs	Anthropic SDK, OpenAI SDK, Google Generative AI
DB	SQLite
Platform	Windows 10/11 and macOS 12+ (Apple Silicon & Intel)

Why Mix Models?

Different models genuinely excel at different things. In my testing:

Claude is great at structured reasoning and nuanced writing
GPT handles coding tasks and tool use well
Gemini has strong multimodal and factual retrieval
Local models (Mistral, Llama, Gemma) keep sensitive data on-device

By pipelining them, you get complementary strengths rather than betting everything on one model's weak spots.

Getting Started

git clone https://github.com/tsunamayo7/helix-ai-studio
cd helix-ai-studio
pip install -r requirements.txt
# Add your API keys to config/config.json
python HelixAIStudio.py    # Windows
python3 HelixAIStudio.py   # macOS

Ollama needs to be running separately if you want local model support. Everything else runs in-process.

What's Next

MCP (Model Context Protocol) tool integration
Plugin system for custom pipeline steps
Better multi-modal support (image inputs across models)

The project is MIT licensed. Issues, PRs, and feedback all welcome — especially from people who've tried mixing models for real workloads. Curious what combinations others find useful.

GitHub: https://github.com/tsunamayo7/helix-ai-studio

Top comments (1)

Matthew Hou • Mar 1

The multi-phase pipeline approach is the right direction. Single-model-does-everything is hitting a wall for anything beyond simple tasks.

One question: how do you handle disagreements between models? If Phase 1 (Claude) produces an analysis that Phase 2 (GPT) fundamentally contradicts, does the pipeline have a resolution strategy, or does the final phase just work with whatever it receives?

In my experience, the orchestration layer is where most of the engineering effort ends up — not in the model calls themselves. Routing, error handling, context compression between phases, knowing when to retry vs skip. The model calls are the easy part. Everything around them is where it gets interesting.