German Heller

Posted on Mar 28

I Built a Terminal IDE Because Alt-Tab Was Killing My AI Workflow

#agents #ai #productivity #showdev

I run Claude Code, Gemini CLI, and Codex simultaneously. Not because I'm showing off — because that's genuinely the fastest way to build software right now. One agent refactors a module, another writes tests, a third handles the deployment config.

The problem isn't running them. The problem is managing them.

I'd have four terminal windows open, alt-tabbing constantly, trying to remember which one was waiting for my input, which one was still thinking, and which one had finished five minutes ago while I wasn't looking. I'd miss prompts. I'd interrupt agents mid-thought. I'd lose track of what each terminal was even doing.

So I built PATAPIM — a terminal IDE designed around running multiple AI coding agents at once.

The core insight: your IDE should match your workflow

Traditional IDEs are built around a code editor with a terminal panel at the bottom. That made sense when the terminal was where you ran build commands. It doesn't make sense when the terminal is where development happens.

If you're using Claude Code or Gemini CLI, the terminal isn't a secondary panel — it's your primary interface. And if you're running multiple agents in parallel, you need to see them all at the same time.

PATAPIM gives you up to 9 terminals in a grid layout. Each one can run a different agent, a different task, or just a plain shell. The left sidebar shows your projects, quick tasks, and file tree. The terminal grid is front and center.

The hardest problem: knowing what your agents are doing

Here's the thing nobody talks about when they say "just run multiple agents." How do you know when one finishes? How do you know when one needs input? Do you scroll through each terminal manually?

This was the core technical challenge. I needed to detect the state of each AI agent automatically and show it visually.

The solution is pattern matching on terminal output. Each agent has recognizable patterns:

Working/thinking: Braille spinners, asterisk spinners, or text like "(thinking)"
Waiting for input: A prompt character like "❯" with no spinner
Plan mode: Text like "Plan:" or specific plan-mode indicators

The detection runs on a 200ms throttle — fast enough to feel responsive, slow enough to not kill performance with 9 terminals pumping output simultaneously.

Each terminal gets a color-coded border based on its detected state:

Red    → Agent is working. Don't touch it.
Green  → Agent needs your input. Go here.
Cyan   → Plan mode. Review and approve.

This single feature changed everything. Instead of checking each terminal, I glance at the grid and immediately see which ones need attention. It's like a traffic light system for your AI agents.

Voice dictation that stays on your machine

I code on a standing desk and sometimes I want to talk to my agents instead of typing. I also have a long history of wrist issues, so voice input isn't a gimmick for me — it's a necessity.

PATAPIM includes built-in voice dictation powered by Whisper. The important part: it runs locally. The speech recognition models (from @huggingface/transformers, ONNX format) run entirely on your machine. No audio is sent anywhere.

The app auto-selects the model size based on your hardware — tiny, base, or small — so it works on modest machines too. The free tier gives you 30 minutes of dictation per session. Pro unlocks unlimited.

Remote access from your phone

Sometimes I step away from my desk — getting coffee, pacing while thinking through architecture — and I want to check if my agents are done or approve a plan. PATAPIM runs a local WebSocket server on your LAN. Scan a QR code with your phone, and you get full terminal access in your mobile browser.

No VPN, no cloud service, no port forwarding. It's a direct connection on your local network.

I use this daily. Agent running a long task? I check from my phone. Need to approve a plan while making coffee? Done from the couch.

The embedded MCP browser

This one is a bit wild but incredibly useful.

PATAPIM includes an embedded Chrome browser panel that registers as an MCP (Model Context Protocol) server. When Claude Code runs inside PATAPIM, it can use this browser to navigate real websites, click elements, fill forms, and read page content.

This isn't a headless browser or a simulation. It's a real Chrome instance your agent can control. I use it for researching docs while coding, testing web apps, and scraping data as part of development tasks.

Scheduling and automation

One pattern I found myself doing constantly: running the same check every few minutes. "Did the build pass?" "Is the server responding?"

Right-click a terminal tab and you can schedule a command to run at a specific time, or loop a command on an interval.

Small feature, but it eliminates a whole class of interruptions.

The philosophy behind it

Work in parallel. Run multiple agents on different tasks simultaneously.

Never stay waiting. When one agent is thinking, switch to another. The color-coded borders tell you instantly where your attention is needed.

Better to speak than write. Voice dictation is faster than typing for most prompts.

The technical stack

Electron — xterm.js is the gold standard for terminal emulation in JS. Memory sits around 150-200MB for the full app with 9 terminals.
xterm.js 5.3 — Terminal rendering
node-pty 1.0 — PTY process management
WebSocket (ws 8.16) — Remote access transport
@huggingface/transformers 3.8.1 — Local Whisper inference with ONNX models

Try it

PATAPIM is free to use. The free tier gives you 9 terminals, 3 projects, and 30 minutes of voice dictation — no credit card, no trial period. Pro is $7/month or $30 for lifetime access.

Windows is live now. macOS is coming.

If you're running AI coding agents and losing your mind managing terminal windows, give it a shot: https://patapim.ai

I'm a solo dev in Buenos Aires, and I built this because I needed it. If you have feedback, feature requests, or bugs, I want to hear about it.

DEV Community