gentic news

Posted on Jul 4 • Originally published at gentic.news

Muxer: Open-Source Model Multiplexer Slashes Claude Code Costs by Routing

#ai #opensource #programming #machinelearning

Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks. Keep Fable/Opus for planning; route boilerplate to Haiku.

Key Takeaways

Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks.
Keep Fable/Opus for planning; route boilerplate to Haiku.

What Changed — The Update

Muxer is an open-source Claude Code plugin that lets an expensive model orchestrate a session while cheaper models do the actual work. It hit GitHub recently and solves a pain point anyone on a Max plan knows: the orchestrator model (Fable, Opus) bills for every subtask it spawns, even trivial ones like grepping files or writing boilerplate.

Muxer works through three mechanisms:

Agent frontmatter — Each agent in agents/*.md has a model: line. The scout always runs on Haiku; the builder always runs on Opus. This is a hard guarantee, not a suggestion.
SessionStart hook — A scripts/session-policy.sh injects a routing policy: on Fable sessions it says "keep the main loop lean"; on cheaper models it adds an escalation path up to Fable via muxer:oracle.
PreToolUse guard — scripts/guard-model.sh catches subagents (Explore, Plan) that inherit the session model and pins them to Opus unless overridden. Prevents premium billing on file exploration.

What It Means For You — Concrete Impact

If you're on a Max plan, your biggest cost driver isn't the number of prompts—it's the model running each subtask. Claude Code's built-in subagents inherit the session model by default. On a Fable session, every grep, find, and boilerplate write bills at premium rates.

Muxer's approach mirrors the strategy we covered in "Claude Fable 5 in Claude Code: The Routing Strategy That Saves Your Weekly Limit" (2026-07-02). The difference: Muxer gives you hard guarantees via agent frontmatter rather than relying on prompt engineering.

The project also has rules for quality control:

Taste-critical work (UI, CSS, game feel) never goes below Opus regardless of cost hints.
The verifier is never a cheaper model than the builder it's judging.
Work that fails review twice at one tier gets redone a tier up with a fresh brief.

Try It Now — Commands, Config, and Prompts

1. Clone and Set Up

git clone https://github.com/DangerousYams/muxer.git
cd muxer
# Copy agents and hooks into your Claude Code project
cp -r agents/* ~/.claude/agents/
cp scripts/* ~/.claude/hooks/

2. Configure Agent Frontmatter

Create agents/scout.md:

---
model: haiku
---
You are a scout. Your job is to explore the codebase and gather information. Be fast and concise.

Create agents/builder.md:

---
model: opus
---
You are a builder. You implement features from detailed briefs. Never accept under-scoped briefs.

3. Install Session Hook

In ~/.claude/hooks/session-start.sh:

#!/bin/bash
# If session model is Fable, inject delegation policy
if [ "$CLAUDE_SESSION_MODEL" = "fable" ]; then
  echo "Policy: Delegate all file operations to haiku. Escalate complex decisions to opus."
fi

4. Run and Watch Savings

Start a Claude Code session on Fable. Muxer prints $ saved after each task. You'll see the scout run on Haiku, the builder on Opus, and Fable only for planning and review.

Why It Works

Claude Code picks a model for a subtask at the moment the orchestrator spawns it. Muxer leans on that decision from three directions: agent frontmatter guarantees the model, session hooks set policy, and guards catch unspawned overrides. This triple-layer approach means you don't need to trust prompt engineering—you get hard routing guarantees.

When To Use It

Max plan users — Your biggest cost is Fable/Opus subtasks. Muxer cuts that.
Multi-model workflows — You want to route specific work to Gemini or OpenAI Codex via their CLIs.
Quality-sensitive projects — The review escalation rules ensure cheap models don't produce garbage.

Caveats

Muxer is new (2 points, 0 comments on HN as of writing). The guard script approach is experimental. Test on a small project before rolling out to production. Also, as we noted in "How Navan's MCP Server Cuts Travel Booking from 8 Steps to 1 Command" (2026-07-02), any tool that modifies session behavior can introduce unexpected interactions—monitor your first few sessions closely.

Source: github.com

[Updated 04 Jul via hn_claude_code]

A new competitor, Brick-SR1 by regolo-ai, has emerged on GitHub with a similar model-routing approach but claims to save tokens rather than direct costs. Unlike Muxer's agent frontmatter system, Brick-SR1 uses a YAML configuration file to define routing rules per task type and includes built-in fallback logic if a cheaper model fails quality checks. The project has 9 points on Hacker News but zero comments as of writing, suggesting early-stage adoption. [per regolo-ai/Brick-SR1]

Originally published on gentic.news

DEV Community