DEV Community

gentic news
gentic news

Posted on • Originally published at gentic.news

Muxer: Open-Source Model Multiplexer Slashes Claude Code Costs by Routing

Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks. Keep Fable/Opus for planning; route boilerplate to Haiku.

Key Takeaways

  • Muxer reduces Claude Code costs by multiplexing models per subtask via agent frontmatter and session hooks.
  • Keep Fable/Opus for planning; route boilerplate to Haiku.

What Changed — The Update

Muxer is an open-source Claude Code plugin that lets an expensive model orchestrate a session while cheaper models do the actual work. It hit GitHub recently and solves a pain point anyone on a Max plan knows: the orchestrator model (Fable, Opus) bills for every subtask it spawns, even trivial ones like grepping files or writing boilerplate.

Muxer works through three mechanisms:

  1. Agent frontmatter — Each agent in agents/*.md has a model: line. The scout always runs on Haiku; the builder always runs on Opus. This is a hard guarantee, not a suggestion.

  2. SessionStart hook — A scripts/session-policy.sh injects a routing policy: on Fable sessions it says "keep the main loop lean"; on cheaper models it adds an escalation path up to Fable via muxer:oracle.

  3. PreToolUse guardscripts/guard-model.sh catches subagents (Explore, Plan) that inherit the session model and pins them to Opus unless overridden. Prevents premium billing on file exploration.

What It Means For You — Concrete Impact

If you're on a Max plan, your biggest cost driver isn't the number of prompts—it's the model running each subtask. Claude Code's built-in subagents inherit the session model by default. On a Fable session, every grep, find, and boilerplate write bills at premium rates.

Muxer's approach mirrors the strategy we covered in "Claude Fable 5 in Claude Code: The Routing Strategy That Saves Your Weekly Limit" (2026-07-02). The difference: Muxer gives you hard guarantees via agent frontmatter rather than relying on prompt engineering.

The project also has rules for quality control:

  • Taste-critical work (UI, CSS, game feel) never goes below Opus regardless of cost hints.
  • The verifier is never a cheaper model than the builder it's judging.
  • Work that fails review twice at one tier gets redone a tier up with a fresh brief.

Try It Now — Commands, Config, and Prompts

1. Clone and Set Up

git clone https://github.com/DangerousYams/muxer.git
cd muxer
# Copy agents and hooks into your Claude Code project
cp -r agents/* ~/.claude/agents/
cp scripts/* ~/.claude/hooks/
Enter fullscreen mode Exit fullscreen mode

2. Configure Agent Frontmatter

Create agents/scout.md:

---
model: haiku
---
You are a scout. Your job is to explore the codebase and gather information. Be fast and concise.
Enter fullscreen mode Exit fullscreen mode

Create agents/builder.md:

---
model: opus
---
You are a builder. You implement features from detailed briefs. Never accept under-scoped briefs.
Enter fullscreen mode Exit fullscreen mode

3. Install Session Hook

In ~/.claude/hooks/session-start.sh:

#!/bin/bash
# If session model is Fable, inject delegation policy
if [ "$CLAUDE_SESSION_MODEL" = "fable" ]; then
  echo "Policy: Delegate all file operations to haiku. Escalate complex decisions to opus."
fi
Enter fullscreen mode Exit fullscreen mode

4. Run and Watch Savings

Start a Claude Code session on Fable. Muxer prints $ saved after each task. You'll see the scout run on Haiku, the builder on Opus, and Fable only for planning and review.

Why It Works

Monitoring Claude Code costs on AWS Bedrock | by hackthebox | Medium

Claude Code picks a model for a subtask at the moment the orchestrator spawns it. Muxer leans on that decision from three directions: agent frontmatter guarantees the model, session hooks set policy, and guards catch unspawned overrides. This triple-layer approach means you don't need to trust prompt engineering—you get hard routing guarantees.

When To Use It

  • Max plan users — Your biggest cost is Fable/Opus subtasks. Muxer cuts that.
  • Multi-model workflows — You want to route specific work to Gemini or OpenAI Codex via their CLIs.
  • Quality-sensitive projects — The review escalation rules ensure cheap models don't produce garbage.

Caveats

Muxer is new (2 points, 0 comments on HN as of writing). The guard script approach is experimental. Test on a small project before rolling out to production. Also, as we noted in "How Navan's MCP Server Cuts Travel Booking from 8 Steps to 1 Command" (2026-07-02), any tool that modifies session behavior can introduce unexpected interactions—monitor your first few sessions closely.


Source: github.com

[Updated 04 Jul via hn_claude_code]

A new competitor, Brick-SR1 by regolo-ai, has emerged on GitHub with a similar model-routing approach but claims to save tokens rather than direct costs. Unlike Muxer's agent frontmatter system, Brick-SR1 uses a YAML configuration file to define routing rules per task type and includes built-in fallback logic if a cheaper model fails quality checks. The project has 9 points on Hacker News but zero comments as of writing, suggesting early-stage adoption. [per regolo-ai/Brick-SR1]


Originally published on gentic.news

Top comments (0)