3 Open-Source Repos That Each Kill a Different AI Bill(token, infra, creative)

#ai #llm #productivity #opensource

Your AI spend is not one number. It is three: the tokens you feed the model, the infrastructure to run agents, and the paid tools you bolt on around them. Most cost-cutting advice optimizes one and ignores the other two.

This is a short, honest roundup of three popular open-source repos that each attack a different bill. Free, self-hosted, and verified before I wrote a word. I will give you the real numbers and the real catch on each, because "free" is only a deal if the thing actually works.

TL;DR

Repo	Cuts your...	License	Stars
codebase-memory-mcp	token bill	MIT	~13.8k
flue	agent-infra bill	Apache-2.0	~6.6k
OpenMontage	creative-tool bill	AGPL-3.0	~18k

Cut your token bill: codebase-memory-mcp

codebase-memory-mcp is the one with receipts. It indexes your repo into a persistent knowledge graph (functions, classes, call chains, HTTP routes) across 158 languages, ships as a single static binary with zero dependencies, and exposes it to your agent over MCP.

The point: your coding agent stops re-reading the same files into context on every question and queries the map instead. Re-feeding your codebase is the single biggest source of wasted token spend in agentic coding, and a knowledge graph is the clean fix.

It is not just a marketing claim. The project's preprint (arXiv:2603.27277) reports, across 31 real repositories, roughly 10x fewer tokens and about 2x fewer tool calls versus file-by-file exploration, while keeping answer quality high. It plugs into 11 coding agents and persists to a local SQLite cache.

# Index once, then your MCP-compatible agent queries the graph
codebase-memory-mcp index .

The honest catch: it is a structural backend, not an LLM. The savings come from feeding your agent less, not from it being smarter. Index your actual codebase and measure the token drop on your own tasks rather than taking the headline number on faith. Here is the verified setup with the savings measured.

Cut your agent-infrastructure bill: flue

flue (from the Astro team) is a TypeScript framework for building headless agents that deploy anywhere: Node, Cloudflare, CI.

The money lever is its default sandbox. Instead of spinning up a full container for every agent, flue defaults to a lightweight virtual sandbox, which its docs pitch as far cheaper and more scalable than a container per agent. You can still opt into a local or remote container when a job genuinely needs one. At any real volume, that is the difference between paying for one box and paying for a fleet.

// A whole agent: a prompt and a typed result, no container required
import { createAgent } from '@flue/runtime';

const translator = createAgent(() => ({ model: 'anthropic/claude-sonnet-4-6' }));
const harness = await init(translator);
const session = await harness.session();
const { data } = await session.prompt('Translate "hello" to French');

The honest catch: it is explicitly experimental and the API may still change, so pin your version and expect some churn before you build something load-bearing on it. Here is the verified deploy setup.

Cut your creative-tool bill: OpenMontage

OpenMontage turns a coding assistant into a full video production system: 12 pipelines, 52 tools, and 500+ agent skills spanning scripting, asset generation, editing, and final composition with FFmpeg and Remotion. The pitch is replacing a stack of paid AI-video and editing subscriptions with one open pipeline you run yourself.

Two honest notes:

There is a genuinely free path. You can run it end to end with zero paid APIs using free local text-to-speech (Piper) and public-domain footage (Archive.org, NASA, Wikimedia). Wire in paid AI models only when you want generated assets, so paid generation is an upgrade, not a requirement.
The bigger watch-item is the license. AGPL-3.0 is copyleft: fine for personal and internal use, but it carries real obligations if you build a commercial product on top, so read it first.

Here is the verified free-pipeline setup.

How to choose

Pick by the bill that hurts most:

Tokens are your problem? Start with codebase-memory-mcp. It is the most direct win and the only one here with published numbers behind it.
Running agents at scale? flue's default sandbox is the infrastructure win.
Paying for a pile of creative subscriptions? OpenMontage replaces the pipeline, and there is a fully free way to run it.

All three are free to try, so measure the saving on your own usage rather than trusting the README. That habit, verify before you trust, matters more than any single tool.

Where this leaves you

These three cut the tokens, the infrastructure, and the tools. The fourth bill is the model itself. If that is the part that hurts, I just wrote up the two cheapest ways to get top-tier results without the premium price (one open model you can own, one API that runs a team of models for you): Two new ways to get top-tier AI without paying top-tier prices.

If you have a cost-cutting open-source repo I missed, drop it in the comments. I verify and add the good ones.