Mukunda Rao Katta

Posted on May 25

I shipped eight agent-stack repos in eight hours. Here's what made it possible.

#hermeschallenge #ai #llm #agents

Today I pushed eight public repositories to GitHub. Five new agent-stack libraries plus three revivals of abandoned March scaffolds. 278 new tests. Three different programming languages between them. Zero runtime dependencies on most of the new ones.

People will ask "how" and the honest answer is not "AI wrote it." The honest answer is that I have shipped 35 agent-stack libraries over the prior eight weeks and the boring patterns are now muscle memory. Today was just one round of executing the pattern more times than usual.

Here's what the pattern is.

One boring problem per library

The five fresh libraries each solve one problem.

prompt-shield: stop pasted prompt-injection text from reaching your model.
crusoe-nemotron-harness: wrap a Nemotron agent on Crusoe Cloud with cost, egress, vet, snap, trace, budget.
tool-call-cache: memoize LLM tool calls by canonical args.
perfectcorp-tryon-concierge: chat-style AI agent over the Perfect Corp YouCam APIs.
bitte-telegram-launcher: turn any Bitte agent into a working Telegram bot.

Each one fits in one sentence. None of them tries to be a platform. None of them tries to do the next problem too.

If I cannot describe the problem in one sentence, I do not start the repo. If the description starts drifting into "and also" I cut it before I write any code.

Tests are the proof, README is the pitch

Every one of the eight has a real test suite. The fresh libraries have 41 to 79 tests each. The three revivals went from one stub assertion or near-zero coverage to 22, 25, and 50 tests respectively.

When I say "shipped" I mean pytest -q returns green on a fresh clone with no API keys. If a judge cannot reproduce the green bar, the project is not done.

The README is the sales pitch. It claims one thing the library does. It does not claim things the library does not do.

Offline demo or it does not count

Every one of the five fresh libraries has a FakeProvider or seeded stub so the demo runs without credentials.

This is not a quality-of-life nicety. It is the difference between "the judge tried it and saw it work" and "the judge looked at the README, did not feel like provisioning a Crusoe Cloud key, and moved on."

The same pattern for the revivals. agentmemory v0.3 ships a visible-retrieval demo that runs against an in-memory store, no API key. ragvitals v0.2 ships a synthetic 500-document corpus and an aging knob, no API key. agentcore ships an InMemoryLLM scripted stub so the agent loop runs entirely offline.

Zero runtime dependencies where possible

Of the five fresh libraries, four are zero-runtime-dependency. The one exception is perfectcorp-tryon-concierge, which needs Gradio for the UI and Pillow for image composition.

Zero dependencies means: faster pip install, no supply chain surface, easier to read, and one less version-pinning headache for the user. It also means the library has to be small enough to not need them, which forces useful design pressure.

When I cannot do zero deps, I keep the dependency list to two or three well-known packages.

Revival rule: cut, do not add

The three revivals are the part of today I am proudest of.

The temptation when reviving an abandoned scaffold is to add features that finish what you originally planned. That is a trap. The original scaffold did not get finished because it was too big. Adding more makes the same mistake again.

The revival rule is: pick the ONE thing the README headline promised, make that work end-to-end, write tests for it, ship. Remove anything else that does not contribute.

agentcore had six placeholder methods (think, act, observe, plan, usetool, getcontext). Now they each do something real and they compose into one loop. The deny patterns on the Bash tool actually deny. The OpenAI-compatible LLM adapter actually hits the API. Everything else got cut.

Naming as a forcing function

When I cannot name a library in one short kebab-case phrase, I do not start it. prompt-shield works. tool-call-cache works. crusoe-nemotron-harness is verbose but accurate. If the name needed two clauses to explain itself, that is a signal the scope is too big.

The name has to ship before the code does.

The agent-stack family as compounding investment

Today's eight repos are not standalone. They are the next layer on a 35-library family where each piece composes with the others.

prompt-shield slots in front of the agent input. agentguard wraps the network egress. agentvet validates tool args. agentsnap records the call trace. agenttrace rolls up cost and latency. agentcast enforces the structured output. token-budget-py caps the spend. tool-call-cache memoizes the repeat calls.

Each library is small enough to read in one sitting. Together they cover the boring infrastructure layer of any production LLM application. The investment is compounding because every new library can reuse the patterns from the prior ones.

The honest part

Two of the five fresh libraries have placeholder demo videos noted in the SUBMISSION.md. I have not recorded them yet.

Two of the eight repos pushed had a couple of git follow-up commits after the first push to clean up. The CHANGELOG version header on one of the revivals has an em dash because the prior v0.1 header had one and consistency mattered more than style.

The point is: this kind of velocity is not magic and not free. It requires having shipped the pattern enough times that the friction is mostly gone. The first lib you build at this style takes a weekend. The thirty-fifth takes an hour.

Try them

All eight live at github.com/MukundaKatta. MIT, public, ready to read.

If any of them solve a problem you have today, take what you need.

DEV Community