Welcome back to the server room. The fans are humming, the GPU is running cool, and the cloud dependency is officially dead.
Kiwi-chan has officially crossed the Rubicon. We’ve fully transitioned to a 100% local deployment running Qwen 35B. No more API rate limits, no more telemetry leakage, no more waiting in queue. Just raw, unfiltered reasoning, token generation, and autonomous decision-making happening entirely behind our firewall. Watching a 35-billion-parameter model navigate a rule-bound Minecraft environment is equal parts engineering triumph and digital sitcom. Let’s break down what happened in the last four hours.
📊 The Numbers Don't Lie (But They Do Dance)
Over the past cycle, Kiwi-chan threw itself into the digital furnace. We clocked 2,064 total actions, with 898 successful executions, landing our overall success rate at a respectable 43.5%.
In traditional software, a 43.5% pass rate gets you fired. In autonomous AI agent dev, it’s a victory lap. Why? Because every failed action is a data point. The model isn’t just executing; it’s learning, failing, memorizing, and adapting in real-time. That nearly-half success rate is the sound of a local LLM figuring out how to think like a senior engineer without hand-holding.
🧠 The Vec3 Blacklist Saga: A Groundhog Day in the Logs
If you’ve been watching the debug stream, you’ve witnessed the drama. Kiwi-chan’s core survival rules are strict: no hardcoded coordinates, dynamic block finding only, and single-task execution. But when it comes to place_crafting_table, the code generator keeps tripping over its own shoelaces.
Here’s the loop I’ve been watching for the last hour:
[18:55:30] ⚠️ Safety Check Failed: ❌ FORBIDDEN: 'Vec3' is in blacklist.. Retrying...
[19:05:47] 🚫 Auto-Blacklisting: 'Vec3'
[19:05:47] 💀 Failure Memorized: place_crafting_table
[19:06:11] 💊 Recovery Plan: ['explore_forward', 'place_crafting_table']
Qwen 35B generates the code, the safety layer intercepts it, screams FORBIDDEN: 'Vec3' is in blacklist., and the recovery system kicks in. Sometimes it patches the code, sometimes it hits Extraction Failed, and sometimes it just gives up and pivots to explore_forward like a seasoned dev refactoring a messy PR.
But here’s the beautiful part: the system rules are actually working. When place_crafting_table fails repeatedly, the critical failure recovery rule triggers: If any gathering task fails... YOU MUST IMMEDIATELY PROPOSE explore_forward. Kiwi-chan doesn’t panic. It calculates a random target 30-40 blocks away using Math.random() * 80 - 40, audits the path, and goes exploring. It’s not hallucinating; it’s gracefully degrading while the crafting table situation simmers in the task stack.
🔧 Under the Hood: How We Force the Model to Think Like a Pro
The 43.5% success rate isn’t magic. It’s the result of brutally strict coding standards baked into the system prompt. We don’t let the LLM write sloppy code. Here’s what’s actually running in Kiwi-chan’s brain:
- Single-Task Principle: One action per script. No "craft, then place" monstrosities. If it needs a crafting table, it finds one dynamically. It doesn’t try to build one from inventory mid-craft.
-
No Error Hiding:
try-catchis strictly banned. Errors must crash the task so the recovery system can actually inspect the stack. Silent failures are the enemy of autonomous agents. -
Dynamic Pathfinding & Placement Safety: We explicitly forbid
GoalBlockfor placement to avoid collision. Kiwi-chan now knows to stand 2 blocks away usingGoalNear, look at the target withbot.lookAt(), and place on the top face withnew Vec3(0, 1, 0). (Yes, it learns how to use the constructor after the blacklist is lifted, which is a funny but necessary distinction.) -
Inventory Count Fatal Error Prevention: The notorious
blocksByNamevsitemsByNametrap. We explicitly taught Kiwi-chan thatbot.registry.blocksByName['dirt'].idwill always return0for inventory counts. It now correctly usesbot.registry.itemsByName['dirt'].id. One line of prompt engineering, and it stopped counting ghosts. -
Movement Auditing:
waitForTicksis banned for long travel. Kiwi-chan recordsbot.entity.positionbefore and afterpathfinder.goto(). If it moves < 10 blocks, it throws a hard error and triggers recovery instead of wasting cycles hoping the pathfinder will magically work.
🏆 Celebrating the Fully Local Evolution
The real win here isn’t the success rate. It’s the architecture. We’re running Qwen 35B entirely on-premise. Every reasoning step, every JSON goal alignment, every code generation pass happens locally.
- Zero Cloud Latency: Decisions happen in milliseconds, not seconds.
- Complete Privacy: No telemetry, no external LLM calls, no API keys floating around.
-
True Autonomy: Kiwi-chan learns from its own failures. It auto-blacklists problematic constructors, memorizes stack traces, and updates its skill library (
explore_forwardis now a memorized skill). This is the closest we’ve come to an AI that doesn’t need a babysitter.
🔮 What’s Next?
Tomorrow, we’re refining the placement safety rules and giving Qwen 35B a fresh batch of logs to break the Vec3 deadlock. We’ll also tighten the reason to goal JSON alignment to reduce the occasional mismatch. Until then, watch the logs, grab a coffee, and enjoy the show. The future of local AI agents is messy, brilliant, and running entirely on our own hardware.
Stay tuned for Devlog #4. Kiwi-chan out. 🥝⛏️

Top comments (0)