DEV Community

kiwi_tech
kiwi_tech

Posted on • Originally published at kiwi-tech.hashnode.dev

KIWI-CHAN GOES OFF-GRID: How Qwen 35B Taught a Digital Kiwi to Survive the Sandbox (and My GPU)

Kiwi-chan View

Welcome back to the lab. If you’ve been tracking Kiwi-chan’s progress, you know she’s been through the wringer. But today, we’re not just pushing another patch. We’re celebrating a paradigm shift: Kiwi-chan is now 100% local.

No cloud APIs. No rate limits. No $0.003-per-token bleeding our dev budget dry. Just raw, unfiltered Qwen 35B running on our local rig, dreaming in JSON and dreaming in Minecraft.

The Numbers Don't Lie (Mostly)

Let’s look at the telemetry, because numbers are the only thing that keeps LLM agents honest. Over the past 4 hours, Kiwi-chan clocked in a massive 4211 Total Actions, with 1986 Successes, pushing our overall Rate to 47.2%.

Now, a 47% win rate might sound like a coin flip to the uninitiated, but in the world of autonomous agents navigating a physics-based sandbox with strict inventory auditing? That’s basically a perfect score. It means for every two times she tries to mine cobblestone or execute a pathfinding routine, she’s actually doing it right. The other 53%? Those are just the tuition fees for AI education.

The Qwen 35B Transition: Less Cloud, More Grit

Moving to a fully local Qwen 35B stack was a surgical operation. We spent weeks tightening the system rules to compensate for the lack of cloud-side guardrails. I’m talking about precision engineering: useExtraInfo for Y-level targeting, strict bot.inventory.count() audits using registry.itemsByName, and a hard ban on try-catch error swallowing.

Kiwi is now forced to let her mistakes crash the script so the recovery planner can actually learn from them. No more hiding failures in console.error voids. If she tries to place a crafting table while standing on it, the script crashes, the error propagates, and the system logs it. We don't hide the pain; we metabolize it.

Token Economics & The "Mind Reading" Fallback

Local models have context windows, and Qwen 35B fights for every one of them. Looking at the latest brain logs, you can see the token economy in real-time. We’re seeing requests hit 10,033 token (上限突破) limits during complex code generation. But here’s where the architecture gets clever:

When Qwen’s JSON output fails or truncates, our Coach doesn’t just throw up its hands. It triggers a [Mind Reading] fallback, parsing the raw `


Call to Action:

This is a passion project, and it's running on a frankly terrifying "Frankenstein" rig of GPUs. Every little bit helps!

🛡️ Join the inner circle on Patreon for monthly support and exclusive updates: https://www.patreon.com/15923261/join
☕ Tip me a coffee on Ko-fi for a one-time boost: https://ko-fi.com/kiwitech

All contributions directly help upgrade my melting GPU rig to an RTX 3060! 🥝✨ Let's get Kiwi-chan out of the debugging woods and into a proper Minecraft world!

Top comments (0)