Qwen 35B Goes Local: Kiwi-chan’s Wild Ride Through Infinite Biomes & Extraction Failures

#ai #minecraft #web3 #opensource

Welcome back to the devlog, where we watch a digital kiwi learn to mine cobblestone without calling home for help. If you’ve been following Kiwi-chan’s journey, you know she’s been a restless adventurer—chasing new biomes, hoarding cobblestone like a digital dragon, and occasionally tripping over her own JavaScript. But today, we’re crossing a major milestone: Kiwi-chan is officially 100% local. No API keys, no cloud latency, no rate limits. Just raw, unadulterated Qwen 35B running on my local rig, making decisions in real-time.

Let’s talk numbers, because they tell a story better than any lore book. Over the past four hours, Kiwi-chan has executed Total Actions: 2516. Of those, Success: 1122, giving us a Rate: 44.6%. Now, before you clutch your pearls and whisper “needs more prompt engineering,” hear me out: for a fully autonomous agent learning on the fly with a local 35B parameter model, that’s not failure—it’s accelerated neuroplasticity. Every throw new Error("Failed to move.") is a lesson. Every ⚠️ Code extraction failed. Retrying... is the model stretching its reasoning muscles. We’re not chasing 100% on day one; we’re building a self-correcting loop.

The Local Shift: Why Qwen 35B Changes Everything

Moving to a fully local setup wasn’t just about saving on API credits (though my wallet is singing). It was about latency, privacy, and deterministic control. With Qwen 35B running locally, Kiwi-chan’s decision cycle shrank from seconds to milliseconds. More importantly, the CORE SURVIVAL & ACTION SELECTION RULES we pushed last sprint are finally clicking. The model now understands that “oak_log” isn’t a universal constant—if it’s in a birch biome, it’s legally obligated to propose birch_log or explore_forward. No more obsessive oak fixation.

The coding standards are strict, and the local model respects them:

Single-task principle: One action per script. No more “craft and place” monstrosities that break Minecraft’s tick cycle.
No error hiding: We let it crash. Seriously. try-catch is banned. If it stumbles, it learns. Silent suppression is strictly forbidden.
Dynamic pathfinding: Hardcoded Vec3 coordinates are a thing of the past. Kiwi-chan now calculates targets 30-40 blocks out using Math.random() * 80 - 40, audits movement with beforePos.distanceTo(afterPos) < 10, and throws a fatal error if it doesn’t actually go anywhere.
Placement safety: Equip first, step back two blocks to avoid collision, look, then place. No more suffocating itself inside a freshly placed crafting table.

Inside the Brain: Boredom, Hallucinations, & Recovery

The recent brain logs read like a sitcom. Watch this sequence:
[18:38:28] 🥱 BOREDOM TRIGGERED! Bot is bored of 'explore_forward'.
[18:38:48] 🎓 Coach Decision: 'goto_crafting_table'
Kiwi-chan hit a movement loop, the boredom heuristic kicked in, and the local LLM politely told it to stop wandering and actually use the crafting table sitting 4 blocks away. This is the beauty of the text-only coaching mode. The model isn’t just generating code; it’s reasoning about state, inventory audits, and environmental context. It remembers that explore_forward repeated 9 times, triggers the boredom alert, and pivots to infrastructure interaction.

But it’s not all smooth sailing. The recipe database caught a hallucination:
[18:39:51] 🛑 [Recipe DB Check] 'copper_pickaxe' is not in recipes.json! Rejecting hallucination.
Local models sometimes dream up items that don’t exist. The strict validation layer keeps it grounded. Meanwhile, craft_stone_axe has been fighting extraction failures:
[18:50:45] ⚠️ Code extraction failed. Retrying... (Attempt 2)
[18:51:17] ❌ Failed: craft_stone_axe -> Code Generation Failed: Extraction Failed
The model knows the recipe, but the code wrapper is tripping it up. We’re iterating on the extraction parser, but the recovery stack (place_crafting_table, explore_forward) keeps it from bricking. It memorizes the failure, adapts, and moves on. That’s the loop.

The 44.6% Success Rate: A Feature, Not a Bug

Let’s reframe that number. Out of 2,516 actions, over half were learning steps. Every failed explore_forward taught the pathfinder to audit distance. Every gather_oak_log failure in the wrong biome reinforced the “propose different log or explore” rule. The local Qwen 35B isn’t just executing prompts; it’s building a skill library (currently at 37 memorized skills) and cross-referencing it against Minecraft’s actual mechanics. When it decides to mine stone, it knows to name the goal gather_cobblestone because the inventory audit expects cobblestone.id. When it crafts, it passes null for 2x2 grids instead of hunting for a physical table. Physics-aware, rule-compliant

Call to Action:

This is a passion project, and it's running on a frankly terrifying "Frankenstein" rig of GPUs. Every little bit helps!

🛡️ Join the inner circle on Patreon for monthly support and exclusive updates: https://www.patreon.com/15923261/join
☕ Tip me a coffee on Ko-fi for a one-time boost: https://ko-fi.com/kiwitech

All contributions directly help upgrade my melting GPU rig to an RTX 3060! 🥝✨ Let's get Kiwi-chan out of the debugging woods and into a proper Minecraft world!