There's a specific reason why plugging a free model into Claude Code usually fails, and it's not the model quality.
The Real Requirement: Tool Calling
Claude Code's agent loop works like this:
- Receive task
- Call a tool (create file, run command, read output)
- Observe result
- Continue or adjust
Step 2 requires the model to support function calling — the ability to call structured tools, not just generate text. Most free models either don't support it at all, or implement it in a way that breaks at step 2 or 3. The agent starts working and then silently fails or returns garbage.
This is why OpenRouter free tiers, most local models, and generic API proxies don't work well with Claude Code as an agent. They work fine for chat. They fail for agentic tasks.
Gemma 4 31B: Native Function Calling
Gemma 4 31B has proper native function calling support. That's what makes it actually work as a coding agent — not just a code generator that hands text back to you.
To verify: I ran two tests.
Test 1: Build a Python terminal dashboard
Create a Python script called dashboard.py that:
- Generates sample SaaS metrics (Revenue, Users, Signups, Churn)
- Prints 4 metric cards with trend arrows
- ASCII bar chart for 6 months of revenue
- Recent transactions table
- Python standard library only, run it after creating
It created the file, ran it, verified the output. Full agent loop, zero issues.
Test 2: Find and fix 3 bugs in user_report.py
The script had:
-
=instead of==in a list comprehension (syntax error) -
datetime.datetime.nowwithout()(subtle runtime error) -
user[email]instead ofuser["email"](missing quotes)
Gemma 4 found all three, explained each one, fixed them, and ran the fixed version to confirm output.
The Setup (One Command)
Install Ollama if you don't have it:
curl -fsSL https://ollama.com/install.sh | sh
Then:
ollama launch claude --model gemma4
This connects Claude Code to Gemma 4 31B running on Ollama's cloud. The download is a small routing file — not 20GB. First run asks you to authenticate with Ollama. After that, it's instant.
Honest Limitations
Free tier: Measured in GPU time, not tokens. Resets periodically. A normal coding session fits fine. Heavy daily use will hit the limit — $20/month gives 50x more headroom.
HTML generation bug: Current version produces malformed output (doubled tags) for HTML specifically. Python, shell, JSON, config files — all fine. Don't use it for frontend templating.
When This Is Useful
If you want to test Claude Code's agentic capabilities before committing to an Anthropic plan, this is the most functional free option available right now. The tool calling works, the agent loop completes, and the model is capable enough to handle real tasks.
Full writeup: ayyaztech.com/blog/gemma4-claude-code-ollama-cloud-free
Top comments (0)