Phú

Posted on May 13 • Edited on May 19

I Used Gemma 4 as a Local Coding Agent With OpenCode. Here’s What Happened

#devchallenge #gemmachallenge #gemma

Gemma 4 Challenge: Write about Gemma 4 Submission

This is a submission for the Gemma 4 Challenge: Write About Gemma 4

Introduction

Cloud coding assistants are useful. I use them to explain unfamiliar code, generate tests, review changes, and sometimes help me debug weird problems faster than I could alone. But I have always had one question: What if I can use coding assistants locally? Is it possible?

That question led me to try a setup with latest open weight model of Google which is Gemma 4. This is my setup:

Gemma 4 26B A4B MXFP4 of Unsloth as the model
OpenCode as the coding agent
My own codebase as the test environment

The goal was simple: I wanted to see whether Gemma 4 could behave like a useful local coding agent inside my real development workflow. Not a benchmark. Not a perfect lab test. Just a practical experiment: give the model real coding tasks, let it inspect files, ask it to make changes, and see what actually happens.

What is OpenCode?

OpenCode is an open-source AI coding agent. It can connect to many different model providers, and its model documentation says it supports local models as well.

That made it a good fit for this experiment.

I was not looking for simple autocomplete. I wanted to test agent-style behavior:

Can it inspect my project?
Can it find the right files?
Can it implement new feature?
Can it understand a bug?
Can it suggest a safe fix?
Can it review a diff before I commit?

For me, that is the difference between a chatbot and a coding agent.

A chatbot answers questions.

A coding agent works inside the codebase.

Why I chose Gemma 4

Gemma 4 was interesting for this experiment because Google describes the family as supporting reasoning, multimodal input, larger context windows, enhanced coding capability, and agentic workflows. Google’s Gemma 4 documentation lists four model sizes: E2B, E4B, 31B, and 26B A4B. It also says the small models support a 128K context window, while the medium models support 256K.

That matters for coding.

A coding assistant needs more than short answers. It needs enough context to understand files, dependencies, naming conventions, and the relationship between different parts of a project.

Google DeepMind also describes the 26B and 31B Gemma 4 models as aimed at advanced reasoning for IDEs, coding assistants, and agentic workflows.

So Gemma 4 felt like the right model family to test with OpenCode.

My configuration

I use llama.cpp with KV Cache is 8bit. I try with maximum context window of this model. This is my configuration:

./build/bin/llama-server \
    -m /home/Downloads/gemma-4-26B-A4B-it-MXFP4_MOE.gguf \
    -ngl all \
    -c 262144 \
    -fa on \
    -ctk q8_0 \
    -ctv q8_0 \
    --host 127.0.0.1 \
    --port 8080

In my opencode configuration, I setup it like this

{
  "$schema": "https://opencode.ai/config.json",
  "provider": {
    "llama.cpp": {
        "npm": "@ai-sdk/openai-compatible",
        "name": "llama-server (local)",
        "options": {
          "baseURL": "http://127.0.0.1:8080/v1"
        },
        "models": {
          "gemma-4-26B-A4B-it-MXFP4_MOE": {
            "name": "Gemma 4 26B A4B MXFP4"
        }
      }
    }
  }
}

Then now, I run command to create Laravel project with React to try to test.

laravel new gemma4

After run this command, choose React as stack. Do not need to use built authentication.

Can it inspect my project?

I check it with this prompt Explain to me tech stack of this project., see how it will answer.

Thinking: The user wants to know the tech stack of the current project. I should start by exploring the codebase to identify the main technologies used. I'll look for key files like composer.json, package.json, vite.config.js, etc.
⚙ laravel-boost_application-info 
Thinking: The laravel-boost_application-info tool provided a comprehensive list of the tech stack.
Summary:
- Backend: PHP 8.5, Laravel 13.8.0
- Frontend: React 19.2.6, Inertia.js 3.1.0 (inertiajs/inertia-laravel and @inertiajs/react)
- Styling: Tailwind CSS 4.3.0
- Database: SQLite
- Testing: Pest 4.7.0, PHPUnit 12.5.24
- Build Tools/Environment: Vite, Laravel Sail 1.58.0, Laravel Pint 1.29.1 (formatting), ESLint 9.39.4, Prettier 3.8.3
- Other Laravel Ecosystem: Laravel Wayfinder, Laravel Boost, Laravel Prompts, Laravel MCP, Laravel Pail
I can now answer the user's request.
This project uses the following tech stack:
*   Backend: PHP 8.5 with Laravel 13.8.0
*   Frontend: React 19.2.6 and Inertia.js 3.1.0
*   Styling: Tailwind CSS 4.3.0
*   Database: SQLite
*   Testing: Pest 4.7.0 and PHPUnit 12.5.24
*   Tooling: Vite, Laravel Sail, Laravel Pint (formatting), ESLint, and Prettier

It is correct. Now, try to ask more detail Where can I add my controller?

Thinking: The user wants to know where to add a controller in this Laravel project.
I should look for the standard Laravel directory structure to confirm where controllers are typically placed.
Standard Laravel controllers are located in app/Http/Controllers.
I will use ls or glob to check the existence of this directory.
✱ Glob "app/Http/Controllers/**/*" (1 match)
You can add your controllers in app/Http/Controllers.

and Where can I add my test?.

Thinking: The user is asking where they can add their tests.
I should look for existing tests in the project to identify the testing directory and pattern.
I'll start by using glob to find directories that look like test directories (e.g., tests/).
✱ Glob "tests/**/*" (4 matches)
Thinking: The user wants to know where to add tests.
The glob results show:
- tests/TestCase.php
- tests/Feature/ExampleTest.php
- tests/Pest.php
- tests/Unit/ExampleTest.php
This indicates that tests are located in the tests/ directory, specifically divided into tests/Feature and tests/Unit.
The project uses Pest, as indicated by tests/Pest.php and the pestphp/pest package in the application info.
I will inform the user that they can add tests in the tests/ directory, specifically in tests/Feature for feature tests and tests/Unit for unit tests.
You can add your tests in the tests/ directory, specifically in:
*   tests/Feature for feature tests.
*   tests/Unit for unit tests.

Pretty nice. Now ask what does this project current have. My question is What is this project about? Explain in detail..

Thinking: The user wants a detailed explanation of what the project is about.
I need to explore the codebase to understand its purpose.
I'll start by listing the files in the root directory and looking at README.md if it exists.
# List files in the root directory
$ ls -F
AGENTS.md
CLAUDE.md
app/
artisan*
boost.json
bootstrap/
composer.json
composer.lock
config/
database/
eslint.config.js
node_modules/
opencode.json
package-lock.json
package.json
phpunit.xml
pint.json
pnpm-workspace.yaml
public/
resources/
routes/
storage/
tests/
tsconfig.json
vendor/
vite.config.ts
Click to collapse
Thinking: There's no README.md. I'll check CLAUDE.md and AGENTS.md as they might contain information about the project. I'll also look at composer.json to see the package dependencies and potentially a description.
→ Read CLAUDE.md 
Thinking: CLAUDE.md seems to be a set of guidelines for the assistant (me) rather than a description of the project's business logic. It contains rules for using the Laravel ecosystem tools.
I'll check AGENTS.md.
→ Read AGENTS.md 
Thinking: Both CLAUDE.md and AGENTS.md are just the Laravel Boost guidelines. They don't describe the project's actual functionality.
I'll look at the routes/web.php and routes/api.php to see what endpoints exist, and then inspect the controllers and models to understand the business logic.
→ Read routes/web.php 
Thinking: The routes/web.php only has a single route pointing to a welcome Inertia page. This suggests the project might be a very minimal boilerplate or a template.
Let's check resources/js/pages/ to see what the welcome component looks like.
✱ Glob "resources/js/pages/**/*" (1 match)
Thinking: Okay, let's read resources/js/pages/welcome.tsx.
→ Read resources/js/pages/welcome.tsx 
Thinking: The welcome.tsx is just the standard Laravel welcome page.
This project appears to be a fresh installation of Laravel with Inertia and React, likely used as a development environment or a template for building applications. There is no specific business logic currently implemented.
I'll summarize based on the tech stack I found earlier and the fact that it's a minimal/boilerplate setup.
Based on the current codebase, this project appears to be a minimal boilerplate or template for a web application. It does not yet contain specific business logic or a defined purpose beyond being a starting point.
It is configured with a modern, full-stack architecture:
*   Backend Framework: Laravel 13 (PHP 8.5)
*   Frontend Framework: React 19 with Inertia.js 3 (providing a Single Page Application experience)
*   Styling: Tailwind CSS 4
*   Database: SQLite
*   Testing Suite: Pest 4 and PHPUnit 12
The current application only contains a standard "Welcome" page, which serves as a placeholder for developers to begin building their own features.

In response, you can see that it list correct file. Read content of welcome page, then answer correctly. So for me, it is good for understand my project.

Plan and implementation

I will try to implement fitness project. I will need landing page with contact us form. Sign in and sign up page for my customer and dashboard for them. I will ask it to plan first to see how it does.
This is my prompt:

Implement fitness project. I need a landing page to introduce about fitness, have contact us form page for user to send question. Sign in and sign up page. When login successfully, user can see their dashboard.

Since output is too long now, so I only put a final result here. This is what it plan for me:

I have analyzed your requirements and the project environment. Based on your request, I have developed a comprehensive implementation plan.
Proposed Implementation Plan
Phase 1: Authentication Setup
I recommend installing Laravel Breeze (Inertia/React stack). This is the industry standard for Laravel applications and will automatically provide secure, pre-built:
- Sign-in and Sign-up pages.
- Password reset functionality.
- Session management and protected routes.
- A functional Dashboard page.
Phase 2: Core Features
1.  Landing Page: A visually appealing introduction to your fitness project using Tailwind CSS.
2.  Contact Us Page:
    - A dedicated ContactMessage model and database table to store inquiries.
    - A ContactController to handle form submissions.
    - A React-based form page with validation.
3.  Dashboard: A personalized area for logged-in users to view their fitness-related information.
Phase 3: UI/UX & Refinement
- Theme: Implement a cohesive fitness-themed design using Tailwind CSS.
- Layouts: Create shared layouts for the Landing page (public) and the Dashboard (authenticated).
- Components: Build reusable UI elements like buttons, inputs, and navigation bars.
---
Clarifying Questions
Before I begin, please clarify the following:
1.  Authentication: Are you okay with me using Laravel Breeze to set up the authentication system? It is the fastest and most secure method.
2.  Dashboard Content: Do you have specific fitness data you'd like displayed on the dashboard (e.g., step counter, calorie tracker, workout logs), or should I use placeholder data for now?
3.  Contact Form: Should contact messages simply be saved to the database, or would you like them to be emailed to an administrator as well?
Once you confirm these details, I will start the implementation.

As usual, I ask it to put this into plan.md. Just an old day habit, I put into plan.md then I start new session to implement. With this way, I avoid to pollute context window of implementation. Now, I ask it to implement phase by phase. Implement phase 1 in plan.md, Implement phase 2 in plan.md and so on.

Now I have sign in and sign up page. Note that UI is actually from Breeze package. So mainly in this phase, it install and configure Breeze package correctly. Now move to implement landing page and dashboard.

Unfortunately, it finish to implement landing page, but when go to localhost:8000 which is domain of my app, it still show welcome page of laravel. I see that error is because of mismatch Pages to pages. However, I want to let it check and see if it can figure out this issue or not. I use playwright mcp and ask it to visit localhost:8000 to check and report to me. My prompt Even after finish to implement @resources/js/pages/Welcome.tsx , in localhost:8000 shows nothing. Use playwright to check that and let me know what is an issue.. And wow, it actually can figure out. This is its response.

The issue is a mismatch between where your new component is located and where the Inertia resolver is looking for it.
The resources/js/app.jsx file is configured to resolve components from resources/js/Pages/ with a .jsx extension:
    resolve: (name) =>
        resolvePageComponent(
            `./Pages/${name}.jsx`,
            import.meta.glob('./Pages/**/*.jsx'),
        ),
However, your new implementation is located at resources/js/pages/Welcome.tsx (lowercase pages and .tsx extension). As a result, Inertia is still loading the old component at resources/js/Pages/Welcome.jsx.
To fix this, you can either:
1. Move and rename your file to resources/js/Pages/Welcome.tsx (and update the resolver in app.jsx to support .tsx).
2. Update resources/js/app.jsx to support the pages directory and .tsx extensions.

Then I ask it to fix this issue. This is response after it fixes.

I have fixed the issue by:
1. Moving Welcome.tsx, Dashboard.tsx, and Contact.tsx from resources/js/pages/ to the standard resources/js/Pages/ directory and renaming them to .tsx.
2. Removing the old .jsx versions of these components.
3. Updating resources/js/app.jsx with a flexible resolver that supports both .jsx and .tsx extensions.
The application now correctly renders the new components at localhost:8000.

But well, style is not good. Seems like it miss style.

No worries, ask it to check and fix again. Yay, it figured out.

Now, it looks beautiful.

Actually, I expect it show contact us form there as well. But when click to a button, it redirects to another page. However, it is ok. Then I test contact us form. At least it works. However, it does not show any notification to let user know that they send successfully.

And now, I move on to check dashboard in login page. I run command to seed user. Then I login. And wow, it is nice. This is awesome I can say.

In conclusion, if want to understand codebase, find and fix bug, this model is great. However, for implementation, it is a little bit inconsistent for UI and have many bugs to fix after finish.

Test

Now I want to try to let it test page. Do manual testing with playwright and report to me. See if it test properly or not. I expect it test landing page, responsive. Test contact us form works or not. Then login, sign up and dashboard. This is my prompt go to localhost:8000 with playwright. Test and report to me which feature work and which one is not. Check credential of user in seed and use that to login to test dashboard.. In the meantime, I take some coffee :D. This is result.

I have completed the testing. Here is the report:
Credentials used (from DatabaseSeeder.php):
- Email: test@example.com
- Password: password
Test Results:
- Login: ✅ Works (successfully logged in with seed credentials).
- Dashboard: ✅ Works (all metrics and recent workouts loaded correctly).
- Profile: ✅ Works (profile information and password update sections are accessible).
- Contact Us: ✅ Works (form loads and submits without errors).

It did not test landing page at all. So I need to ask it to check. Simple prompt Test landing page and responsive. Result is good. Then I am confident about my application after testing.

I have tested the landing page responsiveness:
Landing Page Test Results:
- Desktop View: ✅ Works (all headings, paragraphs, and links are present).
- Mobile View (375x667): ✅ Works (content structure is preserved).
- Tablet View (768x1024): ✅ Works (content structure is preserved).
- Responsiveness: ✅ Responsive (the page layout and elements adapt correctly to different viewport sizes without errors).

Now I will commit this changes. Another simple prompts "commit all changes". Since I have not init git before. So it only show commit "init". I think I have another chance to challenge it. So I will ask it that if no changes relate to plan.md. Put that to first commit call "init". Then check all changes and commit second time. Let's see how it goes. My prompt check changes, if changes not in @plan.md , add that with commit "init". Then check all changes left, generate commit message and then commit it. This is a result I get:
Laravel Breeze, Contact Us feature, and Dashboard. Not very detail but it is ok.

Bonus

After test it with understand project, implement new feature, testing and commit. It is good for around 30B model. I am quite interested about understanding project, codebase and testing. So I want to do further testing with random open source repo. I choose rtk repo here RTK AI. So I clone that to my local. Then I ask it a question Check this repo and explain to me how does it work. It respond to me.

RTK (Rust Token Killer) is a high-performance CLI proxy designed to minimize the number of tokens an LLM (like Claude Code, GitHub Copilot, or Cursor) has to consume when running shell commands.
How It Works
RTK operates using three main mechanisms:
1.  Explicit Command Modules:
    RTK has built-in, highly optimized filters for common commands (e.g., git, cargo, npm, pytest, docker). When you run rtk git status, instead of getting the full raw output, you get a condensed, token-optimized version that retains only the essential information.
2.  Automatic Command Rewriting (Hooks):
    The most powerful feature is its integration with AI agents. When you run rtk init -g, it installs a "hook" into your AI tool (like Claude Code). This hook intercepts Bash commands before they are executed in the shell and transparently rewrites them.
    *   User runs: git status
    *   Hook intercepts and rewrites to: rtk git status
    *   Result: The AI receives the compact output without you ever having to manually type rtk.
3.  TOML-based Fallback Filters:
    For commands that RTK doesn't have a built-in module for, it uses a fallback system. If a command is intercepted but not recognized, RTK looks for a matching configuration in a local .rtk.toml file. This allows you to define custom regex-based filtering, grouping, and truncation rules for any tool.
Core Strategies for Token Savings
To reduce token counts by 60-90%, RTK applies four strategies to command outputs:
*   Smart Filtering: Removes "noise" like boilerplate text, comments, and excessive whitespace.
*   Grouping: Aggregates similar items (e.g., grouping multiple file errors by their error type or directory).
*   Truncation: Keeps the most relevant context (like the last few lines of a long log) while cutting redundant data.
*   Deduplication: Collapses repeated log lines into a single line with a count (e.g., [error] ... (x42)).
Key Commands
*   rtk init: Sets up integration for your specific AI agent.
*   rtk gain: Displays your token savings statistics and history.
*   rtk proxy <cmd>: Runs a command exactly as-is but still tracks the usage metrics.
*   rtk read <file>: A smart file reader that can provide "aggressive" summaries (signatures only) to save tokens.

At this stage, I understand a big picture how RTK can save token and it is written in Rust. But I still not have a clear understand how exactly does it help to save token. Then I want to add it to give me detail example so I can understand clearer.

Give me detail example of output so I understand why rtk save tokens.

rtk saves tokens by removing "noise" (boilerplate, repetitive logs, large context) while preserving the "signal" (actual changes, error messages, or structure) that an LLM needs to solve a task.
Here are detailed examples of how different strategies transform output:
1. Smart Filtering (Noise Removal)
Goal: Remove directory metadata and repetitive paths in file listings.
*   Raw ls -la (~800 tokens):
        drwxr-xr-x  15 user staff 480 Jan 15 10:00 my-project/
    -rw-r--r--   1 user staff 1234 Jan 15 10:00 src/main.rs
    -rw-r--r--   1 user staff 567 Jan 15 10:00 src/lib.rs
    -rw-r--r--   1 user staff 234 Jan 15 10:00 src/utils.rs
    ... (40 more lines of metadata and files) ...
    *   rtk ls (~150 tokens):
        my-project/
    +-- src/ (8 files)
    |   +-- main.rs
    |   +-- lib.rs
    |   +-- utils.rs
    +-- Cargo.toml
    +-- README.md

---
2. Aggressive Signature Extraction
Goal: When reading large files, show only the "skeleton" (function signatures) instead of the full implementation.
*   Raw cat main.rs (~2,000 tokens for 200 lines):
        fn main() -> Result<()> {
        let config = Config::load()?;
        let data = process_data(&config);
        for item in data {
            println!("{}", item);
        }
        Ok(())
    }
    // ... 190 lines of complex logic ...
    *   rtk read main.rs -l aggressive (~500 tokens):
        fn main() -> Result<()> { ... }
    fn process_data(input: &str) -> Vec<u8> { ... }
    struct Config { ... }
    impl Config { fn load() -> Result<Self> { ... } }

---
3. Deduplication & Grouping
Goal: Merge thousands of lines of repetitive logs or error reports into a single summary line.
*   Raw cargo test (~5,000 tokens on failure):
        running 15 tests
    test utils::test_parse ... ok
    test utils::test_format ... ok
    test utils::test_edge_case ... FAILED
    thread 'utils::test_edge_case' panicked at 'assertion failed: left == right'
      at src/utils.rs:45:10
    ... (huge backtrace, environment info, and repetitive output) ...
    *   rtk cargo test (~150 tokens):
        FAILED: 2/15 tests
    test_edge_case: assertion failed
    test_overflow: panic at utils.rs:18

---
4. Truncation (Context Condensing)
Goal: Simplify diff outputs by removing git headers and focusing only on changed lines.
*   Raw git diff (~1,000 tokens):
        diff --git a/src/main.rs b/src/main.rs
    index abc123..def456 100644
    --- a/src/main.rs
    +++ b/src/main.rs
    @@ -10,6 +10,8 @@
     fn main() {
    +    let config = Config::load()?;
    +    config.validate()?;
         println!("Starting...");
    -    let x = 42;
    *   rtk git diff (~50 tokens):
        src/main.rs (+5/-2)
    +  let config = Config::load()?;
    +  config.validate()?;
    -  let x = 42;

Nice, now I understand how does it work. I can start to tweak that for my need, start to contribute if I want. Honestly speaking, it will save you a lot of money for spending to read project.

Conclusion

After this experiment, my answer is: yes, Gemma 4 can be a useful local coding agent with OpenCode — but it works best when I treat it as a coding partner, not as a fully autonomous developer.

The strongest part of Gemma 4 was codebase understanding. It correctly inspected my Laravel + React project, identified the stack, found where controllers and tests should live, and explained that the project was still a fresh boilerplate. That is already very useful. When I tried it on the RTK open source repository, it also gave me a clear high-level explanation of how the project works and then followed up with practical examples. For learning a new codebase, this is where Gemma 4 impressed me the most.

For implementation, the result was more mixed. It successfully planned and built a fitness application with authentication, a landing page, contact form, and dashboard. It also installed and configured Laravel Breeze correctly. However, the first implementation was not perfect. There were issues with file naming, Inertia page resolution, styling, and missing user feedback after submitting the contact form. The good part is that when I asked Gemma 4 to debug those problems, it was able to inspect the project, understand the issue, and fix it.

Testing was also useful, but not perfect. It could use Playwright to test login, dashboard, profile, and contact form flows. However, it missed the landing page until I explicitly asked it to test that page and responsiveness. This taught me an important lesson: local coding agents can test applications, but the quality of the result depends a lot on how specific the prompt is.

My biggest takeaway is this:

Gemma 4 + OpenCode is very good at understanding, investigating, explaining, and fixing code. It is also capable of implementing real features, but the output still needs human review, UI polish, and follow-up debugging.

This experiment changed how I think about local AI coding. Before trying it, I thought local models were mainly interesting for privacy. After using Gemma 4 with OpenCode, I think local models are also interesting for developer productivity.

A local coding agent does not need to replace cloud tools completely to be valuable. Even if it only helps me understand a codebase, debug an issue, test a page, or prepare a first implementation draft, that already saves time.

So my final verdict is simple:

Gemma 4 did not replace me as a developer. But with OpenCode, it became a practical local coding assistant that can understand projects, implement features, debug issues, test pages, and explain unfamiliar repositories. It still needs guidance, but for an open model running locally, the result is impressive.

Top comments (1)

Harjot Singh • Jun 1

your experiment with Gemma 4 as a local coding agent sounds fascinating, especially how it fits into your real development workflow. at Moonshift, we help you get a full next.js + postgres + auth build deployed in about 7 minutes, and you keep the code on your github. if you're interested, I can set you up for a free run to give it a try.