We just wrapped an action-packed session walking through Google's new Gemini 3 model and how it performs inside Kilo Code. We talked benchmarks and pricing, and even shipped an entire video game with one prompt using Kilo Deploy.
Highlights
Headline: Gemini 3 is Google's most capable model so far.
Crushes benchmarks like GPQA Diamond, Humanity's Last Exam, MathArena Apex, and tops WebDev Arena and Terminal Bench 2.0
It has 1M+ token context, and up to ~65k output tokens
It's designed to feel smart, concise, and direct. It's better at understanding intent so you don't have to over-prompt.
Why should developers care?
Google explicitly calls Gemini 3 their best vibe coding & agentic coding model yet.
It picks human-sensible architectures and libraries, writes extendable code, and organizes multi-file React/Next.js projects in a way that feels like a thoughtful human engineer, according to our experiments.
Pricing vs Claude Sonnet 4.5
At typical prompt sizes (<200k tokens), Gemini 3 Pro is slightly cheaper per input/output token than Claude Sonnet 4.5.
In long-context (1M tokens), Gemini is more cost-efficient on both input and output.
With Kilo's open model selection, you could use Gemini for heavy reasoning/design, then switch to more inexpensive models for smaller implementation tasks, without changing anything about your workflow.
Live Demo: Building & Deploying Kilo Man in One Prompt
We showed an end-to-end flow using Gemini 3 Pro Preview in Kilo Code to build a playable (and honestly, pretty fun) video game in one prompt:
Prompting Kilo in Orchestrator Mode
Here's the prompt we gave to Kilo Code, set to Orchestrator mode:
"Create a modern, detailed 2D platform game called "Kilo Man" with a humanoid character that has to avoid obstacles and moving "monsters" to reach a final goal (like Super Mario).
It should be full screen, with a parallax, aesthetic background, and a black and yellow theme. The game camera should follow the player, and there should only be one, fairly long, scrolling level.
Include a slider that adjusts the player's jumping height."\
Here's what happened next:
Agent Orchestration in Kilo Code
- Kilo's **[Orchestrator mode](https://kilo.ai/docs/basic-usage/orchestrator-mode)** created a subtask for **[Architect](https://kilo.ai/docs/basic-usage/using-modes#architect-mode)** mode.
- **Architect** wrote an implementation and architecture plan to a markdown file in the project repository:
- Tech stack: **Next.js 15, React, Tailwind, HTML5 Canvas**
- Systems for camera, enemies, physics, parallax visuals, and a slider for jump height.
- Orchestrator then handed off to **Code** mode, which implemented the full app according to the plan.

Result: a fully playable platformer
- Fullscreen, parallax background, patrolling enemies, restart on death, follow-camera, jump-height slide. None of this was micromanaged in the prompt.
- All generated in one shot, then ran locally:

Deploying with Kilo Deploy (early testing feature)
- We connected GitHub to Kilo via the **[Integrations](https://app.kilocode.ai/integrations)**[ tab ](https://app.kilocode.ai/integrations)in the Kilo Dashboard.
- Went to **Deploy** → "New Deployment" → selected the Kilo Man repo & branch → clicked **Create deployment**.
- Kilo built the project and gave a **shareable production URL** for the game:

You can play Kilo Man here:
And, you can be an early tester of Kilo Deploy today! Visit our Discord to learn more:
Questions We Covered (Q&A Highlights)
-
Q: What's Kilo Code's long-term plan for autonomous debugging?
- Today: Use our native Debug mode, which you can pair with any model (including Gemini 3). Also, you can use the CLI in autonomous mode with the
--autoflag. - Coming soon: Sign up to be an early tester of code reviews and cloud agents!
- Today: Use our native Debug mode, which you can pair with any model (including Gemini 3). Also, you can use the CLI in autonomous mode with the
-
Q: How does Kilo Code ship so fast?
- Internally, we call it "Kilo Speed". We have a tight team, fast shipping cadence, and quick feedback loop with our open source community.
-
Q: Is the Kilo CLI actively developed and core to the product?
- Yes, and yes!
- Recent features include parallel CLI agents (two agents working in parallel on the same task) and autonomous mode (no approval needed, suitable for pipelines).
-
Q: Is it better to access Gemini 3 via Kilo Gateway or CLI? How does tool calling work?
- It's the same model either way, so you can choose the CLI or Extension based on your personal workflow preference.
-
Q: Do parallel agents use Git worktrees under the hood?
- Yes, they do! That's how they don't overwrite/conflict with each other.
-
Q: Is Kilo based on Cline/Roo and is it up to date?
- Kilo originally forked Roo, then pulled in all the best from Cline.
- Kilo is a superset of the best open-source features: take the best from what already exists, and keep evolving from there.
-
Q: Are you accepting community contributions? Any best practices?
- Absolutely!
- Contributions, bug reports, and feature requests go through GitHub Discussions, so feel free to post there with details and screenshots, if possible.
-
Q: Any more events coming?
What This All Means
Gemini 3 brings frontier-level reasoning + huge context + good taste in code.
Kilo Code's agentic orchestration (Architect / Code / Ask / Debug / Orchestrator) lets you actually use that intelligence in real projects.
With Kilo Deploy, going from "idea in a prompt" → "live app on a URL" is now realistically a one-prompt + few-clicks experience.
And, watch the session replay here:



Top comments (0)