Jubin Soni

Posted on May 29

Jules: Google's Async Coding Agent Is Changing How We Think About AI and Software Development

#webdev #ai #google #productivity

There's a quiet architectural shift happening in how we build software, and it doesn't look like what most people expected.

We've spent the last two years treating AI like a very fast autocomplete — a co-pilot sitting shotgun, responding the moment we type. Cursor, Copilot, Gemini Code Assist: all synchronous, all requiring you to stay in the loop, all fundamentally keeping you as the CPU driving execution.

Jules breaks that model.

Google's async coding agent, which went generally available in 2025 and got major updates at I/O 2026, doesn't help you write code faster. It removes you from the writing loop entirely. You assign a task. Jules works. You review a pull request. That's it.

This article breaks down how Jules works technically — with architecture diagrams, sequence flows, and real code — and why the async model might be more significant than it first appears.

What Jules Actually Does (and Doesn't Do)

Jules is not an IDE plugin. It's not an inline suggestion engine. It's not a chat interface for your codebase.

Jules is a task-based async agent. You give it a scoped coding task — fix a bug, migrate a module, add a feature, write tests — and it:

Clones your repository into a secure Google Cloud VM
Analyzes the relevant codebase context (2M token context window as of I/O 2026)
Writes a step-by-step implementation plan using Gemini Pro
Executes that plan: writing code, running tests, fixing errors
Opens a pull request against your branch with a description, diff, and change summary

When it's done, you're not staring at a chat window waiting to approve line-by-line edits. You're reviewing a PR — just like you would from any engineer on your team.

The 2026 update closes the loop further: if the CI/CD pipeline fails on the Jules-authored PR, Jules automatically receives the error, analyzes it, applies a fix, and re-pushes the commit — often without any human intervention at all.

Architecture: How Jules Is Built

Jules is your end-to-end agentic product development platform. It reads your entire product context to figure out what to build next, comes up with solutions, and then ships a PR.

jules.google.com

Here's how the components fit together:

Key architectural choices:

Isolated VM per task: no shared state between runs, reproducible environments
Network access retained: Jules can npm install, run builds, call APIs — unlike Codex which sandboxes with no egress
Two-model split: Gemini Pro handles planning and hard reasoning; Gemini Flash handles lighter subtasks for cost efficiency
Native GitHub integration: reads issues, creates branches, authors commits, opens PRs — not a wrapper, it's first-class

Sequence: The Async Flow End-to-End

The sequence below shows what happens from task assignment to merged PR, and where the developer is actually free:

The key insight: the developer's attention is only required at step 1 (spec) and step 12 (review). Everything in between is Jules.

Code: Using Jules in Your Workflow

1. Jules CLI (Jules Tools — GA at I/O 2026)

# Install Jules Tools CLI
npm install -g @google/jules-tools

# Authenticate
jules auth login

# Submit a task against a GitHub repo
jules task create \
  --repo your-org/your-repo \
  --branch main \
  "Fix the race condition in payment/processor.go — 
   two concurrent requests can double-charge. 
   Add regression tests covering the concurrent case."

# Check task status
jules task status <task-id>

# List open tasks
jules task list --status=in-progress

2. Via Gemini CLI Extension

# Install Gemini CLI
npm install -g @google/gemini-cli

# Add Jules extension
gemini extensions install https://github.com/gemini-cli-extensions/jules --auto-update

# Submit directly from your terminal
/jules Fix the flaky integration tests in auth/session_test.go. 
       Root cause appears to be missing teardown between test runs.

# Jules responds async — you get a PR link when it's done

3. Jules API (for CI/CD integration)

import google.auth
from jules_client import JulesClient

credentials, project = google.auth.default()
client = JulesClient(credentials=credentials)

# Submit a task programmatically
task = client.tasks.create(
    repo="your-org/your-repo",
    branch="main",
    description="""
        Migrate the UserRepository class from raw SQL to 
        the new ORM layer introduced in db/orm.py.
        Preserve all existing query behaviour and update tests.
    """,
    labels=["migration", "automated"]
)

print(f"Task submitted: {task.id}")
print(f"Track at: {task.url}")

# Poll for completion (or use webhooks)
import time
while task.status not in ["completed", "failed"]:
    time.sleep(30)
    task = client.tasks.get(task.id)

if task.status == "completed":
    print(f"PR ready: {task.pull_request_url}")

4. GitHub Actions Integration

# .github/workflows/jules-debt.yml
name: Weekly tech debt sweep

on:
  schedule:
    - cron: '0 9 * * MON'   # Every Monday at 9am

jobs:
  sweep:
    runs-on: ubuntu-latest
    steps:
      - name: Submit Jules tasks from tech-debt.md
        uses: google/jules-action@v1
        with:
          jules-api-key: ${{ secrets.JULES_API_KEY }}
          task-file: .github/tech-debt.md
          branch: main
          auto-merge: false   # Always require human review

Practical Workflow: What Jules Is Good At

Jules excels when the unit of work maps to a ticket. The sharper your spec, the better the output.

Task Type	Jules Fit	Why
Bug fix with clear repro steps	✅ Excellent	Deterministic target, testable outcome
Add test coverage to a module	✅ Excellent	Well-defined scope, no design decisions
Dependency upgrades with API changes	✅ Good	Mechanical but multi-file
Migrate module to new framework/ORM	✅ Good	Repetitive pattern Jules handles well
Security patch + regression tests	✅ Good	Scoped + CI validates automatically
Exploratory refactor (uncertain scope)	⚠️ Risky	Scope drift, Jules may over-engineer
Greenfield architecture design	❌ Wrong tool	No acceptance criteria to validate against
Real-time pair debugging	❌ Wrong paradigm	Needs synchronous back-and-forth

The honest rule of thumb: if you could write a solid Jira ticket for it, Jules can probably do it.

Jules vs. the Field

	Jules	Claude Code	OpenAI Codex	GitHub Copilot
Execution model	Async (PR delivery)	Sync (interactive terminal)	Async (PR delivery)	Sync (inline suggestion)
Runtime environment	Google Cloud VM	Local / container	Cloud sandbox	Editor plugin
Network access in VM	✅ Yes	✅ Yes	❌ No (strict sandbox)	N/A
GitHub integration	Native (issues → PR)	Via CLI	Native	Native
Languages supported	Node, Python, Go, Rust, Java	Any	Node, Python primary	Any
Parallel task execution	✅ Yes	❌ One at a time	✅ Yes	❌ One at a time
CI auto-fix loop	✅ Yes (2026)	❌ No	❌ No	❌ No
Context window	2M tokens	~200K tokens	~128K tokens	~8K tokens
Best for	Delegated ticket work	Complex collaborative tasks	Security-sensitive workflows	Inline acceleration

What Jules Gets Right — and Where It's Still Incomplete

What's working:

The async PR model genuinely removes you from low-value execution loops
CI integration with auto-fix is a real quality-of-life improvement for teams
Multi-language runtime support (Node, Python, Go, Rust, Java) is broader than most competitors
The CLI and Gemini CLI extension make it composable into existing dev workflows
2M token context means Jules can reason across large codebases without truncation

What's still incomplete:

Jules validates against tests — codebases with thin coverage expose the reviewer to unknown unknowns
The debugging story for multi-agent ADK workflows is thin; distributed AI agent observability is largely unsolved
Spec quality gates: Jules has no way to flag an underspecified task before burning compute on it
For exploratory or greenfield work, you still need a synchronous collaborator

The Bigger Picture: What This Means for SWEs

Jules isn't a replacement for engineers. It's a redefinition of what "engineering work" means at the margin.

The value of a senior engineer is increasingly not in the ability to implement — it's in:

Writing specs precise enough that an agent can execute them
Reviewing AI-generated PRs for correctness, design quality, and unintended side effects
Knowing when to reach for async delegation vs. interactive collaboration
Building and maintaining the test coverage and CI infrastructure that makes async agents safe to trust

Google I/O 2026 framed this explicitly: the engineers who get the most from agentic coding will be those who run both patterns in parallel — async for ticket-level work, sync for exploration — not those who pick a favorite.

Jules is a real tool for real workflows right now. If you have a backlog of well-scoped tasks and a codebase with decent test coverage, it's worth spinning up.