ANIRUDDHA ADAK

Posted on May 20

i watched google tear down the old internet from a hostel room in kolkata

#googleiochallenge #devchallenge #ai #gemini

Auth0 for AI Agents Challenge Submission

This is a submission for the Google I/O Writing Challenge

it was well past midnight when i finally leaned back and thought: this is not a product update. this is a different kind of computing altogether.

the fan on my lenovo was whirring steadily. the air in my hostel room in kolkata was thick and humid the way it always gets in may. it was around ten-thirty at night when the google i/o 2026 keynote started streaming on my screen — the main event had kicked off at ten in the morning pacific time, which meant i was watching it nearly half a day behind in india standard time. i kept going until it bled past midnight, because i had a feeling i should not skip a single slide.

i work as a full-stack developer and machine learning engineer. i design scalable software systems and build specialized ai pipelines. so when i say that what google showed that night felt less like a feature drop and more like a paradigm reset, i mean that very literally.

the afternoon of may 20 had me back at my desk for the deep-dive developer sessions. i sat through google cloud live, the firebase integrations, and the antigravity platform walkthroughs, taking notes and occasionally running my own tests in google ai studio. by the time i finished, i had filled several pages with things i needed to process.

what follows is that processing — honest, technical, and written the way i actually think about software.

`gemini 3.5 flash` and the model that runs everything now

the first thing that hit me was the scale of what gemini 3.5 flash is designed to do. this is now the default model powering the gemini app and google search. it is not a research preview. it is live and running at a speed that google claims is nearly four times faster than competing frontier models, at roughly half the cost.

that matters because the use case has changed. this model is not built to answer a question once. it is built for agentic workflows — meaning it is expected to run multiple steps in a row, call external tools, hold context across long sessions, and execute things autonomously without someone typing a new prompt every five seconds.

the benchmark numbers they showed reflected that shift in focus:

benchmark	score	what it measures
`terminal-bench 2.1`	76.2%	command line execution and tool routing
`gdpval-aa`	1656 elo	autonomous agent performance
`mcp atlas`	83.6%	model context protocol integration
`charxiv reasoning`	84.2%	visual understanding and logical reasoning

these are not the benchmarks of a chat assistant. these are the benchmarks of a system that is supposed to do things on your behalf.

`antigravity 2.0` — the part that genuinely made me sit up

i have been using various agentic coding tools for a while now. so i came in sceptical. then varun mohan got on stage and did something that changed my reference point for what "agentic coding" actually means.

he gave antigravity 2.0 a single task: build the core framework of an operating system from scratch.

what the platform did next was this →

i) it spun up 93 separate sub-agents running in parallel
ii) those agents collectively generated 2.6 billion tokens
iii) the entire os framework was completed in roughly 12 hours
iv) the total compute cost came in at under 1,000 dollars

when i read that last figure, i had to re-read it. under a thousand dollars for a coordinated 93-agent os build in twelve hours. that cost structure changes what is financially viable to attempt.

then mohan tried to run doom on the freshly compiled os. it failed because of missing keyboard and video drivers. so he prompted antigravity 2.0 to write the drivers live, on stage. within seconds, freedoom (the open-source variant) was running and fully playable.

i have seen a lot of live demos fall apart. this one did not. and the fact that the failure itself became part of the demo — and got fixed in real time — actually made it more convincing, not less.

antigravity 2.0 is now a standalone desktop application with full cli support. google is actively nudging developers to move away from legacy command line interfaces toward this. i am already testing the migration.

how search became something different

the search box that i have used since i was a kid looks completely different now. under the hood, it is powered by gemini 3.5 flash, and it supports something called generative ui — where instead of returning a list of links, the search engine builds an interactive application or widget for you, on demand.

the feature that stuck with me most is search with canvas artifacts. when you search for something, a side panel opens with a live, editable mini-application. you can drag elements, modify the logic, inspect the structure. it is less like searching and more like summoning a working tool.

this is where i started to feel the ground shift. the result of a search is no longer a document. it is a running program.

`gemini spark` — an agent that runs while i sleep

google introduced gemini spark as a cloud-based, always-on personal agent. it does not need my laptop to be open or my phone to be charged. it runs on dedicated virtual machines inside google cloud, continuously, in the background.

what it handles:

1) calendar organization and scheduling
2) email inbox monitoring and draft responses
3) document drafting across workspace apps
4) task routing to over 30 third-party platforms via the open model context protocol

that last point is significant. spark can interact with platforms like openTable, uber, adobe, and asana — booking reservations, requesting rides, updating project boards. the integration is through mcp, which means it is not a closed google-only pipeline.

to handle the obvious security concern, google built the agent payments protocol. this framework lets users set →

✅ strict spending limits per agent session
✅ transaction restrictions to pre-approved merchants only
✅ mandatory manual human confirmation before any purchase clears

i appreciate that the payments protocol exists. i still think the tradeoffs here deserve careful thought, and i will come back to that later in this piece.

what i actually built in google ai studio

reading about new tools is one thing. i spent several hours testing them myself, which is the only way i trust my own analysis.

android app from a prompt

the android development pipeline inside google ai studio is, genuinely, fast. here is how it went when i built a native task management client:

i) prompt-driven kotlin generation → i selected the "build an android app" option and described what i wanted. the build agent generated production-quality kotlin code using the latest jetpack compose patterns. no scaffolding, no boilerplate hunting.

ii) real-time ui customization → i used the preview editor to draw directly on the virtual interface, adjusting margins and generating custom asset styling through the nano banana generator tool.

iii) emulator and deployment → i tested the build inside the browser using the integrated android emulator, then deployed directly to a physical device via adb. connecting my google play developer account and pushing to the internal test track was a single click.

the whole process from blank prompt to a device-deployed app took me under two hours. that includes the time i spent breaking things on purpose to see what would happen.

web portal deployed to cloud run

i also built a companion web portal to act as a command dashboard for my background agents. the workflow here was notably smooth:

1) workspace api integration → linked the portal directly to google sheets and google drive inside ai studio. no separate database connectors needed.

2) one-click serverless deployment → i hit deploy, and the app was live on google cloud run within seconds. no yaml files, no container configuration.

3) zero-cost developer tier → the first two deployed applications are free, with no credit card required. this matters for prototyping because it removes the friction of "let me check my billing first."

4) codebase export → i pulled the entire project state — conversation history and file structure included — directly into my local antigravity 2.0 desktop environment to continue scaling from there.

`gemini omni` and what it means for video

gemini omni is google's answer to multimodal generation at a serious level. what separates it from older video generators is its architecture — it is a single unified neural network that processes text, images, audio, and video simultaneously, rather than stitching together separate models.

the result is that when you edit a video using gemini omni, the physics are consistent across frames. lighting holds. fluid dynamics behave. if you add a character to a scene, they exist inside the scene's physical logic, not pasted on top of it.

the first iteration, gemini omni flash, is live now for subscribers through the gemini portal and through google flow.

on the content verification side, google has expanded synthid watermarking and c2pa content credentials across gemini omni output. every generated video carries an invisible digital watermark that supposedly persists through →

file compression
screen recording
direct editing

google search and chrome are now actively verifying these credentials. whether the watermarking holds under determined adversarial testing is something i want to see third-party researchers evaluate independently.

the things i think we need to talk about

i am genuinely excited about a lot of what was announced. but i am also a systems engineer, and i think it is important to be honest about the architectural and societal tradeoffs embedded in these announcements.

☑️ the ecosystem trap

antigravity 2.0, firebase, and google cloud run now form a very coherent, very capable development stack. the tighter that integration gets, the harder it becomes to leave. if your agents write code, host it, deploy it, and maintain it all within a single ecosystem, you are building a dependency that will cost you significantly if you ever need to migrate. this is worth thinking about before you go all in.

☑️ token consumption at scale

gemini 3.5 flash is cost-efficient for a frontier model. but complex agentic workflows eat context windows at a rate that adds up fast. when you are coordinating multiple sub-agents that are continuously analyzing codebases and running tests, token costs can spiral quickly. this is almost certainly the reason behind the new 100 dollars per month ai ultra subscription tier — it is not just a upsell, it reflects the actual compute cost of power users hitting baseline api quotas regularly.

☑️ surveillance by convenience

for gemini spark and android halo to do what they are designed to do — proactive cross-app automation — they need to observe a continuous, detailed picture of your digital life. your emails. your calendar. your purchases. your workflows across thirty-plus third-party platforms.

the traditional security boundary between separate applications has to dissolve for this to work. you are not paying for these features with money alone. you are paying with the depth and continuity of your behavioral data. that is a real tradeoff, and i think every developer who builds on top of these platforms should make that tradeoff consciously rather than by default.

i am not saying do not use these tools. i am saying understand what you are agreeing to when you do.

what i am actually doing with all of this

for developers and system architects who want to move intentionally through this new landscape, here is how i am thinking about it:

→ migrate to the antigravity cli

if you are a command-line developer (and if you are reading this, you probably are), the new antigravity cli is worth switching to from the legacy gemini cli. it brings sandboxed execution, credential masking, and secure git policies as first-class features.

→ build for webmcp compatibility

when you are building web applications now, expose structured tools — javascript functions, html forms — using the proposed webmcp open standard. if your web interface is not navigable by a browser-based ai agent, it will increasingly be invisible to the workflows people build around these tools.

→ manage your token footprint deliberately

configure persistent, isolated environments when calling the gemini api. resuming existing multi-turn sessions rather than re-uploading full file contexts on every call can meaningfully reduce operational costs and keeps session coherence intact.

→ use safe play store tracks first

if you are vibe-coding mobile apps inside google ai studio (and honestly, i have been), always connect to the internal test track before you do anything else. isolate your early prototypes completely from production builds while you are still figuring out what you actually built.

where i ended up

it was past one in the morning in kolkata when i finally closed the last session tab. the fan on my laptop wound down. the room was quiet again.

i have been a developer long enough to feel the difference between a year where things get incrementally better and a year where the underlying model of how software gets built actually changes. this felt like the second kind.

the agent is not a feature you add to your product anymore. the agent is the environment your product runs inside.

that sentence is what i kept coming back to as i fell asleep. i think it is the most accurate single-line description of what google announced at i/o 2026.

to follow the ongoing release of these tools, visit io.google/2026 or check what i am building at github.com/aniruddhaadak.

— aniruddha adak, kolkata, may 20, 2026

DEV Community

i watched google tear down the old internet from a hostel room in kolkata

`gemini 3.5 flash` and the model that runs everything now

`antigravity 2.0` — the part that genuinely made me sit up

how search became something different

`gemini spark` — an agent that runs while i sleep

what i actually built in google ai studio

android app from a prompt

web portal deployed to cloud run

`gemini omni` and what it means for video

the things i think we need to talk about

what i am actually doing with all of this

where i ended up

Top comments (0)

gemini 3.5 flash and the model that runs everything now

antigravity 2.0 — the part that genuinely made me sit up

how search became something different

gemini spark — an agent that runs while i sleep

what i actually built in google ai studio

android app from a prompt

web portal deployed to cloud run

gemini omni and what it means for video

the things i think we need to talk about

what i am actually doing with all of this

where i ended up

`gemini 3.5 flash` and the model that runs everything now

`antigravity 2.0` — the part that genuinely made me sit up

`gemini spark` — an agent that runs while i sleep

`gemini omni` and what it means for video