My Dad Got an Electricity Bill He Couldn't Understand. Google I/O 2026 Just Made That Problem Solvable.

#devchallenge #googleiochallenge

Google I/O Writing Challenge Submission

This is a submission for the Google I/O Writing Challenge

The Moment That Started Everything

Last year my dad got our electricity bill and just stared at it.

He couldn't understand why it was so high. He had no idea which device was the problem. Was it the iron my mum uses every morning? The old fridge that runs all night? The TV nobody turns off?

He just paid it. Like he always does. Because there was no easy way to find out.

That moment stayed with me. So I built something about it, an app where you point your camera at any appliance, pinch your fingers, and instantly see what that device costs you per month in your local currency, what it does to your CO₂ footprint, and one simple habit to cut the cost immediately.

The heart of it was Gemini's vision capability. You point. Gemini looks. Gemini understands. No typing. No forms. Just a camera and a pinch gesture.

It worked. But building it taught me exactly where the limits were.

And then Google I/O 2026 happened.

What I Was Actually Asking Gemini to Do

When you point a camera at an appliance, you are not giving Gemini a clean, labelled image. You are giving it a live camera frame, often at an angle, often in bad lighting, often partially obscured by a cable or a counter.

Gemini 2.0 Flash handled it remarkably well. It could identify an iron, a refrigerator, a standing fan, a microwave, even when the image was not perfect. It returned structured data: what the device is, how many watts it typically consumes, and what that costs per month in Nigeria, Ghana, or Kenya.

But there were moments where it hesitated. Older appliances with no visible branding. Devices that looked similar from certain angles. Situations where the model was confident but slightly wrong, calling a dehumidifier an air purifier, for instance, which changes the wattage estimate significantly.

Those were not failures. They were the edges of what vision AI could do at the time.

What Gemini 3.5 Flash Changes

Gemini 3.5 Flash improves both multimodal reasoning and response speed, which matters enormously for real-time camera experiences.

For a real-time camera application, that speed difference is not a convenience. it is the difference between an experience that feels instant and one that makes the user wait.

But the multimodal improvement matters more than the speed. Gemini 3.5 Flash is better at understanding what it is actually looking at in complex, real-world visual contexts. Not studio images. Not clean product photos. Real environments with bad lighting, odd angles, and ambiguous objects.

That is exactly the problem I was trying to solve.

An old Nigerian refrigerator from the 1990s does not look like the refrigerators in most training datasets. A locally assembled electric cooker does not have a recognizable brand logo. The visual diversity of household appliances across Africa, India, Ghana, and Kenya is enormous, and the gap between what a model trained on Western product images knows and what actually exists in these homes is real.

Gemini 3.5 Flash's improved multimodal understanding closes that gap. Not completely. But meaningfully.

What Gemini Omni Opens Up

Gemini Omni is a new series of models that combines Gemini's reasoning capabilities with creation, accepting image, audio, video, and text input and outputting video grounded in real-world knowledge.

This one stopped me.

Imagine pointing your camera at your kitchen and instead of scanning one appliance at a time, the model watches a short video of your kitchen, identifies every device it sees, and generates a complete energy audit, total monthly cost, total CO₂ footprint, priority recommendations, all in one pass.

That is not a feature I could have built before. That is a feature I could build now.

The pinch-to-scan interaction I built works well for focused, one-device-at-a-time scanning. But the real problem my dad had was not "tell me about this one device." It was "Tell me why this whole bill is so high." Video input that generates grounded output is the architecture that answers the actual question.

The Announcement That Surprised Me Most

Google announced Antigravity 2.0 with new capabilities to orchestrate and build agents, transitioning from AI that simply assists to agents that can independently navigate complex tasks.

When I built my energy scanning app, every interaction was stateless. You scan a device, you get a result. If you scan another device, you get another result. The dashboard accumulated data, but the model had no memory of what it had already seen.

An agent that can remember that knows you scanned the fridge last week, that it already recommended you switch to a more efficient model, that it is now tracking whether your bill actually went down, that is a fundamentally different product.

That is not a scanning tool. That is an energy advisor that lives on your phone and actually follows up.

What This Means For Developers Building in Africa

I want to say something that the Google I/O coverage mostly missed.

Gemini 3.5 Flash delivers frontier-level capabilities at less than half the price of comparable frontier models.

For developers building in Nigeria, Ghana, Kenya, India, where margins are thin, where users cannot pay Western subscription prices, where the business model has to work at a completely different cost structure, that pricing difference is not a footnote. It is the difference between a viable product and one that cannot sustain itself.

Most of the Google I/O coverage talked about what these models can do. The more important story for developers outside Silicon Valley is what these models now cost to run. Frontier vision capability at accessible pricing means the gap between what developers in Lagos can build and what developers in San Francisco can build just got smaller.

That is the announcement I cared about most.

What I Am Building Next

My dad still checks his electricity bill every month and wonders where the money went.

With Gemini 3.5 Flash's improved multimodal understanding, Gemini Omni's video input, and Antigravity 2.0's agentic memory. I have everything I need to build the version of this tool that completely solves his problem.

Do not scan one appliance at a time. Walk through the house. Let the agent watch. Get the full picture. Follow up next month.

Google I/O 2026 did not just announce new models. For developers who are building real tools for real people in the places the tech industry usually forgets, it has moved the frontier to where we actually are.

The appliance scanning app referenced in this post was built independently for the DEV Earth Day Weekend Challenge. All Google I/O 2026 details are from official Google announcements.