DEV Community

MCP Just Landed on Your Phone: What Google AI Edge Gallery Actually Does

Daniel Nwaneri on May 20, 2026

This is a submission for the Google I/O Writing Challenge I was already running MCP servers on my desktop — connected to Claude, wired into my ...

Read full post

leob • May 21 • Edited

Google is pushing things pretty hard - but this is for the AI enthusiasts, my phone is 4 (5? 6? I'm not sure) years old, I'm not gonna buy a new one just to run this stuff :-)

But, the fact that you can run LLMs locally on your phone - that's really impressive ... nice piece of research you did!

Daniel Nwaneri • May 21

Hardware wall is real. I hit it too on my first device ("no eligible devices"). Ran it on a Pixel in the end.

The interesting part isn't the flagship models though. There's a 584MB Gemma3-1B-IT in the list running on 4-bit quantized hardware — lower floor. Google's building toward it.

What would you actually want an offline on-device agent to do, if the hardware wasn't the constraint?

Said • May 22

to continue where I left in something I was doing previously for example if I was researching x then continue where I left.

If I was reading or listening it would continue where I left all the extra moves to continue after getting back on topic would be managed by AI.

experience would be something like continuing an story with an friend after few days of not talking.

leob • May 21

Well currently nothing lol, I have no use cases - but just the fact that it's possible to run an LLM on local phone hardware, that's pretty baffling when you come to think about it :-)

Daniel Nwaneri • May 21

"No use cases" is actually the honest position. Most people with one don't know they have it yet.

The baffling part is the right instinct — 584MB doing function calling and calendar writes, offline, on hardware from 2020. That's not a demo. That's the capability existing before the killer app does.

Usually happens in that order...

Syed Ahmer Shah • May 24

Focusing on local, on-device execution through the AI Edge Gallery is a massive win for privacy and latency. It completely changes the game for mobile devs who want to implement MCP without inheriting the cost and lag of constant cloud roundtrips.

Daniel Nwaneri • May 24

The OAuth (WIP) flag complicates the privacy story slightly - authenticated servers have to wait. Public MCP servers are the privacy win today.

S M Tahosin • May 24

Having the Model Context Protocol running locally on Android opens up some wild possibilities. Integrating desktop-level MCP servers directly into a mobile environment means we can finally bypass cloud-API limits for mobile agents. Have you tried chaining it with any local vision tools on the Pixel yet?

Daniel Nwaneri • May 24

Haven't chained it with vision tools . That's still on the list.

Worth knowing though: Ask Image and Agent Skills are separate tiles in the app with separate model instances. Cross-tool chaining isn't built in yet. The orchestration layer to bridge them doesn't exist in the current build.

The "bypass cloud-API limits" framing is also worth unpinning. MCP over Streamable HTTP means the server call still goes out . it's the tool-selection logic that stays local. Not the same as fully air-gapped, which matters depending on what you're trying to bypass.

Manuel Bruña • Jun 15

Phone-based MCP is exciting, but the UX for consent has to be sharper than desktop. Sensors, contacts, files, and local apps have very different blast radius. I’d prefer small, revocable scopes over one broad “let the agent use my phone” permission.

Suny Choudhary • May 21

This is an interesting direction because MCP on mobile changes the mental model.

On desktop, MCP usually feels like a developer or workstation layer: files, terminals, repos, browsers, databases, cloud tools. On phones, the context is different. The device has sensors, camera, location, apps, notifications, identity, and a lot of personal data sitting very close to the user.

That makes the edge/local part important. If the model and tool layer can run locally, it opens up useful workflows without sending everything to a cloud model by default.

But it also raises the same question every agent platform eventually hits: what tools should the model actually be allowed to call, and how visible is that to the user?

MCP on phones could be powerful, but the permission model will matter more than the demo.

Daniel Nwaneri • May 21

The permission model question is already showing up in the design. Calendar read and write are split into separate skills — two toggles, not one. That's a deliberate permission boundary baked into the architecture.

But send-email is also in the list. Same interface, same toggle. The model can be enabled to send email offline, with no visible indication to the user when it decides to fire.

The skill-level toggle is the current answer to your question. Whether it's a sufficient answer is a different thing.

Max Quimby • May 24

The "tool-selection logic runs on-device, only the structured API call leaves the phone" pattern is the part of this that's most underrated. It flips the usual cloud-LLM trust model: instead of "we trust the model provider with the full conversation context," you're only exposing the specific tool invocation. For enterprise/health/financial agent use cases that's a meaningful difference.

The separate calendar-read and calendar-write toggles also caught my eye. Most MCP server configs I've seen treat permissions at the server level — connect the server, get all its tools. Splitting capabilities into individually-grantable skills is closer to how mobile OS permissions actually work, and probably the right primitive for non-developer end users who shouldn't have to reason about server URLs.

One concern: 32K context with a 2.6 GB model plus tool outputs is going to feel tight for any multi-turn agent flow. Have you tried it on longer conversations to see how quickly it has to start dropping history? That's the realistic ceiling for "useful mobile agent" right now in my experience.

Daniel Nwaneri • May 26

The trust model framing is sharper than what I put in the article. "Exposing only the specific tool invocation" rather than the full conversation context . That's the actual privacy primitive, and it matters for regulated use cases in a way that "on-device = private" doesn't quite capture on its own.

On permissions: the View button per skill adds another layer — users can inspect the full skill definition before enabling it. Not just what it can do, but how it's implemented. Closer to informed consent than typical mobile permissions.

On the context ceiling — haven't run multi-turn inference yet (footer says so). But the concern is architecturally right. Tool outputs are verbose: Wikipedia summaries, calendar dumps, MCP response payloads eat tokens faster than conversation turns do. The 32K number behaves smaller than it looks in practice.

Benjamin Nguyen • May 21

neat! I have gemini on my phone already/

Melvin Great • May 25

Nice research you did there. It was an interesting read.