DEV Community

Amitha Mahesh
Amitha Mahesh

Posted on

I Finally Understand Why Mobile Tests Keep Breaking — Thanks to This Article by Jay Saadana

I've always wondered why mobile test automation feels so fragile.
You change one small thing in the UI and suddenly everything breaks
— even though the app itself is completely fine.

I got my answer after reading Jay Saadana's article on Vision Language
Models in mobile app testing
.

The thing that clicked for me

We've been testing visual products by reading invisible XML structure
underneath. The test has no idea what the screen looks like. It just
tracks element IDs and code hierarchy.

So when a developer moves a button or renames a component, the test
panics — even though a real human looking at the screen wouldn't notice
anything wrong.

Jay puts it really well in the article — we treated apps like collections
of XML nodes instead of visual interfaces built for human eyes. That one
line genuinely reframed how I was thinking about this.

Where VLMs come in

Vision Language Models fix this by looking at the screen the way a human
does. Instead of hunting for element IDs, the AI looks at a screenshot
and understands — that's a login button, that's a text field, that's a
navigation menu — purely from visual context.

So when you write a test like "tap the login button", it finds it
visually. Move the button, rename it, redesign the whole screen — the
test still works because it's looking at what's visible.

A few numbers from the article that stuck with me:

  • 9% higher code coverage compared to traditional methods
  • 29 new bugs found in Google Play apps that existing tools completely missed
  • Tests written in plain English — no automation expertise needed

That last one is what got me. Plain English test instructions means
testing becomes something the whole team can contribute to, not just
the person who knows the framework inside out.

What I'm taking away from this

I came into this article thinking flaky tests were just a tooling problem.
I'm leaving it thinking it was always a conceptual problem — we were
never testing what users actually see.

VLMs are the first approach that actually fixes the root cause instead
of patching around it.

Big thanks to Jay for writing this so clearly. If you're into AI, mobile
development, or just curious about where software testing is heading —
give it a read.
Worth your time.


Curious — have any of you run into the flaky test problem before?
How did you deal with it? Drop it below!

@jaysaadana

Top comments (0)