Reading through the comments, a lot of folks have rightly pointed out the maintenance nightmare of XPaths and locator drift. But coming from a systems and backend infrastructure perspective, my biggest takeaway is the architectural decoupling and what that means for CI/CD compute overhead.
Appium’s W3C WebDriver client-server model is incredibly chatty. Every single interaction (find_element, click, send_keys) is an HTTP network hop:
Client Script -> Node Server -> Native Driver -> Device
In an ephemeral CI/CD runner, constantly serializing and transferring massive XML DOM trees back and forth introduces severe latency, race conditions, and socket timeouts. This architectural bottleneck is the real reason we end up littering code with WebDriverWait.
The context-switching tax is insane. Having to break out of your IDE, boot up a GUI, visually hunt through a deeply nested XML tree, validate an XPath, and paste it back into your script completely destroys a developer's flow state (the dev goes mad oh no). The fact that Drizz eliminates that entire 6-step inspection loop is a massive win for DevEx (Developer Experience) alone.
By shifting to a Vision AI approach, you aren't just eliminating locators; you are bypassing that entire heavy XML serialization pipeline. Treating the UI as a visual black-box rather than a structural tree should logically reduce the DOM-parsing bottlenecks that make Appium tests inherently slow to execute.
For me, a question this raises on the infrastructure side:
Since Drizz relies on processing visuals instead of code, what kind of servers do we need to run these tests? Does the AI require expensive, GPU-powered servers to run fast, and does the cost of renting those servers cancel out the money saved by not having engineers/blokes fix broken tests all day?
Brilliant deep dive into the limitations of structural test coupling. It definitely makes you rethink how we architect our testing infrastructure!
For further actions, you may consider blocking this person and/or reporting abuse
We're a place where coders share, stay up-to-date and grow their careers.
Reading through the comments, a lot of folks have rightly pointed out the maintenance nightmare of XPaths and locator drift. But coming from a systems and backend infrastructure perspective, my biggest takeaway is the architectural decoupling and what that means for CI/CD compute overhead.
Appium’s
W3C WebDriver client-server modelis incredibly chatty. Every single interaction (find_element,click,send_keys) is an HTTP network hop:Client Script -> Node Server -> Native Driver -> Device
In an ephemeral CI/CD runner, constantly serializing and transferring massive XML DOM trees back and forth introduces severe latency, race conditions, and socket timeouts. This architectural bottleneck is the real reason we end up littering code with
WebDriverWait.The context-switching tax is insane. Having to break out of your IDE, boot up a GUI, visually hunt through a deeply nested XML tree, validate an XPath, and paste it back into your script completely destroys a developer's flow state (the dev goes mad oh no). The fact that
Drizzeliminates that entire 6-step inspection loop is a massive win for DevEx (Developer Experience) alone.By shifting to a Vision AI approach, you aren't just eliminating locators; you are bypassing that entire heavy XML serialization pipeline. Treating the UI as a visual black-box rather than a structural tree should logically reduce the DOM-parsing bottlenecks that make Appium tests inherently slow to execute.
For me, a question this raises on the infrastructure side:
Since
Drizzrelies on processing visuals instead of code, what kind of servers do we need to run these tests? Does the AI require expensive, GPU-powered servers to run fast, and does the cost of renting those servers cancel out the money saved by not having engineers/blokes fix broken tests all day?Brilliant deep dive into the limitations of structural test coupling. It definitely makes you rethink how we architect our testing infrastructure!