DEV Community

Okeke Chukwudubem
Okeke Chukwudubem

Posted on

Day 7 of building an AI agent that controls a phone.

OCR is fast now. But the agent still can't see icons the send button, the camera, the paperclip. I'm adding template matching with OpenCV to give it visual recognition for image-only UI elements.

Sign in to view linked content

Top comments (0)