DEV Community

Okeke Chukwudubem
Okeke Chukwudubem

Posted on

Day 3 of building an AI agent that controls a phone.

The agent can now see. Tesseract OCR is running locally in Termux. Screenshots text extraction coordinate mapping.

Tested on WhatsApp. It found a contact by reading the screen. All offline.

Sign in to view linked content

Top comments (0)