DEV Community

Cover image for Stop Copy-Pasting from Images: Build a Universal Screen Translator with Python
Samar Shetye
Samar Shetye

Posted on

Stop Copy-Pasting from Images: Build a Universal Screen Translator with Python

 Lingo-Live started with a frustration I’m sure you’ve felt too.

Have you ever tried copying text from a YouTube video?
Or translating a Japanese error message inside a game?

Yeah. You can’t.
Because it’s not text — it’s just pixels.

Most of us end up doing one of two things:

  • painfully typing everything by hand, or
  • pulling out our phones and using Google Lens, holding it up to the screen like it’s 2010.

It’s clunky. It breaks focus. And honestly, we can do better.

So I built Lingo-Live — a sleek desktop app that lets you translate anything you see on your screen instantly.

The Superpower We Wanted

I didn’t want just another translation app. I wanted something that felt like a superpower.

That meant it had to be:

  • Invisible – runs quietly in the background
  • Instant – hit a hotkey, select an area, get a translation
  • Modern – glassy UI, dark mode, blur effects, no Windows-95 vibes

Press Ctrl + Alt + T, drag over any part of your screen, and boom — translated text appears on top of whatever you’re doing.

The Secret Sauce: How Lingo-Live Works

Python made this possible. It’s basically a Swiss Army knife for building tools like this.

Here’s how everything comes together.

1. The “Glass” Overlay

The trickiest part was creating a window that stays on top without being annoying.

I used CustomTkinter to build a frameless, translucent overlay that feels light and modern.

Key details:

  • Always on top so translations stay visible
  • Semi-transparent so you can still see context underneath
  • Frameless — no ugly title bar; custom drag-and-drop instead

The result feels less like an app and more like a layer on your desktop.

2. The Eyes (OCR)

When you trigger the hotkey, Lingo-Live doesn’t try to “read the screen.”

Instead, it:

  • Lets you select a region
  • Takes a screenshot of just that area
  • Sends it to Tesseract OCR to extract text from the pixels

Conceptually, it looks like this:

screenshot = ImageGrab.grab(bbox=(x1, y1, x2, y2))
text = ocr_engine.extract_text(screenshot)
Enter fullscreen mode Exit fullscreen mode

That’s where the magic starts — turning images into actual text.

3. The Brain (Translation)

Once OCR gives us something like こんにちは, we need a translation that actually makes sense.

This is where Lingo.dev comes in.

Instead of raw dictionary swaps, it handles context properly, which makes a huge difference — especially for UI text, error messages, and game dialogue.

The result feels natural, not robotic.

4. The Voice (Text-to-Speech)

Sometimes you don’t want to read. You just want to hear it.

So I added Edge TTS, which uses the same high-quality voices found in Microsoft Edge.

Now Lingo-Live can read translations out loud — great for pronunciation or just staying hands-free.

“Fish are vertebrate animals that live in water…”

5. Leveling Up: AI Summarization

Full translations are great, but sometimes you just want the gist.

So I added a Summarize button powered by Google Gemini.

Here’s what happens:

  • The translated text is sent to Gemini
  • It returns a clean, one-sentence summary
  • You get the point instantly Perfect for skimming foreign articles, long error messages, or RPG dialogue dumps.

6. Make It Yours: Settings That Actually Matter

I didn’t want Lingo-Live to feel rigid, so I built a full settings system backed by JSON.

You can:

[- Change the hotkey (Alt + Z? Sure.)

  • Switch themes (dark mode is the correct choice)
  • Pick different fonts (Roboto > Segoe UI, fight me)](url)

Best part?
All changes apply instantly — no restarts, no reloads.

Conclusion

Building your own tools is one of the most satisfying parts of being a developer.

Lingo-Live solves a problem I run into constantly: text that’s trapped inside images, videos, and games. Instead of working around it, I built something that feels fast, modern, and genuinely useful.

If you’ve ever rage-typed a foreign error message at 2 AM, this app is for you.

Lingo.dev makes localization feel effortless—turning a painful, error-prone task into a smooth, developer-friendly experience.

Check out the repo at https://github.com/Samar-365/lingo_live, clone the code, and stop copy-pasting from pixels.

Special thanks to @sumitsaurabh927 and @maxprilutskiy for their continuous guidance throughout the hackathon and also for providing us this great opportunity.

Happy coding!

Top comments (0)