Sambhav Tawar

Posted on Jun 27

How I Build a Screenshot text Editor

#productivity #javascript #security #software

I can't count how many times I've been writing documentation, preparing a tutorial, or submitting a bug report and found myself staring at a screenshot that needed a small fix. A typo in the UI mockup. An API key that has to be blurred, an arrow pointing to the one button nobody seems to notice. Simple things, yet somehow always a hassle.

The tools I had were either too much or too little. Desktop apps like Photoshop gave me endless layers and filters I didn't need—I just wanted to swap a few words. Free online tools? They came with watermarks, forced sign-ups, or asked me to upload sensitive images to some unknown server. And not a single one of them could do the thing I genuinely craved: replace text inside a screenshot and have it look exactly like the original.

The space nobody was filling

Screenshot tools live in a strange, fragmented world. On one side you have heavyweight desktop apps like Snagit and Photoshop powerful, but expensive, install-heavy, and overkill for a quick edit. On the other side, lightweight capture tools like Lightshot or Greenshot let you annotate while you capture, but the moment you save and close, that annotation is baked in. No way to tweak it later.

Then there’s the web-based middle ground. Most of these tools fall into two camps: simple annotators (arrows, text boxes, cropping) or full-blown photo editors that sort-of-handle screenshots. Neither addresses the specific, frustrating problem of editing existing text inside a screenshot with real visual fidelity.

That’s exactly where Screenshot Editor Online lives. It’s not a general-purpose photo editor. It’s not yet another capture tool. It’s a screenshot text editor built for that moment when you already have a screenshot, you need to change the words, and you need the result to look seamless.

What it actually does

Replace text, keep the look

This is the whole reason the tool exists. You upload a screenshot. The tool uses OCR to find every text element. Click on a word or phrase, and it analyzes the pixels in that area to estimate the font family, size, color, weight, and style (bold, italic, serif or sans-serif). You type your new text, and it renders with those same properties.

Some examples:
Before:
After:

All the annotation tools you expect

Beyond the text magic, there’s a full annotation toolkit. You can draw arrows and shapes, add highlights, blur or pixelate sensitive stuff (emails, phone numbers, API keys), crop, resize, and scribble with a freehand pen. Everything you need to make a screenshot communicate clearly without leaving the browser.

No friction No catch

There’s no account to create, no watermark slapped on the output, no artificial edit limits. You drag your image in, do your work, and download the result. It handles PNG, JPG, WEBP, and BMP.

Privacy by default

Everything happens locally in your browser. Your image never leaves your machine. For developers working with internal mockups or customer data, that’s not a luxury it’s a necessity.

Under the hood

The stack

Three core technologies make this possible:

Fabric.js – gives me an object model on top of the HTML5 canvas, so I can treat shapes, text, and images as moveable, editable objects without losing my mind.

Tesseract.js – a browser-based port of the Tesseract OCR engine. It handles text detection entirely client-side, so there’s no need for a server.

Vanilla JavaScript – no frameworks, no dependencies beyond what’s essential. Plain HTML, CSS, and JS keep things fast and the bundle tiny.

The really tough bit: matching fonts

Tesseract.js tells me where text lives and what it thinks it says, but it doesn’t whisper a word about the font. I had to build custom logic that:
Upscales the image 3x before OCR – a simple trick that dramatically improves recognition of small text.
Uses page segmentation mode 11 (sparse text) to pick up words even in complex, non-linear layouts.
Filters out low-confidence detections – anything below 15% confidence gets tossed so you’re not editing nonsense.
Analyzes the pixels inside each text bounding box to estimate visual traits: stroke width (bold), slant (italic), and whether the font has serifs.
Pulls the dominant color straight from the text region for perfect color matching.

The trade-offs of local OCR

Keeping everything in the browser is a privacy win, but it’s not free. Tesseract.js is slower than a cloud service, and its accuracy isn’t quite as high. The 3x upscale helps a lot, but it adds processing time. For clean, readable UI screenshots—the vast majority of real-world use cases—the results are solid and fast enough.

Why build this now?

The technical pieces have been around for years. Tesseract.js has been available since 2018. Fabric.js even longer. So why hasn’t this tool existed before?

Two reasons. First, combining robust in-browser OCR with on-the-fly font matching is surprisingly tricky—most developers naturally reach for a server to do the heavy lifting. Second, the established players have been desktop apps or subscription-based SaaS products. The idea of a free, browser-first tool that obsesses over text replacement as the primary feature simply hadn’t found its champion yet.

What I learned along the way

OCR isn’t magic, Tesseract.js does a great job on clean, high-contrast text, but stylized fonts, low-res images, and messy backgrounds still trip it up. Upscaling helps, but it’s not a cure-all.

Font estimation is guesswork dressed in math. You can spot bold or italic from pixels. You cannot reliably distinguish Arial from Helvetica from Open Sans without a much larger contextual model. So, I focused on what matters: reproducing the visual weight, size, color, and slant. The exact font name is a nice-to-have; the visual match is a must.

Perceived performance matters, OCR on a chunky image can take several seconds. If users don’t see that something is happening, they assume the tool is broken. Keeping the UI responsive and communicative during processing was non-negotiable.

Where it fits in the market

Screenshot Editor Online isn’t trying to replace Snagit or CleanShot X—those are paid desktop tools with broader feature sets. It’s not competing with Canva or Fotor either; those are general-purpose design platforms.

Its real neighbors are other free web-based screenshot editors. But those tools are either annotation-only (no text editing), marred by watermarks, or require uploading your image to a server you know nothing about.

Our differentiators are clear and intentionally simple:

Text replacement with font matching – no other free tool does this.

Fully client-side – your image stays yours.

No watermarks, no caps – free means free.

No account needed – zero friction from the second you land on the page.

The people who care most? Developers, designers, technical writers, support engineers—anyone who lives in a world of screenshots and needs them to be right.

What’s coming next

The tool is live and starting to find its audience. There’s plenty left to do:

Sharper font detection – I’m continuing to refine how I estimate style, especially for tricky typefaces.

More export options – custom resolutions, additional formats.

Keyboard shortcuts – because power users deserve speed.

Batch processing – edit multiple screenshots in one session without starting from scratch.

Wrapping up

Screenshot Editor Online started the way a lot of side projects do: I was frustrated, and the thing I needed didn’t exist. Building it taught me a ton about OCR, canvas manipulation, and the quiet complexity of making something feel simple.

It’s free, it’s open-source, and it solves a real, everyday headache. If you’ve ever spent twenty minutes in Photoshop trying to match a font in a screenshot, you already know exactly why it’s here.

Give it a try at screenshoteditoronline.com. No account, no watermark, no limits. Just upload, edit, and download.