DEV Community

Tanisha Koladiya
Tanisha Koladiya

Posted on

Tesseract OCR not reading blurry or broken text from image — need accurate image-to-text method

I am currently using the Tesseract-OCR engine in my application to extract text from images. While it works well in many cases, I’m facing issues where it fails to read blurry or partially broken text, especially when the image contains:

Small or anti-aliased fonts

Blurry characters due to low resolution

Digits or symbols like /, %, . that appear broken or unclear

I’ve already tried:

Preprocessing the image using OpenCV (Emgu CV in C#): resizing, thresholding, Gaussian blur, morphology

Using OEM 1 (LSTM-only) and PSM 6 or 7

Character whitelisting (e.g., "0123456789./%")

Still, in some images (attached below), the OCR result is inaccurate — for example, it fails to read values like 96 / 120, 6.67%, etc.

What I need:
A reliable image-to-text conversion method that can:

Handle blurry/low-resolution text

Read small numeric data and symbols from digital display screenshots

Be integrated into a C# application

Top comments (0)