DEV Community

jack zhu
jack zhu

Posted on

Is Python OCR Inaccurate? Try These Image Preprocessing Techniques!

Is Python OCR Inaccurate? Try These Image Preprocessing Techniques!

When using Python for OCR (Optical Character Recognition), poor image quality โ€” such as blur, skew, or noise โ€” can lead to low recognition accuracy. This article introduces essential image preprocessing techniques to improve OCR performance, along with recommended third-party image enhancement APIs.

โœ… 1. Key Image Preprocessing Techniques

1. Adaptive Thresholding to Enhance Contrast

Use adaptive thresholding to handle uneven lighting or background:

import cv2

img = cv2.imread('input.jpg', 0)
binary = cv2.adaptiveThreshold(
img, 255,
cv2.ADAPTIVE_THRESH_GAUSSIAN_C,
cv2.THRESH_BINARY, 11, 2
)
cv2.imwrite('binary.jpg', binary)

Enter fullscreen mode Exit fullscreen mode



  1. Denoising and Removing Artifacts

blur = cv2.GaussianBlur(binary, (3, 3), 0)
kernel = cv2.getStructuringElement(cv2.MORPH_RECT, (2, 2))
denoised = cv2.morphologyEx(blur, cv2.MORPH_OPEN, kernel)
cv2.imwrite('denoised.jpg', denoised)
Enter fullscreen mode Exit fullscreen mode

  • Deskewing: Correct Image Rotation
    import numpy as np
    
    
    

    coords = cv2.findNonZero(denoised)
    angle = cv2.minAreaRect(coords)[-1]

    if angle < -45:
    angle = -(90 + angle)
    else:
    angle = -angle

    (h, w) = denoised.shape[:2]
    center = (w // 2, h // 2)
    M = cv2.getRotationMatrix2D(center, angle, 1.0)
    rotated = cv2.warpAffine(denoised, M, (w, h), flags=cv2.INTER_CUBIC, borderMode=cv2.BORDER_REPLICATE)

    cv2.imwrite('rotated.jpg', rotated)

    Enter fullscreen mode Exit fullscreen mode



    1. Upscaling Low-Resolution Images

  • Bicubic interpolation is recommended to retain text clarity:

    resized = cv2.resize(rotated, None, fx=2, fy=2, interpolation=cv2.INTER_CUBIC)
    cv2.imwrite('resized.jpg', resized)
    Enter fullscreen mode Exit fullscreen mode



    1. Shadow and Uneven Lighting Removal

    dilated = cv2.dilate(img, np.ones((7,7), np.uint8))
    bg = cv2.medianBlur(dilated, 21)
    diff = 255 - cv2.absdiff(img, bg)
    norm = cv2.normalize(diff, None, alpha=0, beta=255, norm_type=cv2.NORM_MINMAX)
    cv2.imwrite('shadow_removed.jpg', norm)
    
    Enter fullscreen mode Exit fullscreen mode

    ๐Ÿš€ 2. Recommended Image Enhancement APIs

    To simplify preprocessing, consider using these high-quality online tools and APIs:

    ๐Ÿ“ Document Correction

    Auto deskew and perspective correction

    ๐Ÿ“„ Virtual Scanner

    Scan-like enhancement & background removal

    ๐ŸŒซ๏ธ Shadow Removal

    Fix uneven lighting and shadows

    ๐Ÿ” Image Enhancement

    Improve sharpness, contrast, brightness

    ๐ŸŒ OCR + Translation

    Extract and translate text automatically

    Visit API

    ๐Ÿง  3. Recommended OCR Processing Workflow

    1. ๐Ÿ“ค Upload original image โ†’ Process using API (deskew, denoise, etc.)
    2. ๐Ÿ“ฅ Download enhanced image
    3. โš™๏ธ Apply further preprocessing if needed (thresholding, resize, etc.)
    4. ๐Ÿ”  Use OCR engine (Tesseract / PaddleOCR) to extract text
    5. ๐Ÿงน Post-process output (correct errors, restore layout)

    Need a full Python template to automate preprocessing and OCR? Let me know โ€” I can provide a complete script.

    Top comments (0)