DEV Community

Cover image for Highlighting Image Text
Zul Ikram Musaddik Rayat
Zul Ikram Musaddik Rayat

Posted on

Highlighting Image Text

Image processing and data extraction has become one of the most powerful features of Machine Learning now. But doing it from scratch is a pain in the a**. The one thing programming taught me that no one else did is not to reinvent the wheel every time and to prioritize getting the job done. Keeping that in mind, I have come across an easy solution for the problem at hand.

The Problem At Hand

Raw image before highlighting

For simplicity's sake, let us consider this to be a page from a book. We want to highlight the word comment wherever it occurs. This could be an intuitive feature for image search engines to direct the users' attention to their desired content.

Solution

We are going to be using an OCR (Optical Character Recognition) engine called Tesseract for the image-to-text recognition part. It is free software, released under the Apache License. Install the engine for your desired OS from their official website. I'm using Windows for this. Add the installation path to your environment variables.

Create a python project with a virtual environment set up on it. Install the necessary packages.

pip install opencv-python # for image processing
pip install pytesseract # to use the ocr engine in your project
pip install pandas # to conduct search queries
Enter fullscreen mode Exit fullscreen mode

In your main.py import the necessary libraries and define the necessary variables. Read the image from the source using the imread method. Make a copy of the original image for the overlay. Extract text information from the image. It is important to set the output_type to be a pandas Dataframe object which will ease the filtering process.

import cv2
from pytesseract import pytesseract, Output

ALPHA = 0.4

filename = "devto.png"
query = "comment"

img = cv2.imread(filename)
# make a copy of the original image for the highlight overlay
overlay = img.copy()

# extract text data from the image as a pandas Dataframe object
boxes = pytesseract.image_to_data(img, lang="ben+eng", output_type=Output.DATAFRAME)
Enter fullscreen mode Exit fullscreen mode

The dataframe object returned has the following structure:

level page_num block_num par_num line_num word_num left top width height conf text
5 1 5 1 1 4 169 537 99 14 96.276794 comments

We are only interested in the text, left, top, width, and height columns. We need to prepare the dataframe for this specific job by applying various filters. Drop the rows that have NaN or empty string in the text column to make our data error-proof and the computations more efficient. The text column usually contains single words. We can iterate through each row to find out if any of them matches our query string.

# drop rows that have NaN values in the text column
boxes = boxes.dropna(subset=["text"])
# remove empty text rows
boxes = boxes[boxes["text"].str.len() > 1]
# Search through the text column for matching words
boxes[boxes["text"].str.contains(query.strip(), case=False)]
Enter fullscreen mode Exit fullscreen mode

Now we can get started with the highlighting part. We will draw rectangular highlight boxes around the matched positions.

for _, box in boxes.iterrows():
    left = box["left"]
    top = box["top"]
    width = box["width"]
    height = box["height"]

    # draw a yellow rectangle around the matched text
    cv2.rectangle(
        overlay,
        (left, top),
        (left + width, top + height),
        (0, 255, 255),
        -1,
    )

# Add the overlay on the original image
img_new = cv2.addWeighted(overlay, ALPHA, img, 1 - ALPHA, 0)
# Some more image processing to make the highlights more realistic
r = 1000.0 / img_new.shape[1]
dim = (1000, int(img_new.shape[0] * r))
resized = cv2.resize(img_new, dim, interpolation=cv2.INTER_AREA)
Enter fullscreen mode Exit fullscreen mode

Show the modified image using opencv's imshow method.

cv2.imshow("Highlighted", resized)
cv2.waitKey(0)
cv2.destroyAllWindows()
Enter fullscreen mode Exit fullscreen mode

The result is this modified image with every occurring comment highlighted in yellow.
Highlighted image

Bonus Tip

The search-through mechanism in this process can only detect and highlight a single word or full sentence with exact matches. If we want to highlight words that are not in a single sentence, we just need to filter the dataframe with a little bit of pandas magic.

+ from pandas import concat

- boxes[boxes["text"].str.contains(query.strip(), case=False)]
+ boxes = concat(
        [
            boxes[boxes["text"].str.contains(word.strip(), case=False)]
            for word in query.split()
        ]
  )
Enter fullscreen mode Exit fullscreen mode

With this, the user can query "essential comments" and it will highlight essential and comments even though they are not together.

Top comments (0)