Emmanuel Onwuegbusi

Posted on Oct 19, 2023

Build an OCR app using fullstack Python Framework Reflex

#reflex #python #webdev #beginners

Sometimes, you might need to extract text from an image without manually typing it. But what if you can't easily copy the text from the image?

In this article, I will show you how to build an app using Reflex that will be able to extract text from images.

The following will be the output of the app:

Outline

Create a new folder, open it with a code editor
Create a virtual environment and activate
Install requirements
reflex setup
reflex_ocr_system.py
state.py
style.py
.gitignore
run app
conclusion

Create a new folder, open it with a code editor

Create a new folder and name it reflex_ocr_system then open it with a code editor like VS Code.

Create a virtual environment and activate

Open the terminal. Use the following command to create a virtual environment .venv and activate it:



python3 -m venv .venv



source .venv/bin/activate

Install requirements

We will need to install reflex to build the app and also tesseract-ocr pytesseract Pillow to help process the image and extract the text from the image.
Run the following command on the terminal:



sudo apt-get install tesseract-ocr



pip install reflex==0.2.9
pip install pytesseract==0.3.10
pip install Pillow==10.1.0

reflex setup

Now, we need to create the project using reflex. Run the following command to initialize the template app in reflex_ocr_system directory.



reflex init

The above command will create the following file structure in reflex_ocr_system directory:

You can run the app using the following command in your terminal to see a welcome page when you go to http://localhost:3000/ in your browser



reflex run

reflex_ocr_system.py

We need to build the structure and interface of the app. Go to the reflex_ocr_system subdirectory and open the reflex_ocr_system.py file. This is where we will add components to build the structure and interface of the app. Add the following code to it:



import reflex as rx

# import State and style
from reflex_ocr_system.state import State
from reflex_ocr_system import style

# color for the upload component
color = "rgb(107,99,246)"


def index():
    """The main view."""
    return rx.vstack(
        rx.heading("OCR System - Extract text from Images",style=style.topic_style),
        rx.upload(
            rx.vstack(
                rx.button(
                    "Select File",
                    color=color,
                    bg="white",
                    border=f"1px solid {color}",
                ),
                rx.text(
                    "Drag and drop files here or click to select files",
                    color="white",
                ),
            ),
            multiple=False,
            accept={
                "image/png": [".png"],
                "image/jpeg": [".jpg", ".jpeg"],
                "image/gif": [".gif"],
                "image/webp": [".webp"],
            },
            max_files=1,
            disabled=False,
            on_keyboard=True,
            border=f"1px dotted {color}",
            padding="5em",
        ),
        rx.hstack(rx.foreach(rx.selected_files, rx.text,), color="white",),
        rx.button(
            "Click to Upload and Extract the text from selected Image",
            on_click=lambda: State.handle_upload(
                rx.upload_files()
            ),
            is_loading=State.is_loading,
            loading_text=State.loading_text,
            spinner_placement="start",
        ),
        rx.text(State.extracted_text_heading, text_align="center", font_weight="bold", color="white",),      
        rx.text(State.extracted_text, text_align="center",style=style.extracted_text_style),
    )

# Add state and page to the app.
app = rx.App(style=style.style)
app.add_page(index)
app.compile()

The above code will render a text, an upload file component, the selected file name, a button, a text, and the extracted text.

state.py

Create a new file state.py in the reflex_ocr_system subdirectory and add the following code:



import reflex as rx

import pytesseract
from PIL import Image

class State(rx.State):
    """The app state."""

    extracted_text_heading: str

    extracted_text: str

    is_loading: bool = False

    loading_text: str = ""


    async def handle_upload(
        self, files: list[rx.UploadFile]
    ):
        """Handle the upload of files and extraction of text.

        Args:
            files: The uploaded files.
        """

        # set the following values to spin the button and
        # show text
        self.is_loading = True
        self.loading_text = "uploading and extracting text...."
        yield



        for file in files:
            upload_data = await file.read()
            outfile = rx.get_asset_path(file.filename)

            # Save the file.
            with open(outfile, "wb") as file_object:
                file_object.write(upload_data)

            # Open an image using Pillow (PIL)
            image = Image.open(outfile)

            # Use Tesseract to extract text from the image
            text = pytesseract.image_to_string(image)
            text = text.encode("ascii", "ignore")
            self.extracted_text = text.decode()

            self.extracted_text_heading = "Extracted Text👇"

            # reset state variable again
            self.is_loading = False
            self.loading_text = ""
            yield

The above code will get the uploaded file, save the file, and use Tesseract to extract the text from the image. is_loading variable controls the spinning of the button, and loading_text variable shows text when the button is spinning.

style.py

Create a new file style.py in the reflex_ocr_system subdirectory and add the following code. This will add styling to the page and components:



style = {
    "background-color": "#454545",
    "font_family": "Comic Sans MS",
    "font_size": "16px",
}

topic_style = {
    "color": "white",
    "font_family": "Comic Sans MS",
    "font_size": "3em",
    "font_weight": "bold",
    "box_shadow": "rgba(190, 236, 0, 0.4) 5px 5px, rgba(190, 236, 0, 0.3) 10px 10px",
    "margin-bottom": "3rem",
}


extracted_text_style = {
    "color": "white",
    "text-align": "center",
    "font_size": "0.9rem",
    "width": "80%",
    "display": "inline-block",
    "display": "inline-block",
}

.gitignore

You can add the .venv directory to the .gitignore file to get the following:



*.db
*.py[cod]
.web
__pycache__/
.venv/

Run app

Run the following in the terminal to start the app:



reflex run

You should see an interface as follows when you go to http://localhost:3000/

You can upload an image and then click the button to extract the text of the image.

Conclusion

Note that the accuracy of OCR can vary based on the image quality, fonts, and languages used in the image.

You can get the code: https://github.com/emmakodes/reflex_ocr_system.git

To learn more about Reflex, you can read here: https://reflex.dev/

To install tesseract-ocr on other platforms, you can check this solution: solutions to install tesseract-ocr

Top comments (1)

artydev • May 10 '24

Thank you