Ilya Nevolin

Posted on Dec 15, 2020

WebOCR - Camera Text Extraction

#dohackathon #machinelearning #node #javascript

What I built

WebOCR is a minimalistic app for devices with a camera. Recognize and copy text from photos made on your mobile device or any other device with camera access.

Category Submission:

Program for the People

App Link

https://nevolin.be/webocr/

https://webocr-colcw.ondigitalocean.app/

Screenshots

Description

visit the app (on pc or phone)
allow camera access
aim at some text and click the button
wait a few seconds for the image to be processed
the detected text will be shown below

note: none of the video/photo content is stored, everything is processed in-memory and removed immediately after processing for privacy reasons.

The OCR system does a pretty good job, especially with numbers and special characters. However it's definitely not perfect and can produce inaccuracies. This solution uses TesseractJS as underlying OCR system.

Link to Source Code

https://github.com/healzer/WebOCR

Permissive License

MIT

Background

Optical Character Recognition (OCR) is a pretty important technology but not many junior developers know about it. It's an intelligent system and should be used a lot more in daily business. Its learning curve is very low and can easily be integrated into business pipelines.

The idea behind WebOCR is to have some tool to quickly extract text from a picture taken with a phone, in my opinion it should be a default app that comes with Android/iOS devices.

The accuracy of the system is not always 100% but comes very close, and is very convenient for extracting URLs, phones, addresses, serial codes, etc.

How I built it

It's pure JavaScript/jQuery/HTML on the front-end, nothing fancy. And NodeJS for the back-end with Express and TesseractJS as additional libraries.

You can deploy it yourself in a matter of seconds. My app runs on a basic $5 digitalocean cloud app.

Additional Resources/Info

There are two ways to carry out OCR: client-side and server-side (default).

Client-side OCR runs in the browser, it is much slower but could be tweaked using more workers. For these configurations you should consult TesseractJS's API docs. To enable client-side OCR use the function localProcessImg() instead of serverProcessImg() inside /public/main.js.

Latest comments (1)

raddevus • Dec 16 '20

I just tried it out and that is very nice. I will be taking a close look at your source code because this has a lot of amazing abilities (using camera, doing OCR, etc). Really cool. I completed my entry to the #dohackathon also and I hope you'll take a moment to check it out and give me your feedback. thx

DEV Community