Sourabh Gawande

Posted on Jun 19

How CAPTCHAs work - The Internet's Annoying but Essential Gatekeepers

If you live on the internet (which is practically everyone these days), you likely come across a CAPTCHA often - those somewhat irritating puzzles asking you to pick traffic lights to prove you’re not a robot. Have you ever wondered how these things work and why they’re needed? Let’s take a deep dive into it. (P.S. There's a bonus section containing bad CAPTCHA examples at the end).

What's a CAPTCHA

CAPTCHA (which stands for “Completely Automated Public Turing test to tell Computers and Humans Apart”) is a tool used by sites to identify whether an internet user is a human or a bot.

Around 1997, AltaVista (a primitive search engine of that decade) was having a tough time fixing the high number of auto URL assets that were hampering its website ranking process severely. To solve this issue, the then-chief scientist of AltaVista, Andrei Broder, created an algorithm that later became famous as CAPTCHA.

Because the test is administered by a computer, in contrast to the standard Turing test that is administered by a human, CAPTCHAs are sometimes described as reverse Turing tests.

Why is CAPTCHA needed?

Tl;dr - To make sure some sensitive resource is not exploited using bots

CAPTCHAs are used by any website that wishes to restrict usage by bots. For instance, if you’re running a hotel reservation site, there’s a chance that someone might try to take all the bookings using a bot and legit customers will not be left with any slots. Or you might be running an online poll and want to make sure that votes are not manipulated using bots. CAPTCHA comes in handy in these scenarios making sure only human users can perform some actions blocking access to bot users.

How it works

There are different types of CAPTCHAs available in the market these days, each a little different in its workings than the last one. But all of them work on the same principle - ask users to perform some actions that are trivial for a human to do but almost impossible to automate.

Here are some common CAPTCHA flavours and how they work -

Classic CAPTCHA (Text-based CAPTCHA)

These are the oldest variants of CAPTCHAs, as the name suggests. Classic CAPTCHAs work by asking a user to identify words. These words are shown in a distorted blurry manner with different fonts and handwriting which makes it very difficult for bots to identify them using OCR but it’s still trivial for human users to decipher.

ReCAPTCHA

Google came up with a new way to identify human users which doesn’t require users to enter anything. It just asks you to click on a checkbox.

How can you figure out if a user is a human or a bot by just looking at a checkbox click, you may wonder. The answer is not in the click but in what happens before you click the checkbox. Google has a risk analysis engine that looks at things like how the cursor moved on its way to the checkbox (organic random path/acceleration vs cursor moving in a straight line), which part of the checkbox was clicked (random places, or dead on center every time), browser fingerprint, Google cookies & contents, click location history tied to your fingerprint or account if it detects one, etc. to differentiate between organic and automated clicks.

Image-recognition CAPTCHA

For an image-recognition CAPTCHA, users are presented with square images of common objects like bikes, buses, traffic lights, etc. The images may all be from the same large image, or they may each be different. A user has to identify the images that contain certain objects. If their response matches the responses from most other users who have submitted the same test, the answer is considered "correct" and the user passes the test.

Bonus: 5 examples of terrible CAPTCHAs that will make you pull out your hair

             _You need a physics degree to solve CAPTCHAs now?_

                         _Good luck deciphering this_

                    _If "you shall not pass" was an image_

                                 _Speechless_

                      _Maybe the robots should take over_

DEV Community

How CAPTCHAs work - The Internet's Annoying but Essential Gatekeepers

What's a CAPTCHA

Why is CAPTCHA needed?

How it works

Classic CAPTCHA (Text-based CAPTCHA)

ReCAPTCHA

Image-recognition CAPTCHA

Bonus: 5 examples of terrible CAPTCHAs that will make you pull out your hair

Top comments (0)

Read next

Community Drawing 🎨

Python Tutorial - 3 Data Structure

鸿蒙Next ArkTS语法适配背景概述

The Role of Machine Learning in Predictive Test Analytics