DEV Community

Cover image for How bloom filters work, irl
Ateeb Hussain
Ateeb Hussain

Posted on

How bloom filters work, irl

A single SELECT for every "Username Taken" check?
Seriously??

Imagine you're building a SaaS and 10,000 bots try to sign up with random usernames in one minute.

Are you really gonna let your Postgres/MySQL handle 10,000 unnecessary hits just to say "Nope, doesn't exist"?
That’s how you kill your DB performance and blow your AWS budget on RDS IOPS.

Tech Giants don't even touch the database to check if a username is free. They use a Bloom Filter.

The "Bouncer" at the door

Think of a Bloom Filter as a high-speed bouncer. He doesn't have a list of names; he just has a row of 1,000 switches (bits).

  1. When a user signs up, you run their name through a Hash Function.
  2. It flips a few switches to ON.
  3. When the next person checks a name, the bouncer just looks at the switches.

The Result?

  • If the switches are OFF: The username 100% DOES NOT exist. (Database: 0 hits. Speed: Lightning).
  • If the switches are ON: It might exist. Now you go check the DB.

Why this is a Flex:

  • Memory is Cheap: You can track 10 MILLION usernames in about 10MB of RAM. Try doing that with a SQL Index.
  • Zero Disk I/O: Your database stays chill while the Bloom Filter handles the noise.
  • Privacy: Even if someone hacks your Redis, they can't see the usernames. It’s just bits, baby.

Real Talk:
I know, I know... "Ateeb, I only have 500 users, I'll just use @unique in Prisma."
Fine. Use your sledgehammer for a nut.

But if you’re planning to scale to the moon, you need to stop being lazy with your architecture.

Next up: Trie Structures. (How to do those "Suggested Usernames" without making your server sweat).

Are you still "Select-ing" your way to a massive AWS bill, or are you actually optimizing? Fight me in the comments! 👇

Top comments (0)