DEV Community

Cover image for What is a Counting Bloom Filter?
ZeeshanAli-0704
ZeeshanAli-0704

Posted on

What is a Counting Bloom Filter?

๐Ÿงฎ What is a Counting Bloom Filter?

A Counting Bloom Filter (CBF) replaces the bit array in a standard Bloom Filter with an array of small counters.

  • Standard Bloom Filter: Each position in the array is just a 0 or 1.
  • Counting Bloom Filter: Each position is a small integer counter (e.g., 4 bits, 8 bits, or 16 bits).

๐Ÿ‘‰ This allows incrementing when inserting and decrementing when deleting, which means deletions are supported safely.


โš™๏ธ How it Works

1. Insertion

  • Compute k hash values for the item.
  • For each hash value โ†’ increment the counter at that position.
  • Example: inserting "apple" sets counters at positions [4, 17, 29] โ†’ incremented by +1.

2. Lookup

  • Compute k hash values for the item.
  • Check if all counters > 0.
  • If yes โ†’ item may exist.
  • If any counter is 0 โ†’ item definitely doesnโ€™t exist.

3. Deletion

  • Compute k hash values.
  • For each โ†’ decrement the counter by -1.
  • If a counter goes back to 0, that slot is free again.
  • Unlike standard Bloom Filter, this does not break other itemsโ€™ membership (as long as you donโ€™t decrement below 0).

๐Ÿงพ Example Walkthrough

Say we have a CBF with 10 counters (all start at 0) and 3 hash functions:

Insert "cat" โ†’ hashes to [2, 4, 7]

  • counters[2]++, counters[4]++, counters[7]++
  • Array: [0,0,1,0,1,0,1,0,0,0]

Insert "dog" โ†’ hashes to [4, 5, 9]

  • counters[4]++, counters[5]++, counters[9]++
  • Array: [0,0,1,0,2,1,1,0,0,1]

Delete "cat" โ†’ hashes [2, 4, 7]

  • counters[2]--, counters[4]--, counters[7]--
  • Array: [0,0,0,0,1,1,0,0,0,1]
  • "dog" still exists safely!

๐Ÿ“Š Pros & Cons

โœ… Advantages

  1. Supports Deletion โ€” main benefit.
  2. Same lookup speed as Bloom Filter (O(k)).
  3. Can prevent memory bloat in dynamic datasets.

โŒ Disadvantages

  1. Higher Memory Usage โ€” counters need more space than single bits.
  • Example: if counters are 4 bits each โ†’ 4x memory cost.

    1. Counter Overflow
  • If too many items hash to the same position, counters may overflow.

  • Fix: use larger counters (but increases memory usage).

    1. Still Has False Positives
  • Like Bloom Filters, cannot avoid false positives.

  • Deletions also increase false positive risk if counters drop unevenly.


๐Ÿš€ Real-World Use Cases

  • Databases & Caches:

    • e.g., Redis, Cassandra, and HBase use Counting Bloom Filters to track keys and allow safe removals.
  • Networking:

    • Routers use them to track active flows that expire over time.
  • Security / Spam Detection:

    • Track whether URLs, IPs, or messages are seen/unseen and remove them when outdated.

๐Ÿ” Comparison with Standard Bloom Filter

Feature Standard Bloom Filter Counting Bloom Filter
Memory Usage Low (1 bit per slot) Higher (c-bit counters per slot)
Deletion โŒ Not possible โœ… Supported
False Positives โœ… โœ…
False Negatives โŒ (unless buggy) โŒ (unless buggy)
Resizing โŒ Hard โŒ Still hard
Use Case Mostly static datasets Dynamic datasets (inserts + deletes)

โœ… Summary:
A Counting Bloom Filter is like a Bloom Filter on steroids โ€” it trades extra memory for the ability to delete elements. Itโ€™s widely used in dynamic systems (databases, caches, networks) where entries need to expire or be removed.

More Details:

Get all articles related to system design
Hastag: SystemDesignWithZeeshanAli

Git: https://github.com/ZeeshanAli-0704/SystemDesignWithZeeshanAli

Top comments (0)