DEV Community: Hassan Zahar Rifat

Bloom Filters: The Data Structure That's Wrong on Purpose (and Why That's Genius)

Hassan Zahar Rifat — Mon, 08 Jun 2026 15:38:03 +0000

How databases, browsers and CDNs answer "have I seen this before?" using a fraction of the memory — by being wrong in exactly one safe direction.

Picture a sign-up form. You type a username, and before you even finish, the field flashes red: "Already taken." Instant. No spinner.

Somewhere behind that, a system just answered the question "have we seen this string before?" against tens of millions of existing usernames — and it did it without querying the database, without loading a list into memory, and in roughly the time it takes a single CPU instruction to run.

It pulled that off with a data structure that has a strange, almost offensive property: it's allowed to be wrong.

Not randomly wrong. Wrong in exactly one direction, in a way you can mathematically bound, in exchange for using a fraction of the memory a normal set would. That trade is the whole trick, and once you see it you'll start noticing Bloom filters everywhere — in your database, your CDN, your browser, your favorite key-value store.

Let's build one from scratch.

First, why the obvious solutions fall apart

The question is dead simple: is element x in set S?

You already know how to answer this. A hash set. O(1) lookups, exact answers, done. So why do we need anything else?

Memory.

Say you're tracking 1 billion URLs a web crawler has already visited, so you don't crawl them twice. An average URL is ~70 bytes. Even if you only stored the raw strings:

1,000,000,000 URLs × 70 bytes ≈ 70 GB

And that's before the overhead of the hash set itself — pointers, load-factor slack, bucket arrays. Realistically you're looking at well north of 100 GB of RAM to answer a yes/no question. You can shard it across machines, sure, but now every "have I seen this?" check is a network hop.

Here's the reframe that unlocks everything:

You don't actually need to store the URLs. You only need to recognize them.

A bouncer doesn't keep a photocopy of every ID he's ever checked. He keeps a much smaller mental fingerprint. A Bloom filter is that bouncer.

The core idea: stop storing, start fingerprinting

A Bloom filter is two things:

A bit array of m bits, all starting at 0.
A set of k independent hash functions, each of which maps any input to a position in that bit array. That's it. No keys, no values, no stored elements. Just a row of bits and some hash functions.

Two operations:

add(x) — hash x with all k functions, get k positions, set those bits to 1.
contains(x) — hash x with the same k functions, check those k positions. If all of them are 1, return "probably yes." If any is 0, return "definitely no." Read that last line again, because it's the entire personality of the data structure:

A 0 is proof of absence. A 1 is only a hint of presence.

A worked example you can follow by hand

Let's use a tiny filter: m = 16 bits, k = 3 hash functions. Empty, it looks like this:

index:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
bits:   0  0  0  0  0  0  0  0  0  0  0  0  0  0  0  0

Add "cat". Our three hashes spit out positions 3, 9, 13. Flip them on:

index:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
bits:   0  0  0 [1] 0  0  0  0  0 [1] 0  0  0 [1] 0  0

Add "dog". Hashes give 1, 9, 14. Note 9 is already 1 — that's fine, it just stays 1:

index:  0  1  2  3  4  5  6  7  8  9 10 11 12 13 14 15
bits:   0 [1] 0 [1] 0  0  0  0  0 [1] 0  0  0 [1][1] 0

Now let's query.

contains("cat") → check positions 3, 9, 13 → all 1 → "probably yes." Correct, we added it.

contains("fish") → hashes to 2, 9, 13 → position 2 is 0 → "definitely no." Correct — and notice we got a guaranteed-correct answer by checking a single zero bit. That's the cheap, certain half of the filter.

contains("bird") → hashes to 1, 13, 14 → position 1 is 1 (set by dog), 13 is 1 (set by cat), 14 is 1 (set by dog) → all three are 1 → "probably yes."

But we never added "bird".

That's a false positive. No single element claimed those three bits — they got lit up by different elements, and "bird" happened to land on exactly that combination. The filter has no way to tell the difference between "one element set all these" and "three different elements each set one."

This is the price. And here's the beautiful part: it only goes one way.

Why you get false positives but never false negatives

A false negative would mean: you added x, but contains(x) says no.

For that to happen, at least one of x's k bits would have to be 0. But add(x) set all of them to 1, and bits in a Bloom filter never go back to zero (we'll revisit that caveat later). So once you've added something, every one of its bits is permanently lit. A query for it can only ever find all 1s.

Added elements always return "yes." Bits never un-set. Therefore: zero false negatives, ever.

This asymmetry is what makes Bloom filters useful rather than just clever. "Definitely no" is a hard guarantee. "Probably yes" is a hint you can verify with a slower, exact check only when needed. You build systems around that shape: use the filter as a fast gate, and only pay the expensive cost when it says "maybe."

The implementation

Here's a clean Python version. The only mildly clever bit is how we get k hash functions without writing k of them — more on that right after.

import math
import hashlib

class BloomFilter:
    def __init__(self, expected_items: int, false_positive_rate: float = 0.01):
        # Size the filter from how many items you expect and the FP rate you'll tolerate.
        self.m = self._optimal_size(expected_items, false_positive_rate)
        self.k = self._optimal_hash_count(self.m, expected_items)
        self.bits = bytearray((self.m + 7) // 8)  # m bits, packed into bytes

    def _set_bit(self, index: int) -> None:
        self.bits[index // 8] |= (1 << (index % 8))

    def _get_bit(self, index: int) -> bool:
        return bool(self.bits[index // 8] & (1 << (index % 8)))

    def _hashes(self, item: str):
        """
        Kirsch-Mitzenmacher trick: derive k hashes from just TWO base hashes.
        g_i(x) = (h1(x) + i * h2(x)) mod m.  Statistically as good as k independent hashes.
        """
        data = item.encode("utf-8")
        h1 = int.from_bytes(hashlib.sha256(data).digest()[:8], "big")
        h2 = int.from_bytes(hashlib.md5(data).digest()[:8], "big")
        for i in range(self.k):
            yield (h1 + i * h2) % self.m

    def add(self, item: str) -> None:
        for index in self._hashes(item):
            self._set_bit(index)

    def contains(self, item: str) -> bool:
        return all(self._get_bit(index) for index in self._hashes(item))

    @staticmethod
    def _optimal_size(n: int, p: float) -> int:
        return max(1, int(-(n * math.log(p)) / (math.log(2) ** 2)))

    @staticmethod
    def _optimal_hash_count(m: int, n: int) -> int:
        return max(1, int((m / n) * math.log(2)))

Usage:

bf = BloomFilter(expected_items=1_000_000, false_positive_rate=0.01)

bf.add("[email protected]")

bf.contains("[email protected]")   # True  (definitely added)
bf.contains("[email protected]")    # almost certainly False; ~1% chance of a false "True"

That double-hashing trick is worth stealing

Naively, k = 7 hash functions means you need 7 genuinely different, well-distributed hash functions. Annoying. The Kirsch–Mitzenmacher result says you don't: take two good base hashes h1 and h2, and generate the rest as h1 + i·h2. The false-positive behavior is indistinguishable from using k independent hashes, and you compute two hashes instead of seven. Production libraries (like the ones in Cassandra and Guava) do exactly this.

The math you actually need (no PhD required)

Four variables run the whole show:

Symbol	Meaning
`n`	number of items you'll insert
`m`	number of bits in the array
`k`	number of hash functions
`p`	false-positive probability you're willing to accept

You pick n (how much you'll store) and p (how often you'll tolerate a wrong "maybe"). The other two fall out:

Optimal bit-array size:

m = -(n × ln p) / (ln 2)²

Optimal number of hash functions:

k = (m / n) × ln 2

Plug in real numbers. For 1 million items at a 1% false-positive rate:

m ≈ 9.6 million bits ≈ 1.2 MB
k ≈ 7 hash functions Sit with that. A hash set of those keys would be tens of megabytes minimum. The Bloom filter answers the same membership question in 1.2 MB. Want a 0.1% false-positive rate instead? It climbs to ~1.8 MB. Still tiny.

The intuition behind the knobs:

Too few hash functions → not enough bits set per element → easy collisions don't get distinguished. High FP rate.
Too many hash functions → the array fills with 1s too fast → everything starts looking present. Also high FP rate.
There's a sweet spot (that k = (m/n) ln 2 formula), and it lands right around where roughly half the bits are set when the filter is full. That's not a coincidence — a half-full array is the point of maximum information per bit. And the punchline that makes this practical: the memory cost per item depends only on p, not on how big each item is. A 70-byte URL and a 10 KB JSON blob cost the same number of bits in the filter, because you only ever store the hashes. That's why it scales where a hash set can't.

Where this actually runs in production

This isn't a whiteboard toy. It's load-bearing infrastructure in systems you use daily.

Databases (Cassandra, HBase, RocksDB, LevelDB, Bigtable). These use LSM-tree storage, where data lives in many on-disk files (SSTables). To find a key, you might have to check several files — and disk reads are brutally slow compared to memory. So each SSTable gets a Bloom filter held in RAM. Before touching disk, the DB asks the filter "could this key be in this file?" A "definitely no" skips the disk read entirely. Given how lopsided the cost is (RAM lookup vs. disk seek), even a filter that's only sometimes able to say no is a massive win.

Browsers (Chrome's Safe Browsing). Chrome warns you before you visit a known-malicious site — without shipping you a list of every dangerous URL on the internet (which would be enormous and instantly stale). A compact local filter answers "is this URL probably bad?" If no → safe, proceed instantly. If maybe → then it does a real network check. The filter turns "phone home on every single navigation" into "phone home only on the rare maybe."

CDNs and caches. "Is this object worth caching?" A common trick: only cache something on its second request. A Bloom filter cheaply remembers "have I seen this URL once before?" without storing the URLs, filtering out the long tail of one-hit-wonder objects that would just pollute the cache.

Redis ships Bloom filters as a module (BF.ADD, BF.EXISTS) for exactly this class of problem — dedup, "have I shown this user this item," and so on.

Medium reportedly used one to track which articles you've already read, so its recommendations don't keep serving you the same posts. A false positive just means it occasionally skips one you hadn't seen — a totally acceptable error for a recommendation feed.

Notice the pattern in every case: the filter is a cheap front gate, and the expensive operation (disk, network, recompute) only happens on a "maybe." A false positive costs you a wasted check. It never costs you correctness.

The catch nobody mentions first: you can't delete

Want to remove an element? You'd flip its k bits back to 0. Except — those bits might be shared with other elements you didn't remove. Zero out "cat"'s bits and you might have just told the filter that "dog" is gone too, because they overlapped at position 9.

Clear those shared bits and you've created the one thing a Bloom filter promised would never happen: a false negative. The whole guarantee collapses.

So a plain Bloom filter is add-only. If you need deletion, you reach for a variant:

Counting Bloom Filter — replace each bit with a small counter (say 4 bits). add increments, remove decrements, contains checks for non-zero. Deletion works, at ~4× the memory.
Scalable Bloom Filter — don't know n in advance? This one chains progressively larger filters as it fills, holding your target FP rate without you having to size it up front.
Cuckoo Filter — supports deletion and often uses less space than a counting filter at low FP rates, by storing tiny fingerprints in a cuckoo-hash table. The modern go-to when you need removals. You don't need these on day one. But knowing they exist means you won't paint yourself into a corner when a requirement changes.

When to reach for one (and when not to)

Reach for a Bloom filter when:

The expensive thing is a disk read, network call, or recomputation — and you want to skip it on definite misses.
"Probably yes, let me double-check" is an acceptable answer.
You have way more data than RAM, and exact membership is a luxury you can't afford.
Skip it when:
You need the actual stored values back, not just yes/no. (It stores nothing. There's nothing to retrieve.)
A false positive is dangerous rather than merely wasteful — e.g. "is this transaction already processed?" where a wrong "yes" means silently dropping a real payment.

- Your dataset is small enough that a plain hash set fits comfortably. Don't add probabilistic machinery to save a few megabytes you weren't short on.

The one-line takeaway

A Bloom filter trades a sliver of accuracy — in a single, controllable, never-dangerous direction — for an enormous cut in memory. It can't tell you what's in the set, and it'll occasionally claim something's there when it isn't. But when it says "definitely not," it is never, ever wrong.

In a world where the slow part is almost always disk or network, a structure that lets you confidently skip work is worth far more than one that's perfectly precise. Being wrong on purpose, in exactly the right way, turns out to be one of the most useful tricks in systems engineering.

If you build one, try logging your actual false-positive rate against the p you configured — watching the math hold up in practice is oddly satisfying. And if you want to really internalize it: implement contains first and watch it return True for things you never added. That moment of "wait, that's a feature?" is when it clicks.

I Built Cursor for Spreadsheets.. But What for?

Hassan Zahar Rifat — Tue, 01 Jul 2025 20:30:53 +0000

🚀 Why I Built It

I manage a side project with a customer base, and like a lot of solo builders, I frequently use Google Sheets to keep track of metrics, revenue, and day-to-day data.

Over time, I found myself doing the same repetitive tasks — writing formulas, cleaning up tables, copying logic across rows and it started to feel inefficient. Not difficult, just unnecessarily manual.

That’s when I realized I didn’t want to build another product just for the sake of the hackathon. I wanted to build something that I would actually use, something would solve a real business problem.

So I scrapped my original idea and started working on a spreadsheet that behaves more like an assistant. One where I could type plain language and get back working formulas, insights, or even full summaries without needing to remember exact syntax or jump between tabs.

That’s how Cellmate AI started.

🔧 Building with Bolt

Once I committed to the idea, I had around 15 days left in the World's Largest Hackathon presented by Bolt. To move quickly, I relied on Bolt.new to scaffold most of the application — from UI components to basic functionality.

Almost every major feature started with a Bolt prompt.

Some initial examples:

"Create a React spreadsheet grid with editable cells"
"Add a formula bar which will for now contain the cell value"
"Add a toolbar with basic formatting like color, bg, alignments"
"Add CSV import/export support"

Bolt helped me move fast, especially when I broke down prompts into focused tasks. Larger prompts often generated bloated or buggy code, so I kept things small and stitched the parts together manually.

When Bolt-generated output broke existing logic or styling, I cleaned it up myself. I avoided over-engineering and left out anything that wasn't essential.

🧠 Prompt Strategy and Workflow

My workflow eventually settled into this loop:

Write a clear, single-purpose prompt
Let Bolt generate a scaffold
Test it immediately
Patch or rewrite the pieces that broke
Move to the next task

By keeping each step tight, I avoided the usual AI-overhead and kept things predictable. This approach worked well — especially when combining AI-generated logic with my own cleanup.

📸 What the App Does

Cellmate AI is a lightweight spreadsheet app with built-in AI support — designed to make working with data faster and less manual.

Here’s what it currently supports:

Formula generation from plain text
Type something like "Sum column B if column C is complete" and it returns a working =SUMIF() formula.
Sheet-level changes via prompt
You can ask it to delete rows, add columns, or clean up sections without touching any menu.
Natural language insights
Ask questions like "Which product had the highest revenue?" or "How many users signed up last week?" — and it gives you answers based on the data in the sheet.
Auto-generated summary reports
One prompt can generate a full summary of the sheet contents.
CSV import/export
Quickly upload or download data.
Supabase integration for persistence and auth
User sessions and sheet data are synced using Supabase — so it works across devices.

🧰 Tech Stack

Bolt (scaffolding + code generation)
React + TypeScript
ShadCN
TailwindCSS
Vite
OpenAI (natural language → formula/insight)
Supabase (auth + database)
Hosted on Netlify

✅ What’s Working

Spreadsheet grid with editable cells
Formula generation from plain text
Sheet-level structural changes via prompt
Insights and summary report generation
CSV import/export
User login and data sync via Supabase

Here's a demo video describing the current stage (0.75x might help):

🐞 What’s Missing (for now)

No multi-sheet/tab support
No real-time collaboration
No AI-generated charting or visualization tools
Some UX rough edges in prompt result placement

💡 What I Learned

Building from a real pain point made it easier to stay focused.
Bolt can be used beyond prototyping; it landed some great features by providing clear-cut instructions.
Prompt clarity mattered more than prompt length — vague requests broke things quickly. (Thanks to revert/undo option)
I didn’t try to do everything, and it helped me finish a foundational MVP.

🔜 What’s Next

Support for Excel file uploads
Summary dashboards and report saving
Sharing and collaboration features
Possibly releasing a public version with pricing

Thanks to Bolt, DEV, and the hackathon team — the pressure helped me shift gears and build something that I'm happy about.

Try it out: https://cellmateai.xyz

Questions, feedback, bugs? Happy to hear them.

Communication Among Services in Microservices Architecture? Let's Clear it Out!

Hassan Zahar Rifat — Mon, 27 May 2024 19:10:37 +0000

When services communicate with each other in a microservices architecture, there are two common patterns: Sync and Async.

In Sync architecture, services call each other typically via direct API calls or other request/response methods. This means that if a service is down, all other dependent services are also down. But this pattern can simplify handling data consistency and reduce data duplication as each service can fetch data when needed.

In Async architecture, services send messages to each other via messaging systems like message queues or event streams where messages are sent without requiring an immediate response. Services are relatively independent and don’t need to wait for each other to respond. If one service fails, other services continue functioning without caring much and will process messages as they become available once the failed service is restored. But this approach usually requires data duplication and extra consistency handling mechanisms, since each service most likely needs to have its own data copy to serve independently.

It might seem like the async approach is the best choice for unicorn projects, but this isn't always the case. Whether to use Sync or Async pattern depends on various factors such as the requirements of the software, budget, market demand, etc. It's important to note that gaining one advantage usually means sacrificing another. Most large-scale software is built on a hybrid pattern to maximize optimization.

I'm building a simple application using microservices architecture in my free time for fun.

Shoot any questions you have. Stay tuned to get updates <3

10 Impeccable JavaScript Projects for Beginners

Hassan Zahar Rifat — Mon, 26 Jun 2023 09:17:32 +0000

Hey, JavaScript enthusiasts!

Congrats on your decision of learning JavaScript. You've entered into an ocean of opportunities. Now, here without any biting around the bush, I'd help you to push your level forward from novice to semi-intermediate.

How's that possible? Well, your skills could be sharpen like a sword only by doing projects. But what projects would be the best fit for beginners? We'll uncover that now.

If you're just starting out with JavaScript, I could suggest 10 hottest beginner-friendly JS projects:

1. Drum Kit:

Create a virtual drum kit where users can play different drum sounds by pressing corresponding keys on their keyboard or by clicking on the drum pads. Enhance it with visual feedback and interactive animations.

Play the 'numb', record, and share!

2. Image Slider:

Develop an image slider that cycles through a collection of images. Implement features like navigation arrows, auto-play, and image captions. You can even add transition effects for a visually appealing slider. It'd be a great utility library.

3. Quiz Application:

Develop an interactive quiz app with multiple-choice questions. Implement a scoring system, timers, and a progress bar to make it engaging. Add different difficulty levels or integrate external APIs to fetch quiz questions from various categories.

4. Expense Tracker:

Build an expense tracker that allows users to add and categorize their expenses. Implement features like income tracking, expense filtering, and graphical representations of spending patterns. It'd help you grasp data manipulation and data visualization.

5. Chat Application:

Develop a real-time chat application using technologies like Node.js and Socket.IO. Enable users to join rooms, exchange messages, and display online/offline status. Enhance it further with features like file sharing and user authentication.

6. Recipe Finder:

Create a recipe finder app that fetches recipes from various sources using APIs. Implement search filters, sorting options, and user ratings. You can even include advanced features like dietary restrictions, personalized recommendations, and meal planning.

7. GitHub Profile Viewer:

Develop an application that fetches and displays GitHub user profiles. Include features like repositories, followers, activity feeds, and user statistics. You can also experiment with data visualization libraries to present the information.

8. Music Player:

Build a sleek and user-friendly music player that can stream and play music from various sources. Implement features like playlists, song searching, and audio visualization. Take it a step further by integrating APIs to fetch lyrics or album information.

9. E-commerce Store:

Create a fully functional e-commerce website with features like product listing, search, shopping cart, and secure payment integration. Focus on responsive design and smooth user experience.

I know it's a challenging and super common project, but worth it.

10. Social Media Dashboard:

Develop a social media dashboard that aggregates data from multiple platforms like LinkedIn, Twitter, and Instagram. Implement features like post-scheduling, analytics, and social media integration.

PS: It's my personal favorite!!

Thanks for checking out this blog.
Now the first thing to do:
Pick up one and start.
Happy coding! 🚀

Can I Show Pie Charts on My Website? - Introducing Recharts

Hassan Zahar Rifat — Sat, 05 Mar 2022 03:18:03 +0000

Pre-requisite: Basic React.js

Hello developers! Thanks in advance for your interest. Maybe at this moment, you're thinking about improving UX of your website by visualizing data in form of pie charts or something like that. Because at the end of the day, user impression mostly depends on the UX. So the good news is if you're using React, you can work on data visualization easily with Reacharts package.

What is Reacharts?
Hold on a minute before going to the main attraction. Do we know what Reacharts is? According to the official documentation, Recharts is an npm package for using in React projects built on top of the SVG elements (We can follow SVG styling rules to style) with lightweight dependency of D3 (JavaScript library to visualize data) submodules. It's customizable by changing the props values.

Installation
Okay, now! moving on to the installation.

npm install recharts

Importing Components
After installation, we can use the components of Recharts by importing. To make a simple pie chart, we need to import ResponsiveContainer, PieChart, Pie, ToolTip. ResponsiveContainer is a wrapping container with responsive behavior. PieChart is a canvas component. Inside this component, one or many Pie component can be declared. Also, Other features of the pie chart(s) of the canvas can be declared inside PieChart (such as: ToolTip). Pie is the component for printing a pie chart. Tooltip is used if we want to show details on hovering.

import React from 'react';
import { ResponsiveContainer, PieChart, Pie, Tooltip } from 'recharts';

Structure of the raw data
Let's understand the structure of the data we have to have. In this particular example, we should have an array of objects and each object will have name and value keys with their corresponding values. name (string type) would contain the title of the data and value (number type) would be the data. For example,

const data = [
  { name: 'A', value: 400 },
  { name: 'B', value: 300 },
  { name: 'C', value: 300 },
  { name: 'D', value: 200 },
  { name: 'E', value: 100 },
  { name: 'F', value: 700 },
];

Core Components and Explanation
After that, we'll be able to print our pie chart at the twinkles of an eye. We have to write our codes inside return of the component. Let's have a look of the code. Don't worry, I won't leave without explaining necessary parts.

    return (
      <ResponsiveContainer width="100%" height="100%">
        <PieChart width={400} height={400}>
          <Pie
            dataKey="value"
            isAnimationActive={true}
            data={data}
            cx="50%"
            cy="50%"
            innerRadius={0}
            outerRadius={80}
            fill="#8884d8"
            label
          />
          <Tooltip />
        </PieChart>
      </ResponsiveContainer>
    )

We have assigned the canvas size 400x400 in PieChart component. After that, we have decent amount of props in Pie components in form of SVG styling. cx and cy defines the position of x and y axis respectively. Assigning 50% both in cx and cy means the pie chart will be shown at the center. label means label={true} and we'll get the pie chart labeled with the values nicely if label is true. isAnimationTrue sets default animations. If we don't want the animation, we have to assign false. fill would be used to set background color. outerRadius defines the solid redial size. But if we want to make the pie hollow, we need to change the value of innerRadius and assign more than 0. Most importantly, We need to pass the dataset as props named data. And finally, we must have to define the dataKey prop with value, so that it can extract the value of the value key of the dataset and do the elementary math for visualizing the pie chart.

Concluding Remarks
So far, we've got enough for getting started. If you like and appreciate this blog, we'll be going deeper towards data visualization. Note: I'm not gonna attach any preview image of pie chart. Try it yourself, show us the pies and Best of luck!

Wanna Create your Own Version of Messenger? - Learn Setting up Socket.io

Hassan Zahar Rifat — Fri, 25 Feb 2022 14:04:30 +0000

Pre-requisite: Basic React.js, Basic Express.js, CLI

Hello amazing developers! Feeling bored? How about starting to make something like Messenger, Whatsapp or text version of Zoom? You know something very basic about React, Express and you're good to go.

Today, we'll start learning Socket.io to serve our goal and at the end of this writing, we'll be able to setup Socket.io perfectly.

What is Socket.io?
-> According to the official documentation, Socket.io is a library that enables real-time, bidirectional, event-based communication between browser (client side) and server.

It uses WebSocket connection (computer communications protocol providing full duplex channel over a TCP connection) whenever possible and if not, it takes HTTP long polling technology (Half duplex connection). WebSocket [a whole another chapter] connection is pretty fast as users can spontaneously send and receive data through this connection.

One important note: Socket.io is easier to use and gives more features than that of WebSocket and also it definitely uses WebSocket for data transportation, but it cannot send data from its client side to WebSocket server and vice-versa. Okay, enough of theoretical jargons. Let's make real good stuff now!

Installation: We have to install socket.io, express, cors, nodemon (to run the server continuously) in node server. Also we need to initialize the server to configure the package.json file and create an index.js file in where we will write the code. Then We'll install react and socket.io-client for client side. Finally, we'll run both the server.

Server side: In package.json, inside "scripts": {...}, add

"start": "node index",
"start-dev": "nodemon index"

Then,

npm init -y
echo null > index.js [using CMD]
npm install -g nodemon
npm install socket.io express cors
npm run start-dev

Client side:

npx create-react-app name-of-the-app
cd name-of-the-app
npm install socket.io-client
npm start

Now what? - Now, first set up the server with some complementary works. we'll import express, cors (!important), the socket.io package. built in http node module (this will be used to create an http server). After that, we'll have to specify the port number with proccess.env.PORT || 5000. (proccess.env.PORT will be used after deployment by the hosting sites).

const express = require('express');
const cors = require('cors');
const socketio = require('socket.io');
const http = require('http');
const port = proccess.env.PORT || 5000;

Now, we'll initialize express and use cors (Cross-Origin Rrsource Sharing >> helps to prevent blocking requests due to different origin. For example, request from localhost:3000 to localhost:5000 will be blocked if we don't use cors ).

const app = express();
app.use(cors());

Now, we'll create an http server on top of express and connect it with socket.io.

const server = http.createServer(app);
const io = socketio(server, {options});
// , {options} won't be written for now;
// will be used in future to handle cors policy

Now, inside the io.on() method 'connection' event will be declared with an instant of socket.

io.on('connection', (socket) => {
    // console.log('New connection!');
    // codes...
});

All the codes related to socket.io will written inside this method. Now, moving on to the client side. To set all up, we'll import socket.io-client and pass the endpoint containing server side url inside the useEffect without any dependency so that it remain connected.

import io from 'socket.io-client';
...
...
// Inside Component
let socket;
useEffect(() => {
    socket = io('http://localhost:5000/')
});

So, thus installation, client-server side initialization and basic setup can easily be handled.

If you like this blog, we'll definitely be going deeper towards Socket.io in my upcoming blogs. Happy developing :3

You Must Know the Answers to the 7 Most Basic Questions about React

Hassan Zahar Rifat — Wed, 12 Jan 2022 17:45:12 +0000

1. What is reactjs? Tell us about advantages and disadvantages of using react js.

-> React.js is a JavaScript library that is used to build scalable Frontend UI.

Advantages:
Easy to learn.

Gives syntactic sugar by which HTML code can be written inside JavaScript.
By writing one component once, it can be used whenever it needs to be used.
Huge community support

Disadvantages:

Core React.js frameworks is not SEO friendly
Huge dependency on many third party libraries.

2. What is JSX? How does it work?
-> JSX refers to JavaScript XML. It gives syntactic sugar and ease to React.js. By using JSX, we can write HTML code inside JavaScript without the burden of using createElement(), appendChild() or template literals.

3. What is Virtual dom? What are the differences between virtual and real dom?
Or what is the diff algorithm? How does it work?
-> Virtual DOM is a virtual copy of real DOM. It is kept in memory and is synced with real DOM by ReactDOM. DOM manipulation is a less speedy and less efficient process and this is why without rendering the whole document for a little change changing the particular portion is efficient. Virtual DOM does this very well. Whenever change happens, virtual DOM captures the change using diff algorithm and then it updates just that important part that needs to be updated.

4. Differences between props and state?
-> Props are immutable and can be passed as child components but states are mutable, owned by the component and mutable.

5. What is the purpose of useState? When and why will you use it?
-> The useState hook is used for initializing, storing and managing the states of any variable. -> const [state, setState] = useState();

6. What is prop drilling?
-> Sometimes it becomes necessary to pass a value to a child component and from that child component to it’s child component as props. This process of nested passing is called prop drilling.

7. why do we need to inject dependency for useEffect?
-> Using useEffect, it’s necessary to inject dependencies as whenever the state of the dependencies change, the code inside the useEffect executes.