DEV Community

Cover image for Create a simple and fast search bar using cosine similarity in React โš›๏ธ๐Ÿ”
Philip
Philip

Posted on

Create a simple and fast search bar using cosine similarity in React โš›๏ธ๐Ÿ”

Introduction ๐Ÿ“œ

In this article, we are going to explore cosine similarity and how to use it to implement a very simple yet effective search bar. Although the topic includes a bit of math, the implementation will be as simplistic as possible, yet we are still going to cover exactly how things work under the hood.

The problem ๐Ÿค”

If you are here, you are probably wondering, "How do I search for things in the first place?". The answer to this question lies in a field of computer science known as Data Retrieval. We are going to use one of the most fundamental methods used in data retrieval which, while not perfect, is surprisingly easy to implement and yields satisfying results.

Cosine Similarity ๐Ÿงฎ

Cosine similarity is a measure used to calculate how similar two vectors are. In simple terms, it is the cosine of the angle formed by the two vectors in space. Consider the following vectors:

Image description

In this example, the cosine similarity between vectors A and B is the cosine of angle ฮฑ, which is cos(26.57ยฐ) = 0.8943
This means that the vectors are quite similar.

We can also use this method to calculate the similarity of words.
Imagine we had words that could only have the letters A and B.

Word 1: "aaab"

Word 2: "abb"

Users input word: "aab"

The x-Axis will represent the amount of As and y-Axis the amount of Bs in each word.

Image description

At first glance, we can already see that our input "aab" is much closer to "aaab" than to "abb". Let's go ahead and calculate the cosine of each angle:
cos(8.13ยฐ) = 0.98

cos(36.86ยฐ) = 0.80

This is how we can represent words in a 2D space. But this by itself is not useful. In our world, we use more than just "a" and "b" characters. The English alphabet alone has 26 base characters. Imagine we wanted to support special characters as well, or even multiple languages.
Here we will need to use the general formula for cosine similarity:

Here, vectors A and B are n-dimensional. Note that they should have the same dimensions.

We will compare the following words using this formula:

Word 1 : "banana"

Word 2 : "apple"

User input: "bana"

For each word, we will create a 26-dimensional vector. Each slot represents the frequency of the n-th letter of the alphabet in a given word.

"banana":ย [3,1,0,0,0,0,0,0,0,0,0,0,0,2,0,0,0,0,0,0,0,0,0,0,0,0]

"apple": ย ย ย [1,0,0,0,1,0,0,0,0,0,0,1,0,0,0,2,0,0,0,0,0,0,0,0,0,0]

"bana":ย ย ย ย  [2,1,0,0,0,0,0,0,0,0,0,0,0,1,0,0,0,0,0,0,0,0,0,0,0,0]

cos(bana,banana) = 0.982

cos(bana,apple) = 0.309

Therefore, the word "bana" is more closely related to "banana" than to "apple".

Enough math yapping, time to code ๐Ÿ’ป

We are going to create a cosine similarity search bar in React.
Let's start:

Create a new React app

npx create-react-app search-bar-app

Create a new SearchBar.tsx file

Image description

This is our main component.
Add the SearchBar component inside App.js.

import logo from './logo.svg';
import './App.css';
import { SearchBar } from './SearchBar.tsx';

function App() {
  return (
    <div className="App">
      <header className="App-header">
        <div>
          <SearchBar/>
        </div>
      </header>
    </div>
  );
}

export default App;
Enter fullscreen mode Exit fullscreen mode

From your terminal, install the compute-cosine-similarity package

npm i compute-cosine-similarity
Enter fullscreen mode Exit fullscreen mode

Let's start building our component.

We need 3 variables: one for the query given by the user, one for the results to be displayed back to them, and finally the list of searchable items.

const [query, setquery] = useState<string>("");
const [results, setResults] = useState<string[]>([]);
const items = [
        'Apples',
        'Bananas',
        'Carrots',
        'Dairy Milk',
        'Eggs',
        'Flour',
        'Grapes',
        'Honey',
        'Ice Cream',
        'Juice',
        'Kale',
        'Lettuce',
        'Milk',
        'Nuts',
        'Oranges',
        'Pasta',
        'Quinoa',
        'Rice',
        'Spinach',
        'Tomatoes',
        'Udon Noodles',
        'Vanilla Extract',
        'Watermelon',
        'Yogurt',
        'Zucchini'
    ];
Enter fullscreen mode Exit fullscreen mode

Time for our component's logic.

When a user provides new input, we have to calculate the cosine similarity between the input and every word in our list. Then we return the list of items ordered by cosine similarity in decreasing order.

To calculate the similarity, for each word we construct an array of size 26. Each position in the array represents a letter of the English alphabet. For each character of a word, if the character is an English character, we increase the value of the character's position by 1.

function isEnglishAlphabet(character: string): boolean {
            return /^[a-zA-Z]$/.test(character);
}

function calculateCosineSimilarity(firstWord:string, secondWord:string):number {
            const firstWordMatrix = new Array(26).fill(0);
            const secondWordMatrix = new Array(26).fill(0);

            for (let character of firstWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    firstWordMatrix[index]++;
                }
            }

            for (let character of secondWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    secondWordMatrix[index]++;
                }
            }

            console.log(cosineSimilarity(firstWordMatrix, secondWordMatrix))
            return cosineSimilarity(firstWordMatrix, secondWordMatrix) || 0;
        }
Enter fullscreen mode Exit fullscreen mode

Notice how we access the position of the letter in the matrix. If it is an English character, we get its ASCII value and subtract 97, which is the ASCII value of the letter "a".

We can now use this method on every item to return the results to the user. For each item, we create a pair of its name and cosine similarity with the user input. Then we filter out the items that are less than 0.5 similar and finally order by similarity.

function search():void {
            let newResults = items.map(item =>({
                name: item,
                similarity: calculateCosineSimilarity(query,item)
            }))
            .filter(item => item.similarity > 0.5)
            .sort((a, b) => b.similarity - a.similarity)
            .map(item => item.name);

            setResults(newResults);
        }
Enter fullscreen mode Exit fullscreen mode

We can gather all the methods above in a single useEffect that runs whenever the user input changes.

useEffect(() => {

        function isEnglishAlphabet(character: string): boolean {
            return /^[a-zA-Z]$/.test(character);
        }

        function calculateCosineSimilarity(firstWord:string, secondWord:string):number {
            const firstWordMatrix = new Array(26).fill(0);
            const secondWordMatrix = new Array(26).fill(0);

            for (let character of firstWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    firstWordMatrix[index]++;
                }
            }

            for (let character of secondWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    secondWordMatrix[index]++;
                }
            }

            console.log(cosineSimilarity(firstWordMatrix, secondWordMatrix))
            return cosineSimilarity(firstWordMatrix, secondWordMatrix) || 0;
        }

        function search():void {
            let newResults = items.map(item =>({
                name: item,
                similarity: calculateCosineSimilarity(query,item)
            }))
            .filter(item => item.similarity > 0.5)
            .sort((a, b) => b.similarity - a.similarity)
            .map(item => item.name);

            setResults(newResults);
        }

        search();
    }, [query])
Enter fullscreen mode Exit fullscreen mode

For the final step, we need to create the view of our component. We'll use a simple input that holds the value of "query" and whenever the value inside changes, the query will be updated. This will trigger the useEffect we created. We will map the results in a list just below the search bar.

return (
        <div>
            <input 
                type="text" 
                placeholder="Search..." 
                value={query} 
                onChange={(e) => setquery(e.target.value)} 
            />
            <ul>
                {results.map((item, index) => (
                    <li key={index}>{item}</li>
                ))}
            </ul>
        </div>
    );
Enter fullscreen mode Exit fullscreen mode

Final result โœจ๐Ÿ”

Go ahead and type something into the search bar. The top results should be very relevant to what you are looking for!

Image description

Full search bar component code:

import React, { useEffect, useState } from 'react';
import cosineSimilarity from 'compute-cosine-similarity';


export const SearchBar = () => {
    const [query, setquery] = useState<string>("");
    const [results, setResults] = useState<string[]>([]);
    const items = [
        'Apples',
        'Bananas',
        'Carrots',
        'Dairy Milk',
        'Eggs',
        'Flour',
        'Grapes',
        'Honey',
        'Ice Cream',
        'Juice',
        'Kale',
        'Lettuce',
        'Milk',
        'Nuts',
        'Oranges',
        'Pasta',
        'Quinoa',
        'Rice',
        'Spinach',
        'Tomatoes',
        'Udon Noodles',
        'Vanilla Extract',
        'Watermelon',
        'Yogurt',
        'Zucchini'
    ];

    useEffect(() => {

        function isEnglishAlphabet(character: string): boolean {
            return /^[a-zA-Z]$/.test(character);
        }

        function calculateCosineSimilarity(firstWord:string, secondWord:string):number {
            const firstWordMatrix = new Array(26).fill(0);
            const secondWordMatrix = new Array(26).fill(0);

            for (let character of firstWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    firstWordMatrix[index]++;
                }
            }

            for (let character of secondWord) {
                if (isEnglishAlphabet(character)) {
                    const index = character.toLowerCase().charCodeAt(0) - 97;
                    secondWordMatrix[index]++;
                }
            }

            console.log(cosineSimilarity(firstWordMatrix, secondWordMatrix))
            return cosineSimilarity(firstWordMatrix, secondWordMatrix) || 0;
        }

        function search():void {
            let newResults = items.map(item =>({
                name: item,
                similarity: calculateCosineSimilarity(query,item)
            }))
            .filter(item => item.similarity > 0.5)
            .sort((a, b) => b.similarity - a.similarity)
            .map(item => item.name);

            setResults(newResults);
        }

        search();
    }, [query])

    return (
        <div>
            <input 
                type="text" 
                placeholder="Search..." 
                value={query} 
                onChange={(e) => setquery(e.target.value)}
            />
            <ul>
                {results.map((item, index) => (
                    <li key={index}>{item}</li>
                ))}
            </ul>
        </div>
    );
}
Enter fullscreen mode Exit fullscreen mode

Top comments (0)