Prashant Yadav

Posted on Apr 17, 2023 • Originally published at learnersbucket.com

⚡️ Use ChatGPT on any website 🔥

#chatgpt #beginners #webdev #javascript

A few days back I created Butler-AI that allows you to use the ChatGPT on any website by simply typing the command “butler: whatever you want to do;”. It is a Chrome extension and works like a charm on many different websites.

You can see how it works in the following image.

After receiving an overwhelming response to it, I decided to create a simple tutorial around this on how it works and how you can create a similar extension of your own.

So let’s get started.

Setup the chrome extension

Chrome currently suggests using the Manifest V3 to define any extension and we are going to use the same. manifest.json is the config file that defines any Chrome extension.

Define a manifest.json in your directory and inside that add the following things.

{

    "name": "Butler AI - Powered by ChatGPT",
    "description": "Use the power of ChatGPT at your fingertips, The butler will serve its master.",
    "author": "Prashant Yadav",
    "version": "0.0.1",
    "manifest_version": 3,
    "permissions": ["storage", "activeTab"],
    "host_permissions": [""],
    "action": {
        "default_popup": "popup.html"
    },
    "content_scripts": [
        {
            "matches": [""],
            "runAt": "document_end",
            "js": ["script.js"],
            "all_frames": true
        }
    ]
}

Many things are self-explanatory, let’s do a walkthrough of the important properties.

permissions: This defines all things this Chrome extension will have access to, we want access to the activeTab to observe what is being written in the command and respond to that and storage to access the localStorage and store some secrets like ChatGPT API key.
action: default_popup: The default HTML page that opens when you click on the ICON of the extension.
content_scripts: This defines which javascript file to load when a new tab opens and when to run this. Basically, we will open the script.js on the document_end (when page load is complete) and for all the URLs all_urls in all the frames Iframes as well. Inside this script.js our all logic will be present.

popup.html

Written a simple message just to make sure popup.html is loading properly.

<!DOCTYPE html>
<html lang="en">
  <head>
    <meta charset="UTF-8" />
    <meta http-equiv="X-UA-Compatible" content="IE=edge" />
    <meta name="viewport" content="width=device-width, initial-scale=1.0" />
    <title>Butler AI</title>
  </head>
  <body>
    <h1>Butler AI Powered By ChatGPT</h1>
  </body>
</html>

script.js

Printing a message to check if the script is being properly injected or not.

console.log("ButlerAI");

Load the Chrome extension

Now that our boilerplate is ready, let’s load the extension and see if it working fine or not.

Remember that we will have to load this on developer mode on the local machine.

Open the Chrome browser.
Go to settings > extensions.
Enable the Developer mode on right handside top corner.
Click on Load unpacked button on the left handside top corner.
Navigate and load your directory.

Once you have loaded the extension, open a new tab and navigate to StackEdit and in the console, the message ButlerAI should be printed.

Observe the text being written on any webpage

For this tutorial, I am going to run this extension on the StackEdit which is a popular markdown editor. Now that the extension is loaded, inside the script.js we will have to observe what the user is typing and find if anything in the context butler: whatever is the command; is written or not.

To do this, we will listen to the keypress event on the whole window (activeTab) and when the user stops typing, we will parse the HTML and look for any text that starts with butler and ends with ;.

Observe what the user is typing on the active tab

This we will do on the debounced event because we want to search only when a user stops typing, thus debouncing is a good thing to do.

// helper function to debounce function calls
function debounce(func, delay) {
    let inDebounce;
    return function () {
        const context = this;
        const args = arguments;
        clearTimeout(inDebounce);
        inDebounce = setTimeout(() => func.apply(context, args), delay);
    };
}

// debounced function call
const debouncedScrapText = debounce(scrapText, 1000);

// observe what the user is typing
window.addEventListener("keypress", debouncedScrapText);

Here the scrapText function will be debounced after 1000 milliseconds that is if the user stopped typing for 1 second then only the function scrapText will be invoked.

Finding text that starts with butler

We will have to first collect all the text from the page and then check which text starts with butler: and ends with ;. If any such text is found then we will store the HTML node that contains this text so that we will populate it back with the response of the command and also extract the command.

To narrow down the search, rather than parsing the whole page, we will only parse the HTML element that accepts text. For example where the user can type or provide input.

On the StackEdit the area where the user can write has the attribute contenteditable="true" thus we can get this element and get its text and parse them.

// regex to check the text is in the form "butler: command;"
const getTextParsed = (text) => {
    const parsed = /butler:(.*?)\;/gi.exec(text);
    return parsed ? parsed[1] : "";
};

// helper function to get the nodes, extract their text 
const getTextContentFromDOMElements = (nodes, textarea = false) => {
  if (!nodes || nodes.length === 0) {
    return null;
  }

  for (let node of nodes) {
    const value = textarea ? node.value : node.textContent;
    if (node && value) {
      const text = getTextParsed(value);
      if (text) return [node, text];
      else return null;
    }
  }
};

// function to find the text on active tab
const scrapText = () => {
  const ele = document.querySelectorAll('[contenteditable="true"]');
  const parsedValue = getTextContentFromDOMElements(ele);
  if (parsedValue) {
    const [node, text] = parsedValue;
    makeChatGPTCall(text, node);
}

Here we are getting all the HTML elements, extracting their text, and checking if they are matching the pattern we are expecting, once they match, we get that node (HTML element) and the command.

Different websites have different ways of accepting input, thus you will see that I am checking if the node is of type textarea or not and getting its value accordingly.

Once we have them we are passing them forward to make the ChatGPT API call.

Get the command response with ChatGPT API

Create this function that will accept the command and the node and populate the node with the response from the ChatGPT API for this command.

We are going to use the completions API of ChatGPT with the text-davinci-003 model. You can use any of the APIs and Models as per your preference, but remember you have limited tokens in the free tier so make a note of it while testing to not exhaust the limit. Explore your choice through this ChatGPT playground.

You will have to pass your API key in the Authorization headers to make it work.

const makeChatGPTCall = async (text, node) => {
  try {
    const myHeaders = new Headers();
    myHeaders.append("Content-Type", "application/json");
    myHeaders.append("Authorization", `Bearer ${apikey}`);

    // set request payload
    const raw = JSON.stringify({
      model: "text-davinci-003",
      prompt: text,
      max_tokens: 2048,
      temperature: 0,
      top_p: 1,
      n: 1,
      stream: false,
      logprobs: null,
    });

    // set request options
    const requestOptions = {
      method: "POST",
      headers: myHeaders,
      body: raw,
      redirect: "follow",
    };

    // make the api call
    let response = await fetch("https://api.openai.com/v1/completions", requestOptions);
    response = await response.json();
    const { choices } = response;

    // remove the spaces from the reponse text
    const text =  choices[0].text.replace(/^\s+|\s+$/g, "");

    // populate the node with the response
    node.textContent = text;
  } catch (e) {
    console.error("Error while calling openai api", e);
  }
};

That’s it, reload the extension and see the magic, it should work like a charm on StackEdit.

The most challenging part of this extension is reading the values from different websites as a user writes and then populating back them with the response. For security purposes, websites do many internal things and block the updation of value by just changing the content through javascript.

I made this work on many websites. You can get the source code of Butler-AI for $10, but I leave it up to you, to try and make it work on as many websites as possible.

You can also watch the tutorial on my youtube channel.

Also, follow me on Twitter for tips and tricks to solve the coding interviews and more solved examples of Algorithms. I write 2 - 3 post weekly on my blog learnersbucket.com.

How I fixed 20 seconds of lag for every user in just 20 minutes.

Our AI agent was running 10-20 seconds slower than it should, impacting both our own developers and our early adopters. See how I used Sentry Profiling to fix it in record time.

DEV Community

⚡️ Use ChatGPT on any website 🔥

Setup the chrome extension

popup.html

script.js

Load the Chrome extension

Observe the text being written on any webpage

Observe what the user is typing on the active tab

Finding text that starts with butler

Get the command response with ChatGPT API

How I fixed 20 seconds of lag for every user in just 20 minutes.

Top comments (0)

Read next

Scripting: Creating Windows For Your After Effects Scripts

Day 40: Implementing Advanced Role-Based Access Control (RBAC) with OPA Gatekeeper

TypeScript CLI: Automate Build and Deploy Scripts

Deepseek R1 Locally | Top 5 Free Open-Source Tools | Ollama | Automation | RAG

Okay