Posted on Aug 28, 2024 • Originally published at linkedin.com

The Chrome Extensions Handbook: Memory-Heavy to Production-Ready

#webdev #javascript #tutorial #beginners

Are Web Extensions Slept On?

Absolutely! You can customize your entire browser, and honestly, I didn't realize just how powerful they could be until now.

If you're like me, you've always wanted to have a web extension published—maybe one or two. But here's the thing: all my ideas were either too costly in terms of money or demanded too many computer resources.

We all know the browser is a sandbox. We're at the mercy of the vendors, and for a good reason! There's no way to access a user's device without the browser's permission, which makes building resource-heavy extensions quite the challenge.

Imagine you're working on a screen recorder, and at every 10-minute mark, your CPU is struggling. You think, "I’ll just offload some data to the hard drive." But then, you realize you'll need to ask the user, "Hey, can I save this video chunk on your system for a minute?" And later, "Oh, can I read them all back and can I save them as a final video too?"

That’s a terrible user experience.

This is why I shelved so many web extension ideas—until now! What I discovered is incredible. It not only eliminates the costs I was so worried about, but it also addresses the CPU resource issue.

This devlog might turn into a series, a guidebook on building a new browser experience, walking you through the decisions I had to make along the way.

Getting Started

I assume you have some development experience since you’re reading this! If you know JavaScript, getting started with web extensions is super easy—it’s just a manifest file away. But a quick article introducing extensions won't hurt! I’ve got you covered here:

Chrome Extensions 101

But Why Web Extensions?
The Sandbox Curse
The Solutions
- IndexedDB
- The Server
Unshelving the Extensions?
Native Messaging
Network Programming
In Conclusion

But Why Web Extensions?

Remember that game you loved so much as a kid? You wished there were more features, and you dreamed about how it could be better. Fast forward, and you became an avid internet user, discovering mods—realizing your original ideas weren’t so original. Someone had thought of it and did something about it. Yes, game mods—short for modifications.

That’s what web extensions are to the browser. Ever dreamed of a feature or wished the browser did something or looked a certain way? You can modify it with web extensions. They provide an API to mod the browser, all the way to blocking certain links (like ad blockers) or replacing text with memes and gifs (like 7TV in a live stream).

Not to mention the ease of access—everyone has a browser, and minimal installation is required. You know that feeling you have as a developer, where nothing is really out of reach? That’s what makes web extensions so powerful.

So, I wanted to build my extensions, but like I mentioned earlier, I had to shelve them.

The Sandbox Curse

Browsers provide beautiful abstractions and APIs to interact with the system. If you’ve been around long enough, you’ve probably come across a few!

For example, capturing a screen is as simple as:

navigator.mediaDevices.getDisplayMedia({ video: true })

Capturing a webcam:

navigator.mediaDevices.getUserMedia({ video: true })

An audio track:

navigator.mediaDevices.getUserMedia({ audio: true })

Extensions that do this are all over the extension market, but I wanted something ambitious. Now, all these methods, as I’ve said, store all the video data in memory. If you record a long enough video, your browser will crash. Let me put this into perspective: a 14-minute video recorded on a 144Hz display with lots of movement is about 400MB. Let’s agree on an average of 100MB.

Now, imagine all this data in memory. Or by the way, this is how you record a video stream in the browser:

const mediaRecorder = new MediaRecorder(stream);
mediaRecorder.start();

The only way my dream extension could become a reality was to store the video chunks somewhere else.

Not giving up, I came up with a few options—some good, and some I just wasn’t willing to go through with.

The Solutions

IndexedDB, Server, or Just Shelf It!

IndexedDB

The browser does provide local storage solutions like localStorage or cookies, which are limited and can only store small data. Then comes IndexedDB, where we can store as much data as we need. For a start, each domain is allocated a certain percentage of the user’s hard drive, which could be gigabytes—exactly what I wanted.

I could capture the screen, save the video chunks temporarily in IndexedDB, and when the user is done, read them back up and try to stitch them together into a full video.

const dbRequest = indexedDB.open('videoChunks', 1);

It worked perfectly for about 70% of the time. One or two problems—a video chunk might get corrupted (I don't even know how to debug this), or if the video is large enough, we need to read all the chunks out at the end of the day and stitch them up, meaning the video will be in memory one way or the other.

Back to stage one!

The server was looking like the only sane option! And I am mostly a backend engineer—yes, full stack, but I handle the backend, ML, and graph data science more. I am familiar with streaming data to or from the server!

But…

The Server

Okay, I'll admit it—I am a very cheap, minimal developer! Ever since I watched this video

by Fireship, where he created a live chat application and only paid $5 for a server, I’ve been determined to exhaust all my free options.

This is so prevalent in my career that after five days on the job, with consent and some convincing, I was allowed to migrate a legacy AWS server, which was consuming $5,000 per month, to a $25 Supabase BaaS solution.

As you can tell by now, the server option wasn’t cutting it for me. I had to shelve my ideas and extensions.

RIP for real this time!

Maybe it wasn’t meant to be?

Until one day, I downloaded this extension that came with a “co-app.” Wait, hold on—a co-app? I was puzzled and ready to figure out how this was even possible.

Unshelving the Extensions?

Enter IDM—the famous, blazing-fast download manager. It’s two-pronged: a web extension and a native app. If you’ve never used it, the extension detects any videos on a webpage and shows some UI options to download it.

When you hit download, it launches a native executable app. Clearly, the extension isn’t doing the downloading—there’s no way a browser can download that fast! Some magic?.

The gears in my head started turning—wait, I can use an extension to send my video chunks to the system?

I don’t know how to paint a picture of how much this drove me crazy, trying to wrap my head around this!

I spent hours thinking—sockets? But how do they initiate the initial connection, and how does the native app know to launch itself and grab the video link?

I went so deep I allegedly tried reverse engineering a “similar” tool. That was not pretty—assembly? You know what I mean!

All I learned was that this tool can grab or serve HTML? For some reason, other than that, I am not reading that ghoulishly beautiful code:


 __unwind { // sub_14125D6B4
.text:0000000140096B80                 mov     [rsp+arg_0], rbx
.text:0000000140096B85                 push    rdi
.text:0000000140096B86                 sub     rsp, 150h
.text:0000000140096B8D                 mov     rax, cs:__security_cookie

To be fair, I know some assembly—I did take the Nand2Tetris course a while back! But I wasn’t ready to read thousands of lines of that. Oh, hell no!

After this, I was back to the beginning—but now with a newfound interest in how they did it.

Yep, it was a “Web extension communicate with native app” Google search away! I spent hours, and the solution was seconds away.

Native Messaging

Web extensions provide an API to communicate with native applications. This was the breakthrough I needed!

The setup might seem tedious, but once you get it, you're just a few lines of code away from connecting to a native application.

const port = chrome.runtime.connectNative('com.my_company.my_application');

That’s really all it takes. I nearly got lost in thousands of lines of assembly code from dwhelper, only to realize their companion app was open-source Python code—easy to read!

So, what do you need? Just a manifest file, a registry key pointing to it, and an executable. It sounds complex, but it’s mostly just tedious, and every platform covers it well:

Chrome - link
Mozilla - link
Edge - link

Here's how it works: The manifest file acts as a blueprint for your extension, telling the browser where to find your native app. Here’s an example manifest:

{
  "name": "com.my_company.my_application",
  "description": "My Application",
  "path": "C:\\Program Files\\My Application\\chrome_native_messaging_host.exe",
  "type": "stdio",
  "allowed_origins": ["chrome-extension://knldjmfmopnpolahpmmgbagdohdnhkik/"]
}

The key part is the path—it points directly to your native application. This file is packaged with the native app.

To link this with your extension, you use a registry key. Think of it as setting an environment variable, but for your application:

REG ADD "HKCU\Software\Google\Chrome\NativeMessagingHosts\com.my_company.my_application" /ve /t REG_SZ /d "C:\path\to\nmh-manifest.json" /f

This command adds a key to the registry that points to your manifest file. Now, when your extension calls:

const port = chrome.runtime.connectNative('com.my_company.my_application');

Chrome will search the registry for com.my_company.my_application, check the permissions, and, if allowed, launch your native app. You can then communicate with it like this:

port.onMessage.addListener(function (msg) {
  console.log('Received: ' + msg);
});
port.onDisconnect.addListener(function () {
  console.log('Disconnected');
});
port.postMessage({text: 'Hello, my_application'});

It might seem overwhelming at first, but with practice, it’ll click. If you’re still feeling stuck, don’t worry—I’ll cover this in more detail in a dedicated article.

Breakthrough! That was it. I quickly put together a web extension and a native WPF app. I had to dust off my C# skills—MS never misses in language design! C# is a super elegant language, just look at TypeScript!

In a few hours, I had the basics down, including the communication. But of course, it wouldn't be software development without problems.

The port communicates via standard input and output. Remember when you first learned programming and were excited to get user input via the CLI? Yes, that standard input and output.

For small chunks of data, it works perfectly. However, video data is large, and this became so bad that the standard input would be stuck reading a chunk from a minute ago while the extension had already produced over 50 video chunks.

This is really bad! I spent hours trying to debug, spawning threads in the native app (again, tunnel vision!).

And the answer was simple: network programming!

Network Programming

The basic unit in a network application is a socket—not to be confused with a WebSocket.

Here's a basic Node TCP client socket:

const net = require("node:net")

const stream = net.connect(3306, "localhost") // stream is a socket

stream.on("data", (data)=> {
  console.log(data) // raw binary data: 
  // <Buffer 4a 00 00 00 0a 38 2e 30 2e 33 38 00 12 00 00 00  ... 28 more bytes>

 })

A socket is how devices communicate over the network. It's a low-level OS concept. All a socket needs to initiate is a domain (URL) and a port. I already have a domain since both the native WPF app and web extension are running locally (localhost).

I’ll be diving deep into Sockets, Multiplexers, and Demultiplexers in an upcoming series, _The Backend Engineer’s Handbook: From Sockets to Express, Django, Ruby on Rails, and Java Spring in 4 Weeks. If you're eager to master these foundational backend concepts, make sure to follow my kofi page for updates!

All I needed was a port from either party.

If you have backend experience, you know there's no limit to the data you can stream via a socket connection!

Everything is history now. My data-passing issue was solved. The only thing I used standard input and output for was to pass the port that the native app would be listening on.

_port = GetAvailablePort();
_wss = new WebSocketServer($"ws://127.0.0.1:{_port}");

Once the native app has a running WebSocket server, I notify the extension:

private void NotifyExtension(int port)
{
    var ackMessage = new
    {
        type = "port",
        port = port,
    };

    string json = JsonConvert.SerializeObject(ackMessage);
    byte[] jsonBytes = Encoding.UTF8.GetBytes(json);
    byte[] lengthPrefix = BitConverter.GetBytes(jsonBytes.Length);

    using (var stdout = Console.OpenStandardOutput())
    {
        stdout.Write(lengthPrefix, 0, lengthPrefix.Length);
        stdout.Write(jsonBytes, 0, jsonBytes.Length);
        stdout.Flush();
    }
}

Using standard output, when the extension receives the port—guess what? The beautiful thing is the connection is kept alive for a long time and is bidirectional.

port.onMessage.addListener(function (msg) {
  console.log('Received ' + msg);

  if (msg.type === 'ack' && msg.chunkId) {
    // Handle acknowledgment
  } else if (msg.type === 'error') {
    // Handle error
  } else if (msg.type === 'port') {
    sendMessagetoPopup({"status": "port", ...msg});
  } else {
    console.log("MSG?", msg);
  }
});

Now, the video chunks are sent via this socket, without missing a beat. Here’s a video showing the first iteration:

This opened a new world of possibilities and ideas. My extension is ready for a facelift and production.

And I achieved two of my goals: publishing a Windows application and a web extension. With costs and computer resources minimized, there’s no excuse not to build any extension ever again.

The possibilities are limitless.

In Conclusion

Building this extension has been quite the adventure, and I hope you found it as rewarding as I did. There's something special about seeing your code come to life, especially when it's tackling real-world problems. I know I mentioned this would be a series, and it looks like we’ve covered quite a lot already! If there’s any update, I’ll drop another article—so be on the lookout. You can also follow me here to stay updated.

But this is just the beginning—there’s a whole world of advanced features and optimizations waiting for you. Let’s keep learning and growing together; I can’t wait to see what you’ll create next!

Socials:

Kofi : future exclusive content

Twitter