Introduction
When building web apps, we sometimes find ourselves with a need to persist large JSON objects. Either directly to the browser using Web Storage, or externally through the File or Fetch APIs.
This problem lead me to ask the question: Is there a better solution to storing raw or encoded JSON? Can we achieve some level of compression?
As it turns out, the answer is Yes and it's surprisingly simple with the Compression Streams API.
This example was created for the specific needs of Web Storage, File and Solid Pod persistence, for a scaling representation of JSON application state.
Background
To set the scene a little, the project I am working on doesn't communicate with a server. Application data is structured transactionally in a Merkle Tree (git, blockchain). The tree structure naturally grows with each interaction with the application and along with it, its memory and storage footprint.
To keep an eye on this growth, I built a small indicator in the control bar to visualise the size. I found as the JSON object grew, the data started to reach an extent I wasn't completely comfortable with, so I decided to take a look and see if I could do better.
This brought me to the Compression Streams API. A browser API that enables gzip
compression on a ReadableStream
, directly in the browser.
TL;DR After enabling gzip
compression on my JSON data, I saw a size reduction of 98%, from 943kb
to 10kb
.
This wasn't a scientific test and the data didn't scale how it would organically in the app, but it serves as a fair benchmark that I will get the results I want from this API.
Solution
As the Compression Streams API is part of the Streams API, we first need to transform our JSON data object to a ReadableStream.
A Blob can be constructed from text, and JSON can be serialized to text, so to start we can create a Blob from our JSON and get our stream from there.
// Convert JSON to Stream
const stream = new Blob([JSON.stringify(data)], {
type: 'application/json',
}).stream();
Now we have our ReadableStream
, we can use the ReadableStream.pipeThrough
method to pipe our data through the gzip
CompressionStream
transform.
// gzip stream
const compressedReadableStream = stream.pipeThrough(
new CompressionStream("gzip")
);
Then to get this compressed stream back to our main code, we can create a new Response and handle the data like any other we would receive from the Fetch API.
// create Response
const compressedResponse =
await new Response(compressedReadableStream);
So what can we do with this data now it's compressed?
Well, quite a lot actually.
We could base64 encode it to a string and then store it in localStorage
.
// Get response Blob
const blob = await compressedResponsed.blob();
// Get the ArrayBuffer
const buffer = await blob.arrayBuffer();
// convert ArrayBuffer to base64 encoded string
const compressedBase64 = btoa(
String.fromCharCode(
...new Uint8Array(buffer)
)
);
// Set in localStorage
localStorage.setItem('compressedData', compressedBase64);
We could save it as a File.
// Get response Blob
const blob = await compressedResponse.blob();
// Create a programmatic download link
const elem = window.document.createElement("a");
elem.href = window.URL.createObjectURL(blob);
elem.download = '';
document.body.appendChild(elem);
elem.click();
document.body.removeChild(elem);
Or we could take more traditional approach, sending through HTTPS and the Fetch API.
// Get response Blob
const blob = await compressedResponse.blob();
await fetch(url, {
method: 'POST',
body: blob,
headers: {
'Content-Encoding': 'gzip',
}
});
And I'm sure there are many other techniques that can be used with our newly compressed JSON object.
Pretty cool, right? But how do we get our data back into a readable form in the browser?
To decompress, we first want to get our compressed data back into a ReadableStream
.
Either directly from our Blob (or file/fetch Response)
const stream = compressedResponse.blob().stream();
Or decode back from base64.
// base64 encoding to Blob
const stream = new Blob([b64decode(compressedBase64)], {
type: "application/json",
}).stream();
Edit: The b64 decode function I'm using looks like this.
export function b64decode(str: string): ArrayBuffer {
const binary_string = window.atob(str);
const len = binary_string.length;
const bytes = new Uint8Array(new ArrayBuffer(len));
for (let i = 0; i < len; i++) {
bytes[i] = binary_string.charCodeAt(i);
}
return bytes;
}
Then again, this time with our compressed ReadableStream
, we go back through the ReadableStream.pipeThrough
method, piping our data through the gzip
DecompressionStream
.
const compressedReadableStream = stream.pipeThrough(
new DecompressionStream("gzip")
);
We then go back into a Response, take the Blob and call the Blob.text()
method to get the content back in string format.
const resp = await new Response(compressedReadableStream);
const blob = await resp.blob();
All to do now is parse this string back to JSON and we have our decompressed object, back in Javascript and ready to use.
const data = JSON.parse(await blob.text());
Demo and Results
I've put together a small CodeSandbox that takes a few online JSON sources and runs them through the gzip
CompressionStream
transform described in the solution above.
Oh, and yes. ❤️ VueJS.
Conclusion
Firstly, I had a blast writing this piece and exploring the Compression Streams API. It's simple to use and feels like a really nice addition to the Javascript ecosystem.
In terms of how useful the API itself is, it has solved a real problem for me in being able to reduce my persistent data footprint.
I have already built this into a small feature in my app, but I do plan to integrate it deeper and bake it into the core functionality soon. Which should be another interesting problem to solve as the app has integrated client-side encryption using Age (rage (rage-wasm)). But that's for another day...
It's a little disappointing that this API isn't fully supported yet, at only 72.53% support coverage, according to CanIUse. But I still think it's very worthwhile. We can implement this to fail gracefully where the Compression Streams API is not supported and forgo the performance boost, so that's not a deal breaker for me.
I'm not sure how often the Compression Streams API will be a tool I reach for, but I'm glad to have learnt it and I'm sure this won't be the last time I use it, or at the very least consider it.
Top comments (8)
Very nice posting, the compression works as explained here in the chrome browser. Decompression not, because there is no b64decode function. This should also work somehow with the the btoa-function, could you please add code with btoa? Thanks!
Thanks for pointing that out @pl4n3
I've added the
b64decode
function I'm using to the post now.Thanks alot!
Hello @samternent thank you for this very interesting post.
I've tried to implement this in my Angular app. The compression process went well, I stored the data in the local storage. During the decompression process I got an error as "Error: Uncaught (in promise): TypeError: Failed to fetch" on these lines
const resp = await new Response(compressedReadableStream);
And to be specific it happens at the second lineconst blob = await resp.blob();
const blob = await resp.blob();
Do you have any idea ? Have you ever met this error ? Any help would be appreciated ! Thanks.
Hey @sam89 ,
I'm unsure without seeing a reproducible demo of the issue.
Failed to fetch suggests a fetch call failed, you could try wrapping up some of your code in a try/catch to identify the issue.
I noticed the same error on the code sandbox in this post. I debugged it and found that my xhr to a reddit endpoint was failing, so removing that from my demo fixed the issue
I used the compression for a string and the base64 produced length is bigger than the original string, how this is supposed to be for compression?
I've included an API example at the bottom of the code sandbox that will request a random string (from https://baconipsum.com/api/?type=all-meat&sentences=1&start-with-lorem=1)
All of the JSON endpoints get consistent compression, due to the repetitive nature of a JSON structure. This one, however, will sometimes achieve compression, but will sometimes see an increase in size.
AFAIK, the gzip compression algorithm is designed to look for repeated strings and replace them with a reference in order to achieve compression (great for large JSON objects and repeated data structures).
I'm not sure what string you tested this with, but it possibly wasn't in need of compression, which would likely give you the increased size.