DEV Community: Jakub Kopańko

Always Winning at Juwenalia: Hacking Rewards from the Festival App's Mini-Games

Jakub Kopańko — Mon, 26 May 2025 10:45:32 +0000

Ah, Juwenalia. If you're a student in Poland, particularly here in Kraków, you know exactly what that means. It's that time of the year when the city transforms into a student playground. Uni classes are suspended and replaced by days filled with concerts, events, and a significant amount of beer.

The backstory

Last year, alongside the usual festivities, the organizers introduced "JuweAppka". Developed by a third-party contractor, the app promised drops of Juwenalia merchandise — tote bags, t-shirts, in-house Juwe beer, and other cool prizes. The mechanic was simple — each device could claim a single "lottery ticket" per drop. At a specific time of the day, you'd tap a button in the app, and if luck was on your side, you'd win a prize tied to your device ID.

This presented an interesting opportunity. Since prizes were linked only to a device ID, and reinstalling the app would generate a new ID, I realized I could automate this process. A quick peek with mitmproxy, revealed the simplicity of the backend. The endpoints were completely unprotected - they consumed the device ID and returned either a the details of the prize if you won, or an empty array if you didn't. There was no real complex authentication or validation.

I quickly whipped up a small Node.js script that would generate random device IDs, send a request to the lottery endpoint at designated time and see if it won. By running this script many times with different IDs, I could accumulate numerous, real, prizes across these virtual devices.

But how to collect all these prizes on my actual phone? This is where mitmproxy came back into play, this time for its ability to modify traffic in-flight. I could intercept the request the real app made to fetch the list of won prizes for my real device ID. Then, I could modify the response from the server, injecting all the prize data I had collected from my multitude of virtual devices. The app, none the wiser, would display this combined list of prizes, allowing me to claim them all at once. It felt less like cheating and more like... efficient prize collection, leveraging the system's design. After all, anyone could have reinstalled the app repeatedly; I just automated the process.

Fast forward to this year. The JuweAppka is back, but this time it's an in-house production, built with React Native and backed by Firebase. The simple lottery is gone, replaced by more engaging mini-games — think cookie clicker mechanics or falling fruit challenges, complete with leaderboards. The top players at the end of the day win the coveted prizes.

A screenshot showing one of the mini-games.

Crucially, they've implemented an account system. This is a significant change. While it prevents the simple device ID spoofing of last year, it also simplifies the prize collection. I can now potentially manipulate my score within an emulator and then simply log into my account on my physical phone to collect any prizes won, without needing to spoof the final prize list request.

My initial thought process was straightforward: the app must send my score to the server via a POST request to update the leaderboard. If I could intercept this request using mitmproxy, I could simply edit the score value before it reached the server, artificially boosting my rank on the leaderboard. Easy, right?

While I'll walk you through my process, this post is for educational purposes and isn't a step-by-step guide for replication.

The setup

Ok, so let's get into this. Here's the list of ingredients you'll need to follow along:

Android Emulator: You'll need a rooted Android emulator, preferably running a release image with Google Play services installed. I would target older, but still supported Android versions. Installing Magisk on the emulator is highly recommended. You can use MagiskOnEmulator to do this.
mitmproxy and frida tools: Ensure you have both installed on your host machine. For mitmproxy you'll also need to install its CA cert into the Android system certificate store on your emulator. Magisk makes this significantly easier with a module. You'll also need to configure your emulator to route all traffic through the proxy running on your machine. While there are multiple ways to approach MITM attacks, I generally add the CA to the system store and use Frida to bypass SSL certificate pinning anyway as it's often surprisingly effortless. Magisk is also invaluable here, use magisk-frida module.

Note that the following can also be done on a non-rooted, stock Android device using Frida Gadget patched directly into the APK.

The execution

The goal was simple: play a game, capture the network traffic when the score was submitted, and identify the relevant request.

I fired up one of the mini-games, played for a bit to get a non-zero score, and finished the round. As expected, mitmproxy immediately showed network activity. Sifting through the requests, I quickly spotted a POST request sent a Firebase endpoint, likely containing my score update.

POST https://REDACTED_URL/result HTTP/2.0
authorization: Bearer REDACTED_JWT
authtoken: f9c32a74560ef09523a21f7798c836dcc17448a7f0bc7b03d121015779922bf4
game: lap_co_leci
content-type: application/json
content-length: 12
accept-encoding: gzip
user-agent: okhttp/4.9.2

{"score":91}

My initial thought was confirmed: there's a score field in the JSON payload. Modifying that with the proxy is trivially easy. But then I noticed it — the authtoken header. There already is the Authorization JWT header and this looks... different. A long string of hex characters.

After playing the game a couple more times with different scores, I observed a pattern: the authtoken value changed whenever the score changed, but two games with the exact same score resulted in the exact same authtoken. This strongly suggested the authtoken was not a random session token, but rather a value derived directly from the score itself, likely some form of hash. Given its length and appearance, a SHA-256 hash seemed like a prime candidate.

This is a common, albeit often insufficient, security pattern. The server likely recalculates the hash on its end using the received score and a secret key (a salt or a prefix) and compares it to the authtoken provided by the client. If they match, the score is considered valid.

My simple plan of just changing the score in the request body hit a roadblock. If I just changed the score, the authtoken wouldn't match the server's calculation, and the request would be rejected. I needed to generate a valid authtoken for my desired, artificially high score.

This is precisely where Frida shines — it allows us to peer inside the running application process. Since the app is performing this hashing operation locally before sending the request, I could use Frida to hook into the native crypto functions being called by the React Native application. This approach is quite elegant in that we don't have to modify or even read messy compiled Hermes bytecode — we modify the platform instead!

By intercepting the call to the hashing function (like a SHA-256 implementation), I could log the input data being fed into it — this input would likely be the score combined with the secret salt/prefix. Once I had that input format, I could replicate the hashing process myself and generate valid authtoken for any score I wanted.

Writing a Frida script to hook native crypto functions is surprisingly straightforward once you know what to look for. Frida injects a full JavaScript engine into the target process, giving you incredible power to inspect, modify, and trace function calls at runtime.

Here's the script I wrote that got injected into the app:

function bytesToString(buffer) {
  var str = '';
  for (var i = 0; i < buffer.length; i++) {
    str += String.fromCharCode(buffer[i]);
  }
  return str;
}

Java.perform(function () {
  // Find the class that we want to hook
  var MessageDigest = Java.use('java.security.MessageDigest');

  // Replace native implementation with our JS hook
  MessageDigest.update.overload('[B').implementation = function (inputBytes) {
    var algorithm = this.getAlgorithm();
    if (algorithm.toUpperCase() === "SHA-256") {
      console.log("[*] MessageDigest.update called with SHA-256 algorithm.");

      var stringInput = bytesToString(inputBytes);
      console.log("[*] SHA-256 input (string): " + stringInput);
    }

    // Call the original method
    this.update(inputBytes);
  };

  console.log("[*] MessageDigest hooks installed for SHA-256.");
});

Immediately after injecting the script and playing the game, the console output from Frida revealed the secret:

$ frida -U -f com.kuvus.juwenalia_krakowskie -l unpin.js -l sha_hook.js
Spawning `com.kuvus.juwenalia_krakowskie`...
Spawned `com.kuvus.juwenalia_krakowskie`. Resuming main thread!
[Android Emulator 5554::com.kuvus.juwenalia_krakowskie ]-> [*] MessageDigest hooks installed for SHA-256.
(...)

[*] MessageDigest.update called with SHA-256 algorithm. 
[*] SHA-256 input (string): jsjkek-91

There it was! The input string to the SHA-256 hash was the literal string "jsjkek-" concatenated with the score.

sha256("jsjkek-91") ->
"f9c32a74560ef09523a21f7798c836dcc17448a7f0bc7b03d121015779922bf4"

This was the key. With mitmproxy to intercept and modify the request body (setting the desired high score) and the ability to generate the correct authtoken using the discovered prefix, I could now, in theory, submit any score I wanted to the leaderboard. This effectively bypasses the score validation mechanism and allows for arbitrary score submission.

As for whether or not I actually used this method to climb the leaderboard and claim prizes... well, I've been advised not to comment on that.

The $75 Photoshop Plugin That Cost Me... Nothing (I Built It Instead)

Jakub Kopańko — Wed, 07 May 2025 12:59:33 +0000

I was scrolling through Instagram Reels recently when an ad, presented as a "tutorial," caught my eye. It was for a Photoshop plugin called "DITHERTONE Pro," showcasing a rather striking visual effect – lines that seemed to follow an image's contours, giving it a distinct, almost cyberpunk aesthetic. It genuinely looked impressive. Then, the price appeared: $75.

That figure made me pause. Not because I'm against paying for well-crafted software, but because the effect itself felt incredibly familiar. It was a strong sense of déjà vu, like I'd seen this exact visual style generated somewhere else, in a context far removed from polished, commercial plugins.

The Ghost of Glitch

My mind immediately jumped back a few years. I used to be quite involved in the datamoshing and glitch art scene – not so much these days, but I still have a soft spot for creative data bending. And in that world, a collection of Processing scripts by Tomasz Sulej, called "GenerateMe," was pretty well-known. He'd developed a range of tools, from pixel sorting to very accurate VHS tape emulations.

And wouldn't you know it, one of those scripts, "FM Modulation," produced an effect that was practically a dead ringer for what "DITHERTONE Pro" was selling. It created those same intricate, flowing lines that followed the image's inherent structure. The core aesthetic was undeniably there, born from a free, open-source script.

Now, those GenerateMe scripts were (and are) clever pieces of work. But using them today isn't exactly a seamless experience. They are, after all, products of a slightly different era in creative coding.

For instance, you'll likely need to download a specific, older version of Processing to ensure they run correctly, as they often haven't been updated for the latest Processing releases. The user experience is also very minimal – typically just an image preview. Controlling the FM Modulation script involved moving the mouse cursor to adjust parameters, which, while interesting, isn't the most precise method. And if you wanted to process your own image? That meant editing the source code to change the input file path. So, while not ancient, they require a fair bit of fiddling and aren't what you'd call user-friendly by today's standards – definitely a bit of a pain.

Why Not Rebuild It for the Web? And Make it Free.

So, there's this cool effect, locked away in an old script that’s a hassle to use, or available via a $75 plugin. The plugin undoubtedly offers a more polished experience and likely more features, but for that core, captivating FM modulation effect? The price felt steep, especially knowing its open-source roots.

"I could probably build that," I thought. The idea started to form: take the essence of that FM Modulation script and rewrite it for the web. Make it accessible, easy to use, and, importantly, free for anyone to try. I'd done something similar before with an old AVI glitch tool that messed with frame indexes (I wrote about that here), and the challenge was appealing. If the core logic was sound, why not modernize its delivery?

Choosing the Right Weapons

Alright, so the goal was set: bring FM modulation to the web. The next question was how. My gut feeling, based on what I knew about the FM modulation algorithm – lots of loops, complex math, angles, and bits of signal processing theory – was that a straightforward JavaScript implementation might not cut it. Especially for larger images, I had visions of the browser grinding to a halt, fans spinning up, and a generally laggy, unpleasant experience.

This is where the idea of using Rust compiled to WebAssembly (WASM) came in. Rust offers near-native performance, which is perfect for CPU-intensive tasks like heavy image processing. WASM allows that Rust code to run in the browser at speeds that JavaScript alone often can't match for these kinds of operations. Plus, the plan was to offload this heavy lifting to a Web Worker, keeping the main browser thread free and the UI responsive. It felt like the right architectural choice to deliver both the complex visual effect and a smooth user experience.

The Build: Rust to WASM Delivers

With the tech stack decided, it was time to get to work. I set up the Rust workspace: fm_core for the actual image processing, and a separate module to interface with wasm-pack for the WebAssembly compilation. I have to say, the experience of getting Rust code to run on the web via WASM was surprisingly smooth. The Rust community has built an impressive ecosystem here.

Honestly, I was expecting more of a battle – the usual snags and configuration issues when you're trying to make different technologies play nice. But wasm-pack handled its part elegantly, and the necessary glue code between JavaScript and the compiled WASM module was straightforward. Getting the Rust logic up and running in the browser happened with minimal friction, which was a very welcome development and speaks volumes about the maturity of Rust's WASM tooling.

Vibe coding critique

Now, about AI tools. I did use GitHub Copilot in agent mode here and there, mostly for helping translate some of the gnarlier math from Processing to Rust. It can be a great assistant for many tasks, and that agentic workflow has great potential for speeding things up significantly. But let's be brutally honest: AI isn't writing your next masterpiece, and it sure as hell isn't a substitute for actual engineering skill. The current crop of "AI can code better than engineers" tech bros and founders, churning out their AI-powered software rot machines, are in for a rude awakening. You can't just prompt your way to a solid, maintainable, secure application – that takes understanding, discipline, and a grasp of fundamentals that these tools simply don't possess. Relying on AI as a crutch without that oversight is just asking for a dumpster fire of a codebase down the road.

The Payoff

After all that wrestling with Rust, WASM, and Web Workers, I'm happy to say the project came together. I'm calling it the "Image Frequency Modulation tool," and you can try it out right now at modulate.kopanko.com.

The front end is built with React and shadcn/ui, giving it a clean and modern interface. When you upload an image and tweak the parameters, the actual heavy lifting – the FM modulation processing – is done by the Rust/WASM module running inside a Web Worker. This means the image processing happens asynchronously, off the main browser thread, so the UI stays snappy and responsive even while your image is being transformed. It’s designed to do that one core thing – generate those cool FM modulation effects – and do it well, without any fuss. And, of course, it's completely free to use.

Conclusion: So, About That $75 Plugin...

And there you have it. From a slightly overpriced Instagram ad to a free, performant web tool. It was a fun challenge, a good excuse to dive deeper into Rust and WASM, and ultimately, a way to bring a cool visual effect to more people without a price tag.

While the commercial plugin might offer a broader suite of features for professional workflows, if you're just looking to experiment with this specific FM modulation aesthetic, my little tool gets the job done. Sometimes, the satisfaction of building it yourself (and then sharing it) is worth more than any plugin.

So go ahead, give it a spin here. Upload an image, play with the settings, and see what kind of visual weirdness you can create. Maybe you'll save yourself $75 in the process.

Finding a Needle in the Image Stack: A Deep Dive into Content-Based Image Retrieval

Jakub Kopańko — Tue, 28 Mar 2023 22:37:29 +0000

As an undergraduate with a keen interest in computer vision, I found myself fascinated by the subject of digital image processing, particularly Content-Based Image Retrieval (CBIR). When my professor encouraged me to work on a project of my choice, I eagerly delved into the fascinating field of digital image processing beyond the standard lab exercises. While my peers were focused on their routine lab work, I was immersed in the world of CBIR. In this blog post, I will share the knowledge I gained, from conventional methods to state-of-the-art neural networks. I'll also discuss how these techniques can help you find the right image every time. So join me on an enlightening journey as I reveal what I've learned!

It is important to emphasize that the primary goal of all content-based image retrieval techniques is to transform images into numerical representations, typically in the form of n-dimensional vectors. This allows us to compare and manipulate images using mathematical operations. By using various distance metrics and clustering algorithms, we can assess the similarity between images and categorize them into relevant groups.

The code for my experiments can be found at https://github.com/pcktm/image-retrieval-experiments, and I wholeheartedly encourage you to check it out and explore the fascinating world of image retrieval!

Classical methods

I decided to start with the classical methods of CBIR, which were invented in the early days of digital image processing in the 1990s and early 2000s. These methods are based on extracting global features from images, such as color, texture, and other statistics. Global features represent the overall characteristics of an image and can be used to describe and compare images. Some of the most common global features used in classical CBIR methods are color histograms, texture features based on Gabor filters, and statistical features such as mean, variance, and entropy.

Excited to get started on my project, I began by following an "Content-based image retrieval tutorial" by Joani Mitro. The report covered classic CBIR techniques such as color histogram quantization and RGB means and moments, which I was able to implement with ease. However, I hit a roadblock when it came to the next step: implementing a color auto-correlogram.

A color auto-correlogram is a powerful technique invented by Huang, J., Ravi Kumar, S., Mitra, M. et al. in their seminal paper, "Spatial Color Indexing and Applications" published back in 1999. In short, a color auto-correlogram is a tool that allows us to measure the spatial correlation between color values in an image. For example, if we compute the color auto-correlogram for a given image and find that the value for the red-blue pair is high, this indicates that red is likely to be found near blue in the image. This information can be used to identify other images that have similar color patterns and texture characteristics.

The problem was that when it was time for implementation of the color auto-correlogram, I quickly found that the only implementation available was in MATLAB. Unfortunately, I was working on this project in Python using OpenCV2, and I had no knowledge of MATLAB. This left me with the monumental job of rewriting the entire algorithm in Python.

It took literally days of hard work and countless hours of debugging, but I finally got it to almost work with the help of Copilot at one point, except instead of a vector of 512 values, I got a vector of insane size. After digging around, I found that MATLAB has a function that palletizes the image to just 64 colors. Unfortunately, OpenCV2 has no equivalent, and I couldn't just trim the least significant bits or use k-means clustering to create a palette for each image, because the resulting vectors wouldn't be comparable across images.

After some trial and error, I finally came up with a solution: I took a PNG image of the Windows 2000 or something color palette and used it to palletize all my images. And finally, it clicked! The search began to work, and I was able to retrieve images that had a similar color distribution to mine!

Now, let's put these extracted features to work by testing different distance metrics and searching for images in the flickr30k dataset using a single global feature vector that combines all of them. Here are some of the results:

Local features based methods

Next, I delved into local feature-based methods, specifically examining the SIFT and ORB keypoint detectors and descriptor generators. I chose these algorithms because they are seamlessly integrated into OpenCV2 and quite easy to use, making them ideal for what I wanted to explore.

The bag-of-words approach is a popular method for representing local features in image retrieval. It involves clustering local descriptors (in this case, SIFT or ORB descriptors) into a set of visual words using K-means clustering. A histogram-like representation of the image is then created by assigning each descriptor to the nearest visual word. These histograms are called Bag of Visual Words vectors, which can then be used for similarity searches using plain old distance metrics.

Okay, so the term Frequency - Inverse Document Frequency (TF-IDF) technique was originally invented for natural language processing, but it turns out that it can be just as useful for image applications! Essentially, TF-IDF is a weighting scheme that assigns weights to words in a text document based on how often they occur in that document and how often they occur across all documents in a corpus. In the context of image retrieval, we can think of an image as a "document" and the visual features as "words". By calculating the TF-IDF scores for each feature, we can identify the most "distinctive" features that are most likely to be useful in distinguishing one image from another.

To give an example of how IDF works, let's consider the visual word "sky". Imagine there are thousands of images containing the visual word "sky", but only a small percentage of those images contain another visual word "eye". If we use IDF weighting, the visual word "eye" will have a higher weight compared to "sky" because it is less common in the total set of images.

The whole algorithm looks like this:

First, we extract local feature descriptors from the image (such as SIFT, SURF, or ORB). I personally use both SIFT and ORB at the same time.
We then place all the descriptors into a K-means classifier and obtain n clusters.For each image, we calculate how many descriptors belong to each cluster, resulting in an n-dimensional Bag of Visual Words vector.
We transform this Bag of Words vector into a TF-IDF vector, which we can use to measure the distance between images using any distance metric (such as cosine or Euclidean distance).

Again, here are some of the results on the flickr30k dataset:

Despite SIFT and ORB not considering color information, the images are still similar.

Clearly the common visual element across these photos is "dots".

The biggest problem with CBIR

A notable challenge in content-based image retrieval is the scarcity of meaningful datasets. Many scientific articles tend to reduce the complexity of the problem by treating it as a classification problem, rather than focusing on measuring the similarity of image content. This raises important questions: How can we effectively evaluate the performance of our models? Which approach is better - classical methods or SIFT+ORB?

So I decided to simplify the problem to a classification task. To do this, I used the Landscapes dataset (which, as you'll soon find out, was not the most optimal choice). I assigned each image in the test set a category based on its 3 nearest neighbors from the training set, allowing me to evaluate the performance of the different techniques under this classification framework.

Here are the results of using SIFT for searching in the landscape dataset.

I may have inadvertently created a watermark detector!

Yeah... the dataset is of low quality which will have an impact on the accuracy. However, it's worth noting that the content of the images is actually very similar, which is exactly what we're looking for!

Confusion matrix for classical methods.

Confusion matrix for SIFT+ORB.

Convolutional Conquerors

As the finals were fast approaching, I shifted my focus to neural networks. With limited time, I couldn't afford to train one from scratch, and frankly, I wasn't sure how best to approach it. However, I realized that convolutional neural networks produce an output vector that could potentially be used to measure the distance between images.

I thought, why not use a pre-trained neural network, "chop off its head", and simply utilize those vectors? So, I went ahead and acquired an EfficientNet without the classification layers, hoping to evaluate its potential... And it failed spectacularly, producing seemingly random vectors that made the images virtually incomparable.

Convinced that the vectors were not random, but simply not comparable using standard distance measures, I tried to train a similarity-measuring neural network. A rather desperate move, I admit. In the end, the results were far from usable. The classifier consistently picked certain buildings while completely ignoring others, and that wasn't even the primary objective! So I decided to stop wasting time and let it go. I even experimented with machine learning techniques other than neural networks, such as Random Forests and KNN, but unfortunately these efforts were also fruitless.

With the project deadline looming and time running out, I was almost ready to throw in the towel. I wondered if maybe EfficientNet was to blame for my struggles. In a last-ditch effort, I decided to give the BiT-M R50x1 model a try and evaluate its performance. To my amazement, it worked like a charm on the first try! I achieved spectacular results with a remarkable 93% accuracy!

Thrilled by the impressive performance of the BiT-M model, I decided to test it on the entire Flickr30k dataset, intending to search across these vectors. After waiting patiently for a few hours as the model gradually processed all 30,000 images, I finally had my database. With great anticipation, I ran my first search and was blown away by the results! The images retrieved were not only visually similar to the query, but they also shared the same CONTENT.

In conclusion, my journey into the world of content-based image retrieval has been nothing short of eye-opening. From exploring classic methods like color auto-correlograms, to diving into local features with SIFT and ORB, and finally delving into the power of neural networks, I've learned a great deal about the different approaches to CBIR.

Although I faced some challenges along the way, the remarkable success of the BiT-M model on the Flickr30k dataset was truly rewarding. It demonstrated the incredible potential of neural networks to retrieve images based not only on visual similarity, but also on content. As we continue to push the boundaries of what's possible with CBIR, the future of image search is undoubtedly bright and full of innovation.

I hope you've enjoyed joining me on this exciting journey and that you've gained some valuable insights into the fascinating world of content-based image retrieval.

How to glitch video in the age of web

Jakub Kopańko — Thu, 23 Sep 2021 22:19:11 +0000

The tool described in this post is available at ezglitch.kopanko.com

For years I've been interested in datamoshing and glitch art, but mainly for the computer aspect of it, like, you know, you edit some parts of the file, and it plays differently? How cool is that, right?

But if you wanted to get into glitching, there's an obvious barrier! Most tutorials rely on old and buggy software or require you to download countless environments and tools onto your computer! Some people argue that if you don't do it with buggy software, it ain't glitch-art at all!

In the past, I have had made my own tools to break files for me, like glitchbox, which was basically a JavaScript interface to ffglitch (back when it had none), always trying to make things as easy as possible for the end-user.

So, one evening, I sat down and set on rewriting my go-to AVI glitching tool, tomato for the web. Let me start by explaining how the AVI file is actually constructed. AVI files consist of three basic parts:

hdrl buffer - a header of sorts that contains data on the total amount of frames, width, and height of the video, and so on.
movi buffer - this is the part we actually care about as it contains raw frame data.
idx1 buffer - holds the index.

Now, the frames in the movi buffer are arranged as they will be played by the player. Audio data starts with the string 01wb and compressed video with 00dc. They end just before the next such tag or just before the idx1 buffer tag.

For the fun part - if we rearrange or copy those frames around, the player will play them right as it sees them. We don't need to know the exact structure of the frame, its DCT coefficients, or some other complicated technical stuff - we just need to be able to move bytes around! Fortunately for us, that is entirely possible in modern browsers!

const buf = await file.arrayBuffer();
const moviBuffer = buf.slice(moviMarkerPos, idx1MarkerPos);

Now that we have the entire movi buffer, we need to construct a frame table. We use some string-search algorithm to find all occurrences of 00dc or 01wb in the buffer - they mark the beginning of every frame.

// this is just "00dc" in hexadecimal
const pattern = new Uint8Array([0x30, 0x30, 0x64, 0x63]);
const indices = new BoyerMoore(pattern).findIndexes(moviBuffer);
const bframes = indices.map(v => {return {type: 'video', index: v}});

We do the same thing to I-frames, combine the two, and sort them based on their index. Then, we need to get each frame's byte size (which will come in very handy in a moment):

const table = sorted.map((frame, index, arr) => {
  let size = -1;
  if (index + 1 < arr.length)
    size = arr[index + 1].index - frame.index;
  else
    size = moviBuffer.byteLength - frame.index;
  return {...frame, size}
})

This has been a pretty linear and dull process so far, but now we get to have some genuine fun - we get to come up with a function to mess with the frames! Let's do the simplest thing and just reverse the whole array.

let final = table;
final.reverse();

This will, obviously, make the video play backward, but since the frames encoding motion do not take this into account we effectively flipped the motion vectors inside them, which in turn leads to a very odd effect in playback. Keep in mind the frames are still valid, and their data hasn't changed - just their order inside the file.

OK, so that's it? Well, not yet. We still need to reconstruct the new movi buffer from the frame table and combine it with hdrl and idx1 buffers. How do we approach it?

The best way to do it is to get the final size of the movi buffer and allocate that much memory beforehand so that we don't ever have to resize our Uint8Array.

let expectedMoviSize = 4;
final.forEach(frame => expectedMoviSize+=frame.size);

Wait, why expectedMoviSize = 4? Well, now we initialize the TypedArray with the final size and set the first 4 bytes to the movi tag itself.

let finalMovi = new Uint8Array(expectedMoviSize);
finalMovi.set([0x6D, 0x6F, 0x76, 0x69]);

This is the final stretch - for every frame in the frame table, we read the data from the original file and write it at the correct offset in the final movi tag. We advance the head by the frame bytesize so that the frames are written sequentially.

let head = 4; // guess why we start at 4

for (const frame of final)) {
  if(frame.index != 0 && frame.size != 0) {
    const data = moviBuffer.slice(frame.index, frame.index + frame.size);
    finalMovi.set(new Uint8Array(data), head);
    head += frame.size;
  }
}

Now all there's left is to recombine it with the original hdrl and idx1 and we're done!

let out = new Uint8Array(hdrlBuffer.byteLength + finalMovi.byteLength + idx1Buffer.byteLength); 
out.set(new Uint8Array(hdrlBuffer));
out.set(finalMovi, moviMarkerPos);
out.set(new Uint8Array(idx1Buffer), hdrlBuffer.byteLength + finalMovi.byteLength);

That's it, we can now save the complete modified file and enjoy the result we got!

Again, you can find the complete tool here.
Thanks for reading, glitch on ✨!