DEV Community

benank
benank

Posted on • Updated on • Originally published at benank.com

Building More Than Just a YouTube Video Downloader

For the dance game I'm making, I want players to be able to use any YouTube video and dance to it. To put it simply, they'll paste a YouTube link into the game and it'll analyze it for them and they'll be able to dance to it, being scored in realtime as they dance along. (take a look at the previous blog posts if you need more context!)

I don't want to embed YouTube videos in my game. There's a few reasons for this:

  1. TensorFlow.js (the machine learning platform that I'm using) can't analyze an embedded YouTube video. The video needs to be inside of an accessible <video> (or similar) element on the webpage and cannot be embedded.
  2. Improving the editor user experience - to create new dance charts from YouTube videos, a player needs to go to the Create tab and make a new project, using a YouTube video as the source. Using an already downloaded video would ensure that there isn't any buffering or lag when editing.
  3. Improving the play experience - when playing a chart, the YouTube video will have to load and play as they dance. If there's a momentary connection issue, their rhythm will be thrown off and the video will pause. This would also lead to scoring complications if the video is paused.
  4. Greater control over the experience - I don't want users to be able to pause and play videos while they play the game. The video should automatically play right when the chart starts and continue without any interruptions so the player can have a smooth dance session.

Getting Started

Everyone's wanted to download a YouTube video at some point, but the methods for doing so have often been less than ideal. My usual strategy would be to search "youtube to mp4" and then click on the first result. I would paste in the YouTube link and wait for it to give me a download link.

Many of these sites use an underlying program called youtube-dl. youtube-dl is a program that's capable of downloading videos from YouTube and many other sites.

I'm writing my server in NodeJS, so ideally I would have a wrapper to use with youtube-dl to make it extra easy for me to use. Luckily, someone's already done that, with youtube-dl-wrap! And it can even download the executable file for you (the youtube-dl program itself) so you don't have to manage that at all.

Video Metadata

Let's say that a user wants to download the video from the link: https://www.youtube.com/watch?v=pdsGv5B9OSQ. First, we have to verify that the link they provided is an actual video that can be downloaded.

We can do this by retrieving the video's metadata using youtube-dl. The metadata for a video is a collection of attributes about the video, such as its webpage url, thumbnail, video length, file size, title, description, upload date, and so on.

If we're able to get the metadata, this means that the video is a valid video that can be downloaded. If it wasn't a real video or an invalid link, youtube-dl would tell us and we can give the user an error.

The video metadata has an important section called formats, which is a list of audio and video formats that can be downloaded. These have varying qualities, such as audio-only, 360p, 480p, and others. This makes it pretty easy to download the video at the quality that you want - just tell youtube-dl to download the 360p video.

{
    format_note: '360p',
    acodec: 'none',
    url: '...',
    ext: 'mp4',
    tbr: 177.301,
    format: '134 - 640x360 (360p)',
    filesize: 3244599,
    vcodec: 'avc1.4d401e',
    quality: 2,
    asr: null,
    container: 'mp4_dash',
    downloader_options: [Object],
    vbr: 177.301,
    height: 360,
    http_headers: [Object],
    width: 640,
    format_id: '134',
    protocol: 'https',
    fps: 30
}
Enter fullscreen mode Exit fullscreen mode

Example of one entry in the formats section of the metadata. (Above)

However, there's a catch: most of the time, the highest quality video doesn't have audio. That's just how YouTube seems to work, with the audio and video separate. So in order to download the highest quality video (with audio), they'll have to be downloaded separately. In many cases, you'd want to combine the two into one file so you have the highest quality video and audio. ffmpeg is one way to do that. But in my case, I can simply play both the audio and video at the same time and it will work!

Downloading the Video

After we have the video metadata, we need to have the user download the video. There are a few ways to do this:

  • (Option 1) Send the user the direct links to the video/audio files from YouTube and have them download the files.
  • (Option 2) Download the video/audio files myself and upload them to a cloud storage provider, then serve those files to the user.

Option 1 sounds like less work, and while it might be good for a while, it has a lot of limitations. YouTube could block or rate-limit downloads from their server that originate from another domain (hint: not YouTube.com). YouTube could also change something entirely on their backend to prevent users from downloading directly while on my website.

So to combat that and have more control over the process, I opted for Option 2. Option 2 has a catch, though: storing and serving video files through a cloud storage provider isn't free. However, it isn't constantly streaming them if a user needs to use the video again later; videos are only downloaded once and then stored locally so the user can access them later without needing to download again.

This means that we'll only need to store and serve the files for a limited amount of time. Using different lifecycle rules, I can automatically configure the cloud storage to optimize for high/low usage for each file, and then delete the file if it hasn't been downloaded for a few days. If another user needs the same file again later, it will just have to be downloaded again from YouTube and uploaded back into cloud storage.

On a similar note, the server will also store recent video requests in memory for a little while. This will ensure that subsequent requests for the same video will be super fast (waiting for metadata from YouTube takes ~5 seconds or so).

And one more note: the reason I don't simply serve the files from the same server that downloads them is because I don't want the heavy traffic to go to the same server that gets the API requests. Files should be downloaded from one place and all API requests in another.

Downloading Without a Download Prompt

When you download files from the internet, most of the time there is a popup asking if you'd like to download the file, and if so, where you would like to save it. This paradigm isn't conducive to a smooth user experience for my game, so I am using a different download method.

Using XMLHttpRequests, I can download any file from the internet without needing to prompt the user. Once it's downloaded, I can store it in the user's IndexedDB, which is a local storage solution on a per-website basis, intended for storing large amounts of structured data. That's perfect for storing video and audio files. As per usual, I wanted a wrapper for IndexedDB to keep things extra simple, so I opted to use Dexie.js.

The video and audio files are downloaded as blobs, which as the name might suggest, are just big blobs of raw data for any type of arbitrary file. Blobs are great for storing video and audio files.

After storing the data in the IndexedDB, retrieval and usage is pretty easy. Create a URL that links to the blob:

const url = URL.createObjectURL(blob);
Enter fullscreen mode Exit fullscreen mode

and then use that URL in the video or audio element:

<video src={url} />
Enter fullscreen mode Exit fullscreen mode

And that's it! Now we've got locally downloaded media files that the user can play anytime without any buffering, lag, or ads!

I also wanted to download and store the thumbnails for videos as well, and this used a similar process, except with one important change.

The XMLHttpRequest has a property called responseType, which indicates the type of data that we intend to download. I set this to blob for all of the media types, but for thumbnails (which are JPEGs), it didn't work. I created an <img> element and inserted the downloaded thumbnail in, and it didn't show up.

The trick is to use the overrideMimeType on the XMLHttpRequest, allowing us to explicitly interpret what kind of data we're dealing with, instead of the server telling us. In my case, since I am dealing with JPEG images, I used this line to set the MIME type accordingly:

xhr.overrideMimeType("img/jpeg");
Enter fullscreen mode Exit fullscreen mode

and viola, the thumbnails magically worked! MIME type doesn't seem to be necessary for video/audio files, but this is good to keep in mind in case those don't work in the future. There are many types of video and audio formats to keep track of.

Structuring the API Server

I've never made an API server before, but it sounds pretty fun! I can get any kind of data I want, just by visiting a URL in my browser. In our case, I want to have an API server to get information about a video (and later, dance charts + more). This information will include its current status, progress (if it is currently being downloaded), and download links (if it is ready to be downloaded).

It's actually pretty easy to do with express. You can set up your app and begin specifying what to return to users when you receive a GET request:


const app = express();
app.use(express.json());

...

app.get('/api/video/:id', apiLimiter, isAuthenticated, (req, res) => {
    mediaManager.getMedia(req.params.id).then((media_info) => {
        res.send(media_info);
        res.end();
    });
});
Enter fullscreen mode Exit fullscreen mode

And that's all there is to it! The function inside is the one I created in the above section where the metadata for the video is queried and then the video is downloaded and uploaded. During those steps, this returns JSON with an appropriate status. Once it's ready for download, the JSON is updated with download links for the media and an appropriate status. Users can continuously send GET requests to the API to check on the status of a video. Pretty cool, right?

I'll add more API endpoints later so that specific dance charts can be queried or created.

Adding Passwordless Authentication with JSON Web Tokens

Having an exposed, unauthenticated API server on the internet is a little spooky. Someone could spam requests or flood it with garbage so that it crashes or becomes slow. I've added some rate limiting, which limits the amount of requests that a user can send the the server in a given period, but there's still more that we can do!

Eventually, everyone person who plays my game will have their own profile so that they can track all their high scores. This will require some sort of authentication system. I could use an existing provider, such as Google, to do this, but I wanted to learn a new way to do this.

Enter JSON Web tokens!

You can read more about them in the link above, but they're basically little pieces of data that tell the server about who's accessing the page. In my case, I only need one piece of information about them: their email.

No password required! Users can visit the site and get a "magic link" emailed to them. This magic link has a JSON Web Token that my server has generated using a secret key embedded in it, so the link looks something like this:

https://mysite.com/login?token=98132nbglda9832y9rg2n3jk4d
Enter fullscreen mode Exit fullscreen mode

When a user clicks on that link, they're taken to my website, where the token is stored as a cookie in their browser. Now every time they visit the site, I'll read the cookie to figure out who they are. No passwords required! It's a pretty neat way to do logins. If someone clears their cookies or wants to log in on another device, they can just enter their email again and get a new magic link.

The Result

There's been a lot of talk but not a lot of show so far, but here's what it looks like in action:

The button that I click to start the download is just a test button - in the future, the downloads will start when you need to download a song to play or create a dance chart.

The design for the Downloads page of my game is pretty basic, and I'll be diving deeper into the struggles with creating a design that looks semi-decent in the future.

Top comments (2)

Collapse
 
expalmer_21 profile image
Palmer Oliveira

Thanks for sharing! I will use this strategy in one of my projects...thanks again

Collapse
 
benank profile image
benank

Happy to hear that you got some use out of this! Thanks for reading.