DEV Community

Amen Ayach
Amen Ayach

Posted on

Youtube playlist `console` scraping

The motivation πŸ”₯

Sometimes you face the fact that you need some seed data to do some stuff, in my case I needed some course data in form of the bulk of videos so my first blinking thought was using Youtube playlists where I just needed a couple.

Why using the browser console? 😎

Of course, I could use selenium if I had to grab a big number of playlist info or dig in with the Youtube API, but the need was quite simpler, so I decided to write a simple JavaScript script in the playlist browser console by analyzing the "current" HTML tree, and it was simple as well.

The recepie πŸ˜‹

  • Each video in the list was a web component called ytd-playlist-panel-video-renderer.
  • The video title had the id video-title
  • The video length in ytd-thumbnail-overlay-time-status-renderer element then with span.ytd-thumbnail-overlay-time-status-renderer selector
  • And finally the relative Url using the id thumbnail within the main web component.

Then a couple of grain of salt getSeconds and getEmbedUrl helper functions we got our finished dish ready.

let getSeconds = secondsText => {
    let spl = secondsText.split(':');
    return parseInt(spl[0]) * 60 + parseInt(spl[1]);
};

let getEmbedUrl = url => {
    let vIndex = url.indexOf('?v=');
    let lIndex = url.indexOf('&list=');
    return 'https://www.youtube.com/embed/' + url.substring(vIndex + 3, lIndex);
};

JSON.stringify([...document.querySelectorAll('ytd-playlist-panel-video-renderer')]
.map(x => {
    let title = x.querySelector('#video-title').innerText;
    let secondsText = 
        x.querySelector('ytd-thumbnail-overlay-time-status-renderer')
         .querySelector('span.ytd-thumbnail-overlay-time-status-renderer').innerText;
    let seconds = getSeconds(secondsText);
    let url = getEmbedUrl(x.querySelector('#thumbnail')
              .getAttribute('href'));
    return {title, seconds, url};
}), null, 2);
Enter fullscreen mode Exit fullscreen mode

Or
Code as gist

Last note πŸ‘‹

Usually, I use the console built in copy function to copy the resulting data to the clipboard, but in this case, I used JSON.stringify to print the result out as the Youtube site seems to disable the copy function!

Top comments (0)