DEV Community

Cover image for Undetectable YouTube automation - The Challenge | Part -2
Abhiprojectz
Abhiprojectz

Posted on

Undetectable YouTube automation - The Challenge | Part -2

Welcome to the second part of the tool, if you haven't gone through the first part please go through it first.

In this part we will be discussing:

  • The Issues that comes our way when automating such giant platforms.
  • Channel Analytics after fully automating via the tool.
  • Content generation through the tool.
  • Alternate ways for it & issues that comes the way.

To start automating content generation on YouTube, you'll need to have a basic understanding of Python programming. Python is a versatile programming language that can be used to automate a wide range of tasks, including content generation.

Issues automating such giant platforms

pic67

As from the above diagram, it can be seen that getting detected and eventually gets blocked is a major issue when dealing with multiple social media accounts and automating it.

Youtube is an dynamic platform that is its not static (obviously), which makes it difficult to automate in such case browser automation is the only way to do so (except apis).

But it comes with lots of troubles for a glimpse just see below image:

The above errors usually comes after a while, when the youtube detects that something is wrong and it just blocks the uploads. Meaning that channel will get a spam or termination.

Following issues have been specially taken care off!

Channel Analytics

Viewership depends on a lot of factor & is not our focus, so results may vary a lot but the tool's main concern is on the automation part.

The purpose of showing the analytics is mainly to show how the tool performed, extremities & where were it lacking, what all the issues faced.

The tests have been done on a variety of conditions such as testing it on headless servers which is a major challenge.

As Youtube catches it insanely.

  • VPS hosting servers
  • Google Colab
  • Heroku
  • CircleCI
  • Self local computer

Over a span of months. before moving ahead,

dkhd

Let's see the analytics of one of my personal channel that is automated few months ago.

The latest channel views :

dhd

Note the shorts & videos title for now, will relate it later.

Screenshot_2023-03-08-09-14-59-30


Content generation through the tool.

Now let's see how the content generation can be automated, and how will be reach the limits of creativity as long as we proceeds in this series.

As always lets start with basics.

pic2

The base of the content generation:

  • Scraping interesting content

Don't worry we will not be using any third party service for scrapping, instead we will build a better and smart solution to it.

Our aim is to build a dataset for both textual content such as qoutes, facts and educational phrases as well as free stock vector that mets our condition.

There are thousands of platform on the web such as:

But, as usual i can't find a way to scrape the target links or only the main textual content from different websites etc.

As some service doesn't supports dynamic websites while others took a lot of time to scrape or build rules and ofcourse some got blocked as well.

So, Speaking frakely generally such scrapping solutions are useless for developers like us.

For example look at this dataset of vectors created from pexels within a few minutes (-3 minutes).

vec

This contains the IDs of vector that we want and its royalty free.


Now look at this as well:

frr

Not only limited to links but, here's the dataset of hindi qoutes and facts collected from dozens of websites.

dghd

This won't be possible without a custom scrapping solution simple yet powerful.

We will be using ScrapeGen, a custom ruled based generator that uses certain profiles to scrape the data from several websites.

The concept

Our concept is simple build a tool that scrape the target data and follow the God Django's principle DRY, that is don't repeat yourself.

yuoo

Generally for scrapping content we require:

  • CSS Selectors
  • JS Selectors
  • XPATH

Finding any of the one for target data is very easy and takes less than 10 seconds.

So, our idea is first collect the desired selectors through web dev tools and put it somewhere some sort of seed.

In ScrapeGen its called rules template.

script_name: kk.py 
main:
  url: xx
  m_function: main 

selector_rules: 
  - name: 
    type: single 
    selector: ''
    extract: text


  - name: 
    type: single 
    selector: ''
    extract: text


props_rules:
  - name: 
    parent: ''
    parent_two: ""
    selectors:
      - xx

proxies:


Enter fullscreen mode Exit fullscreen mode
  • The selector_rules are the ones that needs to be fetched once like a list item.

  • The props_rules are collection of different selector_rules or a list of selector_rules itself, like lists of list of items.

_ The parent , parent -1 , parent-n etc are BS4 html elements like a parent div container.

The extract defines what to extract, like text or list of texts , links or even any html attribute etc.

SO, we will build rules for different websites, have a look:

yml_387833


Which after running the command builds the main scrapping program that could be saved for later use.

Have a look here as well:

ghud

This could be further extended to different industry like:

  • E- Commerce
  • Medical area
  • Stocks & Crypto
  • Sports
  • News
  • Kids
  • etc.

and can be used after months as well.

I have build different dataset for various fields have a look at this mixkit vector stock platform.

  • Labels such as drone, nature, science and universe etc.

mixxx

The tool after each epoch basically creates multiple segments when something goes wrong.
have a look:

oooooo

The content of each looks like:

['/free-stock-video/a-scientist-studying-3-d-printed-skull-24126/', '/free-stock-video/scientist-with-virtual-reality-headsets-talking-24090/', '/free-stock-video/a-man-wearing-a-biohazard-suit-and-a-mask-23902/', '/free-stock-video/people-in-biohazard-suits-walking-in-the-forest-23901/', '/free-stock-video/scientists-with-facemasks-experimenting-in-the-lab-23899/', '/free-stock-video/a-mad-scientist-watching-chemical-reaction-23897/', '/free-stock-video/scientist-working-with-flasks-in-the-lab-23617/', '/free-stock-video/scientist-discussing-hurricane-and-weather-23616/', '/free-stock-video/a-couple-of-friends-stargazing-using-the-telescope-23615/', '/free-stock-video/teacher-and-kids-exploring-cardboard-prosthesis-23542/', '/free-stock-video/scientist-checking-laboratory-equipment-23530/', '/free-stock-video/scientist-working-with-vr-glasses-and-3d-printer-23305/', '/free-stock-video/technology-classroom-with-teacher-and-students-23298/', '/free-stock-video/scientists-wearing-a-face-mask-talking-in-the-lab-23163/', '/free-stock-video/kids-exploring-robotic-cardboard-mechanism-23122/', '/free-stock-video/couple-of-scientists-on-a-factory-22992/', '/free-stock-video/scientist-looking-through-a-microscope-21452/', '/free-stock-video/reviewing-a-test-sample-with-a-microscope-21093/', '/free-stock-video/scientist-writing-in-the-lab-17699/', '/free-stock-video/scientist-mixing-components-on-flasks-17683/', '/free-stock-video/chemistry-flask-on-table-17600/', '/free-stock-video/a-scientist-experimenting-with-tubes-on-the-lab-17541/', '/free-stock-video/laboratory-flasks-with-gray-liquid-17514/', '/free-stock-video/doctor-in-the-lab-looking-at-the-computer-screen-17504/']
Enter fullscreen mode Exit fullscreen mode

Now have a look at one of amazon search results fetching rules function in python.

ooooooyuy

It would be hard to believe that all those python functions were written automatically within 30 seconds!.

Creating such solution in traditional Scrapping hub's SCRAPY spider would take you couple of hours + lots of headache & errors as a gift.

This scrapping tool is inspired by mlscrapper but instead this tool actually fulfills what its author claims to achieve so.

If you are interested comment so, I'll release it as well.


The Aim & Steps

You might be wondering what type of content we will be generating ?

The tool supports various Areas where content can be automated and viewers actually like the content.
And at last which should make some sense!

Genres and areas:

  • Motivational Quotes
  • Motivational speeches
  • Educational Facts
  • ASMR Videos
  • ASMR Stress relief stuffs
  • Sports news
  • General News Headlines
  • Celebrities News and facts
  • Comparison Videos

These genres can be fully automated and the best part is there is a massive audience eager to watch the content.

Comparison Videos, based on my last month research (as of writing this) have a massive viewership and engagement of over 100M+.

Let us have a look of what am i talking about:

shs

The above is just one video that is fully automated from data-source to the final clip using the tool itself and posted on Youtube channel, without any human intervention.

While many others are charging 100s of dollars for the same.

An example see this service https://ytbasics.com/comparison-video-maker/, this tool basically works only on ranker and some sorts of those websites and charges 99$ for the tool.

No doubts, such content can be automated and the challenge is automating along with the channel etc.


This genre (is not Open-Sourced) is not included in the this tool, if you are interested then comment or raise a issue.


Steps in content generation

The below image sums up the following steps used in the generation.

djhjd

Let's go through them one by one:

  • 1. Picks a random motivational quote from the massive quotes database collected across the web.

This database is not any traditional boring quotes database but something that audience will surely like to listen.

    1. Generates a audio for the text using Text to Speech.
    1. Downloads related infographics from the web
    1. Preprocess the images and infographics

This basically resizes and adjust the aspect ratio of images or infographics as all images may not be of same size or perfect aspect ratios.

Adjusts the images with meshing with/the background and enlarge or de-enlarge the content wherever required.

    1. Combine the audio with the infographics.
    1. Editing - Adds appropriate visual effects for images/videos.
    1. Uploads to the target channel with the respected metadata.
    1. Shares notification after successful launch.

The whole process works with in a desired workflows for various channels under respective pipelines.

Tools and libraries

Below are the python libraries for each purpose.

For content generation

  • Scraping and text extraction

NLTK, BS4 , Selenium

  • Text to speech purposes

googletrans, & gtts

Don't worry in the later post we will be discussing alternate human like speech generation custom solutions.

  • Visual preprocessing

Pillow

  • Video Creation

Moviepy, ffmpeg, textwrap3

For Channel automation

Mentioned in the introductory post of the tool.

Conclusion

This is not the end instead their a lot more to explore in YTGenX.

The main challenge is ahead the automation part as well.

So, don't forget to share it with your friends and be tuned for the next part, where we will be discussing the challenges in automating the platforms & how the things works under the hood.

Regards

Top comments (0)