DEV Community

Evan Lin
Evan Lin

Posted on • Originally published at evanlin.com on

Avoiding YouTube Blocking on GCP (Using a Proxy)

title: [Google Cloud] Avoiding YouTube Blocking Network Traffic in the Same Region on GCP (Solved via Proxy)
published: false
date: 2025-03-27 00:00:00 UTC
tags: 
canonical_url: https://www.evanlin.com/youtube-transcript-proxy/
---

![LINE 2025-03-28 15.31.08](https://www.evanlin.com/images/2022/LINE%202025-03-28%2015.31.08.png)

# Preface

Previously, there was an article "[[Google Cloud] How to Obtain YouTube Information via LangChain on GCP Cloud Run](https://dev.to/evanlin/google-cloudfirebase-ru-he-zai-gcp-cloud-run-shang-mian-tou-guo-langchain-qu-de-youtube-de-xiang-guan-zi-xun-1b45-temp-slug-7524546)", which discussed using Secret Manager and GCP-related LangChain YouTube packages to try and fetch data. However, recently YouTube has started to revise its reading specifications, causing the original method to fail. Here, I'll record the main error messages and how to solve them.

## Main Problem

One day, the subtitles of YouTube started to fail to be fetched, and the following content appeared in the logs.

Enter fullscreen mode Exit fullscreen mode

During handling of the above exception, another exception occurred:
youtube_transcript_api._errors.RequestBlocked:
Could not retrieve a transcript for the video https://www.youtube.com/watch?v=ViA4-YWx8Y4!
This is most likely caused by:
YouTube is blocking requests from your IP. This usually is due to one of the following reasons:

  • You have done too many requests and your IP has been blocked by YouTube
  • You are doing requests from an IP belonging to a cloud provider (like AWS, Google Cloud Platform, Azure, etc.). Unfortunately, most IPs from cloud providers are blocked by YouTube.

There are two things you can do to work around this:

  1. Use proxies to hide your IP address, as explained in the "Working around IP bans" section of the README (https://github.com/jdepoix/youtube-transcript-api?tab=readme-ov-file#working-around-ip-bans-requestblocked-or-ipblocked-exception).
  2. (NOT RECOMMENDED) If you authenticate your requests using cookies, you will be able to continue doing requests for a while. However, YouTube will eventually permanently ban the account that you have used to authenticate with! So only do this if you don't mind your account being banned! If you are sure that the described cause is not responsible for this error and that a transcript should be retrievable, please create an issue at https://github.com/jdepoix/youtube-transcript-api/issues. Please add which version of youtub

Here's a quick mention of the main problem:

- Because in the LangChain architecture, `GoogleApiYoutubeLoader` also calls [https://github.com/jdepoix/youtube-transcript-api/](https://github.com/jdepoix/youtube-transcript-api/) to use.
- Because it's deployed on GCP, YouTube started blocking all requests from the same cloud.
- This package suggests using a proxy service like [Webshare](https://www.webshare.io/?referral_code=1yl49cgzfedr) to fetch related data.

## WebShare

![image-20250328170602359](https://www.evanlin.com/images/2022/image-20250328170602359.png)

[Webshare](https://www.webshare.io/?referral_code=1yl49cgzfedr) is a third-party paid proxy service that allows your Web Request to achieve the following:

- Want to access (crawl) Google services (Map, YouTube, Google Search) on GCP
- Want to crawl some CDN services that have more CSP (Cloud Services Provider) IP blocking (e.g., CloudFlare)

It also has related free proxy quotas that can be used:

- Five Proxies
- 1GB usage per month

For Youtube subtitles, this traffic will not be a problem.

It's also quite simple to use. Here's the code for fetching [YouTube Transcript](https://github.com/jdepoix/youtube-transcript-api/) through [Webshare](https://www.webshare.io/?referral_code=1yl49cgzfedr):

Enter fullscreen mode Exit fullscreen mode

from youtube_transcript_api import YouTubeTranscriptApi
from youtube_transcript_api.proxies import WebshareProxyConfig
import os

def get_transcripts(video_id, languages):
# Get proxy credentials from environment variables
proxy_username = os.environ.get("PROXY_USERNAME")
proxy_password = os.environ.get("PROXY_PASSWORD")

ytt_api = YouTubeTranscriptApi(
    proxy_config=WebshareProxyConfig(
        proxy_username=proxy_username,
        proxy_password=proxy_password,
    )
)
transcript_list = ytt_api.fetch(video_id, languages=languages)
transcript_texts = [snippet["text"] for snippet in transcript_list.to_raw_data()]
return " ".join(transcript_texts)
Enter fullscreen mode Exit fullscreen mode

Example usage (only runs when script is executed directly)

if name == " main":
video_id = "YOUR_VIDEO_ID"
languages = ["en", "de"]
transcript_text = get_transcripts(video_id, languages)
print(transcript_text)


Although it will be a bit slower than a direct connection, it can really fetch data directly.

## Parts to Avoid (Extra Fees)

Although [Webshare](https://www.webshare.io/?referral_code=1yl49cgzfedr) has a free quota and is easy to use, if you fetch too frequently, you may be blocked by [Webshare](https://www.webshare.io/?referral_code=1yl49cgzfedr) and asked to pay. Be careful here.

Enter fullscreen mode Exit fullscreen mode

youtube_transcript_api._errors.RequestBlocked:

Could not retrieve a transcript for the video https://www.youtube.com/watch?v=ViA4-YWx8Y4! This is most likely caused by:

YouTube is blocking your requests, despite you using Webshare proxies. Please make sure that you have purchased "Residential" proxies and NOT "Proxy Server" or "Static Residential", as those won't work as reliably! The free tier also uses "Proxy Server" and will NOT work!

The only reliable option is using "Residential" proxies (not "Static Residential"), as this allows you to rotate through a pool of over 30M IPs, which means you will always find an IP that hasn't been blocked by YouTube yet!

You can support the development of this open source project by making your Webshare purchases through this affiliate link: https://www.webshare.io/?referral_code=1yl49cgzfedr
Thank you for your support! <3

Enter fullscreen mode Exit fullscreen mode

Top comments (0)