DEV Community

Cover image for Build a Viral Hook Generator using YouTube Transcripts & OpenAI
Olamide Olaniyan
Olamide Olaniyan

Posted on

Build a Viral Hook Generator using YouTube Transcripts & OpenAI

The first 15 seconds of a YouTube video dictate its success. If the "hook" fails, the viewer clicks away, retention plummets, and the algorithm buries the video.

What if you could analyze the exact words top creators use in their first 15 seconds, and use AI to generate similar hooks for your own niche?

In this tutorial, we'll build a Python script that:

  1. Fetches a viral YouTube video.
  2. Extracts the transcript (subtitles).
  3. Isolates the first 15 seconds of spoken text (the Hook).
  4. Uses OpenAI (ChatGPT) to reverse-engineer the psychology of the hook and generate new ones for your topic.

The Tech Stack

  • Python 3
  • SociaVault API (to fetch YouTube transcripts without dealing with headless browsers or CAPTCHAs)
  • OpenAI API (to analyze and generate text)

Step 1: Setup

Install the required libraries:

pip install requests openai python-dotenv
Enter fullscreen mode Exit fullscreen mode

Create a .env file:

SOCIAVAULT_API_KEY=your_sociavault_key
OPENAI_API_KEY=your_openai_key
Enter fullscreen mode Exit fullscreen mode

Step 2: Fetching the Transcript

YouTube's official API makes getting transcripts incredibly difficult. We'll use SociaVault's YouTube Transcript endpoint, which returns a clean JSON array of text and timestamps.

import os
import requests
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

SOCIAVAULT_KEY = os.getenv("SOCIAVAULT_API_KEY")
OPENAI_KEY = os.getenv("OPENAI_API_KEY")

client = OpenAI(api_key=OPENAI_KEY)

def get_video_transcript(video_id):
    print(f"Fetching transcript for video: {video_id}...")

    url = "https://api.sociavault.com/v1/youtube/video/transcript"
    headers = {"Authorization": f"Bearer {SOCIAVAULT_KEY}"}

    response = requests.get(url, headers=headers, params={"video_id": video_id})

    if response.status_code == 200:
        return response.json().get("data", [])
    else:
        raise Exception(f"Failed to fetch transcript: {response.text}")
Enter fullscreen mode Exit fullscreen mode

Step 3: Isolating the Hook

The SociaVault API returns data like this:

[
  {"text": "In 2008, a mysterious programmer...", "start": 0.5, "duration": 4.2},
  {"text": "changed the financial world forever.", "start": 4.7, "duration": 3.1}
]
Enter fullscreen mode Exit fullscreen mode

We want to extract only the text spoken in the first 15-20 seconds.

def extract_hook(transcript_data, max_seconds=20):
    hook_text = []

    for segment in transcript_data:
        # If the segment starts after our max time, stop collecting
        if segment['start'] > max_seconds:
            break
        hook_text.append(segment['text'])

    return " ".join(hook_text).replace("\n", " ")
Enter fullscreen mode Exit fullscreen mode

Step 4: AI Analysis & Generation

Now we pass the viral hook to OpenAI. We'll ask the AI to analyze why the hook works (the psychology, the curiosity gap) and then generate 3 new hooks for our own topic using the same framework.

def generate_new_hooks(viral_hook, my_topic):
    print("\nAnalyzing hook with AI...")

    prompt = f"""
    Here is the opening hook (first 20 seconds) of a highly viral YouTube video:
    "{viral_hook}"

    Step 1: Analyze the psychology of this hook. Why does it grab attention? What curiosity gap does it open?
    Step 2: Using the exact same psychological framework and pacing, write 3 new YouTube hooks for a video about: "{my_topic}".

    Format the output clearly.
    """

    response = client.chat.completions.create(
        model="gpt-4-turbo",
        messages=[
            {"role": "system", "content": "You are an expert YouTube retention strategist and scriptwriter."},
            {"role": "user", "content": prompt}
        ],
        temperature=0.7
    )

    return response.choices[0].message.content
Enter fullscreen mode Exit fullscreen mode

Step 5: Running the Generator

Let's test it. We'll use a viral MrBeast or documentary-style video ID, and ask it to generate hooks for a video about "Learning to Code in 2026".

def main():
    # Example: A viral tech/documentary video ID
    target_video_id = "dQw4w9WgXcQ" # Replace with a real viral video ID
    my_video_topic = "How to learn Python fast in 2026"

    try:
        # 1. Get transcript
        transcript = get_video_transcript(target_video_id)

        # 2. Extract the hook
        viral_hook = extract_hook(transcript, max_seconds=20)
        print(f"\nExtracted Viral Hook:\n\"{viral_hook}\"")

        # 3. Generate new hooks
        results = generate_new_hooks(viral_hook, my_video_topic)

        print("\n" + "="*50)
        print("AI GENERATED RESULTS")
        print("="*50)
        print(results)

    except Exception as e:
        print(f"Error: {e}")

if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

The Output

The AI will output something like:

Extracted Viral Hook:
"I spent 100 days locked in a room with nothing but a laptop. And what I discovered about human psychology completely terrified me."

==================================================
AI GENERATED RESULTS
==================================================
Analysis:
This hook uses the "Extreme Commitment" framework combined with a "Negative Discovery" curiosity gap. It establishes immediate authority (100 days) and promises a shocking revelation (terrified me).

New Hooks for "How to learn Python fast in 2026":

Hook 1: "I spent 30 days analyzing the code of the top 1% of software engineers. And the secret they use to learn new languages completely broke my understanding of programming."

Hook 2: "I locked myself in my office for a week to learn Python from scratch. And the shortcut I discovered makes traditional coding bootcamps look like a complete scam."
Enter fullscreen mode Exit fullscreen mode

Why This is a Game Changer

Instead of guessing what makes a good intro, you are programmatically scraping proven, data-backed hooks and applying their underlying frameworks to your own content.

By using SociaVault, you bypass the nightmare of YouTube's official API quotas and transcript scraping blocks. You get clean data, instantly.

Get your free API key at SociaVault.com and start reverse-engineering virality today.

Top comments (0)