DEV Community

Cover image for Build a Video-to-Blog Post Converter That Works Across Every Platform
Olamide Olaniyan
Olamide Olaniyan

Posted on

Build a Video-to-Blog Post Converter That Works Across Every Platform

You published a banger 10-minute video last week. Thousands of views. Good engagement. And then it's gone — buried in the algorithm.

Meanwhile, a written version of that same content would rank on Google for years. But who has time to manually transcribe and rewrite every video?

Let's build a Video-to-Blog Converter that:

  1. Grabs the transcript from any video (TikTok, YouTube, Instagram, Twitter)
  2. Cleans and structures it automatically
  3. Uses AI to rewrite it as an SEO-optimized blog post
  4. Suggests titles, meta descriptions, and internal links

One video → one blog post → years of organic traffic. Let's go.

Why Every Video Should Become a Blog Post

Here's something most creators miss: video and search audiences barely overlap.

Google indexes text, not video. Your YouTube video might get 50K views in the first week and then plateau. But a blog post covering the same topic can compound traffic for years.

The math:

  • A 10-minute video = ~1,500 words of spoken content
  • A 1,500-word blog post = competitive for most long-tail keywords
  • Time to manually transcribe + edit = 2 hours
  • Time with this tool = 30 seconds

The Stack

  • Python: Language
  • SociaVault API: Multi-platform transcript endpoints
  • OpenAI: Content transformation
  • python-dotenv: Config management

Step 1: Setup

mkdir video-to-blog
cd video-to-blog
pip install requests openai python-dotenv
Enter fullscreen mode Exit fullscreen mode

Create .env:

SOCIAVAULT_API_KEY=your_key_here
OPENAI_API_KEY=your_openai_key
Enter fullscreen mode Exit fullscreen mode

Step 2: Universal Transcript Fetcher

The beauty here — SociaVault has transcript endpoints for every major platform. One function handles them all.

Create converter.py:

import os
import re
import json
import requests
from openai import OpenAI
from dotenv import load_dotenv

load_dotenv()

API_BASE = "https://api.sociavault.com"
HEADERS = {"Authorization": f"Bearer {os.getenv('SOCIAVAULT_API_KEY')}"}
client = OpenAI(api_key=os.getenv("OPENAI_API_KEY"))


def detect_platform(url: str) -> str:
    """Auto-detect which platform a URL belongs to."""
    url_lower = url.lower()

    if "tiktok.com" in url_lower:
        return "tiktok"
    elif "youtube.com" in url_lower or "youtu.be" in url_lower:
        return "youtube"
    elif "instagram.com" in url_lower:
        return "instagram"
    elif "twitter.com" in url_lower or "x.com" in url_lower:
        return "twitter"
    elif "facebook.com" in url_lower or "fb.watch" in url_lower:
        return "facebook"
    else:
        raise ValueError(f"Unsupported platform: {url}")


def get_transcript(url: str) -> dict:
    """Fetch transcript from any supported platform."""
    platform = detect_platform(url)

    endpoints = {
        "tiktok": "/v1/scrape/tiktok/transcript",
        "youtube": "/v1/scrape/youtube/transcript",
        "instagram": "/v1/scrape/instagram/transcript",
        "twitter": "/v1/scrape/twitter/transcript",
        "facebook": "/v1/scrape/facebook/transcript",
    }

    endpoint = endpoints.get(platform)
    if not endpoint:
        raise ValueError(f"No transcript endpoint for {platform}")

    print(f"📥 Fetching {platform} transcript...")

    response = requests.get(
        f"{API_BASE}{endpoint}",
        params={"url": url},
        headers=HEADERS
    )
    response.raise_for_status()

    data = response.json().get("data", {})

    # Normalize transcript format across platforms
    transcript_text = ""

    if isinstance(data, str):
        transcript_text = data
    elif isinstance(data, dict):
        transcript_text = data.get("transcript", "") or data.get("text", "")

        # Handle timestamped segments
        if not transcript_text and "segments" in data:
            transcript_text = " ".join(
                seg.get("text", "") for seg in data["segments"]
            )
    elif isinstance(data, list):
        transcript_text = " ".join(
            item.get("text", "") if isinstance(item, dict) else str(item)
            for item in data
        )

    word_count = len(transcript_text.split())
    print(f"  ✓ Got {word_count} words from {platform}")

    return {
        "platform": platform,
        "url": url,
        "transcript": transcript_text,
        "word_count": word_count,
        "raw_data": data,
    }
Enter fullscreen mode Exit fullscreen mode

Step 3: AI Blog Post Generator

Here's where the transcript becomes a polished blog post:

def transcript_to_blog(
    transcript_data: dict,
    target_keyword: str = None,
    tone: str = "conversational but authoritative",
    word_count_target: int = 1500,
) -> dict:
    """Convert a video transcript into an SEO-optimized blog post."""

    transcript = transcript_data["transcript"]
    platform = transcript_data["platform"]

    if not transcript.strip():
        raise ValueError("Empty transcript — video might not have speech")

    print(f"\n✍️  Converting {transcript_data['word_count']} words to blog post...")

    keyword_instruction = ""
    if target_keyword:
        keyword_instruction = f"""
        Target SEO keyword: "{target_keyword}"
        - Include it in the title, first paragraph, and 2-3 subheadings
        - Use it naturally 3-5 times in the body
        - Include related long-tail variations
        """

    completion = client.chat.completions.create(
        model="gpt-4o-mini",
        messages=[{
            "role": "user",
            "content": f"""Convert this video transcript into a polished blog post.

TRANSCRIPT (from {platform}):
{transcript[:6000]}

INSTRUCTIONS:
- Write ~{word_count_target} words
- Tone: {tone}
- Structure with H2 and H3 headings
- Add an engaging introduction that hooks the reader
- Break into scannable sections
- Add a clear conclusion with a takeaway
- Write like a human, not AI — use short sentences, contractions, and personality
- Keep the speaker's original insights and examples
- Don't add information that wasn't in the transcript
- Remove verbal tics (um, uh, like, you know, so basically)
{keyword_instruction}

Return JSON:
{{
  "title": "SEO-optimized blog title",
  "meta_description": "155 character meta description",
  "slug": "url-friendly-slug",
  "blog_post": "Full markdown blog post with headings",
  "tags": ["5 relevant tags"],
  "estimated_read_time": "X min read",
  "seo_suggestions": ["3 internal linking opportunities"],
  "social_snippets": {{
    "twitter": "Tweet-length summary",
    "linkedin": "LinkedIn post summary (2-3 sentences)"
  }}
}}"""
        }],
        response_format={"type": "json_object"}
    )

    result = json.loads(completion.choices[0].message.content)

    actual_words = len(result["blog_post"].split())
    print(f"  ✓ Generated {actual_words}-word blog post")
    print(f"  📝 Title: {result['title']}")
    print(f"  🔗 Slug: /{result['slug']}")
    print(f"  ⏱️  {result['estimated_read_time']}")

    return result
Enter fullscreen mode Exit fullscreen mode

Step 4: Batch Converter for Content Libraries

Got a backlog of videos? Convert them all:

def batch_convert(urls: list, output_dir: str = "blog_posts") -> list:
    """Convert multiple videos to blog posts."""

    os.makedirs(output_dir, exist_ok=True)
    results = []

    print(f"\n🔄 Converting {len(urls)} videos to blog posts...\n")

    for i, url in enumerate(urls, 1):
        print(f"{'' * 50}")
        print(f"[{i}/{len(urls)}] {url}")

        try:
            transcript = get_transcript(url)

            if transcript["word_count"] < 50:
                print(f"  ⚠️  Skipping — only {transcript['word_count']} words")
                continue

            blog = transcript_to_blog(transcript)

            # Save as markdown file
            filename = f"{blog['slug']}.md"
            filepath = os.path.join(output_dir, filename)

            frontmatter = f"""---
title: "{blog['title']}"
description: "{blog['meta_description']}"
tags: {json.dumps(blog['tags'])}
source_video: "{url}"
source_platform: "{transcript['platform']}"
read_time: "{blog['estimated_read_time']}"
---

"""

            with open(filepath, "w", encoding="utf-8") as f:
                f.write(frontmatter + blog["blog_post"])

            print(f"  💾 Saved: {filepath}")

            results.append({
                "url": url,
                "title": blog["title"],
                "slug": blog["slug"],
                "file": filepath,
                "word_count": len(blog["blog_post"].split()),
            })

        except Exception as e:
            print(f"  ❌ Error: {e}")
            results.append({"url": url, "error": str(e)})

    # Summary
    successful = [r for r in results if "title" in r]
    print(f"\n{'' * 50}")
    print(f"✅ Converted: {len(successful)}/{len(urls)} videos")

    total_words = sum(r["word_count"] for r in successful)
    print(f"📝 Total content: {total_words:,} words")
    print(f"📁 Output: {output_dir}/")

    return results
Enter fullscreen mode Exit fullscreen mode

Step 5: Content Calendar Generator

Plan which videos to convert based on performance:

def suggest_conversion_priority(video_urls_with_views: list) -> list:
    """Rank videos by conversion priority."""

    print("\n📊 Analyzing conversion priority...\n")

    scored = []

    for item in video_urls_with_views:
        url = item["url"]
        views = item.get("views", 0)

        # Higher views = more proven topic
        # Longer videos = more content to work with
        transcript = get_transcript(url)

        word_count = transcript["word_count"]

        # Score: views (topic validation) × words (content depth)
        score = 0
        if word_count >= 500:
            score += 40  # Enough content for a full post
        elif word_count >= 200:
            score += 20
        else:
            score += 5

        if views >= 100000:
            score += 50
        elif views >= 10000:
            score += 35
        elif views >= 1000:
            score += 20
        else:
            score += 10

        scored.append({
            "url": url,
            "views": views,
            "word_count": word_count,
            "score": score,
            "platform": transcript["platform"],
        })

    scored.sort(key=lambda x: x["score"], reverse=True)

    print("Priority ranking:")
    for i, item in enumerate(scored, 1):
        emoji = "🟢" if item["score"] >= 60 else "🟡" if item["score"] >= 30 else "🔴"
        print(f"  {emoji} {i}. [{item['platform']}] Score: {item['score']}")
        print(f"     {item['url']}")
        print(f"     {item['views']:,} views | {item['word_count']} words")
        print()

    return scored
Enter fullscreen mode Exit fullscreen mode

Step 6: Full Pipeline CLI

def main():
    import sys

    if len(sys.argv) < 2:
        print("Video-to-Blog Converter")
        print()
        print("Usage:")
        print("  python converter.py convert <video_url>")
        print("  python converter.py convert <url> --keyword 'target keyword'")
        print("  python converter.py batch urls.txt")
        print("  python converter.py transcript <url>")
        print()
        print("Supported platforms: TikTok, YouTube, Instagram, Twitter/X, Facebook")
        return

    command = sys.argv[1]

    if command == "transcript":
        url = sys.argv[2]
        result = get_transcript(url)
        print(f"\n--- TRANSCRIPT ({result['word_count']} words) ---\n")
        print(result["transcript"][:2000])
        if result["word_count"] > 400:
            print(f"\n... ({result['word_count'] - 400} more words)")

    elif command == "convert":
        url = sys.argv[2]
        keyword = None
        if "--keyword" in sys.argv:
            idx = sys.argv.index("--keyword")
            keyword = sys.argv[idx + 1]

        transcript = get_transcript(url)
        blog = transcript_to_blog(transcript, target_keyword=keyword)

        # Save it
        filename = f"{blog['slug']}.md"
        frontmatter = f"""---
title: "{blog['title']}"
description: "{blog['meta_description']}"
tags: {json.dumps(blog['tags'])}
source: "{url}"
---

"""
        with open(filename, "w", encoding="utf-8") as f:
            f.write(frontmatter + blog["blog_post"])

        print(f"\n💾 Saved to {filename}")

        print(f"\n🐦 Twitter: {blog['social_snippets']['twitter']}")
        print(f"\n💼 LinkedIn: {blog['social_snippets']['linkedin']}")

    elif command == "batch":
        filepath = sys.argv[2]
        with open(filepath) as f:
            urls = [line.strip() for line in f if line.strip()]
        batch_convert(urls)


if __name__ == "__main__":
    main()
Enter fullscreen mode Exit fullscreen mode

Running It

# Convert a single YouTube video
python converter.py convert "https://youtube.com/watch?v=abc123"

# Convert with SEO keyword targeting
python converter.py convert "https://youtube.com/watch?v=abc123" --keyword "python web scraping"

# Just grab the transcript
python converter.py transcript "https://tiktok.com/@user/video/123"

# Batch convert from a file of URLs
python converter.py batch my_videos.txt
Enter fullscreen mode Exit fullscreen mode

Sample Output

From a 12-minute YouTube tutorial:

📥 Fetching youtube transcript...
  ✓ Got 1,847 words from youtube

✍️  Converting 1,847 words to blog post...
  ✓ Generated 1,623-word blog post
  📝 Title: How to Build a REST API with Express.js in 2025
  🔗 Slug: /build-rest-api-express-2025
  ⏱️  7 min read

💾 Saved to build-rest-api-express-2025.md

🐦 Twitter: Just converted my Express.js tutorial into a blog
   post automatically. 1,800 words of video → 1,600 words of
   SEO-optimized content in 30 seconds.

💼 LinkedIn: Published a comprehensive guide on building REST
   APIs with Express.js. Originally a video tutorial, now
   optimized for search with structured headings and examples.
Enter fullscreen mode Exit fullscreen mode

The Content Multiplication Strategy

Here's the real play:

  1. Record one video with your expertise
  2. Convert to blog post (this tool)
  3. Extract tweet threads from key sections
  4. Pull LinkedIn posts from the social snippets
  5. Create email newsletter from the summary

One video → 5 pieces of content. Every week.

The creators winning right now aren't creating 5x more content. They're repurposing 5x more effectively.

Cost Comparison

Method Cost Time
Manual transcription Free 2-3 hours
Rev.com transcription $1.50/min 12 hours turnaround
Descript $24/mo 30 min editing
This tool ~$0.02/video 30 seconds

The SociaVault transcript costs 1 credit per video. OpenAI rewrite costs about $0.01. Total: pennies per blog post.

Get Started

  1. Get your API key at sociavault.com
  2. Pick your best-performing video
  3. Convert it and publish — it'll probably rank within a month

Your best content is already recorded. It's just stuck in video format where Google can't find it.


Every video you've ever made is a blog post waiting to happen. Stop leaving SEO traffic on the table.

python #contentcreation #seo #webdev

Top comments (0)