Hackceleration

Posted on Mar 24 • Originally published at hackceleration.com

Building an AI-Powered YouTube Publisher with n8n, OpenAI Whisper, and Google Drive

#n8n #api #automation #tutorial

Building an AI-Powered YouTube Publisher with n8n, OpenAI Whisper, and Google Drive

You've got audio files sitting in Google Drive. You need them transcribed, analyzed, and published to YouTube with SEO-optimized metadata. Here's how to architect an automation that handles the entire pipeline using n8n, OpenAI's Whisper API, and the YouTube Data API.

Architecture Overview

This integration chains five core APIs:

1. Google Drive API → Retrieve audio/video files
2. OpenAI Whisper API → Transcribe audio to text
3. OpenAI GPT-4.1-mini → Generate metadata from transcript
4. Google Drive API → Download video file
5. YouTube Data API v3 → Upload with generated metadata

Why this stack? Google Drive provides centralized file storage with robust search capabilities. Whisper offers high-accuracy transcription across languages. GPT-4.1-mini balances quality and cost for metadata generation. The YouTube API handles programmatic uploads with scheduling.

Alternative considered: Zapier's YouTube integration lacks structured output parsing for AI-generated content. n8n's JSON schema validation ensures consistent metadata formatting.

API Integration Deep-Dive

Google Drive API: File Search and Download

Authentication: OAuth2 via Google Cloud Console. Create credentials at console.cloud.google.com, enable Google Drive API, configure OAuth consent screen.

Search Request:

GET https://www.googleapis.com/drive/v3/files
Headers: { "Authorization": "Bearer {access_token}" }
Query Parameters: {
  "q": "'{folder_id}' in parents",
  "fields": "files(id, name, mimeType)"
}

Response Structure:

{
  "files": [
    {
      "id": "1AbC2DeF3GhI4JkL5MnO6PqR7StU8VwX9YzA",
      "name": "episode-42.mp3",
      "mimeType": "audio/mpeg"
    }
  ]
}

n8n Configuration:

Resource: File/Folder
Operation: Search
Filter by Folder ID (found in Drive URL)
Return All: Enabled

Download Request:

GET https://www.googleapis.com/drive/v3/files/{fileId}?alt=media
Headers: { "Authorization": "Bearer {access_token}" }

Returns binary data stream. n8n stores this in the data binary field.

Rate Limits: 1,000 queries per 100 seconds per user. Edge case: handle 403 userRateLimitExceeded with exponential backoff.

OpenAI Whisper API: Audio Transcription

Authentication: API key from platform.openai.com. Add to request header as Authorization: Bearer {api_key}.

Request Format:

POST https://api.openai.com/v1/audio/transcriptions
Headers: {
  "Authorization": "Bearer {api_key}",
  "Content-Type": "multipart/form-data"
}
Body (form-data): {
  "file": <binary_audio_data>,
  "model": "whisper-1"
}

Response:

{
  "text": "Welcome to episode 42 where we discuss API integration patterns..."
}

Critical Parameters:

File size limit: 25 MB
Supported formats: mp3, mp4, mpeg, mpga, m4a, wav, webm
Cost: $0.006 per minute

n8n Configuration:

Resource: Audio
Operation: Transcribe a Recording
Input Data Field Name: data

Error Handling: 413 Payload Too Large → compress audio or split files. Missing binary data → verify Google Drive download node output.

OpenAI GPT-4.1-mini: Structured Metadata Generation

Request Structure:

POST https://api.openai.com/v1/chat/completions
Headers: {
  "Authorization": "Bearer {api_key}",
  "Content-Type": "application/json"
}
Body: {
  "model": "gpt-4.1-mini",
  "messages": [
    {
      "role": "system",
      "content": "You are a YouTube SEO expert. Generate title, description, and tags."
    },
    {
      "role": "user",
      "content": "{transcript_text}"
    }
  ],
  "response_format": { "type": "json_schema", "json_schema": {...} }
}

JSON Schema for Structured Output:

{
  "type": "object",
  "properties": {
    "title": { "type": "string" },
    "description": { "type": "string" },
    "tags": { "type": "array", "items": { "type": "string" } }
  },
  "required": ["title", "description", "tags"]
}

Response:

{
  "choices": [
    {
      "message": {
        "content": "{\"title\":\"How to Integrate APIs with n8n\",\"description\":\"Learn API integration...\",\"tags\":[\"api\",\"n8n\",\"automation\"]}"
      }
    }
  ]
}

n8n AI Agent Setup:

Model: gpt-4.1-mini
Output Parser: JSON Schema
Auto-Fix Format: Enabled

Cost Optimization: GPT-4.1-mini costs ~$0.15 per 1M tokens. Typical metadata generation uses 500-1000 tokens per video.

YouTube Data API v3: Video Upload

Authentication: OAuth2 with youtube.upload scope. Create credentials in Google Cloud Console, enable YouTube Data API v3.

Upload Request (resumable upload):

POST https://www.googleapis.com/upload/youtube/v3/videos?uploadType=resumable&part=snippet,status
Headers: {
  "Authorization": "Bearer {access_token}",
  "Content-Type": "application/json"
}
Body: {
  "snippet": {
    "title": "{ai_generated_title}",
    "description": "{ai_generated_description}",
    "tags": ["{tag1}", "{tag2}"],
    "categoryId": "28"
  },
  "status": {
    "privacyStatus": "private",
    "publishAt": "2025-06-20T18:00:00Z"
  }
}

Response:

{
  "id": "dQw4w9WgXcQ",
  "snippet": {
    "title": "How to Integrate APIs with n8n",
    "publishedAt": "2025-06-20T18:00:00Z"
  }
}

n8n Configuration:

Resource: Video
Operation: Upload
Title: {{ $('AI Agent').item.json.output.title }}
Description: {{ $('AI Agent').item.json.output.description }}
Privacy Status: private
Publish At: ISO 8601 datetime string

Rate Limits: 10,000 quota units per day. One upload = 1,600 units. Handle 403 quotaExceeded by queueing uploads across days.

Implementation Gotchas

Missing Transcript Data: If Whisper returns empty text (silence detection), the AI agent receives no input. Add conditional logic: {{ $json.text ? $json.text : 'No audio detected' }}.

OAuth Token Expiration: Google OAuth tokens expire after 1 hour. n8n's credential system auto-refreshes, but manual API calls need refresh token handling.

YouTube Category IDs: Category 28 = Science & Technology. Wrong category causes upload rejection. Validate against YouTube's category list before deployment.

Binary Data Size: Large video files (>2GB) can timeout. Set n8n's EXECUTIONS_TIMEOUT environment variable to 3600 seconds for long uploads.

Scheduled Publish Failures: publishAt must be at least 6 hours in the future and use ISO 8601 format with timezone. JavaScript Date objects in n8n expressions need .toISO() conversion.

AI Hallucination: GPT-4.1-mini occasionally generates tags unrelated to content. Add validation: check if tags exist in transcript text before uploading.

Prerequisites

Required Accounts:

n8n instance (self-hosted or cloud)
OpenAI API account with credits
Google Cloud project with Drive + YouTube APIs enabled
YouTube channel with upload permissions

API Credentials Needed:

OpenAI API key: platform.openai.com/api-keys
Google OAuth2 credentials: console.cloud.google.com/apis/credentials
YouTube OAuth consent configured with youtube.upload scope

Estimated Costs:

Whisper: $0.006/minute audio
GPT-4.1-mini: $0.15/1M input tokens, $0.60/1M output tokens
Per video: ~$0.10-0.50 depending on audio length

Official Documentation:

Google Drive API: developers.google.com/drive/api/v3/reference
OpenAI Whisper: platform.openai.com/docs/guides/speech-to-text
YouTube Data API: developers.google.com/youtube/v3/docs

Get the Complete Workflow Configuration

This tutorial covers the API integration architecture and critical parameters. For the complete n8n workflow JSON file with all node configurations, system prompts, and error handling logic, check out the full implementation guide.

DEV Community

Building an AI-Powered YouTube Publisher with n8n, OpenAI Whisper, and Google Drive

Building an AI-Powered YouTube Publisher with n8n, OpenAI Whisper, and Google Drive

Architecture Overview

API Integration Deep-Dive

Google Drive API: File Search and Download

OpenAI Whisper API: Audio Transcription

OpenAI GPT-4.1-mini: Structured Metadata Generation

YouTube Data API v3: Video Upload

Implementation Gotchas

Prerequisites

Get the Complete Workflow Configuration

Top comments (0)