Generate cinematic AI videos from text prompts using xAI’s latest model and Python.
A command-line tool that takes a text prompt, sends it to xAI’s Grok Imagine model via fal.ai ‘s API, monitors the generation in real-time, and downloads the finished video, complete with synchronized audio.
Prerequisites
Before starting, ensure you have:
- Python 3.8+ installed
- A fal.ai account with API key from the Dashboard
Step 1: Environment Setup
Install the required packages:
pip install fal-client python-dotenv requests
Create a .env file in your project directory:
FAL_KEY=your_api_key_here
Now let’s build the application piece by piece.
Step 2: Imports and Configuration
import asyncio
import os
from pathlib import Path
import fal_client
import requests
from dotenv import load_dotenv
load_dotenv()
We use asyncio because fal.ai‘s client is async-first, video generation takes time, and async patterns let us stream progress updates while waiting. The load_dotenv() call pulls your FAL_KEY into the environment automatically.
Step 3: The Core Generation Function
This is where the magic happens. We’ll define an async function that handles the entire pipeline:
async def generate_and_download_video(
prompt: str,
output_dir: str = "videos",
duration: int = 6,
aspect_ratio: str = "16:9",
resolution: str = "480p",
) -> str:
"""
Generate a video using Grok Imagine and download it.Args:
prompt: Text description of the desired video
output_dir: Directory to save the downloaded video
duration: Video duration in seconds (default: 6)
aspect_ratio: 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, or 9:16
resolution: Output resolution - 480p or 720p
Returns:
Path to the downloaded video file, or None on failure
"""
Path(output_dir).mkdir(parents=True, exist_ok=True)
print(f"🎬 Generating video with prompt: {prompt}")
print(f" Duration: {duration}s | Aspect Ratio: {aspect_ratio} | Resolution: {resolution}")
print("-" * 60)
The function accepts configurable parameters with sensible defaults. We create the output directory if it doesn’t exist, then print the configuration for visibility.
Step 4: Submitting the Request
# Submit the async request to Grok Imagine
handler = await fal_client.submit_async(
"xai/grok-imagine/text-to-video",
arguments={
"prompt": prompt,
"duration": duration,
"aspect_ratio": aspect_ratio,
"resolution": resolution,
},
)
print(f"📤 Request submitted. Request ID: {handler.request_id}")
print("⏳ Waiting for video generation...")
submit_async sends your request to fal.ai‘s queue and returns immediately with a handler. The request_id is useful for debugging and tracking jobs. The model endpoint xai/grok-imagine/text-to-video tells fal.ai which model to run.
Available parameters:
Parameter Options Notes
- duration 6–15 seconds
- Longer = slower generation
- aspect_ratio 16:9, 9:16, 1:1, 4:3, 3:4, 3:2, 2:3
- 9:16 for vertical/mobile resolution 480p, 720p
- 720p = better quality, slower
Step 5: Real-Time Progress Monitoring
# Stream real-time logs from the GPU workers
async for event in handler.iter_events(with_logs=True):
if hasattr(event, "message"):
print(f" 📝 {event.message}")
elif hasattr(event, "logs"):
for log in event.logs:
print(f" 📋 {log.get('message', log)}")
This is one of fal.ai ‘s best features. Instead of waiting blindly, iter_events opens a WebSocket connection and streams status updates from the GPU workers. You’ll see messages like “Enqueuing”, “Processing”, and generation progress in real-time.
Step 6: Retrieving Results
# Retrieve the completed result
result = await handler.get()
print("-" * 60)
print("✅ Video generation complete!")
# Extract video metadata
video_info = result.get("video", {})
video_url = video_info.get("url")
file_name = video_info.get("file_name", "grok_video.mp4")
print(f" 📊 Resolution: {video_info.get('width')}x{video_info.get('height')}")
print(f" ⏱️ Duration: {video_info.get('duration', 'N/A')}s")
print(f" 🎞️ FPS: {video_info.get('fps', 'N/A')}")
print(f" 📹 Frames: {video_info.get('num_frames', 'N/A')}")
Once generation completes, handler.get() returns the full response. The API provides rich metadata including actual dimensions, frame count, and a temporary URL to download the video.
Response structure:
{
"video": {
"url": "https://...",
"file_name": "generated_video.mp4",
"width": 854,
"height": 480,
"duration": 6.0,
"fps": 24,
"num_frames": 144
}
}
Step 7: Downloading with Progress
# Download with progress tracking
if video_url:
output_path = Path(output_dir) / file_name
print(f"\n📥 Downloading video to: {output_path}")
response = requests.get(video_url, stream=True)
response.raise_for_status()
total_size = int(response.headers.get("content-length", 0))
downloaded = 0
with open(output_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
downloaded += len(chunk)
if total_size > 0:
progress = (downloaded / total_size) * 100
print(f"\r Progress: {progress:.1f}%", end="", flush=True)
print(f"\n✅ Video saved to: {output_path}")
return str(output_path)
print("❌ No video URL found in the response")
return None
We use streaming downloads with stream=True to handle large files efficiently without loading everything into memory.
Step 8: The Main Section
def main():
if not os.getenv("FAL_KEY"):
print("❌ Error: FAL_KEY not found in environment variables.")
print(" Please add your API key to the .env file:")
print(" FAL_KEY=your_api_key_here")
return
# Interactive prompt input
prompt = input("🎬 Enter your video prompt: ").strip()
if not prompt:
prompt = "A cat playing with a ball of yarn in a cozy living room"
print(f" Using default prompt: {prompt}")
# Configuration options with sensible defaults
print("\n📐 Video settings (press Enter for defaults):")
duration_input = input(" Duration in seconds (6): ").strip()
duration = int(duration_input) if duration_input else 6
print(" Aspect ratios: 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16")
aspect_ratio = input(" Aspect ratio (16:9): ").strip() or "16:9"
print(" Resolutions: 480p, 720p")
resolution = input(" Resolution (480p): ").strip() or "480p"
print()
# Execute the async generation pipeline
video_path = asyncio.run(
generate_and_download_video(
prompt=prompt,
duration=duration,
aspect_ratio=aspect_ratio,
resolution=resolution,
)
)
if video_path:
print(f"\n🎉 Done! Your video is ready at: {video_path}")
if __name__ == "__main__":
main()
The main() function provides an interactive CLI. It validates the API key first, then collects user input with sensible defaults. asyncio.run() bridges the sync/async boundary, executing our async function from synchronous code.
Complete Code
Here’s everything together, ready to copy and run:
import asyncio
import os
from pathlib import Path
import fal_client
import requests
from dotenv import load_dotenv
load_dotenv()
async def generate_and_download_video(
prompt: str,
output_dir: str = "videos",
duration: int = 6,
aspect_ratio: str = "16:9",
resolution: str = "480p",
) -> str:
Path(output_dir).mkdir(parents=True, exist_ok=True)
print(f"🎬 Generating video with prompt: {prompt}")
print(f" Duration: {duration}s | Aspect Ratio: {aspect_ratio} | Resolution: {resolution}")
print("-" * 60)
handler = await fal_client.submit_async(
"xai/grok-imagine/text-to-video",
arguments={
"prompt": prompt,
"duration": duration,
"aspect_ratio": aspect_ratio,
"resolution": resolution,
},
)
print(f"📤 Request submitted. Request ID: {handler.request_id}")
print("⏳ Waiting for video generation...")
async for event in handler.iter_events(with_logs=True):
if hasattr(event, "message"):
print(f" 📝 {event.message}")
elif hasattr(event, "logs"):
for log in event.logs:
print(f" 📋 {log.get('message', log)}")
result = await handler.get()
print("-" * 60)
print("✅ Video generation complete!")
video_info = result.get("video", {})
video_url = video_info.get("url")
file_name = video_info.get("file_name", "grok_video.mp4")
print(f" 📊 Resolution: {video_info.get('width')}x{video_info.get('height')}")
print(f" ⏱️ Duration: {video_info.get('duration', 'N/A')}s")
print(f" 🎞️ FPS: {video_info.get('fps', 'N/A')}")
print(f" 📹 Frames: {video_info.get('num_frames', 'N/A')}")
if video_url:
output_path = Path(output_dir) / file_name
print(f"\n📥 Downloading video to: {output_path}")
response = requests.get(video_url, stream=True)
response.raise_for_status()
total_size = int(response.headers.get("content-length", 0))
downloaded = 0
with open(output_path, "wb") as f:
for chunk in response.iter_content(chunk_size=8192):
f.write(chunk)
downloaded += len(chunk)
if total_size > 0:
progress = (downloaded / total_size) * 100
print(f"\r Progress: {progress:.1f}%", end="", flush=True)
print(f"\n✅ Video saved to: {output_path}")
return str(output_path)
print("❌ No video URL found in the response")
return None
def main():
if not os.getenv("FAL_KEY"):
print("❌ Error: FAL_KEY not found in environment variables.")
print(" Please add your API key to the .env file:")
print(" FAL_KEY=your_api_key_here")
return
prompt = input("🎬 Enter your video prompt: ").strip()
if not prompt:
prompt = "A cat playing with a ball of yarn in a cozy living room"
print(f" Using default prompt: {prompt}")
print("\n📐 Video settings (press Enter for defaults):")
duration_input = input(" Duration in seconds (6): ").strip()
duration = int(duration_input) if duration_input else 6
print(" Aspect ratios: 16:9, 4:3, 3:2, 1:1, 2:3, 3:4, 9:16")
aspect_ratio = input(" Aspect ratio (16:9): ").strip() or "16:9"
print(" Resolutions: 480p, 720p")
resolution = input(" Resolution (480p): ").strip() or "480p"
print()
video_path = asyncio.run(
generate_and_download_video(
prompt=prompt,
duration=duration,
aspect_ratio=aspect_ratio,
resolution=resolution,
)
)
if video_path:
print(f"\n🎉 Done! Your video is ready at: {video_path}")
if __name__ == "__main__":
main()
Tips for Better Results
Write descriptive prompts. Include subject, action, environment, and style:
A steaming cup of coffee on a wooden table, morning light streaming through window blinds, gentle steam rising, cinematic depth of field
Match aspect ratio to platform. Use 9:16 for TikTok/Reels, 16:9 for YouTube, 1:1 for Instagram feed.
Start with 480p. It’s faster for iteration. Switch to 720p for final renders.
Next Steps
This foundation can be extended into automated content pipelines, creative tools, or integration with other services. The fal.ai also offers additional tools like fal workflows which lets you combine multiple models together for better customization and automation. While this tutorial uses the new Grok Imagine — text-to-video model, you can also easily adapt it for the other new Grok Imagine endpoints.
https://fal.ai/models/xai/grok-imagine-image/
https://fal.ai/models/xai/grok-imagine-image/edit
https://fal.ai/models/xai/grok-imagine-video/text-to-video
https://fal.ai/models/xai/grok-imagine-video/image-to-video
https://fal.ai/models/xai/grok-imagine-video/edit-video
Hope this was helpful! Happy Prompting!!

Top comments (0)