Everyone's talking about Seedance 2.0 — ByteDance's new AI video model that generates cinema-grade video with native audio sync. But most articles stop at "look what it can do." This one shows you how to actually build with it — real API calls, real code, real results.
What you'll walk away with:
- Generate AI videos from text, images, video references, and audio — all via API
- Poll async tasks and retrieve your output video URL
- Set up MCP integration so your AI editor can call the API for you
- Use Cursor Skills to automate entire storyboard-to-video workflows
Prerequisites
- Register at SuTui AI (the API platform for Seedance 2.0)
- Create an API Key on the API Key page
- Grab some credits — a 5-second Fast video costs 50 credits (~$0.50)
Base URL: https://api.xskill.ai
Auth header:
Authorization: Bearer sk-your-api-key
Core Concepts: Two Modes, One Model
Seedance 2.0 (st-ai/super-seed2) supports two function modes:
| Mode | What It Does | Input Assets |
|---|---|---|
omni_reference |
Multi-modal mixing — combine images, videos, and audio freely |
image_files, video_files, audio_files
|
first_last_frames |
Control start/end frames for smooth transitions | filePaths |
The @ Reference System
Upload assets and reference them by position in your prompt:
@image_file_1 character performs the dance from @video_file_1, cinematic lighting
-
@image_file_N→ N-th element in theimage_filesarray -
@video_file_N→ N-th element in thevideo_filesarray -
@audio_file_N→ N-th element in theaudio_filesarray
Up to 9 images + 3 videos + 3 audio clips in a single request.
Step 1: Generate Your First Video (cURL)
Two endpoints, two steps — create a task, then poll for results.
Create a task
curl -X POST "https://api.xskill.ai/api/v3/tasks/create" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-api-key" \
-d '{
"model": "st-ai/super-seed2",
"params": {
"model": "seedance_2.0_fast",
"prompt": "A golden sunset over the ocean, waves gently crashing on shore, cinematic drone shot",
"functionMode": "first_last_frames",
"ratio": "16:9",
"duration": 5
}
}'
Response:
{
"code": 200,
"data": {
"task_id": "task_abc123",
"price": 50
}
}
Poll for results
curl -X POST "https://api.xskill.ai/api/v3/tasks/query" \
-H "Content-Type: application/json" \
-H "Authorization: Bearer sk-your-api-key" \
-d '{"task_id": "task_abc123"}'
When complete:
{
"code": 200,
"data": {
"status": "completed",
"result": {
"output": {
"images": ["https://your-video-output.mp4"]
}
}
}
}
Status values:
pending→processing→completed(orfailed)
Step 2: Python — Full Create + Poll Loop
Here's a production-ready pattern you can drop into any Python project:
import requests
import time
API_KEY = "sk-your-api-key"
BASE = "https://api.xskill.ai"
HEADERS = {
"Content-Type": "application/json",
"Authorization": f"Bearer {API_KEY}"
}
def create_video(prompt, **kwargs):
"""Create a Seedance 2.0 video task."""
payload = {
"model": "st-ai/super-seed2",
"params": {
"model": kwargs.get("speed", "seedance_2.0_fast"),
"prompt": prompt,
"functionMode": kwargs.get("mode", "first_last_frames"),
"ratio": kwargs.get("ratio", "16:9"),
"duration": kwargs.get("duration", 5),
}
}
# Add optional asset arrays
for key in ("image_files", "video_files", "audio_files", "filePaths"):
if key in kwargs:
payload["params"][key] = kwargs[key]
resp = requests.post(f"{BASE}/api/v3/tasks/create", json=payload, headers=HEADERS)
return resp.json()["data"]["task_id"]
def poll_result(task_id, interval=5, timeout=300):
"""Poll until task completes or timeout."""
elapsed = 0
while elapsed < timeout:
resp = requests.post(
f"{BASE}/api/v3/tasks/query",
json={"task_id": task_id},
headers=HEADERS
)
data = resp.json()["data"]
status = data["status"]
print(f"[{elapsed}s] Status: {status}")
if status == "completed":
return data["result"]["output"]["images"][0]
if status == "failed":
raise Exception("Task failed")
time.sleep(interval)
elapsed += interval
raise TimeoutError("Task timed out")
# --- Example: Text-to-Video ---
task_id = create_video("A cat wearing sunglasses walks down a neon-lit Tokyo street at night")
video_url = poll_result(task_id)
print(f"Video ready: {video_url}")
Step 3: JavaScript — Async/Await Pattern
const API_KEY = "sk-your-api-key";
const BASE = "https://api.xskill.ai";
const headers = {
"Content-Type": "application/json",
Authorization: `Bearer ${API_KEY}`,
};
async function createVideo(prompt, opts = {}) {
const resp = await fetch(`${BASE}/api/v3/tasks/create`, {
method: "POST",
headers,
body: JSON.stringify({
model: "st-ai/super-seed2",
params: {
model: opts.speed || "seedance_2.0_fast",
prompt,
functionMode: opts.mode || "first_last_frames",
ratio: opts.ratio || "16:9",
duration: opts.duration || 5,
...opts.assets,
},
}),
});
const { data } = await resp.json();
return data.task_id;
}
async function pollResult(taskId, interval = 5000, timeout = 300000) {
const start = Date.now();
while (Date.now() - start < timeout) {
const resp = await fetch(`${BASE}/api/v3/tasks/query`, {
method: "POST",
headers,
body: JSON.stringify({ task_id: taskId }),
});
const { data } = await resp.json();
console.log(`Status: ${data.status}`);
if (data.status === "completed")
return data.result.output.images[0];
if (data.status === "failed")
throw new Error("Task failed");
await new Promise((r) => setTimeout(r, interval));
}
throw new Error("Timeout");
}
// --- Example: Image-to-Video (Omni Reference) ---
const taskId = await createVideo(
"@image_file_1 character slowly turns and smiles, breeze moves hair, golden hour lighting",
{
mode: "omni_reference",
assets: {
image_files: ["https://your-character-image.png"],
},
}
);
const videoUrl = await pollResult(taskId);
console.log("Video ready:", videoUrl);
Real-World Use Cases
Here are the patterns that unlock Seedance 2.0's full power:
Motion Transfer (Image + Video)
Make a character from a photo perform actions from a reference video:
{
"prompt": "@image_file_1 character performs following @video_file_1 motion and camera style, cinematic lighting",
"functionMode": "omni_reference",
"image_files": ["https://character.png"],
"video_files": ["https://dance-reference.mp4"]
}
Audio-Driven Lip Sync (Image + Audio)
A character speaks with phoneme-level lip sync in 8+ languages:
{
"prompt": "@image_file_1 character speaks naturally, matching @audio_file_1 with expressive lip sync",
"functionMode": "omni_reference",
"image_files": ["https://character.png"],
"audio_files": ["https://voiceover.mp3"]
}
Multi-Character Scene
Two characters interacting in one shot:
{
"prompt": "@image_file_1 and @image_file_2 face each other in conversation, warm indoor lighting",
"functionMode": "omni_reference",
"image_files": ["https://person-a.png", "https://person-b.png"]
}
First/Last Frame Transition
Control exactly where your video starts and ends:
{
"prompt": "Smooth cinematic transition, elegant camera movement, natural lighting shift",
"functionMode": "first_last_frames",
"filePaths": ["https://sunrise.png", "https://sunset.png"],
"duration": 10
}
Level Up: MCP Integration
Don't want to write API calls manually? Use MCP (Model Context Protocol) to let your AI editor handle everything.
One-Click Install
Mac / Linux:
curl -fsSL https://api.xskill.ai/install-mcp.sh | bash -s -- YOUR_API_KEY
Windows (PowerShell):
irm https://api.xskill.ai/install-mcp.ps1 | iex
Manual Setup (Cursor)
Create .cursor/mcp.json in your project root:
{
"mcpServers": {
"sutui-ai": {
"command": "npx",
"args": [
"-y",
"@anthropic/mcp-client",
"https://api.xskill.ai/api/v3/mcp-http"
],
"env": {
"SUTUI_API_KEY": "YOUR_API_KEY"
}
}
}
}
Now you can just chat:
You: Generate a 10-second video of an astronaut walking on Mars, 16:9, cinematic style
Agent: Sure! Submitting to Seedance 2.0... your video is ready: [link]
Works with Cursor, Claude Desktop, and any MCP-compatible editor.
Bonus: AI Storyboarding with Cursor Skills
This is where it gets wild. Install the Seedance Storyboard Skill and the AI Agent automates the entire creative pipeline:
Your Idea → Info Gathering → Reference Image Generation → Storyboard → Video → Done
Install
git clone https://github.com/siliconflow/seedance2-api.git
cp -r seedance2-api/.cursor/skills/seedance-storyboard/ \
your-project/.cursor/skills/seedance-storyboard/
What Happens
Just describe what you want in plain English:
You: Make a 15-second coffee brand commercial for "Lucky Coffee"
Agent:
1. Gathers info (duration, ratio, style)
2. Generates reference images with Seedream 4.5
3. Builds a professional shot list:
0-3s: Macro — coffee pouring, steam rising
3-6s: Medium orbit — hand holding cup, sunlight
6-10s: Push into coffee beans falling
10-12s: Black transition
12-15s: Brand text "Lucky Coffee" fades in
4. Submits to Seedance 2.0
5. Returns your finished video
The Skill includes built-in storyboard templates (narrative, product, action, scenic) and a camera movement glossary — the Agent picks the best fit automatically.
Pricing Quick Reference
| Mode | Cost |
|---|---|
| Fast 5s (text-to-video) | 50 credits |
| Fast 5s (with video input) | 100 credits |
| Standard 5s (with video input) | 200 credits |
| Per-second: Fast (no video) | 10 credits/sec |
| Per-second: Fast (with video) | 20 credits/sec |
Duration range: 4-15 seconds. Speed modes: seedance_2.0_fast or seedance_2.0 (standard).
Wrapping Up
Seedance 2.0 is not just another AI video model — the multi-modal input system, native audio sync, and @ reference syntax make it genuinely programmable. Whether you're building a video generation feature into your SaaS, automating content pipelines, or just experimenting — the API is straightforward and the MCP/Skills layer makes it even easier.
Resources:
- GitHub Repo (code examples + Cursor Skill)
- SuTui AI Platform
- Seedance 2.0 Model Page
- API Key Management
Have questions or built something cool with Seedance 2.0? Drop a comment below — I'd love to see what you're making.
Top comments (0)