Aloysius Chan

Posted on Mar 19 • Originally published at insightginie.com

Understanding the OpenClaw rawugc‑api Skill: AI Video, Image and Music Generation Made Simple

#news #insights #ginie #openclaw

Introduction

The OpenClaw skill named rawugc‑api provides agents with a direct gateway to
the RawUGC platform, a suite of generative AI tools for video, image, and
music creation. By wrapping the RawUGC REST API, the skill lets you generate
AI‑powered media, manage custom characters, build audience personas, and
schedule social‑media content without leaving your agent workflow.

What Is the OpenClaw rawugc‑api Skill?

This skill is a procedural knowledge module that teaches an autonomous agent
how to call any endpoint of the RawUGC API. It does not contain the API
itself; instead it defines the required environment variable, authentication
headers, base URL, versioning rules, and the exact request/response shapes for
each endpoint. When an agent follows the skill, it can construct correct HTTP
requests, handle asynchronous polling, and interpret results.

Key Features

Unified access to video, image, and music generation endpoints.
Support for file uploads that generate temporary URLs usable in generation calls.
CRUD operations for personas, messaging templates, and custom characters.
Built‑in handling of API versioning via the RawUGC-Version header.
Automatic credit tracking and balance reporting in responses.
Guidance on handling asynchronous jobs with polling GET endpoints.

Authentication and Setup

The skill requires a single environment variable called RAWUGC_API_KEY. This
variable must hold a bearer token obtained from the RawUGC dashboard. Agents
should read the variable at runtime, place it in the Authorization header as
Bearer , and never log or hard‑code it. If the variable is missing or empty,
the skill instructs the agent to inform the user to obtain a key from the
dashboard.

Video Generation

The skill exposes the POST /videos/generate endpoint for starting a video job.
Required fields are model (choose from sora-2-text-to-video, sora-2-image-to-
video, kling-2.6/motion-control, veo3, veo3_fast) and, depending on the model,
either a prompt string for text‑to‑video or imageUrls for image‑to‑video.
Optional parameters include aspectRatio, nFrames (Sora only),
selectedCharacter, characterOrientation, mode, and more. The response returns
a videoId, status, creditsUsed, newBalance, and estimatedCompletionTime. To
check progress, agents call GET /videos/:videoId; when status becomes
succeeded, the response includes a url to the generated video. Additional
endpoints allow adding captions (POST /videos/captions) or text overlays (POST
/videos/overlay).

Image Generation

Image creation uses POST /images/generate. The model can be nano-banana-2 for
text‑to‑image (4 credits) or google/nano-banana-edit for image editing (2
credits). Required fields are model and prompt. For editing, imageUrls must
contain the source images. Optional parameters let you set aspectRatio,
imageSize, resolution, outputFormat, and enable googleSearch grounding for the
nano-banana-2 model. The response returns an imageId and similar metadata to
video jobs. Status is retrieved via GET /images/:imageId, and a completed
image provides a direct url.

Music Generation

The skill wraps POST /music/generate, which creates AI music using Suno
models. Required field is prompt describing the desired track. Optional model
choices include V3_5, V4, V4_5, V4_5PLUS, V4_5ALL, V5 (default). Setting
instrumental to false adds vocals; title and style enable custom mode. Each
generation costs 3 credits. The response includes musicId, status,
creditsUsed, and estimatedCompletionTime. Poll GET /music/:musicId to retrieve
audioUrl, albumArtUrl, duration, and other metadata when the track is ready.

File Upload

Before using external media in generation requests, agents can upload a file
via POST /upload. The endpoint accepts multipart/form-data with a file field;
allowed types are MP4, QuickTime, WebM video, and PNG, JPEG, WebP images.
Maximum size is 100 MB. The response supplies a temporary URL, contentType,
and size, which can be referenced in the imageUrls or videoUrls parameters of
generation calls.

Characters

RawUGC provides a library of AI‑driven characters. The skill exposes GET
/characters to list all built‑in and custom characters, returning username,
displayName, description, videoPreviewUrl, type (admin/user), and activity
status. To fetch a single character, use GET /characters/:characterId. This
enables agents to inject a specific persona into video generation via the
selectedCharacter parameter.

Personas (CRUD)

Personas define target audiences for content planning. GET /personas returns a
paginated list with count. To create a new persona, POST /personas with a name
(max 200) and description (max 5000). The response includes the newly created
_id. Individual personas can be retrieved, updated (PATCH), or removed
(DELETE) using /personas/:personaId. This helps agents maintain a library of
audience profiles for later campaign planning.

Messaging (CRUD)

Messaging stores brand‑or‑positioning templates. Similar to personas, GET
/messaging lists all templates. Create a template with POST /messaging
supplying name (max 200) and body (max 5000). Updates and deletions follow the
same PATCH / DELETE pattern on /messaging/:messageId. Agents can retrieve
these templates to inject consistent copy into generated videos or social
posts.

How to Call the Skill from an Agent

An agent using the skill follows these steps:

Read RAWUGC_API_KEY from the environment.
Set the Authorization header to Bearer .
Optionally add RawUGC-Version: 2026-03-06 to lock the API version.
Construct the JSON payload according to the endpoint’s specification (refer to the skill’s tables).
Send the request to https://rawugc.com/api/v1/.
If the endpoint returns a job ID (videoId, imageId, musicId), poll the corresponding GET endpoint until status equals succeeded or failed.
Extract the result URL or metadata and continue the workflow.

The skill also provides error‑handling guidance: if RAWUGC_API_KEY is missing,
return a clear message asking the user to configure it; if an API call fails
with a non‑2xx status, surface the failCode and failMessage from the response.

Example: Generating a TikTok‑style Video

Imagine an agent tasked with creating a 15‑second promotional clip for a new
product. The workflow could be:

Upload a product demo video via POST /upload to obtain a temporary URL.
Call POST /videos/generate with model set to veo3, prompt describing the desired scene, imageUrls containing the uploaded URL, aspectRatio set to 9:16, and selectedCharacter set to rawugc.mia for a branded presenter.
Receive videoId and start polling GET /videos/:videoId.
When the video succeeds, add captions with POST /videos/captions (language en) to improve accessibility.
Optionally overlay a call‑to‑action using POST /videos/overlay with text "Learn More" at the bottom.
Download the final URL and schedule the clip for posting via a social‑media integration.

Each step consumes credits reported in the responses, allowing the agent to
track budget in real time.

Best Practices and Tips

Always set the RawUGC-Version header to ensure reproducible results across runs.
Keep prompts concise yet descriptive; the models respond best to clear, specific instructions.
When generating multiple variants, reuse the same videoId or imageId as a reference to avoid re‑uploading assets.
Monitor credit usage via the newBalance field to prevent unexpected depletion.
Store generated URLs in a secure, temporary storage; RawUGC URLs may expire after a set period.
Use the characters endpoint to select a presenter that matches your brand’s tone before generating video.
For editing tasks, prefer google/nano-banana-edit when you need to modify an existing image; it costs fewer credits than a full regeneration.

Limitations and Considerations

The skill relies on the availability and pricing of the RawUGC service. If the
service experiences downtime, API calls will return errors that the agent must
handle. Some advanced features, such as custom model fine‑tuning, are not
exposed via the public API and therefore are not accessible through this
skill. Additionally, the skill does not abstract away the asynchronous nature
of generation; agents must implement polling logic or use webhook‑style
callbacks if supported by their environment. Finally, because the skill only
defines the procedural steps, any errors in constructing JSON payloads (e.g.,
mismatched data types) will lead to validation failures that the agent should
catch and report.

Conclusion

The OpenClaw rawugc‑api skill equips agents with a comprehensive, ready‑to‑use
interface to RawUGC’s generative AI suite. By following the skill’s guidance,
agents can autonomously create high‑quality videos, images, and music, manage
custom characters, define audience personas, and maintain consistent brand
messaging—all while keeping track of credit consumption and adhering to best
practices. Whether the goal is rapid content prototyping, automated
social‑media pipelines, or sophisticated multimedia campaigns, this skill
reduces the integration overhead and lets developers focus on creativity
rather than API boilerplate.

Skill can be found at:
https://github.com/openclaw/skills/tree/main/skills/tfcbot/ai-ugc/SKILL.md

DEV Community