top duke

Posted on May 15

Building an AI Video Generation Workflow With Queues, Webhooks, and Review States

#architecture #ai #webdev #api

AI video generation looks simple from the user interface: enter a prompt, upload an image, wait a bit, and get a video. The backend is not that simple.

If you are building a product that calls an AI video API, you need to handle long-running jobs, retries, provider callbacks, file storage, user-facing status, and review before anything gets published. This article walks through one practical architecture for that workflow.

I will use SeeVido as an example of the kind of external AI video service a product might call, but the pattern applies to any provider that accepts prompt-to-video or image-to-video requests.

The Problem: Video Generation Is Not a Normal Request

Most web requests are short. A user clicks a button, your server does some work, and the response comes back quickly.

AI video generation is different. It can take time, fail halfway, produce an output that needs review, or return a callback after the user has left the page. Treating it like a normal synchronous request usually creates a poor user experience and a hard-to-debug system.

Instead, model video generation as a job.

A job has:

an owner;
an input type, such as text-to-video or image-to-video;
a prompt or source asset;
a provider request ID;
a current status;
an event history;
one or more output artifacts;
a review decision.

That gives your product a reliable way to answer the most common support question: "What happened to my video?"

A Useful Architecture

A minimal production-friendly system can look like this:

Frontend submits a prompt or source image.
API server creates a job record in the database.
Worker picks up the job from a queue.
Worker sends the request to the AI video provider.
Provider returns a request ID.
Provider sends status updates to a webhook.
Completed files are copied to object storage.
A review dashboard approves or rejects the output.
User sees the final status in the app.

The important decision is separation. The API request should create the job, not wait for the generated video.

Suggested Job States

Keep the status model boring and explicit.

queued -> submitted -> processing -> completed -> approved
                                      -> rejected
                                      -> failed

You may also need:

canceled
expired
retrying

Avoid using a single vague status such as done. A generated video may be technically completed but still not approved for publishing.

Database Tables

You do not need a complex event-sourcing system to start. A current-state table plus an event table is usually enough.

CREATE TABLE video_generation_jobs (
  id BIGINT PRIMARY KEY,
  public_id UNIQUEIDENTIFIER NOT NULL,
  user_id BIGINT NOT NULL,
  workflow_type VARCHAR(40) NOT NULL,
  provider_name VARCHAR(80) NOT NULL,
  provider_request_id VARCHAR(200) NULL,
  current_status VARCHAR(40) NOT NULL,
  source_asset_id BIGINT NULL,
  prompt_hash VARBINARY(32) NULL,
  created_at_utc DATETIME2 NOT NULL,
  updated_at_utc DATETIME2 NOT NULL,
  completed_at_utc DATETIME2 NULL
);

CREATE TABLE video_generation_events (
  id BIGINT PRIMARY KEY,
  job_id BIGINT NOT NULL,
  event_type VARCHAR(80) NOT NULL,
  previous_status VARCHAR(40) NULL,
  new_status VARCHAR(40) NULL,
  event_source VARCHAR(40) NOT NULL,
  message NVARCHAR(1000) NULL,
  error_code VARCHAR(100) NULL,
  created_at_utc DATETIME2 NOT NULL
);

The job table answers "where is it now?" The event table answers "how did it get there?"

API Endpoint: Create the Job

The first endpoint should validate the request, create a job, and return immediately.

app.post("/api/video-jobs", async (req, res) => {
  const { workflowType, prompt, sourceAssetId } = req.body;

  if (!["text-to-video", "image-to-video"].includes(workflowType)) {
    return res.status(400).json({ error: "Unsupported workflow type" });
  }

  const job = await db.videoJobs.create({
    userId: req.user.id,
    workflowType,
    providerName: "seevido",
    currentStatus: "queued",
    sourceAssetId,
    promptHash: hashPrompt(prompt),
  });

  await queue.publish("video.generate", { jobId: job.id });

  res.status(202).json({
    jobId: job.publicId,
    status: "queued",
  });
});

The 202 Accepted response is intentional. It tells the frontend that the request has been accepted, not completed.

Worker: Submit to the Provider

The worker owns provider communication. This keeps slow or unreliable external calls away from the request-response path.

queue.consume("video.generate", async ({ jobId }) => {
  const job = await db.videoJobs.findById(jobId);

  if (!job || job.currentStatus !== "queued") return;

  await markStatus(job.id, "submitted", "Worker picked up job");

  try {
    const providerRequest = await seevidoClient.createVideo({
      workflowType: job.workflowType,
      prompt: await loadPrompt(job.id),
      sourceAsset: await loadSourceAsset(job.sourceAssetId),
      webhookUrl: `${process.env.APP_URL}/webhooks/video-provider`,
    });

    await db.videoJobs.update(job.id, {
      providerRequestId: providerRequest.id,
      currentStatus: "processing",
    });

    await addEvent(job.id, {
      type: "provider_request_created",
      newStatus: "processing",
      message: "Provider request created",
    });
  } catch (error) {
    await markStatus(job.id, "failed", error.message, {
      errorCode: error.code || "provider_submit_failed",
    });
  }
});

The provider client can point to SeeVido or any other AI video API. Keep it behind an interface so you can swap providers without rewriting the queue and review logic.

Webhooks Need Idempotency

Webhooks are often delivered more than once. Your handler should treat duplicate events as normal.

Use a table like this:

CREATE TABLE provider_webhook_receipts (
  id BIGINT PRIMARY KEY,
  provider_name VARCHAR(80) NOT NULL,
  provider_event_id VARCHAR(200) NOT NULL,
  provider_request_id VARCHAR(200) NULL,
  received_at_utc DATETIME2 NOT NULL,
  payload_hash VARBINARY(32) NULL,
  CONSTRAINT uq_provider_event UNIQUE (provider_name, provider_event_id)
);

Then insert the webhook event before processing it.

app.post("/webhooks/video-provider", async (req, res) => {
  const event = req.body;

  const inserted = await tryInsertWebhookReceipt({
    providerName: "seevido",
    providerEventId: event.id,
    providerRequestId: event.requestId,
    payloadHash: hashPayload(req.body),
  });

  if (!inserted) {
    return res.status(200).json({ ok: true, duplicate: true });
  }

  const job = await db.videoJobs.findByProviderRequestId(event.requestId);
  if (!job) return res.status(202).json({ ok: true });

  if (event.status === "completed") {
    await copyArtifactsToStorage(event.outputs);
    await markStatus(job.id, "completed", "Provider completed job");
  }

  if (event.status === "failed") {
    await markStatus(job.id, "failed", event.message, {
      errorCode: event.errorCode,
    });
  }

  res.status(200).json({ ok: true });
});

Returning 200 for duplicates prevents unnecessary retries.

Store Artifacts, Not Videos, in the Database

Generated videos can be large. Store them in object storage and keep metadata in your database.

CREATE TABLE video_generation_artifacts (
  id BIGINT PRIMARY KEY,
  job_id BIGINT NOT NULL,
  artifact_type VARCHAR(40) NOT NULL,
  storage_uri NVARCHAR(1000) NOT NULL,
  mime_type VARCHAR(100) NULL,
  file_size_bytes BIGINT NULL,
  duration_seconds DECIMAL(9,3) NULL,
  width INT NULL,
  height INT NULL,
  created_at_utc DATETIME2 NOT NULL
);

This makes the database useful for search, support, and reporting without turning it into a media file store.

Add Human Review Before Publishing

A generated video can be technically successful and still be wrong.

For example:

product shape changed;
label text became unreadable;
face or hand details distorted;
source image rights are unclear;
output looks like real footage and needs disclosure;
clip implies a product feature that was never approved.

Add a review state before public use.

completed -> approved
completed -> rejected

In many products, completed should only mean "the provider returned a usable file." It should not mean "safe to publish."

What to Show the User

Users do not need every internal status. Map technical states to simple UI states.

Internal Status	User-Facing Copy
queued	Waiting to start
submitted / processing	Generating video
completed	Ready for review
approved	Ready to use
rejected	Needs changes
failed	Generation failed

If a job fails, give the user a clear next step. "Try again with a shorter prompt" is better than "provider_error_409."

Observability Queries

Track operational health from the beginning.

Useful metrics include:

average time from queued to completed;
failure rate by provider and workflow type;
rejection rate by review reason;
duplicate webhook count;
jobs stuck in processing;
storage size by generated artifact type.

These metrics help you decide whether the product needs better prompts, better provider handling, or better user guidance.

Security and Privacy Notes

Do not treat prompts as harmless strings. Prompts may include customer names, campaign plans, product information, or private context.

Consider:

hashing prompts in operational tables;
storing raw prompts separately with tighter access controls;
signing webhook requests;
validating uploaded file types;
scanning output files if users can download them;
expiring unused generated assets;
logging reviewer decisions.

This is especially important if your app supports business users.

Final Checklist

Before shipping an AI video workflow, make sure you have:

a job table;
an event table;
a queue;
a worker;
webhook idempotency;
object storage;
artifact metadata;
review states;
user-friendly status copy;
failure and stuck-job monitoring.

Conclusion

AI video generation is not just an API call. It is an asynchronous workflow with provider latency, callbacks, files, review decisions, and user expectations.

Whether you use SeeVido or another provider, the backend pattern is the same: create a job, process it through a queue, handle webhooks idempotently, store artifacts outside the database, and review the output before publishing.

That architecture gives your users a better experience and gives your team a system that can be debugged when something goes wrong.

DEV Community