The SaaS product demo video is one of the highest-leverage assets in B2B marketing. It's the page that converts cold-traffic to trials. It's the email attachment that wakes up a stalled deal. It's the App Store preview that decides whether a paid install happens or doesn't. And yet most B2B teams ship demo videos roughly once a year, because the production loop — brief, script, screen capture, voiceover, edits, three rounds of stakeholder feedback — is so heavy that the video can't keep up with the product. Six months in, the demo is showing a UI that no longer exists.
That changes when the production loop collapses from two weeks to one day. This guide walks through the actual workflow we've seen B2B teams use to ship demo videos with an AI agent: pick the format, write the script, brief the agent, do one human pass, ship. The longest step is the script. The agent does the rest.
Step 1 — Pick One of Three Formats (Don't Mix Them)
Before you write a single word of script, decide which format you're making. The single most common mistake on a SaaS demo video is trying to do all three jobs in one asset and ending up with a five-minute video that nobody watches to the end. Pick one.
Format A — The 30-second hero demo
Lives at the top of your homepage. Autoplays muted, with captions. Job: in 30 seconds, communicate what your product is and what changes for the user when they use it. Not features. Not pricing. Not the founder's story. Just the before/after of the user's day. The hero demo is the video that determines whether someone scrolls or hits "Start free trial."
Format B — The 90-second to 2-minute feature tour
Lives on a /product or /features page. Sometimes embedded in sales emails. Job: walk through the three to five core features in the order a real user would touch them. This is the format most teams default to without thinking. It's only the right call when the user already knows roughly what your product is and is evaluating whether the specific capabilities match their needs.
Format C — The 3-5 minute onboarding / first-day video
Lives inside the product (post-signup welcome screen, empty state, help center) and in the activation email sequence. Job: get a brand-new user from "I just signed up" to "I've completed my first valuable action." This is the format that drives activation rate, not signup rate.
If you're starting from zero on demo video, ship Format A first. It moves the conversion metric that matters most for early-stage SaaS. Format B and Format C come second and third.
Step 2 — Write the Script Using the 3-Act Formula
This is the formula that survives every product change, every messaging refresh, and every stakeholder review. Three acts, in order, with a clear job for each.
Act 1 — The pain (15-25% of runtime). Open on the user's current reality, not on your product. Show the spreadsheet they're maintaining manually, the inbox they're drowning in, the dashboard that takes 40 minutes to build every Monday. The viewer needs to recognize their own day in the first 5 seconds. If they don't, they bounce.
Act 2 — The product enters (50-60% of runtime). Now your product appears, and the viewer sees the same task get done in a fraction of the time, with a fraction of the steps. This is where you show the actual UI doing actual work. Critically: do not narrate features. Narrate outcomes. "Connect your data sources in two clicks" beats "OAuth-based connector library with 200+ integrations" every time, even though the second one is technically more accurate.
Act 3 — The closing loop (15-25% of runtime). Show the after-state and the call to action. The Monday dashboard is now built in 4 minutes, not 40. The inbox is at zero. The team is shipping. End on a single, unambiguous CTA: "Start free" / "Book a demo" / "Try it on your data." Pick one. Never two.
The 3-act formula works for all three formats. The runtime changes, the proportions stay roughly the same. Format A compresses Act 1 to 5 seconds and Act 3 to 5 seconds. Format C stretches Act 2 into a step-by-step walkthrough. The structure holds.
Step 3 — Brief the AI Agent (Use This Template)
Agents render exactly what you describe. Vague briefs produce vague videos. The brief below takes about 20 minutes to fill in once you have the script, and it's the unit of work that the agent operates on.
Product context (3 sentences). What the product does, who uses it, what it replaces. Example: "Acme is a B2B billing platform for usage-based SaaS companies. It's used by finance and revops teams at $5M-$50M ARR companies. It replaces homegrown billing scripts plus Stripe Billing." Three sentences. No more.
Target viewer (1 sentence). The single person you want to convert. Example: "Head of finance at a Series B SaaS company who's currently maintaining usage-based billing in spreadsheets and a Stripe webhook glue layer."
Format and runtime. "Format A — 30-second hero demo, vertical 9:16 for social, horizontal 16:9 for homepage embed."
The script. Paste the full Act 1 / Act 2 / Act 3 script. Mark each act explicitly with a header. Include the exact voiceover line and the on-screen action it pairs with on each beat.
Visual style. Pick three adjectives. Example: "clean, technical, confident." Then one paragraph elaborating: "Clean = generous whitespace, no unnecessary motion graphics. Technical = real product UI, real data, real numbers — no fake placeholder data. Confident = no apologetic language, no 'we hope', no soft sell."
Brand assets. Logo file, primary color HEX, secondary color HEX, font name (or font file). If you have a voice profile or character reference for an on-camera presenter, include it.
Distribution channel. Where this video will live. Tells the agent the right aspect ratio, captioning style, and opening 3 seconds. Homepage embed reads differently from LinkedIn ad reads differently from in-product activation modal.
Must-include and must-avoid. Two short lists. Must-include: specific UI screens, specific phrases, specific CTAs. Must-avoid: competitor names, regulatory claims you can't substantiate, the founder's pet phrase that nobody else likes.
Save this brief as a reusable template. Future demo videos for the same product reuse most of the fields and only swap script and channel.
Step 4 — Generate, Then Do One Human Pass
The agent runs the production loop end-to-end: script-to-shots, shots-to-audio, audio-to-edit, edit-to-finished export. For a Format A 30-second video, the first generation is usually ready in roughly 10-20 minutes. For Format C 3-5 minute onboarding video, expect 30-60 minutes.
Don't ship the first generation. Do one structured human pass before publishing.
Watch the video three times in a row, each time looking for one specific class of issue:
- Pass 1 — message fidelity. Does Act 2 actually show the outcome described in the script, or did the agent default to feature-listing? Does the CTA in Act 3 match the channel? Watch with the script open next to the video.
- Pass 2 — brand fidelity. Are the colors right? Is the logo placement right? Does the voiceover sound like your brand voice? Are the on-screen UI screens recognizable as your product?
- Pass 3 — first-3-seconds test. Mute the video. Watch only the first 3 seconds. Would the target viewer recognize their own day in those 3 seconds? If no, the hook is broken — fix Act 1 in the brief and regenerate.
If pass 3 fails, regenerate. If pass 1 or pass 2 fail in small ways, edit the brief and request a partial regeneration of the affected segment rather than the whole video. If everything passes, ship.
Step 5 — Embed in the Five Places That Drive Signups
A demo video that lives only on the homepage is doing 20% of its potential job. The same video, with the right cuts, drives signups in five distinct surfaces:
- Homepage hero. Format A, 30 seconds, autoplay muted, looping, with burned-in captions. Above the fold.
- Product / features page. Format B, 90 seconds to 2 minutes. Click-to-play, with audio on by default. Below the fold of the hero pitch, above the fold of the feature grid.
- Onboarding email sequence. Format A in email 1 (welcome), Format C broken into 90-second segments across emails 2-4. Use animated GIF previews that link out to the full video — embedded video in email is unreliable across clients.
- App Store / extension store listing. Format A reformatted to the store's exact spec (App Store: vertical, 30 seconds max, captions on). The store preview is one of the highest-leverage 30 seconds in your funnel and the place teams most commonly skip.
- Sales decks and outbound. Format B as a Loom-style asset that AEs paste into outreach. The same video, captioned, on the second slide of every sales deck. Reps who use it report meeting-acceptance rates 1.5-2x higher than reps who don't.
The five-surface plan is what turns a single demo video from a marketing artifact into a real conversion lever. Most teams skip three of the five and wonder why their demo video "didn't move the needle."
Common Pitfalls (and How to Avoid Them)
Feature-dumping in Act 2. The most common failure mode. The script says "show our integrations library" and the video becomes a 45-second tour of every logo. Fix in the brief: replace every feature noun with an outcome verb. "200+ integrations" becomes "your data flows in five minutes after signup."
Over-narrating. The voiceover talks for the entire runtime, with no breathing room. Real demo videos have moments of silence where the UI does the work. Fix in the script: write 25-30% less voiceover than feels comfortable, then trust the visuals.
Stakeholder consensus on the CTA. Marketing wants "Start free trial," sales wants "Book a demo," product wants "Read the docs." Three CTAs in the same video means zero CTAs. Pick one based on the channel, not on the org chart.
Letting the demo go stale. Six months in, the UI in the video doesn't match the product. The video that converts now becomes the video that confuses customers later. Fix structurally: re-generate the demo every quarter, not every year. With an agent and a saved brief template, the regeneration takes an afternoon.
Skipping captions. 85% of social and embed views are muted. A demo video without burned-in captions is a video that 85% of viewers don't understand. Captions are not optional.
How Genra Fits Into This Workflow
The workflow above is tool-agnostic — any end-to-end AI video agent can run it. Genra is the agent we built and the one this guide is calibrated against. What Genra contributes specifically to a SaaS demo workflow:
- Brief-first input. The brief template above is a real artifact in Genra, not a chat prompt. You can save it, reuse it for the next demo, and version it as the product evolves.
- Brand asset library. Logo, color palette, voice profile, and any on-camera presenter reference get uploaded once and reused on every generation. The 30-second hero demo and the 3-minute onboarding video stay visually consistent without per-video babysitting.
- End-to-end production. Brief in, finished video out — captions, audio, edit, export. No clip-stitching, no separate voiceover step, no hand-off to an editor.
- Multi-format output. Generate Format A 30s, Format B 90s, and Format C 3min from related briefs in one session, all sharing the same brand library and visual style.
If you want to ship your first AI-made SaaS demo this week, Genra has 40 free credits with no card required. Start at genra.ai.
Key Takeaways
- Pick one format. Format A (30s hero) for homepage, Format B (90s tour) for product page, Format C (3-5min) for in-product onboarding. Don't mix.
- Use the 3-act script formula: pain → product enters → after-state with one CTA. Narrate outcomes, not features.
- The brief is the unit of work. Spend 20 minutes on a structured brief; spend 0 minutes on agency back-and-forth.
- One human pass before shipping: message fidelity, brand fidelity, first-3-seconds test. Regenerate if pass 3 fails.
- Embed in 5 surfaces, not 1: homepage, product page, onboarding email, App Store listing, sales deck.
- Re-generate quarterly. A stale demo costs more than a fresh one.
- Captions are mandatory. 85% of views are muted.
Frequently Asked Questions
How long does it take to make a SaaS demo video with AI?
For a Format A 30-second hero demo: roughly half a day end-to-end — about 2 hours on script, 30 minutes on the brief, 20 minutes for the agent to generate, 30 minutes for the human review pass. For Format C 3-5 minute onboarding video, plan for a full day. The longest step is always the script. The agent doesn't shorten that part — the script is human work.
Can I use AI for a demo if my product has a complex UI?
Yes, with one nuance. AI agents are excellent at the narrative and outcome layer of a demo (Act 1 pain, Act 3 after-state, voiceover, captions, brand polish). For the actual UI walkthrough portion of Act 2, many teams use a hybrid: real screen recording of the product UI for the walkthrough segments, AI-generated everything else (intro, outro, voiceover, transitions, motion graphics). The agent stitches the real UI footage into the rest of the production. This is the dominant pattern for technical SaaS demos.
What's the right length for a SaaS demo video?
By format: hero demo 30 seconds, feature tour 90 seconds to 2 minutes, onboarding video 3 to 5 minutes. The instinct to make demos longer is almost always wrong. Watch-through rate drops sharply after 30 seconds on social, after 90 seconds on a product page, and after 3 minutes anywhere else. If you can't make the case in those windows, the script is bloated, not the runtime.
How often should I refresh the demo video?
Quarterly for early-stage SaaS where the UI is changing fast. Twice a year for late-stage products with stable UIs. The trigger isn't a calendar — it's whether the UI in the video still matches the product the user lands in after signup. The moment those diverge meaningfully, the demo starts hurting conversion instead of helping it.
Do I need a voiceover?
For Format A (30s hero) and Format B (feature tour), yes — voiceover plus captions outperforms captions-only by a wide margin in muted-and-unmuted viewing combined. For Format C (in-product onboarding), it depends: if the video is embedded in the product, voiceover is optional because the user already has the UI in front of them. If it's in an email, voiceover is mandatory because the email viewer often isn't logged in.
How does Genra handle SaaS-specific demos differently from generic video tools?
Genra is built brief-first, which matters for B2B because B2B demos require precise messaging fidelity. The brief template (product context, target viewer, format, script, visual style, brand assets, channel, must-include, must-avoid) is a real artifact in the tool, not a chat prompt. The brand asset library means demo number 14 looks consistent with demo number 1 without per-video QA. The end-to-end production loop means you don't hand off between three tools to get from script to finished export. Genra offers 40 free credits with no card required if you want to run a pilot demo this week. Start at genra.ai.
Top comments (0)