DEV Community

Cover image for Building an AI-Powered Video Ad Creator with AWS Nova and Strands Agents
Debadatta Panda
Debadatta Panda

Posted on

Building an AI-Powered Video Ad Creator with AWS Nova and Strands Agents

"Here's how I built a complete video ad creator using AWS's Nova models and Strands Agents: a 5-step AI pipeline that takes text input and outputs professional video with synchronized voiceover.

This is developed with the Strands Agents - an open-source SDK designed to make it dramatically easier to build such smart, autonomous systems

Creating a video ad involves multiple AI services that need to work together seamlessly. Here's how pipeline is designed with Strands Agent

Phase 1: Content Planning

# Input: "Luxury electric car driving through mountains"
# Output: Structured strategy for all subsequent steps

strategy = {
    "image_prompt": "Professional commercial photograph of luxury electric car on mountain road, golden hour lighting, cinematic composition, 1280x720",
    "video_prompt": "6-second commercial showing sleek electric car driving through scenic mountain curves, smooth camera tracking, sunset lighting, premium feel",  
    "audio_script": "Experience the future of driving. Luxury meets sustainability."
}
Enter fullscreen mode Exit fullscreen mode

Phase 2: Visual Foundation

Service: Amazon Nova Canvas
Purpose: Create high-quality reference image that sets visual style
Output: Image stored in S3

Phase 3: Video Generation

Service: Amazon Nova Reel
Input: Video prompt + reference image
Process: Async generation (2-5 minutes)
Output: 6-second professional video footage

Phase 4: Voice Enhancement

Service: Amazon Polly Neural voices
Input: audio script
Output: Professional voiceover with natural intonation

Phase 5: Final Assembly

Tool: MoviePy + FFmpeg
Process: Merge video and audio with proper timing
Output: Complete video advertisement

Why This Tech Stack?

Strands Agents: AWS's new framework for building AI agents with a model-first approach
Amazon Nova: State-of-the-art multimodal models (Canvas for images, Reel for videos)
Streamlit: Rapid prototyping with beautiful, interactive UIs
S3: Reliable storage for all generated media files
Amazon Polly: Neural text-to-speech for professional voiceovers

You can refer the code in the github for this :

`# Clone the repository
git clone https://github.com/debadatta30/aws-strand-streamlit
cd aws-strand-streamlit

# Install dependencies
pip install -r requirements.txt

# Configure AWS credentials
aws configure

# Set up environment variables
cp .env.example .env
# Edit .env with your S3 bucket name

# Launch the app
streamlit run streamlit_agent.py`
Enter fullscreen mode Exit fullscreen mode

AWS Permissions Required:

{
  "Version": "2012-10-17",
  "Statement": [
    {
      "Effect": "Allow",
      "Action": [
        "bedrock:InvokeModel",
        "bedrock:StartAsyncInvoke",
        "bedrock:GetAsyncInvoke"
      ],
      "Resource": "*"
    },
    {
      "Effect": "Allow",
      "Action": ["s3:*"],
      "Resource": [
        "arn:aws:s3:::your-bucket-name",
        "arn:aws:s3:::your-bucket-name/*"
      ]
    },
    {
      "Effect": "Allow",
      "Action": ["polly:SynthesizeSpeech"],
      "Resource": "*"
    }
  ]
}
Enter fullscreen mode Exit fullscreen mode

Bedrock Model Access:
Request access to these models in the AWS Bedrock console:

amazon.nova-canvas-v1:0 (Image generation)
amazon.nova-reel-v1:0 (Video generation)
us.amazon.nova-lite-v1:0 (Content strategy)

Top comments (0)