Building an AI-Powered Video Ad Creator with AWS Nova and Strands Agents

**MyrinNew** · 08-19-2025, 08:40 AM

"Here's how I built a complete video ad creator using AWS's Nova models and Strands Agents: a 5-step AI pipeline that takes text input and outputs professional video with synchronized voiceover.

This is developed with the Strands Agents - an open-source SDK designed to make it dramatically easier to build such smart, autonomous systems

Creating a video ad involves multiple AI services that need to work together seamlessly. Here's how pipeline is designed with Strands Agent

Phase 1: Content Planning

# Input: "Luxury electric car driving through mountains"
# Output: Structured strategy for all subsequent steps

strategy = {
"image_prompt": "Professional commercial photograph of luxury electric car on mountain road, golden hour lighting, cinematic composition, 1280x720",
"video_prompt": "6-second commercial showing sleek electric car driving through scenic mountain curves, smooth camera tracking, sunset lighting, premium feel",
"audio_script": "Experience the future of driving. Luxury meets sustainability."
}

Phase 2: Visual Foundation

Service: Amazon Nova Canvas

Purpose: Create high-quality reference image that sets visual style

Output: Image stored in S3

Phase 3: Video Generation

Service: Amazon Nova Reel

Input: Video prompt + reference image

Process: Async generation (2-5 minutes)

Output: 6-second professional video footage

Phase 4: Voice Enhancement

Service: Amazon Polly Neural voices

Input: audio script

Output: Professional voiceover with natural intonation

Phase 5: Final Assembly

Tool: MoviePy + FFmpeg

Process: Merge video and audio with proper timing

Output: Complete video advertisement

Why This Tech Stack?

Strands Agents: AWS's new framework for building AI agents with a model-first approach

Amazon Nova: State-of-the-art multimodal models (Canvas for images, Reel for videos)

Streamlit: Rapid prototyping with beautiful, interactive UIs

S3: Reliable storage for all generated media files

Amazon Polly: Neural text-to-speech for professional voiceovers

You can refer the code in the github for this :

`# Clone the repository
git clone https://github.com/debadatta30/aws-strand-streamlit
cd aws-strand-streamlit

# Install dependencies
pip install -r requirements.txt

# Configure AWS credentials
aws configure

# Set up environment variables
cp .env.example .env
# Edit .env with your S3 bucket name

# Launch the app
streamlit run streamlit_agent.py`

AWS Permissions Required:

{
"Version": "2012-10-17",
"Statement": [
{
"Effect": "Allow",
"Action": [
"bedrock:InvokeModel",
"bedrock:StartAsyncInvoke",
"bedrock:GetAsyncInvoke"
],
"Resource": "*"
},
{
"Effect": "Allow",
"Action": ["s3:*"],
"Resource": [
"arn:aws:s3:::your-bucket-name",
"arn:aws:s3:::your-bucket-name/*"
]
},
{
"Effect": "Allow",
"Action": ["polly:SynthesizeSpeech"],
"Resource": "*"
}
]
}

Bedrock Model Access:

Request access to these models in the AWS Bedrock console:

amazon.nova-canvas-v1:0 (Image generation)

amazon.nova-reel-v1:0 (Video generation)

us.amazon.nova-lite-v1:0 (Content strategy)

More...