Matthieu Lienart

Posted on Jun 5

From LEGO to Video: Building an AI Storytelling App for the AWS Community Builder Community

#aws #langchain #bedrock #awscommunitybuilders

As an AWS Community Builder, I wanted to build something fun that would bring the community together — not just another demo, but an actual game people could play. The result is AWS Community Builders Fantasy Quest: take a photo of a LEGO creation, and the app turns it into a 36-second epic AWS architect adventure video, generated by Amazon Nova models and narrated by Amazon Polly. Stories are scored on a community leaderboard so everyone can compete for the title of best AWS cloud hero storyteller.

I first demoed it at the AWS User Group Bern meetup on May 6th where attendees teamed up with LEGO blocks and laughed at the resulting videos of cloud heroes battling latency dragons and Lambda knights on serverless quests.
The code is now public at github.com/mlnrt/cb-fantasy-quest.

The app was kept online for two weeks for the community to play with and is now decommissioned.

Note: This project, especially the frontend, was heavily developed using AI.

⚠️ Before you read further: Nova Reel is going EOL

The video generation in this project relies entirely on Amazon Nova Reel v1.1. AWS has marked it as Legacy and it will reach end-of-life on September 30, 2026, after which it will stop accepting requests. As far as I can tell, there is currently no replacement video generation model announced for Amazon Bedrock. So if you want to deploy and run this demo, you will need to find another video generation model and provider. See the Nova Reel model card for the official notice.

What was built

The full stack connects an Angular web app to a serverless AWS backend. The user flow is:

Take a photo of a LEGO scene via the browser camera
The image is uploaded to S3, triggering story generation via Bedrock Nova via LangChain
A Step Functions pipeline produces a 36-second video: 6 scenes, each with a Nova Reel video clip and an Amazon Polly narration track, stitched together with FFmpeg
The finished video lands on the community leaderboard for rating

The infrastructure is fully defined in AWS CDK (TypeScript) across four stacks: Core, API, Video, and Monitoring. Every Lambda function is Python 3.13, using AWS Lambda Powertools for structured logging.

The story generation pipeline

When the photo lands in S3, it triggers a Lambda function that calls Bedrock Nova via LangChain. The model analyses the LEGO scene — detecting themes, characters, and visual elements — and generates a structured 6-scene story using Pydantic for reliable output.

Each scene is designed to fit 6 seconds of video, giving a total of 36 seconds. Once the story is persisted to DynamoDB, an EventBridge event triggers the video pipeline.

The video pipeline

A Step Functions workflow orchestrates the 6-scene production. For each scene it runs two things in sequence:

generate the Polly audio track,
submit the Nova Reel video generation job, and poll for completion.

The Beriock video gineration jobs writes the final video clip but also all individual 6 clips directly to S3.

The six finished clips are then passed to a Composer Lambda that uses FFmpeg to merge each clip's video and audio into the final 36-second file. If the audio is longer than the video, it takes the last frame of the video and holds it until the audio finishes, creating a freeze-frame effect, but avoiding audio overlap issues. The Composer Lambda also generates the VTT subtitle file from the original story text, and uploads both the final video and subtitles to S3 for the frontend to display.

Cost controls and rate limiting

Exposing this app publicly in my AWS ccount with real AWS costs behind it required controls to manage expenses. The app enforces several layers:

Global daily cap: a DynamoDB counter with a 24-hour TTL limits the total number of videos to 5 per day. When the cap is hit, the frontend shows a friendly message and stops accepting submissions.
One story per pseudo: each Community Builder name can only generate one story, preventing a single person from consuming the entire daily quota.
Distributed generation lock: a DynamoDB item with a 5-minute TTL acts as a mutex, blocking concurrent generation attempts. The frontend polls a lock status endpoint before allowing a submission, giving users honest feedback rather than a silent failure.
API Gateway protection: every request must carry an x-api-key header, validated by a Lambda Authorizer before it ever reaches a Lambda function. The authorizer reads the expected key from SSM Parameter Store (stored as a SecureString). API Gateway caches the authorization result for 5 minutes to reduce Lambda invocations. On top of that, API Gateway throttling (5 req/s, burst 10) adds a final backstop against evil hammering.

An example of what was generated

Here is an example of a story generated from a LEGO scene at the Bern meetup — cloud heroes, latency dragons, and Lambda knights included:

📖 The Crystal Vault of Cloudreach Kingdom

📝 Plot Summary: In Cloudreach Kingdom, Knight Sir Archon and his Ninja sidekick Kaito must decipher a secret flag message to stop the villainous Knight Malakar before he unleashes chaos — with AWS CloudFront, VPC, and S3 as their allies.

💬 Favorite Quote: "In the realm of the cloud, wisdom crystallizes like ice in the vault of S3."

☁️ AWS Services Mastered: CloudFront, VPC, S3

👥 Characters in Your Story: Knight (hero), Ninja (sidekick), Knight with flag (hero), Character on broomstick (neutral), Knight in red (villain)

Cavheat

Building this end-to-end revealed a few things worth noting for anyone attempting something similar:

Nova Reel false-positive guardrail failures are real and unhandled. The model occasionally rejects perfectly innocent LEGO prompts due to Amazon's internal content filters triggering on ambiguous visual descriptions. There is no automated retry in the current code — it is listed as an open TODO. For a conference demo this was manageable; for production it would need a proper fallback strategy.
FFmpeg in Lambda works here but has limits. The FFmpeg distribution is made up of two binaries: ffmpeg (the encoder and processor) and ffprobe (the media analyser, used to inspect codecs, duration, stream metadata, and so on). Both together push the layer size beyond Lambda's 250 MB unzipped limit, so only ffmpeg is included in the layer. For the video composition task in this project that is sufficient — merging clips, overlaying audio, and freeze-framing the last frame are all ffmpeg operations. But if your use case requires inspecting video files before processing them, you would need to either strip down the ffmpeg binary further using a custom build with only the required codecs, or use another service than AWS Lambda for the composition step.
The app controls are good enough for a short-lived demo, not for production. Nothing prevents a determined user from clearing their browser cache and picking a new pseudo to generate another story. Also, a pseudo only exists in DynamoDB once a story is created under it, so two people choosing the same name at the same moment will both be allowed through, creating a race condition. For a demo app that runs for a few days with what I consider a trusted audience, these gaps were an acceptable trade-off. They would need proper server-side identity and atomic pseudo reservation before this could be considered robust.

Conclusion

AWS Community Builders Fantasy Quest was a fun side project that combines image analysis, LLM story generation, text-to-video, text-to-speech, and video composition into one event-driven pipeline — all on AWS, all serverless, all infrastructure-as-code. It was a lot of fun to build and even more fun to play with at the Bern meetup.

The code is open source and available at github.com/mlnrt/cb-fantasy-quest.

DEV Community