🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.
Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!
Overview
📖 AWS re:Invent 2025 - How Cires21 revolutionized advanced video workflows using AI on AWS (SMB203)
In this video, Stefano Sandrini from AWS and David González, CTO of Cires21, discuss how Cires21 revolutionizes video workflows using AI on AWS. David explains MediaCopilot, a unified AI platform built on serverless architecture using Lambda, API Gateway, Step Functions, and SageMaker. The platform integrates custom AI models for speech recognition, diarization, and scene detection, along with Amazon Bedrock for metadata generation. Recently, they developed AI agents using AWS AgentCore with MCP server and Gateway, enabling complex tasks like creating viral clips with automated subtitles. Key lessons include selecting appropriate SageMaker deployment models, implementing segmented video processing for parallel inference, and developing specialized agents to optimize token usage. Future plans focus on real-time live processing and integrating visual models for enhanced context.
; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.
Main Part
AWS Media and Entertainment Solutions: Driving Innovation Through Cloud, AI, and Partner Ecosystems
Thanks everyone for joining this late afternoon session and welcome again to re:Invent. My name is Stefano Sandrini, and I currently lead a team of solutions architects covering small and medium businesses in Europe South and France, specifically Italy, France, Spain, and Portugal. Today, I'm joined by David González, CTO of Cires21. Together, we will show you how Cires21 is revolutionizing advanced video workflows using AI on AWS.
Over the last seven or eight years, I've been fortunate to work with many media and entertainment customers at AWS. Based on our conversations globally with these customers, we see that industry leaders and decision makers choose to use AWS to run their media and entertainment workflows for three main key reasons.
The first reason is the deep expertise and knowledge that AWS has in this specific industry. Decision makers want to leverage this expertise to drive and accelerate the digital transformation of their company. The second reason is about technology. AWS provides purpose-built services as well as pre-built solutions that customers can use to leverage cloud data and AI technology to accelerate innovation in their products, services, platforms, and operations. The third reason is about partners. There is a large ecosystem of AWS partners heavily focused on media and entertainment use cases. When we work together with customers in the media and entertainment industry, they can find the right partner for their use case, the right partner that fits their business needs and unique requirements. Together with AWS and the partners, we can accelerate the time to value for those customers.
When we start working together as AWS, the customer, and the partner, we see impact in six main solution areas for media and entertainment. There are areas related to classic media and entertainment use cases like content production, media supply chain, archive, broadcast, live production, direct to consumer, and streaming. However, there are two areas I want to highlight because they relate to a very common and important topic nowadays in the media and entertainment industry: reinventing monetization across all channels.
This is mainly due to the fact that the media and entertainment industry is going through a transformation. Media and entertainment customers want to use data and analytics to better understand their customer base and enhance the customer experience while consuming media content. They also want to provide a hyper-personalized set of content for customers or cohorts of customers. Additionally, data is used to acquire new customers through specific, targeted advertising based on the data we can collect together.
Beyond these six main areas, we must talk about the impact of generative AI in the media and entertainment industry. There are five main and common use cases of generative AI applied to media and entertainment: archival restoration, enhanced broadcasting, and localization, which is very important for market penetration. You create content for a specific language and geography, and then you want to penetrate other markets by doing automatic dubbing or automatic closed captioning. We already discussed hyper-personalization, but there is one more thing I really like to talk about: video semantic understanding. This means understanding the semantic meaning of video content so you can search using semantic search and create new content.
For example, I can create new content based on what I already created in the past for repurposing, for social media and social channels. We will see some of these use cases together with David. So without any further ado, I would like to ask David to join me on stage to show the MediaCopilot from Cires21.
MediaCopilot: Building a Unified AI-Powered Platform with Serverless Architecture and AgentCore
Thank you, Stefano. Hi, everyone. Cires21 started its business almost 20 years ago. We are a company very focused on providing live streaming services in Spain, and my clients are broadcasters. Two years ago, we started to develop our own AI pipelines. As we work very closely with our customers and clients, we know their workflows very well. We noticed that they were using different applications with a very fragmented ecosystem of applications for their regular operations. This caused low content delivery and high costs, and there was a lot of duplicated work.
That is why we decided to develop MediaCopilot. MediaCopilot is a unified platform that integrates many different capabilities of AI models. It is orchestrated by agents, offers faster content delivery, and lower costs. Let me dive into the details of the architecture. This is a classic serverless architecture. We decided to go serverless because it offers much faster time to market and provides reliability and scalability. The API is Lambda, API Gateway. We use Step Functions as the core orchestrator of all the workflows of MediaCopilot. We also use SageMaker, which we will talk about later. We use Amazon Cognito, which offers us the capability to provide two-factor authentication, single sign-on, and also Amazon CloudFront and S3, which offer not only worldwide delivery but also content protection.
One of the main reasons why we chose AWS was media services. Media services offer us the services we need to transcode video. We use MediaConvert, for instance, to transcode all the assets that have been unlocked by our clients to set up and send the video to the rest of the AI models. We use MediaLive and MediaPackage to receive live feeds, and we use harvest jobs to do live clipping of the live feed. On top of this, we have developed a video editor that is quite useful for live clipping creation and also adds the possibility to add subtitling and branding and styling capabilities.
As I said previously, we have developed AI pipelines. We have custom automatic speech recognition models, voice activity detectors, diarization, and scene detection. We do not only process audio but also video. We also use Amazon Bedrock to create more complex metadata like subtitles, highlights, and summaries. All the capabilities of MediaCopilot are available through the API. Most of our clients use the API instead of the UI because they integrate MediaCopilot into their system workflow, into their CMS, into their media asset management, or their live clipping service. So MediaCopilot acts as an AI layer for them.
Two months ago, more or less, we started to develop agents for MediaCopilot. Fortunately, AWS released AgentCore, which is a very interesting service that offers different services like runtime , gateway, identity, observability, memory, which are quite interesting and very scalable, very secure, and extensible. So it was very suitable for us.
We started by developing the MCP server. An MCP server is basically a server that has different tools with capabilities that allow it to connect to different elements like databases, APIs, and code. The first version of the MCP server that we developed was created using Gateway, a service that allows us to deploy an MCP server in minutes instead of hours because you only need to provide the OpenAPI specification of your API and it automatically creates all the tools available for applications and agents.
Then we started to develop the agent. We developed the agent with AgentCore runtime, which is very interesting because AgentCore runtime is stateless and handles the session automatically. For each session, AgentCore deploys specific resources for that session, encapsulating the context of the session. This is critical for us because content protection and privacy are our main priorities. So this is a very cool feature. In this specific example, the user is asking the agent to perform a complex task like finding the best moment or the most viral moment of an interview, creating a vertical clip for social media, and then burning the subtitles and exporting everything with the captions to external metadata automatically. So this is one simple example of what the agent can do.
We also use memory, which is a service of AgentCore that provides short-term memory and long-term memory. This is very useful to recover all the context of past sessions. We use long-term memory to store user preferences so the agent can learn from the user what their preferences are in terms of styling or text generation. Observability is also a very important service for us because it allows us to monitor everything that is happening in sessions of the agent. So we can identify bottlenecks or optimize the workflow.
The lessons we have learned so far are the following. Developing MediaCopilot has been very challenging. One of the lessons that we learned is that it is quite important to select the right deployment model in SageMaker. SageMaker offers different deployment models. We are using a synchronous inference endpoint that is very good for VOD, but it is not enough for live. For live, you need to use real-time serverless or a serverless service. We are starting to integrate segmented video processing because if you can segment your video and process the different segments, you can run everything in parallel and reduce the time of the inferences. Serverless was key for us, as I mentioned previously.
So the next steps are real-time processing for live so we can make decisions while live is happening. We are starting to develop specialized AI agents. We find that as you use fewer tools in your agents, you can save more tokens in the elements. So more agents with specialized capabilities are better. We are also integrating new models, especially visual models, to add more context to the agent. Thank you.
; This article is entirely auto-generated using Amazon Bedrock.




















Top comments (0)