Kazuya

Posted on Dec 5, 2025 • Edited on Dec 8, 2025

AWS re:Invent 2025 - Transforming AI storage economics with Amazon S3 Vectors (STG318)

🦄 Making great presentations more accessible.
This project enhances multilingual accessibility and discoverability while preserving the original content. Detailed transcriptions and keyframes capture the nuances and technical insights that convey the full value of each session.

Note: A comprehensive list of re:Invent 2025 transcribed articles is available in this Spreadsheet!

Overview

📖 AWS re:Invent 2025 - Transforming AI storage economics with Amazon S3 Vectors (STG318)

In this video, Mark Twomey, Vijaya Chakraborty, and Frank Ouyang from March Networks introduce Amazon S3 Vectors, the first cloud object store with native vector storage and query capabilities. They explain how vector embeddings encode meaning into mathematics for AI systems, but create cost and complexity challenges at scale. S3 Vectors addresses this by delivering sub-second query performance (down to 100ms for warm queries), supporting up to 2 billion vectors per index, and reducing costs by up to 90% compared to alternatives. The session covers technical details including metadata filtering, hierarchical clustering for efficient search, and integrations with Amazon Bedrock Knowledge Bases and OpenSearch. March Networks demonstrates a real-world application using S3 Vectors for AI-powered video surveillance, enabling semantic search across billions of video snapshots to identify safety compliance issues like blocked exit doors across hundreds of retail locations.

; This article is entirely auto-generated while preserving the original presentation content as much as possible. Please note that there may be typos or inaccuracies.

Main Part

Introduction to Amazon S3 Vectors: Solving the AI Memory Problem

Hello everyone. My name is Mark Twomey. I am a Senior Solutions Architect for Amazon S3. I'm here with my colleagues today, Bijeta Chakraborty and Frank Ouyang from March Networks, and we are going to talk to you about Amazon S3 Vectors. Thank you for joining us at AWS re:Invent. I don't know about you, but my fitness tracker thinks I'm running for my life with all the steps between sessions.

What we're going to discuss today is a problem that we have solved and a problem we have created. The problem we have solved is that we have created the Star Trek computer. We now have something where you can ask it a question, it understands what you mean, it will generate an answer, and then try to kill you in the holodeck, like every episode of Star Trek. But in reality, we now have artificial intelligence systems that understand meaning. They understand meaning for a simple reason: when we look at things like text and images, computers have no idea how to analyze or synthesize these things. What we have is vectors.

Vectors are how we encode meaning into mathematics. They are ordered arrays of numbers that allow us to encode meaning into mathematics—text, images, sounds, movies. They allow artificial intelligence systems to analyze and synthesize. Your brain does this automatically. Evolution has built it that way. You're doing it right now. Computers are mathematical engines, so we need some form of intermediate layer for a computer to understand meaning. That intermediate layer is what we call a vector embedding.

What are vector embeddings? They are the universal language of meaning in computer science. If it doesn't have a vector embedding right now, it will have one soon. This is how we are turning the world, allowing artificial intelligence systems to understand the world and understand us. We have to use some form of mathematics to do that. A vector embedding is the equivalent of a brain scan of an embedding model. You have these incredibly powerful embedding models available from Amazon through Titan or Nova and other models, as well as third-party models available in Bedrock. You pass it something—it could be a sentence, a paragraph, a document, media, an image, a whole bunch of pixels—and what the embedding model thinks about that becomes a vector embedding.

These things are very tightly aligned as a result of capturing a layer inside an embedding model and converting that into a large amount of mathematics, which you probably want to store somewhere. It is like the long-term memory of artificial intelligence. That's what vectors are. If you look at the graphics here, we have embeddings of different music tracks. You'll notice that they're similar colors and they cluster together. That's what vector embeddings do when they understand stuff—they cluster together.

What makes vector embeddings special? I mentioned already that they convert data into an ordered array of numbers. The features of something, what an embedding model thinks about what you have shown it, gets encoded across all of these numbers. The individual numbers we call dimensions, usually 32-bit floating point numbers, and you end up with this vector embedding—this one string that could go on for tens of pages. That entire thing is evaluated all at once. Vector embeddings quickly start stacking up. That is the problem that we have created. We have created a cost and complexity problem. We have given artificial intelligence this long-term memory, but now we have created a cost and complexity model because these things need to be stored, managed, and made available. That's the problem that we look to address with Amazon S3 Vectors.

The Mathematics of Meaning: How Vector Embeddings Enable AI Understanding

Now, because vector embeddings are mathematical, you can perform arithmetic in them. If you were to present the idea of a dollar—I don't have one on me right now because everyone's using cards or some form of tap—but if you were to present a dollar, it is a unit of exchange.

It is a monetary instrument. It's legal tender. There's a physical object or not, because it can be a virtual object through some form of code. What an embedding does is convert all of that into a string of numbers. If you were to take the meaning of a dollar and subtract the meaning of the United States from it, but add the meaning of Europe, and then ask what is this number to the embedding model, you're going to get back a set of results. In this case, the most similar result would be a euro. This isn't a party trick. It's how vector embeddings work. You can do this with other ideas. It's a principle of computer science. You can add, subtract, multiply these numbers, and get answers from them.

is the world's greatest repository of unstructured data. I would say outside of the great library of Alexandria, it is in history the greatest source of unstructured information. Amazon S3 is the cloud-scale unstructured repository, the largest one in the world. You can take that as a data source and chunk that up. You're not going to provide everything. You decide to take all this data, which could be massive movies, text, images, or whatever else is going on. You could have scanned images of the world's art. Then you present that to an embedding model. AWS supports many embedding models, and we have many third-party partners as well as our own models. The embedding model then generates those vectors, but they need to be stored somewhere, and that usually goes into some form of vector store. Amazon offers several vector stores.

How do you get the meaning back out once you've stored it? I've taken the vectors, I know what the model is thinking about or what it is perceiving at the time. How do I do that? Well, you need to search across all of these vectors, and this is a mathematical operation. Traditional databases are absolutely useless for this work. We are in the world of meaning, we are in the world of similarity, we are in the world of mathematical distance. Traditional databases with rows, tables, columns, and joins don't work in this case. You need specialized algorithms. This is about distance. To work with distance effectively and get an answer back in a reasonable amount of time, you need to narrow the search space, measure distances between all these different mathematical points that you've stored, and then rank and return a set of results.

Vector Search in Action: From Recommendation Engines to the Vector Explosion

We have the idea of a nearest neighbor search. Vectors are in the domain of similarity, finding things that are similar, not exact. If you want exact matching lexical searches, that's what traditional databases are for. Recommendation engines are the backbone of the internet and your experience with the internet right now. You see them for things like Prime Video or any other streaming service. These systems recommend you something that you might like. Let's say I watch a superhero satire show like The Boys, an Amazon show. Maybe I want something more like that. How would I find other shows like that? How would the system recommend these things for me?

Since my account tends to be the default one, other people watch things in my account, things that maybe I would not be interested in, which means I somehow get weird results. For example, I could watch historical epics or superhero satires, and I will get recommendations for The Summer I Turned Pretty. This is a show aimed at young women and teenage girls. There are also various Korean dramas, which are watched a lot in my house as well. No matter how many times you tell someone to use their own account, recommendations are always interesting. I mentioned earlier that vector embeddings cluster together. Well, if you were to turn a television show into a vector, that also will cluster together, and that's how a system can recommend something to me.

Let's say we narrow the sphere here. We have narrowed our search area. I'm just interested in superhero satires. This index has other things in there too. There could be production teams who have worked on different shows, there could be actors that are common.

However, we have narrowed the search space and gotten in as close as we can, as quickly as we can, using specialized algorithms to do that. Then we're going to calculate distances between everything in here to rank and rate what is interesting to me.

Once we rank these things and return results, I've done a vector search for whatever The Boys is. It has brought back and recommended a superhero show to me, such as other shows like Invincible, another Amazon show. How amazing, at an Amazon event and I'm plugging all the Amazon shows. But there are other shows here that are not Amazon shows, and they recommend those too. It also recommends The Summer I Turned Pretty no matter what I do to try and prevent it from doing that.

All of this, turning everything we have into recommendations and into meaning, has led to this explosion of vectors. With these embedding models, they tend to be specialized. Some do text, some do audio, some do video. But giving people the ability to search across all of these media, you could create billions and billions of things out there, as Carl Sagan used to say about the stars in the galaxy. You create these billions of vectors, and all of this adds up.

A vector embedding can be 4K. Then you could be done with your embeddings because you're deciding that maybe I need movies, or maybe I need scenes from movies. Well, depending on how many movies you have, you can be approaching gigabytes, tens of gigabytes for just that one data set. Granularity matters here. The more you decide to get granular with something, the more vector embeddings you're going to generate, and the more storage you're going to use.

The mathematics compounds fast. Vector storage can reach gigabytes and terabytes. There's always someone out there who's going to do a petabyte. Some one of you out there, I'm going to say I'm going to do this whole thing where I said no one's ever going to hit that number, and someone's going to say I'm going to hit that number, right? But the vector explosion is out there.

Addressing Scale and Cost Challenges with Amazon S3 Vectors

We have more stuff coming all the time, and as artificial intelligence proliferates, you're just going to have more of these, not less. There will be more vector embeddings for you to manage, not less going forward. For that, you need a specialized vector store. I mentioned that you can have pages of numbers. We call that dimensionality. The more collections of numbers you have, the more dimensions you have.

What are dimensions? Dimensions are features of something. Everything gets encoded into one string of numbers. So you need something that can handle pages and pages and pages of numbers for one embedding. It needs to be able to do that search where you can zoom in very quickly, calculate distances between everything that might be interesting to you, and return results in milliseconds, under one to two seconds. You need to be able to do that fast.

It also needs massive scalability. You need massive scale in this stuff and specialized indexing algorithms to make sure that the data you put in is there and available immediately. Storing vectors at scale gets expensive fast. There are compute costs, baseline compute. You're not doing any queries, but you're still burning CPU cycles and you're paying for that.

There's memory requirements. RAM is about to get very, very scarce in our industry. I've seen today some providers actually pulling out of the consumer market because they're focusing on the enterprise market for RAM production. But memory requirements for vector stores tends to be incredibly large. We're talking tens of gigabytes, hundreds of gigabytes for vector performance.

Then there's licensing costs. Enterprise-style licensing for these vector stores means you're going to pay for backup, you're going to pay for your queries, you're going to pay for anything else that's going on there. You pay, pay, pay. That's why in July of this year, in preview, we introduced Amazon S3 Vectors, the first cloud object store with native support to store and query vectors. We've taken S3 scale and S3 performance and made that available for vectors. When you consider internet scale and cloud scale, S3 is the thing to think about, and we're doing that for vectors.

What did we deliver in July? We delivered sub-second vector query performance at scale, less than one to two seconds, up to 90% lower costs for what you're doing, uploading, querying, and so on. We also delivered the ability to store billions of vectors. We had an index size of 50 million when we came out first. That has now changed.

General Availability Announcement: Enhanced Performance and Global Expansion

And to talk about what has changed and where Amazon S3 Vectors is today, I'd like to invite Vijaya on stage, product manager for Amazon S3 Vectors, who's going to take you through vectors now and vectors going forward. Vijaya, thank you. Thanks, Mark. Hi everyone, my name is Vijaya Chakraborty, and I'm really excited to talk to you all about S3 Vectors. Just like Mark said, Amazon S3 Vectors is the first cloud object store to have native support for storing, querying, as well as uploading vectors. At public preview, we were delivering subsecond latency, which is critical for you to build AI applications that need instant query results. We also shipped a specialized set of APIs that are purpose-built for vector workloads, making it really straightforward for you to build search applications or RAG workflows. For those of you that are building RAG workflows right at preview, S3 Vectors integrated with Bedrock Knowledge Bases as well as Amazon OpenSearch to build powerful hybrid search applications.

Since that public preview date in the last four months, we've had tens of thousands of customers using S3 Vectors. In fact, we've seen more than 250,000 indexes created. They've ingested more than 40 billion vectors on S3 Vectors, and in the past two weeks we crossed the 1 billion mark of queries that customers have run on S3 Vectors. Let's look into two separate examples of how customers are building on top of S3 Vectors. Here's an example of a biotech firm that's using S3 Vectors to build semantic search on scientific literature.

In this particular biotech firm, we have a team of PhD scientists as well as entrepreneurs whose job it is to find the next science hypothesis or breakthrough hypotheses and turn them into billion-dollar companies. But here's the challenge. When you're looking for the next discovery, you need to first know what's already been researched, and we're not talking about looking through a few documents or even thousands of documents. We're talking about 30 million research papers, scientific journals, information from PubMed, clinical trials, all of that. Previously, it used to take the scientists weeks to manually review through these databases and piece together insights that they could then put together for their next project.

Here's how they're changing that with Amazon S3 Vectors. As you can see on the left-hand side, they're taking all of their scientific documents. These are all those PDF documents, research papers, scientific journals. They're putting them into a general-purpose Amazon S3 bucket. They're processing them through machine learning models and then generating embeddings on the other side. Now they're ingesting all of those embeddings into S3 Vectors, and we're talking about hundreds of millions of embeddings in S3 Vectors to represent all of the scientific literature.

Now let's say a scientist comes along and they want to do a search like "find me novel protein targets for autoimmune diseases." They're not having to do manual keyword searches. They're using S3 Vectors to power a top-K nearest neighbor search that returns results within a second and also is doing metadata filtering. What we've done here is we've compressed that week-long process or weeks-long process into a subsecond semantic search.

Here's a different one. A lot of us are building agentic applications. We're thinking about agents, and I'm sure a lot of you in this room are thinking about or building agents yourself. With agents, you have a new version of the same exact problem. Instead of scientists having to go through millions of documents, it's the agents that now need to go through thousands or hundreds of thousands of tools. In fact, when you're building agents for your organizations, you first need to give your agents access to all the tools in your organizations. These are APIs, Lambda functions, could be your Slack integration or Salesforce integration, it could be anything, database connections, right? And what we've seen is that when you give your agents access to that many number of tools, the agent can get confused. It can even hallucinate and come up with tool ideas that don't even exist.

Here's how Amazon Bedrock Agent Core gateway is changing that with S3 Vectors. When you build an agentic application or you build an agent with Agent Core, Agent Core takes your tool information, for example, the tool name, description, parameters, and converts those into embeddings and stores those into Amazon S3 Vectors. So now let's say you're trying to accomplish a task through the agent like find order information or send customer an email. You're not having to get back those thousands of tools.

S3 Vectors is doing a subsecond semantic search and giving you back or giving agent core back the 5 or 10 relevant results. Those are just two different examples of how customers are building on S3 Vectors. It's been overwhelming looking at how customers are finding different ways of building on top of S3 Vectors. We've seen customers build all sorts of searches on text, on images, on videos. There are obviously RAG workflows that customers are using it for. They're using it for building agent memory. We have life sciences companies also using it to store representations or embeddings of molecular representations and chemical notations.

As they've been building with S3 Vectors, they've been giving us a lot of feedback. Here on this slide are the top 4 things that we've been hearing from customers since public preview. First of all, they've asked for lower latencies. As customers are building user-facing applications, they're building chatbots, they're building more interactive search experiences or recommendation engines. Every millisecond matters. Second, they're asking for bigger size of indexes as their vector applications are growing. They want the infrastructure underneath to scale with it.

Third, faster ingestion of vectors. A lot of the customers want the flexibility of sending tiny batches of vectors to S3 Vectors but want to do it at a much higher concurrency to accelerate the ingestion process. And then more AWS regions. They wanted to see S3 Vectors closer to their data, closer to their users, and in compliance with their regional requirements. We listened to the customer feedback and in the past 4 months since public preview we have continuously improved the product. We're really proud to announce the general availability release of S3 Vectors.

We're down to 100 milliseconds of warm query latency. We've grown the index to 2 billion vectors. Per index you can create 10,000 indexes per vector bucket so you can store trillions of vectors in fact. For those of you that wanted the flexibility of sending smaller batches or a vector at a time to S3 Vectors, we've improved the write throughput to 1,000 transactions per second. And we've done all of this while keeping the headline the same: with S3 Vectors you can save up to 90 percent costs for storing, uploading, and querying vectors as compared to alternative vector solutions.

At GA we've grown 40x in fact since public preview. We went from 50 million vectors per index to 2 billion vectors per index. And you can create 10,000 vector indexes per vector bucket and 10,000 vector buckets per AWS account. So essentially you're storing not billions but trillions of vectors and not having to worry about costs, and this is directly on S3. We're still at subsecond query latency for cold queries, and this is the thing that makes it really cost effective for batch workloads or infrequent workloads of queries.

However, for warm queries we're down to as low as 100 milliseconds. This is really good for when you have repeated or similar query data. S3 Vectors also automatically caches this data for several minutes since last access. We've added additional capabilities, for example, index level encryption, tagging for you to do cost allocation and access control, CloudFormation, as well as PrivateLink for those of you that want private connectivity into S3 Vectors. And we've added S3 Vectors to 9 additional AWS regions, bringing the total up to 14 AWS regions since public preview.

Core Components and Architecture: Vector Buckets, Indexes, and Query APIs

Let's look at some of the core components that make this all work. First of all, you have vector buckets. Think of these as specialized buckets that are purpose-built for vector workloads. Unlike traditional S3 buckets that are purpose-built for objects, these buckets are able to meet the unique requirements of storing, uploading, and querying high-dimensional vector data. Second, vector index. This is a new data structure within the vector bucket, and this contains your vectors as well as the metadata associated with those vectors.

The vector index is the thing that's maintaining the relationship across these vectors to power that sort of similarity search. The metadata that you associate with these vectors could really be anything: document ID, timestamp, release, genre, anything that you want to use to filter your search on or associate with the vectors. And then third, vector APIs. We've shipped a complete set of dedicated APIs that are purpose-built for vector workloads. These are built from the ground up and make it straightforward for you to upload vectors, list vectors, query vectors, and delete vectors when you need them.

Let's specifically talk about what the query vectors API looks like. This is what you'll use to do similarity search as well as filter on metadata. Query vectors in S3 Vectors doesn't just do a similarity search—it does more than that. It performs similarity search while filtering on metadata simultaneously.

On the right-hand side of the slide, you can see what the query vector API looks like. To start a search, you provide a few things to S3 Vectors. First, you provide the query vector. Second, you provide the number of nearest neighbor results you want back, which we also call the top K. You also provide the vector index that you need to query, as well as optionally the metadata that you want to filter on. S3 Vectors doesn't do a similarity search and then filter those results against metadata. Instead, it performs metadata filtering while doing the similarity search, so you're more likely to get the results you actually want—similar items that also meet your business logic.

By default with the query vectors API, you get back the vector keys. Optionally, you can also get back the vector embeddings, the cosine distance like we saw in the Prime Video example, the actual scores, as well as optionally the metadata associated with these vectors. I'm going to use this particular example of a movie index to talk about the different types of metadata that you can associate with vectors. There are really two things: filterable metadata and non-filterable metadata.

Filterable metadata, as you can see in this example, includes things like genre, release year, and rating—all the things you want to filter your search on to narrow down your results. We provide a bunch of operators to use with your filterable metadata, for example, equality, boolean conditions, and lists. Non-filterable metadata is interesting because this is something you're not filtering against. However, you might want this data back along with the vector keys as part of your nearest neighbor results. This can contain things like original text and the source URL. The source URL can be used to point back to the asset that's being described by this particular vector embedding.

The number of key-value pairs or metadata in total that you can associate with each vector is 50—that's 10 times what we had at public preview. By default, all of those key-value pairs or all of that metadata is filterable. However, you can specify 10 of those key-value pairs to be non-filterable.

So how does this nearest neighbor search work on S3 Vectors? When you're ingesting vectors into a vector index, S3 Vectors organizes your vector space using advanced partitioning techniques to power semantic searches. Instead of brute-force searching across millions, billions, or trillions of embeddings, S3 Vectors divides that vector space into hierarchical clusters of vectors. When the query vector comes in, S3 Vectors navigates those clusters, pruning away the irrelevant clusters, and zooming in on the most promising candidates. As a result, you perform the cosine distance math and distance computation only across a subset of your vectors—maybe a few thousands—rather than across millions and billions of vector embeddings.

The outcome is that you can easily perform similarity search across billions or trillions of vectors with subsecond query latency for cold queries and as low as 100 milliseconds for warm queries, and you can do all of this with very high recall accuracy. This is the last thing I want to touch on regarding the internals of S3 Vectors. S3 Vectors shares a lot of the characteristics of Amazon S3 that have made S3 the foundation for millions of applications for nearly two decades. First, availability—your vector data has the same four nines of availability as S3. Durability—yes, we are 11 nines durable, so your vector data has the same amount of redundancy and protection as any other object in S3. Performance—S3 Vectors intelligently caches similar or repeated query data to get you down to that subsecond latency.

This is similar to how Amazon S3 optimizes reads based on access patterns. Finally, S3 Vectors adapts to your request patterns transparently. When your vector applications scale from 10,000 vectors to millions or billions, the infrastructure underneath adapts with it. What we love about this architecture is the simplicity. You're not managing infrastructure, worrying about scaling up your applications, or managing complex cluster configurations. This is all S3—the same scalable, reliable, cost-effective storage that we all love and trust, just purpose-built for vector workloads.

Building with S3 Vectors: Implementation Best Practices and Pricing Model

We've talked about some of the internals of how all of this works together. Now let's see what it takes to build a similarity search with S3 Vectors. The first thing you're going to do is start with your source data, which could be anything—text, images, videos, customer testimonials, or any information you're trying to build a search on top of. You're going to take all of that data, run it through machine learning models, and generate embeddings. Then you're going to use a CLI or SDK to ingest those vectors into an S3 vector index.

There are two performance best practices I'd like you to keep in mind. Even though the vector index has grown to 2 billion vectors per index, if you are able to isolate your vector data by users, by regions, or by some sort of business logic, we still recommend that you distribute your vector data across multiple indexes. This can really help accelerate your ingestion because you're able to write in parallel across multiple vector indexes at a time. Also, when you query, you're always querying a subset of your vector space, which can potentially improve your query latency as well as lower your query costs.

The second thing to keep in mind is that you have the ability to send several hundred vectors in a single batch to S3 Vectors, or you have the ability to send one vector at a time in micro-batches and send up to 1,000 transactions per second to accelerate that ingestion process. Once you've done those things and inserted those vectors, you're ready for the search team to come in. A user sends the search text, which also needs to be generated through the embedding model—the same embedding model that you used for your source data. Then you generate an embedding from that and send that search embedding to S3 Vectors, and there you have it. S3 Vectors sends you back the top K nearest neighbors as well as optionally filters and metadata if you chose that. It really is as straightforward as it looks.

We're going to change gears a little bit and talk about S3 Vectors pricing. S3 Vectors changes the cost model for vector storage costs with three really simple pay-per-use components. First is put operations. You're only paying for the writes or uploads when you're doing them. You're not paying for them otherwise. Second is storage. You're paying a fraction of the cost for storage because you're taking advantage of S3's industry-leading cost effectiveness. Third, you're only paying for queries when you run those queries. You don't have compute spun up, you're not managing compute infrastructure, and you're not managing idle capacity. You're only paying for queries when you run those queries. Additionally, you get a lower query tier if you're managing bigger indexes.

We're going to walk through a particular example to see what the map looks like. Here we're building out a RAG workflow. In this particular example, let's say you have 10 million vectors, each with 1,024 dimensions, which roughly comes up to about 4 kilobytes per vector. Now we're going to add 2 kilobytes of metadata to these vectors—1 kilobyte of filterable and 1 kilobyte of non-filterable. So roughly we're getting at 6 kilobytes of vector information. We're going to distribute these 10 million vectors across 40 indexes, so each of the indexes has 250,000 vectors.

As our applications and data change, we're going to want to keep these vector indexes updated.

Let's assume that we're going to roughly upload or rewrite 25% of each of those indexes every month. And then finally, assuming we're running 1 million queries per month across all of these indexes, let's see what the cost looks like.

First, let's look at the storage costs. We talked about roughly 6 kilobytes or so per vector with the metadata, which is roughly about 59 GB of total vector storage. At 6 cents per GB, that's $3.54 for storage cost.

For PUT cost, there are two components here. First, you're getting the upfront cost of writing all of these vectors the first time into S3 Vectors, which is about $11.80 with the 20 cents rate per GB. And then remember we talked about refreshing that vector index, 25% of each index per month, so that roughly comes up to $2.95 recurring per month.

For 1 million queries, let's see what that amounts to. There's first of all the request charge for the 1 million queries, which is $2.50. There's a tier one query. So for the first 100,000 vectors, you're paying the tier one rate, which is $1.95. And then you take advantage of the lower tier for the remaining 150,000 vectors. For the 1 million queries, that's roughly $1.46. So let's bring that all together. We're looking at a total upfront cost of $11.80 and then the recurring cost of storage, the writes, and the queries that comes up to $12.41. Even if I put all of that together, that still costs less than your lunch.

AWS Ecosystem Integrations: Bedrock Knowledge Bases and OpenSearch

For the next few slides, we're going to talk about the integrations that we've had since public preview. There are two integrations that I want to talk about. S3 Vectors integrates with Knowledge Bases for Amazon Bedrock, allowing you to put an S3 Vectors bucket as a backing vector store, and you can create Bedrock knowledge bases directly using Bedrock or using SageMaker Unified Studio. S3 Vectors also integrates with Amazon OpenSearch, and this is if you're looking for lower latencies as well as to build powerful hybrid search applications.

Here's what that first integration looks like. Whenever you're starting from source data, whatever your source data is, processing them through machine learning models, generating embeddings, ingesting them into a vector store, there's a lot of things and a lot of variables to consider. Amazon Bedrock Knowledge Bases is your fully end-to-end managed workflow for doing that. And now you have the ability to choose S3 Vectors as the backing vector store behind Bedrock Knowledge Bases.

Here's the second integration that we talked about. S3 Vectors integrates with managed OpenSearch as well as with OpenSearch Serverless. The first pattern, we tend to see that one for cost reduction. If those of you in the room that have used managed OpenSearch, you'll see the cost of compute can really add up. Here's why you have an option to use S3 Vectors as a backing vector store to do the similarity search while still using the managed OpenSearch hybrid search capabilities.

The second pattern we see is for better performance and lower latencies as well as for hybrid search. You come in and ingest all your vectors into S3 Vectors bucket indexes. You have the ability to export those vectors into OpenSearch Serverless collections and then you get the cost benefits of using S3 Vectors and then you can still use the OpenSearch APIs to do hybrid search or anything else that you prefer from OpenSearch.

We've got really good customer engagement since public preview for the Bedrock Knowledge Base integration as well as the OpenSearch integrations. With that, I think I covered all the things that I wanted to talk about on S3 Vectors. I'd love to bring Frank Ouyang on stage to talk about how March Networks is using S3 Vectors to build AI smart search.

March Networks Case Study: AI-Powered Video Surveillance at Scale

Thank you. Hello everyone, my name is Frank Ouyang. I'm a Vice President of R&D at March Networks. Thank you for joining us. March Networks is a leader in intelligent video surveillance solutions. What sets us apart in the industry is our ability to provide business intelligence to a wide range of teams across an organization from security, IT to compliance and marketing. Thanks to our advanced AI analytics, we transform video footage into actionable insights that help.

March Networks' vision is simple: to lead the future of intelligent video solutions. With AWS as a key partner, we are making that vision a reality. Today, I'm here to share a recent collaboration between March Networks and AWS in building video AI smart search. I hope you'll walk away with some practical insights that you can apply to your own projects.

For decades, video surveillance has primarily focused on recording events, not interpreting them. Security teams often spend hours reviewing footage, trying to find meaningful moments. Generative AI has changed the game. It's transforming what video systems can do. They are no longer passive watchers, but intelligent partners that help organizations operate smarter, safer, and more efficiently.

There's no need to spend hours scrolling through footage with generative AI. You can simply ask questions such as "show me open blocked exit doors" and receive results in seconds. Vector databases are a key element of that magic. The challenge is to make the system cost effective and scalable. Many of our customers produce thousands of hours of videos in one hour. That's millions of vectors, and often they want to search up to two years of footage. The number of vectors becomes billions.

Let's use one use case, blocked exit door detection, to appreciate the challenge and impact. It is a safety compliance requirement for retailers to ensure exit doors are not blocked. The fines can be hefty for violations. We have a customer who paid a few million dollars last year. Instead of reactively paying fines, retailers wanted to actively look for blocked access to identify hotspots so they can do targeted training and improve operational procedures.

For retail with hundreds of locations, checking in person is not possible. There are cameras pointing to exit doors, but they produce thousands of hours of video. To identify blocked exit doors through watching videos is like searching for a needle in a haystack. AI-powered search powered by vector databases can do semantic search across massive video content, allowing retailers to identify blocked exit doors across hundreds of locations with a single query: "Show me blocked exit doors." This turns passive video surveillance into an active compliance tool so they can build more compliant and safer operations.

We worked very closely with AWS. This partnership actually significantly accelerated our time to market for AI smart search. Video snapshots from all cameras are streamed to an S3 bucket. The total number of cameras a customer has can be quite large. We have a bank customer with over 50,000 cameras. In one single hour, those cameras will produce 50,000 hours of videos. If a movie is two hours, that's 25,000 movies, more than what Netflix has. This is just one hour, and the cameras run 24/7, so the data volume is huge.

Video snapshots are stored in an S3 bucket. Bedrock with the Titan embedded model starts generating vectors, and those vectors are stored in Amazon S3 Vectors and indexed for fast retrieval. Meanwhile, users can start asking questions in natural language. The question gets converted to a vector by Bedrock as well, and our search service will compare it against the stored snapshot vectors and return the best matches. Combining March Networks' deep expertise in video surveillance with the power and flexibility of AWS AI and cloud platform,

this analytical pipeline allows tens of thousands of users to run queries on millions of snapshots and return results in seconds. Based on our experience, S3 Vectors offer some key advantages. First, it is cost-effective. It stores billions of vectors at a very low cost, making our system economically viable at a massive scale. Second is the AWS integration. It is seamlessly integrated with S3 storage and Bedrock. This allows our team to focus on building AI features instead of managing infrastructure. The third advantage is Amazon's performance and stability. We know that S3 at massive scale provides 11 nines of durability and high throughput, which enables subsecond semantic search across an entire video archive.

It is essential to know that the technology is advancing rapidly. We are still at an early stage. At this point, human verification is key because the results are not always perfect. However, accuracy is improving and new features are rolled out at incredible speed. The foundation we built today is going to enable far more sophisticated and impactful applications in the near future. With that in mind, let's see the technology in action.

Running a business is hard enough. Finding operational issues and opportunities shouldn't be, especially when potential fines, theft, and liability cannot wait. Hundreds of cameras and thousands of video snapshots are now made easy with one simple search. Introducing AI Smart Search, the industry's first generative AI-based voice-activated image search tool. With one easy voice prompt, you can show me unattended cash on the desk. Find long queue lines. Audit your operations to find exactly what you're looking for, then review the snapshots to find and fix issues fast. Without needing to be on site, you can instantly verify. Show me empty donut shelves, marketing signage changes, customer service opportunities, slip and fall hazards, if shelves are stocked, and ensure that safety and security protocols are followed. Search for specific operational and safety issues by uploading an image. Filter your results by location, date, and time. Increase search sensitivity to find exactly what you're looking for. Show me a blurry image and monitor your system for faulty devices. Get instant insights into your operations. Show me wet floors. Find opportunities to improve and eliminate potential risks to your business. Show me open back doors. Find what matters faster with AI Smart Search by March Networks.

Conclusion and Resources: Workshop Invitation and Next Steps

Thank you for your attention. Now I'd like to turn it back to Mark to wrap up. Thank you very much, Frank. I'd like to thank you and I'd like to thank March Networks for your great partnership in the preview phase. The work that you did really helped us build a better product. So bringing it all together, I mentioned earlier the problem we defined and what we solved. Now we can ask very interesting questions and get answers from that, but we created another problem. Our answer to that is to bring S3 economics to the problem of cost and complexity with a managed service that has S3 economics attached.

We understand the meaning of what we are storing. Other things understand the meaning, and we can trade these meanings with the videos. Frank was mentioning that you have customers generating Netflix-like levels of data, and that is just going to continue. We are at the start here. There is more, and more means more. S3 Vectors is one of our solutions, but our solution right here is cloud scale, durable, and cost optimized for vector indexes. Also, massive scale. I have up there a number which I calculated with a calculator one night. It is two billion vectors in ten thousand indexes in ten thousand buckets. That is two hundred quadrillion vectors per AWS account. Quadrillion.

I look forward to one of you being the first person to come up saying, could I have an increase in my quota, please? Because I've hit 200 quadrillion. Now, we have a session tomorrow. Everything we've discussed, all our integration, is at the MGM. Raise your phones if you want to go.

We have this workshop where you get to sit down and bring your laptop. You'd be surprised at how many people don't. We had someone complete it on their phone though, which was fascinating, probably more fascinating for them. This workshop goes through all the integrations from start to finish. You'll build chatbots, you'll integrate with OpenSearch, you'll run queries, all amazing stuff.

If you do not show up tomorrow, it's published. Written by myself and my colleagues, this is available. Take it home, share it with your friends. It makes a great Christmas gift, right? So if you're bored over the holidays, gather the kids around and have them do an S3 Vectors workshop. They'll have something to tell you about in your old age that they resent.

That's available out there, published right now. You can use it and do it. Also, if you'd like, you can talk to your Amazon solutions architect, whoever's working on the account, and they can set up an event for you if a bunch of you would like to do it in a room together. But it's available for you to do alone as well.

Again, written by myself and my colleagues, we have two blog posts which we did at launch. Bedrock with S3 Vectors integration and OpenSearch with S3 Vectors integration. These are posts we put some work into. We have a new set of blog posts coming, so we've done our first wave of integrations. People ask us, will there be more? Of course there will be more.

These things take time to cook. But when they're available, you'll know about it. With that, I would like to thank you for your time and your attention. Enjoy the rest of re:Invent. If you have any questions, we will be outside in the corridor. Thank you very much.

; This article is entirely auto-generated using Amazon Bedrock.