Elizabeth Fuentes L for AWS

Posted on Sep 8 • Originally published at builder.aws.com

Building Scalable Multi-Modal AI Agents with Strands Agents and Amazon S3 Vectors

#python #aws #ai #beginners

🇻🇪🇨🇱 Dev.to Linkedin GitHub Twitter Instagram Youtube
Linktr

Elizabeth Fuentes L

AWS Developer Advocate specializing in AI/ML and Generative AI. I simplify complex cloud concepts through hands-on tutorials and real-world examples.

GitHub repositorie: Strands Agent Samples

Part 3: Adding Persistent Memory with Amazon S3 Vector Store

Building sophisticated AI agents doesn't have to be complex. As we demonstrated in my first blog post, the Strands Agent open source framework makes it remarkably simple to create multi-modal AI agents with just a few lines of code. Whether you're processing images, documents, or videos, Strands maintains this simplicity while providing powerful capabilities.

This post continues my AI agent development series, building on our previous exploration of multi-modal AI agents with the Strands Agent framework using FAISS for local memory storage. Now, we advance to enterprise-scale capabilities with Amazon S3 Vectors – the AWS object storage service with native vector support that transforms how you build scalable AI applications without sacrificing the simplicity that makes Strands Agent so accessible.

From Local to Cloud: Production-Ready AI Memory

While FAISS works effectively for local development and prototyping, production AI agents require scalable capabilities. Amazon S3 Vectors provides these capabilities with cost advantages compared to traditional vector databases while maintaining subsecond query performance at scale.

Core Amazon S3 Vectors Capabilities

Amazon S3 Vectors offers cloud object storage with native vector support. This approach reduces the complexity of managing separate vector infrastructure while providing:

Massive scalability: Each vector index supports large-scale vector storage
Managed service: Fully managed with automatic optimization
Enterprise security: Native AWS Identity and Access Management (AWS IAM) integration with user isolation
Global availability: Multi-region support with built-in disaster recovery

Step 1: S3 Bucket Configuration

Before you begin, you need to create an S3 bucket that will serve as the backend for your vector memory.

Important configuration points:

Ensure the bucket is in the same region where you'll run your application
Enable versioning for additional security
Configure appropriate IAM access policies

Vector Buckets and Indexes

S3 Vectors introduces a bucket type specifically designed for vector data. Within each bucket, you can create multiple vector indexes, each capable of storing vectors with attached metadata for sophisticated filtering.

Distance Metrics and Performance

Amazon S3 Vectors supports both Cosine and Euclidean distance calculations.

Enhanced Multi-Modal Content Processing Agent

It's here in this notebook

Our updated Strands Agents agents now supports S3 Vectors natively usign the s3_memory tool , providing:

Core Memory Operations

store(): Persist agent conversations and learned insights
retrieve(): Query similar experiences and context
list(): Enumerate stored memories with metadata
auto_store_and_retrieve(): Intelligent context management
auto_context(): Dynamic conversation continuity

Multi-Modal Content Processing

Similar to our FAISS implementation, the new agent maintains full support for processing images, documents, and videos with persistent, cross-session memory.

🛠️ Setting Up the Enhanced Agent

First, you need to set the variables:

# Environment configuration for S3 Vectors
VECTOR_BUCKET_NAME = "your-vector-bucket"
VECTOR_INDEX_NAME = "agent-memory-index"
USER_ID = "unique-user-identifier"  # For user isolation

Let's start by configuring our agent with S3 memory capabilities:

# Model configuration
model = BedrockModel(
    model_id="us.anthropic.claude-3-5-sonnet-20241022-v2:0",
    region="us-east-1"
)

# System prompt for multi-modal processing with memory
MULTIMODAL_SYSTEM_PROMPT = """You are an AI assistant with multi-modal processing capabilities and persistent memory.

Your capabilities:
- **Multi-Modal Analysis**: Process images, documents, videos, and text
- **Persistent Memory**: Remember preferences, previous analyses, and conversation history
- **Context Awareness**: Use memory to provide personalized and contextual responses
- **Continuous Learning**: Build understanding over time through memory accumulation

Memory Usage Guidelines:
- Check for relevant memories before responding
- Store important insights, preferences, and analysis results
- Reference previous conversations when relevant
- Maintain conversation continuity across sessions

When processing content:
1. First retrieve relevant memories for context
2. Analyze the new content thoroughly
3. Store key insights and findings
4. Provide comprehensive responses using both new analysis and memory context
"""

# Create the multi-modal agent with S3 Vectors memory
multimodal_agent = Agent(
    model=model,
    tools=[
        s3_vector_memory,  # Our S3 Vectors memory tool
        image_reader,      # Image processing
        file_read,         # Document processing  
        video_reader,      # Video processing
        use_llm           # Advanced reasoning
    ],
    system_prompt=MULTIMODAL_SYSTEM_PROMPT

💾 Memory Operations in Action

1. Storing Initial User Context

First, let's store some basic information about our user:

response1 = multimodal_agent(
    f"""Hello, I'm Elizabeth Fuentes. You can call me Eli, I'm a developer advocate at AWS, I like to work early in the morning, 
    I prefer Italian coffee, and I want to understand what's in images, videos, and documents to improve my day-to-day work. 
    I'm also very interested in artificial intelligence and work in the financial sector.

    Please save this information about my preferences for future conversations.

    USER_ID: {USER_ID}"""
)

2. Image Analysis with Memory Storage

Now let's analyze an image and automatically store the results:

print("=== 📸 IMAGE ANALYSIS WITH MEMORY ===")
image_result = multimodal_agent(f"""
                                  Analyze the image data-sample/diagram.jpg in detail and describe everything you observe. 
                                  USER_ID: {user_id}"""
)

The agent will:

Process the image using image_reader
Analyze the architectural diagram Automatically store the analysis in memory using s3_vector_memory
Provide a detailed description

3. Video Analysis with Memory

Let's process a video and store its content:

print("=== 🎬 VIDEO ANALYSIS WITH MEMORY ===")
video_result = multimodal_agent(
    "Analyze the video data-sample/moderation-video.mp4 and describe in detail "
    "the actions and scenes you observe. Store this information in your memory."
)
print(video_result)

4. Document Processing with Memory

Process and remember document content:

print("=== 📄 DOCUMENT ANALYSIS WITH MEMORY ===")
doc_result = multimodal_agent(
    "Summarize as json the content of the document data-sample/Welcome-Strands-Agents-SDK.pdf "
    "and store this information in your memory."
)
print(doc_result)

🔍 Memory Retrieval and Management

Retrieving Specific Memories

# Retrieve memories related to a specific query
memory_result = s3_vector_memory(
    action="retrieve",
    query="preferences and interests",
    user_id=user_id
)

Listing All Stored Memories

# List all memories
memory_result = s3_vector_memory(
    action="list",
    user_id=user_id
)
print(f"Total memories in system: {memory_result['total_found']}")

Production Use Cases

AI Agent Memory Systems

Store conversation context, user preferences, and learned behaviors across multiple users with automatic scaling and enterprise-grade security.

Retrieval-Augmented Generation (RAG)

Build cost-effective knowledge bases that grow with your business requirements without infrastructure management overhead.

Semantic Search Applications

Process and search through large content datasets with optimized response times.

Personalized Recommendations

Maintain user behavior patterns and preferences with built-in multi-region availability.

Cost and Availability

Amazon S3 Vectors follows the AWS pay-per-use model with no upfront infrastructure costs. The service is currently available in preview across US East (N. Virginia), US East (Ohio), US West (Oregon), Europe (Frankfurt), and Asia Pacific (Sydney).

🚀 Getting Started

Clone the repository:

   git clone https://github.com/elizabethfuentes12/strands-agent-samples)
   cd notebook

Install dependencies:

   pip install -r requirements.txt

Configure AWS credentials for Bedrock access
Try the notebook:

   multi-understanding-with-s3-memory.ipynb

The Strands Agent Advantage

What makes Strands Agent unique is its commitment to simplicity without compromising on power. Whether you're building a prototype with local FAISS memory or deploying a production system with Amazon S3 Vectors, the core development experience remains consistent and approachable.

This concludes our three-part series exploring the capabilities of the Strands Agent framework. Throughout this series, we've maintained the core principle that building powerful AI agents should remain simple and accessible, even as we scale to enterprise-level capabilities.

What We've Covered in This Series:

Part 1: Multi-Modal Content Processing with Just a Few Lines of Code - We showed how Strands Agent makes it effortless to build agents that understand images, documents, and videos.
Part 2: Adding FAISS Memory for Local Development - We enhanced our agents with persistent memory capabilities for local development and prototyping
Part 3: Scaling with Amazon S3 Vectors - We evolved to production-ready, enterprise-scale memory with cloud infrastructure