Speaker: Liu Cao @ AWS Amarathon 2025
Summary by Amazon Nova
Search
Search Engine Architecture Flow
Input:
- Query
Retrieval Methods (from Query):
Inverted index based lexical matching
Item-based collaborative filtering
Embedding-based retrieval
Merge Stage: Combines outputs from the three retrieval methods into:
- Unordered product set without duplication
Ranking Stages:
Pre-ranking
Relevance ranking
Ranking
Additional Inputs to Mix-Ranking:
Advertising
Content media
Final Ranking Stage:
- Mix-ranking
System architecture for augmenting product matching using semantic matching
Input:
- Query
Matching Components:
Behavioral Data
Keywords Matching
Semantic Matching
Data Flow:
Query feeds into Keywords Matching and Semantic Matching
Behavioral Data provides a behavioral signal to Ranking
Keywords Matching and Behavioral Data form an unordered match set
Semantic Matching provides a semantic similarity score to Ranking
Processing Stage:
- Ranking
Output:
- Ordered product list
LLM Agent Workflow
Process Flow:
Query -->LLM -->Plan
Plan -->LLM -->Actions
Actions -->"Is the plan complete?" (Decision Point)
If "yes" -->LLM -->Finalize answer
If "no" -->Loop back to "Actions" (via an LLM step)
Decision Logic:
- Uses an LLM to determine if the plan is complete.
Differences
01 Traditional Search
Relies on keyword matching
Webpage link weight ranking
Goal is to quickly retrieve massive information
02 AI Search
Semantic understanding
Related information injected into vectors + keywords
Large models assemble information
Goal is to accurately answer user questions
Memory
LLM with Context Storage Architecture
Main Components
User
Chat APP
LLM (Stateless)
Context Storage
Workflow
Send message (from User to Chat APP)
Carry complete context (from Chat APP to LLM)
Return response (from LLM to Chat APP)
Store history (from Chat APP to Context Storage)
Return result (from Chat APP to User)
Context Storage Content
System Prompt
Round 1: User Question
Round 1: AI Answer
Round 2: User Question
Round 2: AI Answer
Architecture for an agentic system
- uses LLMs to manage memories and knowledge graphs from conversations:
Data Flow & Extraction
Input: Conversations (messages) are fed into the system.
Extraction LLM: A Large Language Model processes the messages to generate structured data.
Outputs: "New memories" and "new entities & relations".
Storage Components
Vector Database: Stores "existing + new memories" for retrieval and updates.
Graph Database: Stores "existing + new entities & relations" for updates.
Update & Management
Update LLM: A second LLM manages the data updates back into the databases.
Functions: Manages "store updates" for both databases.
Operations: Includes explicit "Add", "Delete", and "Update" actions for the stored data.
Overall Goal
- Memories ADD: The entire system facilitates the persistent storage and management of conversation context and extracted knowledge.
Enhanced Context Storage
System Components & Definitions
System Prompt: System instructions or configuration.
Dialog History: Record of the conversation.
Tool Definitions: Available functions for the AI agent.
weather_api(): requires location, date
calculator(): requires expression
Thinking: The agent's reasoning process.
Memory Interaction
Extraction (from Thinking to Memory)
Backfill (from Memory to Thinking)
Memory: Long-term or short-term knowledge base.
Example Workflow: "Query weather (Fahrenheit)"
User: Query weather (Fahrenheit)
AI: Need to call weather API
Tool Usage History:
Call: weather_api(location='Beijing', date='today')
Return: {'temperature': 25, 'condition': 'sunny'}
Call: calculator(expression='25*1.8+32')
Return: 77
Benefit:
Long-term retention and efficient management
Continuous knowledge update
Personalized service
Complex task support
Improve interaction quality
From stateless to stateful
AgentCore Memory
Amazon Bedrock AgentCore
- Utilize any framework and model without managing infrastructure, safely build, deploy, and operate high-performance Agents at scale.
Components
Runtime
Memory
Identity
Gateway
Code Interpreter
Browser Tool
Observability
Simplified Memory System Management
Abstract memory infrastructure
Based on serverless architecture for automatic scaling
Automatically store and manage the context information of Agents across multiple sessions
Enterprise-Level Services
Provide dedicated storage for each customer to fully guarantee data privacy
Provide encryption protection and regionalized data storage to meet enterprise-level security needs
Deep Customization
Customize memory modes according to specific application scenarios
Set long-term memory extraction rules
Select the appropriate model and customize the prompt words to optimize the long-term memory extraction effect
Agent Memory Components
Agent Core:
Agent Reasoning
Agent State
Knowledge Components
Tool Calling
Policy Definition
Short-Term Memory:
Context Window
Conversation History
Tool Calling History
Long-Term Memory:
Events Summary
User Profile Information
Document Information
Connections:
Storage - between Agent Core and Memory components
Retrieve - between Memory components and Automatic Memory Retrieval Module
Automatic Memory Retrieval Module
Agent Architecture & Benefits with Amazon Bedrock AgentCore Memory
Agent Functionality
The intelligent agent automatically breaks down complex user needs (e.g., "Recommend a laptop under 5000 yuan for a university student that runs design software") into multiple parallel sub-tasks.
Examples of sub-tasks: "University student usage scenario analysis", "Budget filtering", "Software running requirement matching".
Benefits of AgentCore Memory
Relies on the technical advantages of AgentCore Memory's long-term memory.
Significantly speeds up the Agent development process.
Achieves multiple real-world application values:
Token usage for smart searches based on user profiles is greatly reduced.
Search result accuracy rate is effectively improved.
Provides a massive breakthrough in technical competitiveness and cost-effectiveness.
System Diagram Flow
- User Query leads to Agent Runtime Environment (Strands Agents).
Strands Agents interacts with:
Tools (User Preference Retrieval)
Amazon Bedrock LLM (Processes Output)
AgentCore Memory
AgentCore Memory manages:
Short Term Memory (Interaction Events)
Automatic Memory Extraction
Long Term Memory (User Preferences)
The process results in an Agent Response to the user.
Team:
Top comments (0)