Eliana Lam for AWS Community Builders

Posted on Nov 22

From "Matching" to "Understanding": Personalized AI Search Practice Driven by AgentCore Memory

#aws #cloud #beginners #productivity

Speaker: Liu Cao @ AWS Amarathon 2025

Summary by Amazon Nova

Search Engine Architecture Flow

Input:

Query

Retrieval Methods (from Query):

Inverted index based lexical matching
Item-based collaborative filtering
Embedding-based retrieval

Merge Stage: Combines outputs from the three retrieval methods into:

Unordered product set without duplication

Ranking Stages:

Pre-ranking
Relevance ranking
Ranking

Additional Inputs to Mix-Ranking:

Advertising
Content media

Final Ranking Stage:

Mix-ranking

System architecture for augmenting product matching using semantic matching

Input:

Query

Matching Components:

Behavioral Data
Keywords Matching
Semantic Matching

Data Flow:

Query feeds into Keywords Matching and Semantic Matching
Behavioral Data provides a behavioral signal to Ranking
Keywords Matching and Behavioral Data form an unordered match set
Semantic Matching provides a semantic similarity score to Ranking

Processing Stage:

Ranking

Output:

Ordered product list

LLM Agent Workflow

Process Flow:

Query -->LLM -->Plan
Plan -->LLM -->Actions
Actions -->"Is the plan complete?" (Decision Point)
If "yes" -->LLM -->Finalize answer
If "no" -->Loop back to "Actions" (via an LLM step)

Decision Logic:

Uses an LLM to determine if the plan is complete.

Differences

01 Traditional Search

Relies on keyword matching
Webpage link weight ranking
Goal is to quickly retrieve massive information

02 AI Search

Semantic understanding
Related information injected into vectors + keywords
Large models assemble information
Goal is to accurately answer user questions

Memory

LLM with Context Storage Architecture

Main Components

User
Chat APP
LLM (Stateless)
Context Storage

Workflow

Send message (from User to Chat APP)
Carry complete context (from Chat APP to LLM)
Return response (from LLM to Chat APP)
Store history (from Chat APP to Context Storage)
Return result (from Chat APP to User)

Context Storage Content

System Prompt
Round 1: User Question
Round 1: AI Answer
Round 2: User Question
Round 2: AI Answer

Architecture for an agentic system

uses LLMs to manage memories and knowledge graphs from conversations:

Data Flow & Extraction

Input: Conversations (messages) are fed into the system.
Extraction LLM: A Large Language Model processes the messages to generate structured data.
Outputs: "New memories" and "new entities & relations".

Storage Components

Vector Database: Stores "existing + new memories" for retrieval and updates.
Graph Database: Stores "existing + new entities & relations" for updates.

Update & Management

Update LLM: A second LLM manages the data updates back into the databases.
Functions: Manages "store updates" for both databases.
Operations: Includes explicit "Add", "Delete", and "Update" actions for the stored data.

Overall Goal

Memories ADD: The entire system facilitates the persistent storage and management of conversation context and extracted knowledge.

Enhanced Context Storage

System Components & Definitions

System Prompt: System instructions or configuration.
Dialog History: Record of the conversation.
Tool Definitions: Available functions for the AI agent.
weather_api(): requires location, date
calculator(): requires expression
Thinking: The agent's reasoning process.

Memory Interaction

Extraction (from Thinking to Memory)
Backfill (from Memory to Thinking)
Memory: Long-term or short-term knowledge base.

Example Workflow: "Query weather (Fahrenheit)"

User: Query weather (Fahrenheit)
AI: Need to call weather API
Tool Usage History:
Call: weather_api(location='Beijing', date='today')
Return: {'temperature': 25, 'condition': 'sunny'}
Call: calculator(expression='25*1.8+32')
Return: 77

Benefit:

Long-term retention and efficient management
Continuous knowledge update
Personalized service
Complex task support
Improve interaction quality
From stateless to stateful

AgentCore Memory

Amazon Bedrock AgentCore

Utilize any framework and model without managing infrastructure, safely build, deploy, and operate high-performance Agents at scale.

Components

Runtime
Memory
Identity
Gateway
Code Interpreter
Browser Tool
Observability

Simplified Memory System Management

Abstract memory infrastructure
Based on serverless architecture for automatic scaling
Automatically store and manage the context information of Agents across multiple sessions

Enterprise-Level Services

Provide dedicated storage for each customer to fully guarantee data privacy
Provide encryption protection and regionalized data storage to meet enterprise-level security needs

Deep Customization

Customize memory modes according to specific application scenarios
Set long-term memory extraction rules
Select the appropriate model and customize the prompt words to optimize the long-term memory extraction effect

Agent Memory Components

Agent Core:

Agent Reasoning
Agent State
Knowledge Components
Tool Calling
Policy Definition

Short-Term Memory:

Context Window
Conversation History
Tool Calling History

Long-Term Memory:

Events Summary
User Profile Information
Document Information

Connections:

Storage - between Agent Core and Memory components
Retrieve - between Memory components and Automatic Memory Retrieval Module
Automatic Memory Retrieval Module

Agent Architecture & Benefits with Amazon Bedrock AgentCore Memory

Agent Functionality

The intelligent agent automatically breaks down complex user needs (e.g., "Recommend a laptop under 5000 yuan for a university student that runs design software") into multiple parallel sub-tasks.
Examples of sub-tasks: "University student usage scenario analysis", "Budget filtering", "Software running requirement matching".

Benefits of AgentCore Memory

Relies on the technical advantages of AgentCore Memory's long-term memory.
Significantly speeds up the Agent development process.
Achieves multiple real-world application values:
Token usage for smart searches based on user profiles is greatly reduced.
Search result accuracy rate is effectively improved.
Provides a massive breakthrough in technical competitiveness and cost-effectiveness.

System Diagram Flow