DEV Community

Cover image for From "Matching" to "Understanding": Personalized AI Search Practice Driven by AgentCore Memory

From "Matching" to "Understanding": Personalized AI Search Practice Driven by AgentCore Memory

Speaker: Liu Cao @ AWS Amarathon 2025

Summary by Amazon Nova



Search

Search Engine Architecture Flow

Input: 

  • Query

Retrieval Methods (from Query):

  • Inverted index based lexical matching

  • Item-based collaborative filtering

  • Embedding-based retrieval

Merge Stage: Combines outputs from the three retrieval methods into:

  • Unordered product set without duplication

Ranking Stages:

  • Pre-ranking

  • Relevance ranking

  • Ranking

Additional Inputs to Mix-Ranking:

  • Advertising

  • Content media

Final Ranking Stage:

  • Mix-ranking


System architecture for augmenting product matching using semantic matching

Input: 

  • Query

Matching Components:

  • Behavioral Data

  • Keywords Matching

  • Semantic Matching

Data Flow:

  • Query feeds into Keywords Matching and Semantic Matching

  • Behavioral Data provides a behavioral signal to Ranking

  • Keywords Matching and Behavioral Data form an unordered match set

  • Semantic Matching provides a semantic similarity score to Ranking

Processing Stage:

  • Ranking

Output:

  • Ordered product list


LLM Agent Workflow

Process Flow:

  • Query -->LLM -->Plan

  • Plan -->LLM -->Actions

  • Actions -->"Is the plan complete?" (Decision Point)

  • If "yes" -->LLM -->Finalize answer

  • If "no" -->Loop back to "Actions" (via an LLM step)

Decision Logic:

  • Uses an LLM to determine if the plan is complete.

Differences

01 Traditional Search

  • Relies on keyword matching

  • Webpage link weight ranking

  • Goal is to quickly retrieve massive information

02 AI Search

  • Semantic understanding

  • Related information injected into vectors + keywords

  • Large models assemble information

  • Goal is to accurately answer user questions



Memory

LLM with Context Storage Architecture

Main Components

  • User

  • Chat APP

  • LLM (Stateless)

  • Context Storage

Workflow

  • Send message (from User to Chat APP)

  • Carry complete context (from Chat APP to LLM)

  • Return response (from LLM to Chat APP)

  • Store history (from Chat APP to Context Storage)

  • Return result (from Chat APP to User)

Context Storage Content

  • System Prompt

  • Round 1: User Question

  • Round 1: AI Answer

  • Round 2: User Question

  • Round 2: AI Answer



Architecture for an agentic system

  • uses LLMs to manage memories and knowledge graphs from conversations:

Data Flow & Extraction

  • Input: Conversations (messages) are fed into the system.

  • Extraction LLM: A Large Language Model processes the messages to generate structured data.

  • Outputs: "New memories" and "new entities & relations".

Storage Components

  • Vector Database: Stores "existing + new memories" for retrieval and updates.

  • Graph Database: Stores "existing + new entities & relations" for updates.

Update & Management

  • Update LLM: A second LLM manages the data updates back into the databases.

  • Functions: Manages "store updates" for both databases.

  • Operations: Includes explicit "Add", "Delete", and "Update" actions for the stored data.

Overall Goal

  • Memories ADD: The entire system facilitates the persistent storage and management of conversation context and extracted knowledge.


Enhanced Context Storage

System Components & Definitions

  • System Prompt: System instructions or configuration.

  • Dialog History: Record of the conversation.

  • Tool Definitions: Available functions for the AI agent.

  • weather_api(): requires location, date

  • calculator(): requires expression

  • Thinking: The agent's reasoning process.

Memory Interaction

  • Extraction (from Thinking to Memory)

  • Backfill (from Memory to Thinking)

  • Memory: Long-term or short-term knowledge base.

Example Workflow: "Query weather (Fahrenheit)"

  • User: Query weather (Fahrenheit)

  • AI: Need to call weather API

  • Tool Usage History:

  • Call: weather_api(location='Beijing', date='today')

  • Return: {'temperature': 25, 'condition': 'sunny'}

  • Call: calculator(expression='25*1.8+32')

  • Return: 77

Benefit:

  • Long-term retention and efficient management

  • Continuous knowledge update

  • Personalized service

  • Complex task support

  • Improve interaction quality

  • From stateless to stateful



AgentCore Memory

Amazon Bedrock AgentCore

  • Utilize any framework and model without managing infrastructure, safely build, deploy, and operate high-performance Agents at scale.

Components

  • Runtime

  • Memory

  • Identity

  • Gateway

  • Code Interpreter

  • Browser Tool

  • Observability

Simplified Memory System Management

  • Abstract memory infrastructure

  • Based on serverless architecture for automatic scaling

  • Automatically store and manage the context information of Agents across multiple sessions

Enterprise-Level Services

  • Provide dedicated storage for each customer to fully guarantee data privacy

  • Provide encryption protection and regionalized data storage to meet enterprise-level security needs

Deep Customization

  • Customize memory modes according to specific application scenarios

  • Set long-term memory extraction rules

  • Select the appropriate model and customize the prompt words to optimize the long-term memory extraction effect



Agent Memory Components

Agent Core:

  • Agent Reasoning

  • Agent State

  • Knowledge Components

  • Tool Calling

  • Policy Definition

Short-Term Memory:

  • Context Window

  • Conversation History

  • Tool Calling History

Long-Term Memory:

  • Events Summary

  • User Profile Information

  • Document Information

Connections:

  • Storage - between Agent Core and Memory components

  • Retrieve - between Memory components and Automatic Memory Retrieval Module

  • Automatic Memory Retrieval Module



Agent Architecture & Benefits with Amazon Bedrock AgentCore Memory

Agent Functionality

  • The intelligent agent automatically breaks down complex user needs (e.g., "Recommend a laptop under 5000 yuan for a university student that runs design software") into multiple parallel sub-tasks.

  • Examples of sub-tasks: "University student usage scenario analysis", "Budget filtering", "Software running requirement matching".

Benefits of AgentCore Memory

  • Relies on the technical advantages of AgentCore Memory's long-term memory.

  • Significantly speeds up the Agent development process.

  • Achieves multiple real-world application values:

  • Token usage for smart searches based on user profiles is greatly reduced.

  • Search result accuracy rate is effectively improved.

  • Provides a massive breakthrough in technical competitiveness and cost-effectiveness.

System Diagram Flow

  • User Query leads to Agent Runtime Environment (Strands Agents).

Strands Agents interacts with:

  • Tools (User Preference Retrieval)

  • Amazon Bedrock LLM (Processes Output)

  • AgentCore Memory

AgentCore Memory manages:

  • Short Term Memory (Interaction Events)

  • Automatic Memory Extraction

  • Long Term Memory (User Preferences)

  • The process results in an Agent Response to the user.



Team:

AWS FSI Customer Acceleration Hong Kong

AWS Amarathon Fan Club

AWS Community Builder Hong Kong

Top comments (0)