abstractions in memory system for agents

#agents #ai #memory #rag

when you are building a memory system for your agents you will likely go through following abstractions during development. these abstractions have been listed in the order of manual instrumentation required, and go from definitive action to more and more probabilistic action.

few definitions to make sure we are on the same page:

memory: any peace of data that is passed either to an agent or a model or to the user
memory system: data storage, could be any combination of sql, nosql, blob storage
agents: a catch-all term for agents, workflows, and single api call

why would we even want to build such a system?

when operating on a small scale you can keep shovelling entire corpus for a model to chew upon. think applications such as 'chat with pdf', or data analytics over a single spreadsheet. but as the data and complexity grows you need to be more careful with your context management; both in order to maintain accuracy and reduce cost/token consumption. here are some scenarios where you might want a dedicated memory system.

due to limitation in the context window of a model, its better to provide specific information as part of prompt, instead of shovelling the entire corpus.
you might be doing data analysis over a stream of unstructured, in which case its better to gather and provide specific entities instead of entire blob.
you might be building a user preference profile, so that you the response can be tailored instead of generic.

the abstractions

query represents any piece of code/statement that you may execute in order to either read or update data, or make changes to the storage system. its a catch-all term to include all the crud operations you may perform on a storage system at any level (schema, table, object)

let us now a construct a pseudo memory interface

memory = new Memory()

1. exact query known
in this scenario you know the exact query that needs to be run. its completely deterministic.

memory.execute_query(_query_)

2. user intent (read vs update) known
in this scenario you only know the requirement. a requirement here is ideally a combination of user input/intent and knowledge about existing system. the exact query will have to be derived and then executed

query = memory.generate_query(_requirement_)
memory.execute_query(_query_)

we may chose to expose only 'execute' function, which would run the steps internally

memory.execute(_requirement_)

3. ambiguous intent is know
in this scenario you have bits of data, and some semblance of intent is known. you have knowledge about existing system but you are not sure about the user intent. this could be either read intent for existing data or non-existing data. in the latter case the data will have to generate.

requirement = memory.generate_requirement(_vague_intent_)

List<query> queries = memory.generate_query(_requirement_)

# because the intent is vague we may end up with a list of queries, that'll have to be executed in order to fulfil the action.

for query in queries:
    memory.execute_query(_query_)

we may chose to expose a minimal function called execute, which runs all the steps internally

memory.execute(_vague intent_)

this minimal exposure is the ideal state because the developer doesn't have to worry about anything. this is also the one most prone to error because every step is guided by llms, and error from those compounds

DEV Community

abstractions in memory system for agents

why would we even want to build such a system?

the abstractions

Top comments (0)