Memory Service for Agentic Applications
Language models are stateless: they do not persist information across calls. In contrast, language agents may store and maintain information internally for multi-step interaction with the world. language agents explicitly organize information (mainly textural, but other modalities also allowed) into multiple memory modules, each containing a different form of information. These include short-term working memory and several long-term memories: episodic, semantic, and procedural.
- Working Memory: Maintain active and readily available information for the current decision cycle. This includes perceptual inputs, active knowledge, and core information from previous cycles.
When to Implement:
When the agent needs to hold temporary information that is actively being used for decision-making processes.
For Intermediate Reasoning: When generating intermediate steps or reasoning paths that need to be referenced within the same decision cycle.
As a Central Hub: To connect and synchronize interactions between the LLM, long-term memories, and grounding interfaces.
2. Episodic Memory: store experiences from earlier decision cycles, such as training input-output pairs, history event flows, and game trajectories.
When to Implement:
For Learning from Experience: When the agent needs to learn from past experiences and use this information to inform future decisions.
During Planning Stages: To retrieve relevant episodes into working memory to support reasoning and decision-making.
As Historical Records: To maintain a history of the agent’s interactions and decisions over time.
3. Semantic Memory: Store the agent’s knowledge about the world and itself. This includes structured and unstructured information that can be retrieved to support reasoning and decision-making.
When to Implement:
For Knowledge Retrieval: When the agent needs to retrieve factual information or knowledge to support its reasoning processes.
To Build World Knowledge: When the agent needs to incrementally learn and store new knowledge obtained from LLM reasoning or external sources.
As a Knowledge Base: To provide a rich source of information that the agent can reference during its operations.
4. Procedural Memory: Contain implicit knowledge stored in the LLM weights and explicit knowledge written in the agent’s code. This includes procedures for actions and decision-making.
When to Implement:
For Action Implementation:* When the agent needs to perform specific actions such as reasoning, retrieval, grounding, and learning.
For Decision-Making: To define the procedures and logic for how the agent makes decisions.
To Bootstrap the Agent: Procedural memory must be initialized with proper code from the start to ensure the agent can function correctly.
Scenario:
Imagine a virtual assistant named “PythonBot” designed to help users learn about Python programming. PythonBot uses a sophisticated memory system to provide accurate and contextually relevant responses.
Memory Initialization
- Semantic Memory:
Store general knowledge about Python programming.
Example Content: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more. - Procedural Memory:
Define the procedures and actions the bot can perform, such as retrieving information, reasoning, and making decisions.
Example Procedures:
a. Action Procedure: Retrieve information about Python concepts.
b. Decision Procedure: Determine the best response based on user queries and available knowledge.
User Interaction Flow
Step 1: User Sends a Message
User Message: “Tell me about Python.”
Working Memory:
Update: Store the user message as the current input.
Content: {“user_message”: “Tell me about Python”}
Episodic Memory:
Retrieve: Check if there are any past interactions related to Python.
Result: No relevant past interactions found (initial interaction).
Semantic Memory:
Retrieve: Fetch general knowledge about Python.
Result: “Python is a high-level programming language known for its readability and versatility.”
Procedural Memory:
Execute: Use the decision procedure to generate a response based on the user message and retrieved knowledge.
Result: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more.”
Update Working Memory:
Content: {“user_message”: “Tell me about Python”, “llm_response”: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more.”}
Update Episodic Memory:
Content: [{“user_message”: “Tell me about Python”, “llm_response”: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more.”}]
Response to User: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more.”
Step 2: User Sends a Follow-Up Message
User Message: “What can I use Python for?”
Working Memory:
Update: Store the new user message.
Content: {“user_message”: “What can I use Python for?”}
Episodic Memory:
Retrieve: Check past interactions related to Python usage.
Result: Retrieve the previous interaction about Python.
Semantic Memory:
Retrieve: Fetch additional knowledge about Python applications.
Result: “Python is used for web development, data analysis, artificial intelligence, scientific computing, automation, and more.”
Procedural Memory:
Execute: Use the decision procedure to generate a response combining past interactions and additional knowledge.
Result: “In addition to what I mentioned earlier, Python is used for web development, data analysis, artificial intelligence, scientific computing, automation, and more.”
Update Working Memory:
Content: {“user_message”: “What can I use Python for?”, “llm_response”: “In addition to what I mentioned earlier, Python is used for web development, data analysis, artificial intelligence, scientific computing, automation, and more.”}
Update Episodic Memory:
Content: [{“user_message”: “Tell me about Python”, “llm_response”: “Python is a high-level programming language known for its readability and versatility. It is used for web development, data analysis, artificial intelligence, scientific computing, and more.”}, {“user_message”: “What can I use Python for?”, “llm_response”: “In addition to what I mentioned earlier, Python is used for web development, data analysis, artificial intelligence, scientific computing, automation, and more.”}]
Response to User: “In addition to what I mentioned earlier, Python is used for web development, data analysis, artificial intelligence, scientific computing, automation, and more.”
Summary
In this example, PythonBot uses:
Working Memory: To store current user inputs and intermediate results.
Episodic Memory: To keep track of past interactions and use them to inform future responses.
Semantic Memory: To store and retrieve factual information about Python.
Procedural Memory: To define how to process user inputs, retrieve relevant information, and generate coherent responses.
This approach ensures that PythonBot can provide accurate, contextually relevant, and coherent responses by leveraging the different types of memory effectively. It demonstrates how a language agent can simulate understanding and continuity in a conversation, making the interaction more meaningful and helpful for the user.
Reference:
1. Cognitive Architectures for Language Agents — https://arxiv.org/pdf/2309.02427