-
Notifications
You must be signed in to change notification settings - Fork 1.5k
Description
Required prerequisites
- I have searched the Issue Tracker and Discussions that this hasn't already been reported. (+1 or comment there if it has.)
- Consider asking first in a Discussion.
Motivation
The Agent's context could get quite challenging to manage after a number of rounds of conversations and messages. Aside from creating context overload, and higher inference price, it could create problems for accurate agent response.
Solution
To effectively manage the context of the agent's history, we can use markdown files so the agent could offload its memory. Markdown Agent Context Handling can use markdown files to save memory into disk, rather than the LLM's context window.
We would save two .md
files:
- the complete history of the LLM conversation, in case it would be used by the agent later on
- a summary file which includes the key points about all the critical information that the agent will need, potentially: what the agent has done so far, what tools it used for what, user-specific information and preferences, what needs to be done further and how.
The summary.md
is also a reflection opportunity for the agent to reflect on the work done so far and the plan for the future, potentially similar to CoT.
The Implementation:
MarkdownMemoryToolkit
for handling the core task of working with memory files. In the backend it uses existingNoteTakingToolkit
for actual file operations.MarkdownAgentMemory
(new memory class): to handle the memory lifecycle, trigger memory saving, and clear active memory after save.
What happens to agent's memory after saving context?
The idea is to start fresh: clear the memory, initialize it with the summary.md
file. The context would be fresh, and include all the key information to know.
When is the Context saved into Markdown?
A hybrid of two approaches could be done:
- The agent can save the memory using a function call from the MarkdownMemoryToolkit to save the memory, clear context, and start it with summary all in one step (to avoid multi function confusion).
- programmatically we can save the memory by setting a threshold on the number of messages in the memory (like 30, as number of messages increase, performance degrades) or some other measure.
Alternatives
No response
Additional context
No response