Skip to content

[Feature Request] Markdown Agent Context Handling #2919

@hesamsheikh

Description

@hesamsheikh

Required prerequisites

Motivation

The Agent's context could get quite challenging to manage after a number of rounds of conversations and messages. Aside from creating context overload, and higher inference price, it could create problems for accurate agent response.

Solution

To effectively manage the context of the agent's history, we can use markdown files so the agent could offload its memory. Markdown Agent Context Handling can use markdown files to save memory into disk, rather than the LLM's context window.

We would save two .md files:

  • the complete history of the LLM conversation, in case it would be used by the agent later on
  • a summary file which includes the key points about all the critical information that the agent will need, potentially: what the agent has done so far, what tools it used for what, user-specific information and preferences, what needs to be done further and how.

The summary.md is also a reflection opportunity for the agent to reflect on the work done so far and the plan for the future, potentially similar to CoT.

The Implementation:

  1. MarkdownMemoryToolkit for handling the core task of working with memory files. In the backend it uses existing NoteTakingToolkit for actual file operations.
  2. MarkdownAgentMemory (new memory class): to handle the memory lifecycle, trigger memory saving, and clear active memory after save.

What happens to agent's memory after saving context?
The idea is to start fresh: clear the memory, initialize it with the summary.md file. The context would be fresh, and include all the key information to know.

When is the Context saved into Markdown?
A hybrid of two approaches could be done:

  1. The agent can save the memory using a function call from the MarkdownMemoryToolkit to save the memory, clear context, and start it with summary all in one step (to avoid multi function confusion).
  2. programmatically we can save the memory by setting a threshold on the number of messages in the memory (like 30, as number of messages increase, performance degrades) or some other measure.

Alternatives

No response

Additional context

No response

Metadata

Metadata

Assignees

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions