Skip to content

[Theia AI] Persistence for chats #15031

@JonasHelming

Description

@JonasHelming

Chat Persistence

The goal of this feature is to persist and restore chat sessions within Theia AI

Requirements

Functionality

  • Persist chat sessions automatically
  • Restore chat sessions automatically
  • The user should be able to seamlessly continue restored chat sessions

Framework

  • Storage mechanism must be configurable
  • Adopter enhancements of Chats must be able to participate
    • Custom ResponseParts
    • Custom data

To be determined

  • Offer a workspace specific history or a combined history over all workspaces

Analysis of chat persistence in VS Code

ChatModel

The ChatModel in VS Code roughly corresponds to the MutableChatModel in Theia. Both represent a "Chat Session".
The ChatModel accepts serialized data in the constructor from which it can restore all fields, requests, responses etc. See here.

Correspondingly the constructors of the requests and responses handle the case in which they are restored too, see for example here.

Serialization

The ChatModel offers toExport() and toJSON() methods which produce a serialized version of the ChatModel. toJSON calls toExport internally and just adds some additional metadata. See here.

Storage

By default VSCode stores chat sessions in the workspace storage. On Linux they can be found in ~/.config/Code/User/workspaceStorage/<hash-of-workspace>/chatSessions with a separate JSON file for each chat session.

The store code can be found here.

Versioning

As the ChatModel is very actively developed, it often changes structure. Old serialized data then no longer fits to the changed model. The serialized data is therefore versioned. If an old format is encountered, it's transformed to the new format on the fly as best as possible. See here and here.

Summary

Chat persistence is relatively straightforward in VS Code as VS Code controls its complete model. In Theia we foresee custom response parts by adopters. We also allow to store and retrieve arbitrary data throughout our MutableChatModel. Therefore the persistence must be implemented more generically.

Versioning also makes a lot of sense and should also be foreseen from the beginning in Theia.

In Theia we do not have a workspaceStorage similar to VS Code, instead we're typically using the localStorage. We could do the same for chat persistence, allow the user to configure a directory or store them within the Theia user storage.

Implementation in Theia

Serialization

Instead of implementing the serialization within the MutableChatModel, as it's done in VS Code, we should use a ChatModelSerializer service. This will allow adopters to override the behavior as they see fit. The ChatModelSerializer's responsibility is to serialize a MutableChatModel to JSON and to deserialize JSON to a MutableChatModel. For the latter we might need to extend MutableChatModel a bit, for example to accept serialized data in its constructor, similar to VS Code.

MutableChatRequestModel (de)serialization should be performed by a ChatRequestModelSerializer.

MutableChatResponseModel (de)serialization should be performed by a ChatResponseModelSerializer. The (de)serialization of each ChatResponseContent entry must be delegated based on the ChatResponseContent.kind. This can either be done with a registry.

MutableChatModel, MutableChatRequestModel and MutableChatResponseModel can additionally contain arbitrary data. Data which is meant to be (de)serialized will either need to be serializable itself, or specify an id. An injectable ChatDataSerializer could then (de)serialize data with an id.

Storage

For the MVP we could just store the last 10 sessions in localStorage. Later we can switch to a file based storage, similar to VS Code.

Versioning and structural changes

Similar to VS Code we should consider versioning from the get go. Additionally we could think about gracefully handling cases in which serialized data can't be restored because the corresponding Theia extension (i.e. deserializer) is not available.

Metadata

Metadata

Assignees

Labels

theia-aiissues related to TheiaAI

Type

No type

Projects

No projects

Milestone

No milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions