Skip to content

Conversation

eneufeld
Copy link
Contributor

@eneufeld eneufeld commented Mar 3, 2025

What it does

  • Added 'Start new Task' and 'Start new Task from current' actions
  • Implemented summary generation for task continuation
  • Added CoderSummaryPrompt template

How to test

Follow-ups

Breaking changes

  • This PR introduces breaking changes and requires careful review. If yes, the breaking changes section in the changelog has been updated.

Attribution

Review checklist

Reminder for reviewers

- Added 'Start new Task' and 'Start new Task from current' actions
- Implemented summary generation for task continuation
- Added CoderSummaryPrompt template
@github-project-automation github-project-automation bot moved this to Waiting on reviewers in PR Backlog Mar 3, 2025
@eneufeld eneufeld mentioned this pull request Mar 4, 2025
61 tasks
@JonasHelming
Copy link
Contributor

@planger Three questions for the UX watchmen and Context expert :-)

Here is a screenshot of the currect draft:
image

The background for this feature is that we want to motivate the user proactively to manage threads by
a) Starting a new thread if they start a new task
b) Start a follow up task with some history information to consolidate the chat history

  1. Placement of the follow up buttons
    Currently they are just appended to every answer. This has the advantage to always remind the user of these two options (which I believe might be really necessary). We could of course also place them somewhere else, but I fear that users might forget about them then. We already have a button to create a new thread, but in practice, it is often forgotten. So I believe they should be where they are, but WDTY? We could of course also let users turn these buttons off in a setting.

  2. Visualization of the follow up buttons
    Currently they are pretty big as they contain text. We could of course also display them smaller or even make them icons. With the risk of course that they are overlooked then, WDYT?

  3. How to add the history into a new thread
    Currently, the history of a previous task for a follow-up task is just added as the first message, but of course, we would like to add this to the system prompt instead.
    For this, it would make sense to use a context variable. This is a bit related to Support prompt fragments and variable dependencies with recursive resolving #15196 (the still unsupported "add to system prompt" case).

We have several choices how to do this:

a) We define a context element "Text" and use it to add the history as plain text with the value "History" and the contextValue "History description". This would be in memory only. This of course immediatly raises the question on how the user can see the content of this from the chat and how the user can potentially modify this.

b) We use the prompt variable for this and create a new prompt file with the history and add it with the existing variable (we need the system variant then see #15196 ). This way the user has at least some way of seeing and modifying it, and we use an existing mechanism but this might get pretty confusing, if we consider prompt fragments to be rather static (I am not sure if we do).

c) We create a similar mechanism as the prompt fragments, but explicitly for summaries of previous chats. We can put them into a directory or in one yaml file. This could be accessed in its own variable then e.g. "#history"

d) We add "categories" to prompt fragments, e.g. #prompt/history:myhistory

WDYT :-)

@planger
Copy link
Contributor

planger commented Mar 17, 2025

Thanks for the cool feature and for let me weigh in.

Placement and Visualization

I think using buttons at the end of the response could be perceived by users as suggested next actions related to the agent's response. To me it would be more clear if the suggestion to create a new chat would be a hint above the chat input, similar to the screenshot sketched below.

image

This would be also very visible to users, would be rather associated with the global session and be more separated from the actual agent response to avoid confusion, and would avoid polluting every response of the chat model. At the same time it would feel less aggressive.

Also I'd use links instead of buttons, to make them more feel like suggestions instead of mandatory next actions.
Alternatively, we could link-like buttons similar to code lens actions above the chat input, like "Start new Coder task | Start follow up Coder task".

image

Implementation of the Follow-Up Buttons

I'd find it versatile, if we'd introduce a hints property on the chat session, where agents could set a React node that is displayed above the chat input (as shown in the screenshot above). Agents could use that for making suggested next steps, alternative queries, or actions to start a new task.

Adding the History in a New Chat

To me, option (a) feels best. A context variable matches this use case best, imho. It'd be rather straightforward to implement:
Add a context variable provider for chat session summary, e.g. #chatSummary:<session-id> that summarizes the chat session with the given session id into the contextValue.

It indeed has the disadvantage that its content is currently not visible to the user. But that'd also be rather straightforward to implement. We could add an open() handler to context variable elements, that is executed on click (showing an editor with an in-memory resource, similar to what we use for the change set file elements). That would be nice to have anyway.

If we want to make this editable too, we'd have to find a way to store it somewhere, because context variables are only resolved when the request is submitted. Until then they are only represented as context variable request (defined as variable type and args) in the chat input. One rather logical place to store it would be the data map of the original chat session that this variable request should summarize. So the #chatSummary variable provider could look into the data map of the chat session first, and only generate a summary if there is none yet.

If the user clicks on the context element and we open it in an editor, we could do the same (look up in data or generate), and once the user modifies and saves the value, store it in the data map to be retrieved from when the user submits the user request with the context variable request.

I haven't thought this through in detail, but on a first glance this seems to help introducing rather versatile generic mechanisms at not much additional cost, and fulfill the reqs of this PR. But I might always be wrong. :-)

@eneufeld
Copy link
Contributor Author

eneufeld commented Mar 18, 2025

Just to add my 2 cents:
I like the suggestion of hints, maybe add them to the agent header? That would work if the last agent header is 'sticky' and does not scroll away if the answer is long.

For context I also like Philips suggestion with a context addition and the ability to modify it in memory.

@eneufeld eneufeld closed this Mar 18, 2025
@github-project-automation github-project-automation bot moved this from Waiting on reviewers to Done in PR Backlog Mar 18, 2025
@eneufeld eneufeld reopened this Mar 18, 2025
@github-project-automation github-project-automation bot moved this from Done to Waiting on reviewers in PR Backlog Mar 18, 2025
@JonasHelming
Copy link
Contributor

Just to add my 2 cents: I like the suggestion of hints, maybe add them to the agent header? That would work if the last agent header is 'sticky' and does not scroll away if the answer is long.

What is the "Agent header"?

@eneufeld
Copy link
Contributor Author

image
Where it says Coder to indicate that it is the answer of the coder agent.

@JonasHelming
Copy link
Contributor

image Where it says Coder to indicate that it is the answer of the coder agent.

Hmm, this could gte easily lost when the response is long, couldn't it?

@eneufeld
Copy link
Contributor Author

yes that why I wrote:

That would work if the last agent header is 'sticky' and does not scroll away if the answer is long.

@planger
Copy link
Contributor

planger commented Mar 18, 2025

maybe add them to the agent header?

Hm, also a good idea. I guess above the chat input is more "prominent" though, in line with what e.g. Claude is doing, and is probably easier to style. So I'd still be in favor of adding it on top of the chat input, if you agree?

@eneufeld
Copy link
Contributor Author

eneufeld commented Mar 19, 2025

I'm fine with your suggestion. It was just one more idea.

@JonasHelming
Copy link
Contributor

@colin-grant-work One additional comment on the UI that Philip described: It would be good if the UI element above the chat input is flexible enough to generally allow the agent to provide "prompt suggestions". We can then for example also use it to propose "good first prompts" such as "Fix the errors in the currently opened file" or "Generate a quiz game as a node-based web application" for people to get started with specific agents. I think this is pretty much covered with Philips suggestion, the agent can dfine these prompt sugggestions to the user, but just keep this additional use case in mind.

@colin-grant-work
Copy link
Contributor

@JonasHelming, when you say

to generally allow the agent to provide "prompt suggestions".

do you mean that the LLM backing the agent will be providing prompt suggestions, or that the developer creating the agent will be responsible for prompt suggestions, but we want to make sure that that functionality is flexible?

I ask because the current concrete use case, providing the suggestion to start a new chat, involves executing a command, which not all agents are aware of - but some are. do we want to try to write functionality that allows the LLM itself to dynamically generate suggestions, or for now, limit the functionality to things developers program ahead of time?

It would be fairly straightforward to provide an API that exposes something like a function with two arguments (or an array of pairs), the suggestion text and the corresponding prompt text and say that if the user selects the option, we'll prompt the LLM with the prompt text. Exposing full dynamic suggestion provision would be trickier, and would have to be tailored to each agent - at least in how the prompt describes the functionality - but might be feasible.

@JonasHelming
Copy link
Contributor

I think there should be two layers, like for the changeset. The first layer is accessible for the agent (via the session) and allows the agent to place these suggestions. In the current coder use case, these suggestions would be static, e.g. always after at least one message has been answered and the LLM is totally unaware.
Based on this layer, we could then create tool functions that also allow the LLM to provide suggestions, but this should then be decided by the agent and I would consider this a second step fro sure.

So what I would like now is an API and UI element to display suggestions and the application of both for the coder uses case in a static way not involving the LLM for now.

Does this make sense?

@colin-grant-work
Copy link
Contributor

Re: storage of the summary data, @planger proposed

One rather logical place to store it would be the data map of the original chat session that this variable request should summarize. So the #chatSummary variable provider could look into the data map of the chat session first, and only generate a summary if there is none yet.

To me, that raises potential lifecycle and ownership concerns, namely, whether we want a session's summary to be accessible only as long as the session is alive, since if the user deleted a given session, its summary would become inaccessible. I can imagine that a user might decide to start a new chat from an existing chat, then discard the old chat, and in that case, we would want to keep the variable alive.

@planger
Copy link
Contributor

planger commented Apr 4, 2025

Re: storage of the summary data, @planger proposed

One rather logical place to store it would be the data map of the original chat session that this variable request should summarize. So the #chatSummary variable provider could look into the data map of the chat session first, and only generate a summary if there is none yet.

To me, that raises potential lifecycle and ownership concerns, namely, whether we want a session's summary to be accessible only as long as the session is alive, since if the user deleted a given session, its summary would become inaccessible. I can imagine that a user might decide to start a new chat from an existing chat, then discard the old chat, and in that case, we would want to keep the variable alive.

Hm, you are right. This could indeed be an issue. How about storing the summary of the "base session" (the session from which we start a new session) in the new chat session data map, with the base chat session id as a key (session-summaries/<base-chat-session-id>/summary), instead of the base chat session. This would mean we would need to create a new summary for each new chat, but on the other hand the ownership and lifecycle is secured.
Alternatively we could store it twice, in the base session and the session that uses it and the lookup would follow this hierarchy (current session -> base session -> create summary).

WDYT?

@JonasHelming JonasHelming mentioned this pull request Apr 5, 2025
37 tasks
@JonasHelming
Copy link
Contributor

JonasHelming commented Apr 7, 2025

Just adding another thought here: I'm thinking we might want to store the summary for follow-up tasks as actual files in a dedicated directory. That way, they’d persist across sessions and be independent of the chat/session lifecycle. This would not only address the lifecycle/ownership concern raised above but also open up new use cases—for example, generating a good PR at the end based on the task description, or storing UI testing instructions that evolve with the task.

Having a file-based summary would allow for easy editing, full transparency, and reusability—inside and outside of Theia. In that sense, these summaries become more like scoped prompt fragments. If we do decide to introduce persistent chat sessions, we should probably align both mechanisms (chat data and file-based summaries) and maybe use the same directory for consistency.

This would essentially align with option c) (or d) discussed above—Maybe with the difference that we want to still add this dynamically to the system prompt, not only a chat message.
just wanted to bring it up again as I believe it might be the most versatile and robust solution.

WDYT? @planger @colin-grant-work

@JonasHelming
Copy link
Contributor

@eneufeld This was the template for #15427
It helped a lot clarifying the requirements, but I think we can close it now, is this fine?

@eneufeld eneufeld closed this Apr 17, 2025
@github-project-automation github-project-automation bot moved this from Waiting on reviewers to Done in PR Backlog Apr 17, 2025
@JonasHelming JonasHelming mentioned this pull request May 5, 2025
16 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
Archived in project
Development

Successfully merging this pull request may close these issues.

4 participants