-
Notifications
You must be signed in to change notification settings - Fork 2.7k
Closed
Description
Use Case: Sending an Image to the Chat Assistant
Goal
A user sends an image to the LLM via the chat interface. The system validates model compatibility and informs the user if vision is unsupported. If supported, the image is retained in the session for context.
Steps
1. User Opens Chat Interface
- The user opens the chat panel within the tool or IDE.
2. User Adds an Image
The user includes an image in the chat message using one of the following methods (listed by priority):
2.1 Paste from Clipboard
- User pastes an image (e.g., screenshot) directly into the chat input using
Ctrl+V
/Cmd+V
.
2.2 Drag and Drop
- User drags an image file (e.g.,
.png
,.jpg
) into the chat window.
2.3 Upload via Button
- User clicks a dedicated upload button (e.g., paperclip icon), selects an image from the file system, and attaches it.
2.4 Select from Workspace
- User browses project/workspace files via file variable and selects an image to attach.
3. Image Appears in Chat Draft
- The image is visually rendered as a thumbnail in the chat input area, along with any accompanying text the user types.
4. User Sends Message
- User hits Enter or clicks “Send” to submit the message and attached image.
5. Framework Checks Model Capabilities
- The chat framework checks if the currently selected LLM supports image inputs.
6. Model Does Not Support Images
If the model is not vision-capable:
- The message is not sent.
- A warning is shown:
"The selected model does not support image input. Please remove the image or select a vision-capable model."
- The message and image remain in the input for user correction.
7. Model Supports Images
If the model does support vision:
- The image is sent as part of the message using the LLM’s required format (e.g., base64 or multipart).
- The image is retained in the current conversation session and can be referred to later by the LLM.
8. Conversation Continues
- The assistant processes the image and responds.
- The image remains visible in the conversation history.
JonasHelming
Metadata
Metadata
Assignees
Labels
No labels