fix grok browser_user issue due to not being able to process webp images #5278
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Related Issue
Issue: #XXXX
Description
The browser use tool does not work for Grok models due to our default for using webp formatted images, and Grok not supporting these. Originally we would get this error when sending the browser screen shots to Grok when using their provider:
400 "Invalid request content: Downloaded response does not contain a valid JPG or PNG image."
. Fixed this by using png format for Grok modelsTest Procedure
Tested browser use tool for grok-4, sonnet-4, and gemini-2.5-pro to check that browser use still works for all. Also tested with and without using the remote browser connection.
Type of Change
Pre-flight Checklist
npm test
) and code is formatted and linted (npm run format && npm run lint
)npm run changeset
(required for user-facing changes)Screenshots
Using grok-4 in browser use:
grok.single.click.mov
Additional Notes
Important
Enhance
ToolExecutor
andBrowserSession
to handle webp image support based on model capabilities, addingmodelDoesntSupportWebp()
function.ToolExecutor
now includesGrok4
model family inisNextGenModel
checks inpushToolResult()
andhandleStreamingJsonReplacement()
.BrowserSession
constructor now acceptsuseWebp
parameter to determine screenshot format.BrowserSession
usesmodelDoesntSupportWebp()
to decide betweenwebp
andpng
for screenshots.modelDoesntSupportWebp()
inmodel-utils.ts
to check if a model supports webp images.doAction()
andnavigateTourl("")
inBrowserSession
to retry withpng
ifwebp
fails.This description was created by
for be364d9. You can customize this summary. It will automatically update as commits are pushed.