-
Notifications
You must be signed in to change notification settings - Fork 6.8k
Bug fix: VSCode LM API token counting for Claude models #5051
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Bug fix: VSCode LM API token counting for Claude models #5051
Conversation
- Reorder imports for better organization - Add extractTextFromMessage helper method - Add isClaudeModel detection method - Use 4:1 character-to-token ratio for Claude models instead of VSCode's inaccurate counting - Fallback to existing VSCode LM token counting for non-Claude models
🦋 Changeset detectedLatest commit: 17ecc59 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR fixes critical token counting issues in the VSCode LM API provider that were causing context window management failures and conversation breaks. The bug prevented Cline from properly monitoring context usage and automatically condensing conversations when reaching the 80% threshold.
Key changes include:
- Eliminates double-counting of the system prompt in token calculations
- Implements character-to-token ratio estimation for Claude models instead of relying on VSCode's inaccurate API
- Adds proper text extraction from VSCode language model chat messages
Reviewed Changes
Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.
File | Description |
---|---|
src/api/providers/vscode-lm.ts | Core token counting logic fixes and Claude model detection |
.changeset/smart-jobs-impress.md | Changeset for the bug fix release |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Put debug statements in your Code and you will see that for GPT 4 family of models the token counting calculations are inaccurate so please fix that.
This is not about your PR as the claude 4 family of models is correct btw.
Not quite sure I understood your message.
I don't use the rest of the models, so it's hard for me to verify that. |
Yes
That would be nice but i understand that this would be beyond the scope of this PR. |
Related Issue
Issue: #4584 #4027
Description
When using CLINE with VSCode LM API as provider - the token counting is wrong:
.countTokens()
API themsg
object instead of themsg
's content itself -- for which the count is always4
.The impact of the above two -- is that CLINE cannot monitor the context window usage, and therefore cannot condense the conversation automatically when reaching the 80% threshold automatically. This leads to exceeding the context window at some point, and break the entire cline conversation -- because when exceeding the context window, GHCP API truncates the conversation from the beginning, omitting CLINE's system prompt.
See issue #4027 for example.
Test Procedure
I have been using in the past several days, a private build of CLINE, that incorporates this change -- in order to validate it works properly.
Type of Change
Pre-flight Checklist
npm test
) and code is formatted and linted (npm run format && npm run lint
)npm run changeset
(required for user-facing changes)Screenshots
Additional Notes
Important
Fixes token counting for Claude models in
VsCodeLmHandler
by using a character-to-token ratio and removing double counting of system prompt tokens.VsCodeLmHandler
by using a 4 character-to-token ratio.calculateTotalInputTokens()
.extractTextFromMessage()
to extract text fromvscode.LanguageModelChatMessage
.isClaudeModel()
to check if the model is a Claude model.countTokens()
to use character-to-token ratio for Claude models.systemPrompt
parameter fromcalculateTotalInputTokens()
invscode-lm.ts
.This description was created by
for 17ecc59. You can customize this summary. It will automatically update as commits are pushed.