Skip to content

Bug fix: VSCode LM API token counting for Claude models #5051

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 6 commits into from
Jul 28, 2025

Conversation

johnib
Copy link
Contributor

@johnib johnib commented Jul 20, 2025

Related Issue

Issue: #4584 #4027

Description

When using CLINE with VSCode LM API as provider - the token counting is wrong:

  1. It counts the system prompt twice
  2. It passes the vscode LM API .countTokens() API the msg object instead of the msg's content itself -- for which the count is always 4.

The impact of the above two -- is that CLINE cannot monitor the context window usage, and therefore cannot condense the conversation automatically when reaching the 80% threshold automatically. This leads to exceeding the context window at some point, and break the entire cline conversation -- because when exceeding the context window, GHCP API truncates the conversation from the beginning, omitting CLINE's system prompt.

See issue #4027 for example.

Test Procedure

I have been using in the past several days, a private build of CLINE, that incorporates this change -- in order to validate it works properly.

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • ♻️ Refactor Changes
  • 💅 Cosmetic Changes
  • 📚 Documentation update
  • 🏃 Workflow Changes

Pre-flight Checklist

  • Changes are limited to a single feature, bugfix or chore (split larger changes into separate PRs)
  • Tests are passing (npm test) and code is formatted and linted (npm run format && npm run lint)
  • I have created a changeset using npm run changeset (required for user-facing changes)
  • I have reviewed contributor guidelines

Screenshots

Additional Notes


Important

Fixes token counting for Claude models in VsCodeLmHandler by using a character-to-token ratio and removing double counting of system prompt tokens.

  • Behavior:
    • Fixes token counting for Claude models in VsCodeLmHandler by using a 4 character-to-token ratio.
    • Removes double counting of system prompt tokens in calculateTotalInputTokens().
  • Functions:
    • Adds extractTextFromMessage() to extract text from vscode.LanguageModelChatMessage.
    • Adds isClaudeModel() to check if the model is a Claude model.
    • Modifies countTokens() to use character-to-token ratio for Claude models.
  • Misc:
    • Removes systemPrompt parameter from calculateTotalInputTokens() in vscode-lm.ts.

This description was created by Ellipsis for 17ecc59. You can customize this summary. It will automatically update as commits are pushed.

Jonathan Barazany added 6 commits July 7, 2025 10:11
- Reorder imports for better organization
- Add extractTextFromMessage helper method
- Add isClaudeModel detection method
- Use 4:1 character-to-token ratio for Claude models instead of VSCode's inaccurate counting
- Fallback to existing VSCode LM token counting for non-Claude models
@Copilot Copilot AI review requested due to automatic review settings July 20, 2025 08:59
Copy link

changeset-bot bot commented Jul 20, 2025

🦋 Changeset detected

Latest commit: 17ecc59

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
claude-dev Patch

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

Copy link
Contributor

@Copilot Copilot AI left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Pull Request Overview

This PR fixes critical token counting issues in the VSCode LM API provider that were causing context window management failures and conversation breaks. The bug prevented Cline from properly monitoring context usage and automatically condensing conversations when reaching the 80% threshold.

Key changes include:

  • Eliminates double-counting of the system prompt in token calculations
  • Implements character-to-token ratio estimation for Claude models instead of relying on VSCode's inaccurate API
  • Adds proper text extraction from VSCode language model chat messages

Reviewed Changes

Copilot reviewed 2 out of 2 changed files in this pull request and generated 2 comments.

File Description
src/api/providers/vscode-lm.ts Core token counting logic fixes and Claude model detection
.changeset/smart-jobs-impress.md Changeset for the bug fix release

Copy link
Contributor

@arafatkatze arafatkatze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Put debug statements in your Code and you will see that for GPT 4 family of models the token counting calculations are inaccurate so please fix that.

This is not about your PR as the claude 4 family of models is correct btw.

@johnib
Copy link
Contributor Author

johnib commented Jul 28, 2025

Put debug statements in your Code and you will see that for GPT 4 family of models the token counting calculations are inaccurate so please fix that.

This is not about your PR as the claude 4 family of models is correct btw.

Not quite sure I understood your message.

  1. Are you saying that token counting for Claude models is correct, in this branch?
  2. Do you want to extend this PR to fix token counting for all models provided by VSCode LM API ?

I don't use the rest of the models, so it's hard for me to verify that.
I can do that, but would rather separate this from this PR if that's okay with you.

@arafatkatze
Copy link
Contributor

Are you saying that token counting for Claude models is correct, in this branch?

Yes

Do you want to extend this PR to fix token counting for all models provided by VSCode LM API ?

That would be nice but i understand that this would be beyond the scope of this PR.

@arafatkatze arafatkatze self-requested a review July 28, 2025 09:09
@arafatkatze arafatkatze merged commit 65c21e7 into cline:main Jul 28, 2025
8 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants