-
Notifications
You must be signed in to change notification settings - Fork 6.9k
fix: Change Vscode LM token counts to use approx counting method #5280
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Pull Request Overview
This PR replaces the VS Code Language Model API's countTokens
method with an approximate character-to-token ratio calculation. The change addresses issues with unreliable token counts from the native API that could lead to model hallucinations.
- Removes complex error handling and API calls for token counting
- Implements uniform 3:1 character-to-token ratio for all non-Claude models
- Simplifies the token counting logic significantly
Coverage ReportExtension CoverageBase branch: 47% PR branch: 48% ✅ Coverage increased or remained the same Webview CoverageBase branch: 17% PR branch: 17% ✅ Coverage increased or remained the same Overall Assessment✅ Test coverage has been maintained or improved Last updated: 2025-08-04T06:40:36.112715 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Left some comments in line. Main concern is that we should use tiktoken lite for our use case and that we are not freeing memory after reusing the same encoder that could lead to memory issues
4fe98b1
to
82743d5
Compare
82743d5
to
28e98b5
Compare
Why not js-tiktoken + ranks
Why not “tiktoken-lite” (3rd-party fork)
What we shipped instead
Accuracy vs. cost
Bottom line
|
Co-authored-by: Daniel Riccio <ricciodaniel98@gmail.com>
Solves GitHub issue #4584, reflecting the tiktoken-based solution implemented in the VSCode LM provider:
Description
Problem:
The VSCode Language Model API's
countTokens()
method is unreliable and often returns incorrect token counts for non-Claude models. For example, it frequently returns a constant value (like 4) regardless of the actual text length. This leads to inaccurate token usage reporting, potential context management issues, and incorrect cost calculations.Solution:
This PR replaces the unreliable
countTokens()
method in the VSCode LM provider with a robust tiktoken-based token counter. The new implementation uses thetiktoken
library'scl100k_base
encoding to provide accurate token estimates across all model types. If tiktoken fails, it gracefully falls back to a character-based estimation (4:1 ratio for Claude models and 3:1 for others). This ensures consistent and reliable token counting behavior.Changes:
TokenCounter
utility insrc/utils/tokenCounter.ts
:cl100k_base
encoding for accurate token counting.countTokens()
inVsCodeLmHandler
to use the newTokenCounter
utility:estimateTokens()
fromTokenCounter
.package.json
to include thetiktoken
dependency.This ensures accurate token usage reporting and prevents issues caused by the broken
countTokens()
method in the VSCode LM API.Test Procedure
Testing approach:
What could break:
Token usage reporting might show different (but more accurate) values compared to before, which could affect cost calculations. However, this is an improvement since the previous values were incorrect.
Confidence:
High - The implementation uses a proven library (
tiktoken
) and includes a robust fallback mechanism. The changes have been thoroughly tested across different scenarios.Type of Change
Pre-flight Checklist
npm test
) and code is formatted and linted (npm run format && npm run lint
).npm run changeset
(required for user-facing changes).Screenshots
Screenshot showing the code changes implementing the tiktoken-based token counter in the VSCode LM provider.
Additional Notes
This fix addresses a fundamental issue with the VSCode Language Model API's
countTokens()
method, which returns unreliable results. By replacing it with a tiktoken-based solution, we ensure accurate token counting across all model types, improving the reliability of token usage tracking and context management in the VSCode LM provider.Important
Replaces unreliable
countTokens()
inVsCodeLmHandler
with a heuristic-based method for improved token counting accuracy.countTokens()
inVsCodeLmHandler
with a heuristic-based token counting method.countTokens()
method.VsCodeLmHandler
.This description was created by
for 28e98b5. You can customize this summary. It will automatically update as commits are pushed.