-
Notifications
You must be signed in to change notification settings - Fork 6.6k
Feat: Prompt Caching in SAP AI Core #5399
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
…entation (and make implicit caching clear)
remove: caching support flag for older claude models
🦋 Changeset detectedLatest commit: 5e6de55 The changes in this PR will be included in the next version bump. This PR includes changesets to release 1 package
Not sure what this means? Click here to learn what changesets are. Click here if you're a maintainer who wants to add another changeset to this PR |
@ncryptedV1 Thanks for the PR, for SAP I would need an enterprise account to be able to set it up and use locally. We have two options
|
@tjandy98 can you help to test this out, thanks. |
@tjandy98 please let us know when we are good to go to merge this -- we cannot test this ourselves. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@tjandy98 Since you have confirmed that this works I am approving and merging this PR.
@tjandy98 You also mentioned
So feel free to make a followup PR if something needs to change there. |
Related Issue
None, this is just a minor improvement to the SAP AI Core inference provider.
Description
This PR adds prompt caching for SAP AI Core to recent Claude models (Claude 4 Sonnet/Opus & Claude 3.7 Sonnet), significantly reducing costs and improving response times. For Vertex AI & Azure Open AI inferences, caching already automatically takes place under the hood, just as in the native providers.
Additionally, as part of adding cache support, the provider has been cleaned up and refactored to align closer with the original providers, easing the integration of future changes to those.
Test Procedure
I tested inference with all registered SAP AI Core models using the regular Cline workflow for two separate AI Core instances. Solely Claude 4 Opus did not work as it's not available in SAP AI Core, yet. This is the only place where the changes could break existing functionalities.
Type of Change
Pre-flight Checklist
npm test
) and code is formatted and linted (npm run format && npm run lint
)npm run changeset
(required for user-facing changes)Screenshots
/
Additional Notes
While a PR for this already exists in #4683, this PR also adds a slight refactoring to align payload preparation with the native providers. This renders the integration of future adaptations to these providers easier. @tjandy98 @lizzzcai @schardosin
Important
Adds prompt caching for SAP AI Core Claude models and refactors provider for better alignment with native providers.
sapaicore.ts
for Claude models (Claude 4 Sonnet/Opus & Claude 3.7 Sonnet).SapAiCoreHandler
to align with native providers.Bedrock
andGemini
namespaces insapaicore.ts
for caching functions.prepareSystemMessages
,applyCacheControlToMessages
, andformatMessagesForConverseAPI
inBedrock
.processStreamChunk
andprepareRequestPayload
inGemini
.sapAiCoreModels
inapi.ts
to support prompt caching for specific models.This description was created by
for c323fd4. You can customize this summary. It will automatically update as commits are pushed.