Skip to content

Feat: Prompt Caching in SAP AI Core #5399

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 7 commits into from
Aug 9, 2025

Conversation

ncryptedV1
Copy link
Contributor

@ncryptedV1 ncryptedV1 commented Aug 6, 2025

Related Issue

None, this is just a minor improvement to the SAP AI Core inference provider.

Description

This PR adds prompt caching for SAP AI Core to recent Claude models (Claude 4 Sonnet/Opus & Claude 3.7 Sonnet), significantly reducing costs and improving response times. For Vertex AI & Azure Open AI inferences, caching already automatically takes place under the hood, just as in the native providers.
Additionally, as part of adding cache support, the provider has been cleaned up and refactored to align closer with the original providers, easing the integration of future changes to those.

Test Procedure

I tested inference with all registered SAP AI Core models using the regular Cline workflow for two separate AI Core instances. Solely Claude 4 Opus did not work as it's not available in SAP AI Core, yet. This is the only place where the changes could break existing functionalities.

Type of Change

  • 🐛 Bug fix (non-breaking change which fixes an issue)
  • ✨ New feature (non-breaking change which adds functionality)
  • 💥 Breaking change (fix or feature that would cause existing functionality to not work as expected)
  • ♻️ Refactor Changes
  • 💅 Cosmetic Changes
  • 📚 Documentation update
  • 🏃 Workflow Changes

Pre-flight Checklist

  • Changes are limited to a single feature, bugfix or chore (split larger changes into separate PRs)
  • Tests are passing (npm test) and code is formatted and linted (npm run format && npm run lint)
  • I have created a changeset using npm run changeset (required for user-facing changes)
  • I have reviewed contributor guidelines

Screenshots

/

Additional Notes

While a PR for this already exists in #4683, this PR also adds a slight refactoring to align payload preparation with the native providers. This renders the integration of future adaptations to these providers easier. @tjandy98 @lizzzcai @schardosin


Important

Adds prompt caching for SAP AI Core Claude models and refactors provider for better alignment with native providers.

  • Behavior:
    • Adds prompt caching for SAP AI Core in sapaicore.ts for Claude models (Claude 4 Sonnet/Opus & Claude 3.7 Sonnet).
    • Refactors SapAiCoreHandler to align with native providers.
  • Caching:
    • Introduces Bedrock and Gemini namespaces in sapaicore.ts for caching functions.
    • Implements prepareSystemMessages, applyCacheControlToMessages, and formatMessagesForConverseAPI in Bedrock.
    • Implements processStreamChunk and prepareRequestPayload in Gemini.
  • Models:
    • Updates sapAiCoreModels in api.ts to support prompt caching for specific models.

This description was created by Ellipsis for c323fd4. You can customize this summary. It will automatically update as commits are pushed.

Copy link

changeset-bot bot commented Aug 6, 2025

🦋 Changeset detected

Latest commit: 5e6de55

The changes in this PR will be included in the next version bump.

This PR includes changesets to release 1 package
Name Type
claude-dev Minor

Not sure what this means? Click here to learn what changesets are.

Click here if you're a maintainer who wants to add another changeset to this PR

@arafatkatze
Copy link
Contributor

@ncryptedV1 Thanks for the PR, for SAP I would need an enterprise account to be able to set it up and use locally. We have two options

  1. If you have properly tested it yourself and can verify that it works perfectly then if you send some screenshots of it working in a debugger etc and then I will approve and merge the PR.
  2. If you would like me to test this locally, you can email me at ara@cline.bot and then we can have a separate conversation about creds setup etc.

@lizzzcai
Copy link
Contributor

lizzzcai commented Aug 7, 2025

@tjandy98 can you help to test this out, thanks.

@saoudrizwan
Copy link
Contributor

@tjandy98 please let us know when we are good to go to merge this -- we cannot test this ourselves.

@tjandy98
Copy link
Contributor

tjandy98 commented Aug 9, 2025

Hello, I have tested the changes, caching is working as expected for Gemini. SAP AI Core does not return cache usage information(converse stream) for Claude models at this point.
image

Copy link
Contributor

@arafatkatze arafatkatze left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@tjandy98 Since you have confirmed that this works I am approving and merging this PR.

@arafatkatze arafatkatze merged commit 759ef87 into cline:main Aug 9, 2025
7 of 8 checks passed
@arafatkatze
Copy link
Contributor

@tjandy98 You also mentioned

SAP AI Core does not return cache usage information(converse stream) for Claude models at this point.

So feel free to make a followup PR if something needs to change there.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants