-
Notifications
You must be signed in to change notification settings - Fork 2.8k
feat(config-yaml): Add promptCaching to Default Completion Options and enable Bedrock Tools Caching #5371
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
feat(config-yaml): Add promptCaching to Default Completion Options and enable Bedrock Tools Caching #5371
Conversation
✅ Deploy Preview for continuedev ready!
To edit notification comments on pull requests, go to your Netlify site configuration. |
@chezsmithy thanks for splitting up the schema/implementation work here. I worry that the What's your take? Do you find value in having it this granular? Is there any scenario where you wouldn't just want the system message, chat, and tool configs to all be cached together? |
@Patrick-Erichsen agree. If you are going to cache you might as well enable all of them. One pain point to adding a global cache setting is testing for the various providers. I don't have access to them all so this kind of change would require a bit of a breaking change. Maybe we add the individual settings and slowly deprecate them in favor of a single setting that replaces them all? Where would you think that should live? Under env? |
I would agree, better to get this shipped and see how it's used vs trying to figure out all the edge cases making it generalizable to all providers. I approved but just realized there's a small merge conflict from your other caching PR, once that's resolved let's merge 👍 Also, what are your thoughts on implementing the actual behavior in this PR as well? Instead of merging the config and then implementing separately. |
@Patrick-Erichsen i have the changes ready and tested for the bedrock provider. I can add those to the PR. I don't have access to Vertex or Claude API direct so I'm less inclined to add those as I can't test it. |
+1 to the idea of a single enable/disable cache setting. If we add all of these granular permissions, we're going to be stuck supporting them across a very large number of providers. Agree that this PR should add the options there, implementation for Bedrock, but can skip implementation for Vertex and Claude API |
a445759
to
d05bcee
Compare
@Patrick-Erichsen I've pushed the changes for review and have added the Bedrock implmentation. I'll open separate issues as enhacement requests for the other providers as a reminder to add them, or looking for a community member to add them. |
@sestinj quick ping on how you would like to proceed? |
@chezsmithy Sorry if this didn't come across in the last message—I don't think we should merge a specific setting for caching tools at all if we are just going to add a general cache setting right after and then deprecate it. I think we should skip straight to the creation of the general caching setting. Probably lives in |
Should this PR stay open? |
@tomasz-stefaniak stand by. I'm going to push an update shortly based on feedback. |
55dfae4
to
6b8345e
Compare
@tomasz-stefaniak it's ready for review. The e2e tests seem to be failing for all recently pushed PRs for some reason. I noted it on discord. |
Description
Provides inital configuration support for this new feature: #5340
Adding the ability to specify if the model / provider should enable caching for the tools Configuration. The changes to enable the use of this configuration in providers will be done via separate PRs. The documenation to use this feature will be updated on a per model / provider basis.
Enabling tools caching for AWS Bedrock Claude Models.
Note, I'm not planning to make significant updates to the JSON schema to support this feature.
Checklist
Screenshots
[ For visual changes, include screenshots. Screen recordings are particularly helpful, and appreciated! ]
Testing instructions
defaultCompletionOptions:
promptCaching: true
Add this configuration to a model and validate through debugging that it's loaded and available to the model / provider in question downstream.