-
Notifications
You must be signed in to change notification settings - Fork 30.3k
[generate] return past_key_values #17574
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. |
if return_dict_in_generate: | ||
past_key_values += (model_kwargs["past"],) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Assumes that if model_kwargs["past"
is not None
-> output_past_key_values == True
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
LGTM 👍
# past_key_values = model_kwargs["past"] if output_past_key_values else None | ||
|
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
# past_key_values = model_kwargs["past"] if output_past_key_values else None |
(probably forgotten :) )
@@ -1433,6 +1433,17 @@ def _check_outputs(self, output, input_ids, config, use_cache=False, num_return_ | |||
use_cache=use_cache, | |||
) | |||
|
|||
# Past Key Value States | |||
past_key_values = output.past_key_values |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Very cool to put it in every test here - good job!
We'll just need to fix the failing tests now :-) Think you'll have to overwrite this "checking" function in the respective individual test files |
Hey there, sorry to nag, but any chance of moving this along? Anything I can do to help? |
This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread. Please note that issues that do not follow the contributing guidelines are likely to be ignored. |
(@patrickvonplaten @patil-suraj should I take over this PR? :) ) |
If ok for you @gante this would be amazing! |
Hi, Thank you all for working on this feature! Is this going to be merged into the main branch soon? |
@shunzh I haven't started working on it and it's hard to give estimates -- hopefully less than a month :) |
Was this closed because it's now possible to retrieve |
@gilljon it is not closed :) |
@gante I'm sorry for the confusion! Any idea when it will be merged? |
hi @gante . Any idea when this will be merged? Interested in using it and building something on top of it. I'll happy to put on the finishing touches if needed too! |
Hey! Just a friendly reminder. Any chance to get it merged soon? |
I would absolutely love this feature! This would open up so much for me, because I have prompts like:
This is so expensive without So this PR is now Merge-Conflicting, and I tried applying the patch but upon inspection, it's quite severely out of date now. Is there another way to accomplish this? I notice that EDIT: IIUC, transformers/src/transformers/generation/utils.py Line 1301 in 0b192de
|
Is this ticket dead because some other technique exists already for returning and reusing |
The following PR is more up to date: #25086 |
(deprecated in favor of #25086) |
Hey folks 👋 #25086 was merged. If you install from You can then pass |
I cant get it to work with intel neural_chat what vartion was this on? |
What does this PR do?
Allows returning
past_key_values
fromgenerate
whenuse_cache=True
.Like other returned values,
past_key_values
are also returned asTuple
, one element per generated token.Fixes #17016