[generate] return past_key_values #17574

patil-suraj · 2022-06-06T14:54:08Z

What does this PR do?

Allows returning past_key_values from generate when use_cache=True.

Like other returned values, past_key_values are also returned as Tuple, one element per generated token.

Fixes #17016

HuggingFaceDocBuilderDev · 2022-06-06T15:04:42Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint.

patil-suraj · 2022-06-06T15:19:43Z

src/transformers/generation_utils.py

+                if return_dict_in_generate:
+                    past_key_values += (model_kwargs["past"],)


Assumes that if model_kwargs["past" is not None -> output_past_key_values == True

gante

LGTM 👍

gante · 2022-06-07T11:58:45Z

src/transformers/generation_utils.py

+            # past_key_values = model_kwargs["past"] if output_past_key_values else None
+


Suggested change

# past_key_values = model_kwargs["past"] if output_past_key_values else None

(probably forgotten :) )

patrickvonplaten · 2022-06-08T16:35:18Z

tests/generation/test_generation_utils.py

@@ -1433,6 +1433,17 @@ def _check_outputs(self, output, input_ids, config, use_cache=False, num_return_
                use_cache=use_cache,
            )

+        # Past Key Value States
+        past_key_values = output.past_key_values


Very cool to put it in every test here - good job!

patrickvonplaten · 2022-06-08T16:36:10Z

We'll just need to fix the failing tests now :-) Think you'll have to overwrite this "checking" function in the respective individual test files

dblakely · 2022-07-29T18:08:21Z

Hey there, sorry to nag, but any chance of moving this along? Anything I can do to help?

github-actions · 2022-08-23T15:02:20Z

This issue has been automatically marked as stale because it has not had recent activity. If you think this still needs to be addressed please comment on this thread.

Please note that issues that do not follow the contributing guidelines are likely to be ignored.

gante · 2022-08-25T10:02:39Z

(@patrickvonplaten @patil-suraj should I take over this PR? :) )

patrickvonplaten · 2022-08-31T12:06:30Z

If ok for you @gante this would be amazing!

shunzh · 2022-10-01T04:12:55Z

Hi, Thank you all for working on this feature! Is this going to be merged into the main branch soon?

gante · 2022-10-03T09:15:19Z

@shunzh I haven't started working on it and it's hard to give estimates -- hopefully less than a month :)

gilljon · 2023-01-24T05:54:28Z

Was this closed because it's now possible to retrieve past_key_values or was there another reason?

gante · 2023-01-24T10:40:31Z

@gilljon it is not closed :)

gilljon · 2023-01-24T17:41:04Z

@gante I'm sorry for the confusion! Any idea when it will be merged?

sijunhe · 2023-05-05T13:38:17Z

hi @gante . Any idea when this will be merged? Interested in using it and building something on top of it. I'll happy to put on the finishing touches if needed too!

slyalin · 2023-06-16T15:08:49Z

Hey! Just a friendly reminder. Any chance to get it merged soon?

freckletonj · 2023-10-02T04:07:17Z

I would absolutely love this feature! This would open up so much for me, because I have prompts like:

prompt = '''
Stuff
* <generate X>
* <generate Y>

Stuff
You said [X], and [Y] previously, now:
* <generate Z>
'''

This is so expensive without past_key_values.

So this PR is now Merge-Conflicting, and I tried applying the patch but upon inspection, it's quite severely out of date now.

Is there another way to accomplish this?

I notice that model.forward typically allows to return past_key_values. But then I... have to make use of a sampling alg myself? Would this be the best way without needing upstream changes, and if so, how can I chain together model.forward and a sampler?

EDIT: IIUC, generation_utils is where model.generate comes from, so the new place to make these edits is:

transformers/src/transformers/generation/utils.py

Line 1301 in 0b192de

def generate(

freckletonj · 2023-10-10T21:10:13Z

Is this ticket dead because some other technique exists already for returning and reusing past_key_values? This is a killer feature.

freckletonj · 2023-10-10T21:11:16Z

The following PR is more up to date: #25086

gante · 2023-10-18T09:45:38Z

(deprecated in favor of #25086)

gante · 2023-11-02T15:42:17Z

Hey folks 👋

#25086 was merged.

If you install from main and add return_dict_in_generate=True to generate, past_key_values will be part of the output, assuming your model is configured with use_cache=True (the default).

You can then pass past_key_values to generate to continue generating!

nevakrien · 2023-11-30T13:27:34Z

I cant get it to work with intel neural_chat what vartion was this on?

return past_key_values from generate

e3c29be

add docs

9de0d1e

patil-suraj commented Jun 6, 2022

View reviewed changes

patil-suraj requested review from patrickvonplaten and gante June 6, 2022 15:21

patil-suraj added 2 commits June 6, 2022 17:55

fix test

d1c91c1

use getattr

162d904

gante approved these changes Jun 7, 2022

View reviewed changes

patrickvonplaten reviewed Jun 8, 2022

View reviewed changes

huggingface deleted a comment from github-actions bot Jul 7, 2022

patrickvonplaten assigned patil-suraj Jul 7, 2022

yangkevin2 mentioned this pull request Aug 15, 2022

add logprobs endpoint with caching past key values alpa-projects/alpa#653

Closed

huggingface deleted a comment from github-actions bot Sep 27, 2022

gante added the WIP Label your PR/Issue with WIP for some long outstanding Issues/PRs that are work in progress label Sep 27, 2022

Natooz mentioned this pull request Dec 14, 2022

Cache size limit for generation #20767

Closed

freckletonj mentioned this pull request Oct 10, 2023

Generate: New Cache abstraction and Attention Sinks support #26681

Merged

5 tasks

gante closed this Oct 18, 2023

		if return_dict_in_generate:
		past_key_values += (model_kwargs["past"],)

		# past_key_values = model_kwargs["past"] if output_past_key_values else None

[generate] return past_key_values #17574

[generate] return past_key_values #17574

Uh oh!

Conversation

patil-suraj commented Jun 6, 2022

What does this PR do?

Uh oh!

HuggingFaceDocBuilderDev commented Jun 6, 2022

Uh oh!

patil-suraj Jun 6, 2022

Choose a reason for hiding this comment

Uh oh!

gante left a comment

Choose a reason for hiding this comment

Uh oh!

gante Jun 7, 2022

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten Jun 8, 2022

Choose a reason for hiding this comment

Uh oh!

patrickvonplaten commented Jun 8, 2022

Uh oh!

dblakely commented Jul 29, 2022 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

github-actions bot commented Aug 23, 2022

Uh oh!

gante commented Aug 25, 2022

Uh oh!

patrickvonplaten commented Aug 31, 2022

Uh oh!

shunzh commented Oct 1, 2022

Uh oh!

gante commented Oct 3, 2022

Uh oh!

gilljon commented Jan 24, 2023

Uh oh!

gante commented Jan 24, 2023

Uh oh!

gilljon commented Jan 24, 2023

Uh oh!

sijunhe commented May 5, 2023

Uh oh!

slyalin commented Jun 16, 2023

Uh oh!

freckletonj commented Oct 2, 2023 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

freckletonj commented Oct 10, 2023

Uh oh!

freckletonj commented Oct 10, 2023

Uh oh!

gante commented Oct 18, 2023

Uh oh!

gante commented Nov 2, 2023

Uh oh!

nevakrien commented Nov 30, 2023

Uh oh!

Uh oh!

dblakely commented Jul 29, 2022 •

edited

Loading

freckletonj commented Oct 2, 2023 •

edited

Loading