Skip to content

Conversation

mathemakitten
Copy link
Contributor

If the user doesn't explicitly pass in max_length, we shouldn't truncate the inputs to perplexity at all. model.config.max_length is unreliable since it's named differently in every model (and sometimes not at all).

I'd also be in support of removing the truncation option in entirety, but this seems like a good compromise for retaining previous existing functionality.

Closes #332.

@HuggingFaceDocBuilderDev
Copy link

HuggingFaceDocBuilderDev commented Oct 31, 2022

The documentation is not available anymore as the PR was closed or merged.

Copy link
Member

@lvwerra lvwerra left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good, thanks for fixing!

@mathemakitten mathemakitten merged commit 9f0f888 into main Nov 1, 2022
@mathemakitten mathemakitten deleted the hn-perplexity-cutoff branch November 1, 2022 14:41
NimaBoscarino pushed a commit to NimaBoscarino/evaluate that referenced this pull request Nov 9, 2022
…gface#333)

* Stop using model-defined truncation

* Formatting

* If start token and also max length defined
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

Bug in computing perplexity about max_length
3 participants