Skip to content

Conversation

OyvindTafjord
Copy link
Contributor

As reported in #685, this fixes a bad bug in the bpb evaluations for oe-eval tasks which was by luck encountered on sciq (since the answer is always "D" there, and the bug meant that only questions with answer "A" would actually be scored!).

This PR also adds the 0-shot RC versions of csqa and social_iqa for parity with old evaluations.

@OyvindTafjord OyvindTafjord merged commit 46f06cb into main Aug 27, 2024
11 of 12 checks passed
@OyvindTafjord OyvindTafjord deleted the ot-fix-oe-eval-bpb branch August 27, 2024 23:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants