-
Notifications
You must be signed in to change notification settings - Fork 608
Add metric for judging numeric relative error #459
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Add metric for judging numeric relative error #459
Conversation
bigbench/api/test_task_metrics.py
Outdated
targets, responses = zip(*test_data) | ||
scores = metrics.exact_str_match_fn(targets=targets, responses=responses) | ||
|
||
expected_scores = {"exact_str_match": 9/18} |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
bigbench/api/test_task_metrics.py
Outdated
(0, "i am not a number"), # Bad | ||
] | ||
targets, responses = zip(*test_data) | ||
scores = metrics.exact_str_match_fn(targets=targets, responses=responses) |
This comment was marked as resolved.
This comment was marked as resolved.
Sorry, something went wrong.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Fixed, thanks!
568b803
to
21175ac
Compare
@RomanPlusPlus : I've fixed the issues you pointed out above and this should be ready for review again. |
b5ca56a
to
4f0a818
Compare
@RomanPlusPlus : All tests pass. |
@r-barnes, I'm not an assigned reviewer, sorry for being unclear. Not sure why I'm listed as a reviewer in the |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hello, thanks a lot for adding this! It is great to have submitters which add custom metrics! It looks great and I added some comments for minor modificatons.
@lewkowycz : Thanks for your review. I've made the changes you requested. |
4f0a818
to
ec4e54b
Compare
ec4e54b
to
a6bb46d
Compare
Fixed the metric's name in Waiting for this to merge now. |
@lewkowycz : Can we merge this after the tests pass? The tests for #372 won't pass until this lands and our reviewer is withholding acceptance until the tests pass. |
Yes, thanks for adding this again! |
…etric Add metric for judging numeric relative error
Also alphabetizes some import lists
Unfortunately, I was unable to test this locally.