Skip to content

Conversation

KennethEnevoldsen
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen commented Feb 4, 2025

External models had revision. This lead to duplicate scores (two results from the same model on the same revision).

I have replace the revision with "no_revision_available". I also think this more accurately reflects the reality (we don't know the version they used when they ran the model).

Checklist

  • Run tests locally to make sure nothing is broken using make test.
  • Run the results files checker make pre-push.

@Samoed
Copy link
Member

Samoed commented Feb 4, 2025

Can you update the load external script too?

@KennethEnevoldsen
Copy link
Contributor Author

I think we should avoid using the load_external (model submission should happen here). So I have actually deleted it.

@KennethEnevoldsen
Copy link
Contributor Author

KennethEnevoldsen commented Feb 4, 2025

Let me know if you agree

I also added a few extra tests these should make the revision much more consistent in the future (they also disallow "external")

essentially you are allowed to use a sha1 rev id or an integer (1, 2, 3) in case of APIs

@KennethEnevoldsen
Copy link
Contributor Author

(@x-tabdeveloping just so that you see this)

@Samoed
Copy link
Member

Samoed commented Feb 4, 2025

Yes, I agree that with new leaderboard this script can be deleted

@KennethEnevoldsen KennethEnevoldsen enabled auto-merge (squash) February 5, 2025 07:56
@KennethEnevoldsen KennethEnevoldsen merged commit 7bfb6d9 into main Feb 5, 2025
2 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants