-
Notifications
You must be signed in to change notification settings - Fork 464
Description
The next version of the MTEB leaderboard will soon be released and with it a new zero-shot filter. However, we are currently planning to use the following definition of zero-shot. This issue is to open that discussion up to the community to ensure that we consider all relevant views.
Zero Shot
A model is considered zero-shot if it is not trained on other splits of the dataset used to derive the task.
E.g., if a model is trained on Natural Questions, it cannot be considered zero-shot on benchmarks containing the task “NQ” which is derived from Natural Questions.
This definition creates a few edge cases. For instance, multiple models are typically trained on Wikipedia title and body pairs, but we do not define this as leakage on, e.g., “WikipediaRetrievalMultilingual” and “WikiClusteringP2P” as these datasets are not based on title-body pairs.
Distilled, further fine-tunes or in other ways, derivative models inherit the datasets of their parent models.
Based on community feedback and research findings, This definition could change in the future.