[v2] Merge main #2412

Samoed · 2025-03-22T12:22:49Z

Code Quality

Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

Adding datasets checklist

Reason for dataset addition: ...

I have run the following models on the task (adding the results to the pr). These can be run using the mteb -m {model_name} -t {task_name} command.
- sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
- intfloat/multilingual-e5-small
I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
If the dataset is too big (e.g. >2048 examples), considering using self.stratified_subsampling() under dataset_transform()
I have filled out the metadata object in the dataset file (find documentation on it here).
Run tests locally to make sure nothing is broken using make test.
Run the formatter to format the code using make lint.

Adding a model checklist

I have filled out the ModelMeta object to the extent possible
I have ensured that my model can be loaded using
- mteb.get_model(model_name, revision) and
- mteb.get_model_meta(model_name, revision)
I have tested the implementation works on a representative set of tasks.

* fix: Ensure BrightRetrieval is valid to run Not sure this is the best way to fix this. Let me know if you can find a better fix. fixes #2327 * fix: convert brightretrieval to two tasks * fix collecting error

Automatically generated by python-semantic-release

* pass task name to all tasks * add test * fix loader

* fix: renaming Zeroshot -> ZeroShot Adresses #2078 * rename 1 * rename 2 * format * fixed error

Automatically generated by python-semantic-release

Update AmazonPolarityClassification.py

Automatically generated by python-semantic-release

* fix: renaming Zeroshot -> ZeroShot Adresses #2078 * fix: minor style changes Adresses #2078 * rename 1 * rename 2 * format * fixed error --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

… covid related scientific papers (#2302) * Clustrec covid new dataset and task * fix * fix * fix * fix * fix * descriptive stats * change all mentions of clustrec-covidp2p to clustrec-covid * change ' to "

* fix: renaming Zeroshot -> ZeroShot Adresses #2078 * fix: minor style changes Adresses #2078 * fix: Major updates to documentation This PR does the following: - This introduced other modalities more clearly in the documentation as well as make it easier to transition to a full on documentation site later. - added minor code updates due to discovered inconsistencies in docs and code. - Added the MMTEB citation where applicable - makes the docs ready to move torchvision to an optional dependency * Moved VISTA example * rename 1 * rename 2 * format * fixed error * fix: make torchvision optional (#2399) * fix: make torchvision optional * format * add docs * minor fix * remove transform from Any2TextMultipleChoiceEvaluator --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com> * move Running SentenceTransformer model with prompts to usage --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

Automatically generated by python-semantic-release

KennethEnevoldsen and others added 16 commits March 18, 2025 22:53

fix: Ensure BrightRetrieval is valid to run (#2334)

cf26764

* fix: Ensure BrightRetrieval is valid to run Not sure this is the best way to fix this. Let me know if you can find a better fix. fixes #2327 * fix: convert brightretrieval to two tasks * fix collecting error

Update tasks table

042d6e7

1.36.26

b3a9191

Automatically generated by python-semantic-release

Pass task name to all evaluators (#2389)

5ebee24

* pass task name to all tasks * add test * fix loader

fix: renaming Zeroshot -> ZeroShot (#2395)

e7b04a6

* fix: renaming Zeroshot -> ZeroShot Adresses #2078 * rename 1 * rename 2 * format * fixed error

1.36.27

349d5a8

Automatically generated by python-semantic-release

fix: Update AmazonPolarityClassification license (#2402)

cf84a79

Update AmazonPolarityClassification.py

fix b1ade name (#2403)

a0990cb

1.36.28

d2dc2f6

Automatically generated by python-semantic-release

Minor style changes (#2396)

8be95b7

* fix: renaming Zeroshot -> ZeroShot Adresses #2078 * fix: minor style changes Adresses #2078 * rename 1 * rename 2 * format * fixed error --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

Added new dataset and tasks - ClusTREC-covid , clustering of thematic…

e2476d2

… covid related scientific papers (#2302) * Clustrec covid new dataset and task * fix * fix * fix * fix * fix * descriptive stats * change all mentions of clustrec-covidp2p to clustrec-covid * change ' to "

Update tasks table

5b0bd56

1.36.29

9c459a8

Automatically generated by python-semantic-release

remove Arabic_Triplet_Matryoshka_V2.py (#2405)

811dbf6

merge

9782c87

Samoed changed the base branch from main to v2.0.0 March 22, 2025 12:23

Samoed added 2 commits March 22, 2025 15:43

update imports

aecdd5e

add stats

110c6bc

Samoed merged commit 97b94d9 into v2.0.0 Mar 22, 2025
10 checks passed

Samoed deleted the merge_main branch March 22, 2025 19:16

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[v2] Merge main #2412

[v2] Merge main #2412

Uh oh!

Samoed commented Mar 22, 2025

Uh oh!

Uh oh!

Uh oh!

[v2] Merge main #2412

[v2] Merge main #2412

Uh oh!

Conversation

Samoed commented Mar 22, 2025

Code Quality

Documentation

Testing

Adding datasets checklist

Adding a model checklist

Uh oh!

Uh oh!

Uh oh!