Main merge for maeb -> 1.38.52 #3109

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

Sign up for GitHub

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Jump to bottom

Merged

isaac-chung merged 113 commits into maeb from main-merge-for-maeb

Sep 1, 2025

+848 −223

Collaborator

isaac-chung commented Sep 1, 2025

If you add a model or a dataset, please add the corresponding checklist:

makram93 and others added 30 commits

July 11, 2025 22:06


          model: add image support for jina embeddings v4 (#2893)

17be7e5

* feat: unify text and image embeddings for all tasks

* fix: uniform batch size

* fix: update error message

* fix: update code task

* fix: update max length

* fix: apply review suggestions


          model: add kalm_models (kalm-emb-v2) ModelMeta (new PR) (#2889)

9ecac21

* feat: add KaLM_Embedding_X_0605 in kalm_models

* Update kalm_models.py for lint format

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

* kalm-emb-v2

---------

Co-authored-by: xinshuohu <xinshuohu@tencent.com>
Co-authored-by: Xinshuo Hu <yanshek.woo@gmail.com>


          Add Classification Evaluator unit test (#2838)

4a47f90

* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Modifications due to the comments

* Modifications due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          fix: update colpali engine models (#2905)

9864e2a

* adding vidore benchmarks

* fix typo

* clean vidore names + per lang eval

* lint

* vidore names

* bibtex fix

* fix revision

* vidore v2 citation

* update citation format and fix per-language mappings

* lint: citations

* typo citations

* fix revisiions

* lint

* fix colnomic3b revision

* fix colqwen2.5 revision + latest repo version

* fix query agmentation tokens

* colsmol revision


          1.38.35

5a8ccec

Automatically generated by python-semantic-release


          Evaluator tests (#2910)

c7078af

* Adding Classification Evaluator test

* Modifications due to the comments

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update tests/test_evaluators/test_ClassificationEvaluator.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Modifications due to the comments

* Modifications due to the comments

* Adding STSEvaluator and SummarizationEvaluator tests

* Correcting due to the comments

* Correcting due to the comments

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          Classification dataset cleaning (#2900)

aef1e33

* Classification dataset cleaning

* Update pull request number

* Fix metadata test

* fix formatting

* add script for cleaning


          Update tasks & benchmarks tables

56c98ed


          dataset: Add JapaneseSentimentClassification (#2913)

57438c2

Add JapaneseSentimentClassification


          Update tasks & benchmarks tables

372fc4c


          fix: change passage prompt to document (#2912)

a298fa9

* change document to passage

* fix prompt names

* fix kwargs check

* fix default prompt


          1.38.36

8eb4f6d

Automatically generated by python-semantic-release


          model: Add OpenSearch inf-free sparse encoding models (#2903)

5a868e3

add opensearch inf-free models

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>


          dataset: add BarExamQA dataset (#2916)

1dcc6dc

* Add BareExamQA retrieval task

* ran linter

* updated details

* updated details

* fixed subtype name

* fixed changes

* ran linter again


          Use mteb.get_model in adding_a_dataset.md (#2922)

c1922c8

Update adding_a_dataset.md


          fix: specify revision for opensearch (#2919)

0ac0231

specify revision for opensearch


          1.38.37

b12b926

Automatically generated by python-semantic-release


          Update the link for gemini-embedding-001 (#2928)

533ce59


          fix: replace with passage (#2934)

5ed6c90


          fix: Only import SparseEncoder once sentence-transformer version have…

79a43af

… been checked (#2940)

* fix: Only import SparseEncoder once sentence-transformer version have been checked

fixes #2936

* Update mteb/models/opensearch_neural_sparse_models.py

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

---------

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>


          fix: Prevent incorrectly passing "selector_state" to get_benchmark (#…

8496ec2

…2939)

The leaderboard would have (silent) errors where `get_benchmark` lead to a KeyError due to "selector_state" being passed as a default value. Setting `DEFAULT_BENCMARK_NAME` as the value solves this issue.


          docs: Update adding_a_dataset.md (#2947)

a78debf

* docs: Update adding_a_dataset.md

* Update docs/adding_a_dataset.md


          ci: bump semantic release

4ef8571


          1.38.38

03a0582

Automatically generated by python-semantic-release


          dataset: Add BSARD v2, fixing the data loading issues of v1 (#2935)

* BSARD loader fixed

* BSARDv2 metadata fixed

* Update mteb/tasks/Retrieval/fra/BSARDRetrieval.py

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          Update tasks & benchmarks tables

da46c8e


          dataset: add GovReport dataset (#2953)

42dfe0d

* Added govreport task

* Updated description


          dataset: add BillSum datasets (#2943)

007d19f

* Added BillSum datasets

* fixed billsumca

* Updated BillSumCA description

* Updated BillSumUS description

* Update mteb/tasks/Retrieval/eng/BillSumCA.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update mteb/tasks/Retrieval/eng/BillSumUS.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* lint

* lint

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>


          Update tasks & benchmarks tables

e4f30e9


          fix: Add new benchmark beRuSciBench along with AbsTaskTextRegression (#…

36df9ca

…2716)

* Add RuSciBench

* fix bitext mining lang

* Add regression task

* fix init

* add missing files

* Improve description

* Add superseded_by

* fix lint

* Update regression task to match with v2

* Add stratified_subsampling for regression task

* Add boostrap for regression task

* Rename task class, add model as evaluator argument

* fix import

* fix import 2

* fixes

* fix

* Rename regression model protocol

fzoll and others added 28 commits

August 24, 2025 10:18


          Correcting the JINA models with SentenceTransformerWrapper (#3071)

70724e7


          ci: Add stale workflow (#3066)

df719cc

* add stale workflow

* add permissions

* add bug label to bug issue template

* revert bug issue and only look at more info needed issues

* more accurate name

* override default


          fix: open_clip package validation (#3073)

1f9641a


          1.38.45

f210ac1

Automatically generated by python-semantic-release


          fix: Update revision for qzhou models (#3069)

63a0c60


          1.38.46

Automatically generated by python-semantic-release


          Fix the reference link for CoDi-Embedding-V1 (#3075)

d2c3570

Fix reference link


          fix: Add beta version of RTEB related benchmarks (#3048)

* Add RTEB related benchmarks

* Add RTEB related benchmarks

* Correcting the task names in the RTEB benchmarks

* Update mteb/leaderboard/benchmark_selector.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Adding the CURE dataset to RTEB benchmarks

* Use the right language subset

* Fix broken finance icon URL in RTEB benchmarks

Replace broken libre-finance-dollar.svg with working libre-gui-price-tag.svg
Validated all icon URLs and confirmed accessibility compliance

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

* Add the rteb_benchmarks to the BENCHMARK_REGISTRY

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>


          1.38.47

bce7471

Automatically generated by python-semantic-release


          fix: run ruff check on all files during ci (#3086)

b46b633

* fix: run `ruff check` on all files during ci

* format


          1.38.48

6db355e

Automatically generated by python-semantic-release


          Move dev to dependency groups (#3088)

cd14ef6

add dependency groups


          fix: Improving validate_task_to_prompt_name logs and error messages (#…

139fc73

…3079)

* Improving validate_task_to_prompt_name logs and error messages

* linter fixes

* Adding None prompts tests

* Update test_benchmark_sentence_transformer

* Update mteb/leaderboard/benchmark_selector.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>


          fix: duplicate mteb multilingual variables (#3080)

27be671

* fix benchmark naming

* format

* lint


          Update tasks & benchmarks tables

5bf303b


          model: mdbr-leaf models (#3081)

e4c2a95

* added MDBR leaf models

* fixed revision for mdbr-leaf-ir

* added model prompts

* updated training datasets

* fixed linting

* lotte task reference

---------

Co-authored-by: Robin Vujanic <robin.vujanic@mongodb.com>


          1.38.49

2b7089a

Automatically generated by python-semantic-release


          CI: Set upper limit for xdist version (#3098)

17fa697

* Commentout bibtex formatting

* Remove `-n auto`

* get back bibtex

* try limiting versions

* revert coverage

* revert coverage

---------

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>


          Combine Plots and Tables into a Single (#3047)

* feat - Combine Plots and Tables into a Single Tab #3009

* feat - Resize the plot to make it more readable

* feat - Remove the (radar chart)

* feat - Add a comment stating that it only shows the Top 5 models in the table.

* feat - adjust layout

* Update mteb/leaderboard/app.py

* format

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>


          fix: Updating the default batch size calculation in the voyage models (…

5851c7a

…#3091)


          1.38.50

80966c2

Automatically generated by python-semantic-release


          fix: Add @classmethod for @field_validators in TaskMetadata (#3100)


          Align task prompt dict with PromptType (#3101)

7303c15

* align task prompt dict with `PromptType`

* use value instead of enum


          1.38.51

b7b5d11

Automatically generated by python-semantic-release


          model: Add ModelMeta for OrdalieTech/Solon-embeddings-mini-beta-1.1 (#…

4774b74

…3090)

* Add ModelMeta for OrdalieTech/Solon-embeddings-mini-beta-1.1

* Add training_datasets (common_corpus, fineweb, wiki_fr, private LLM-synth)

* Format with ruff + add loader per review

* Apply ruff format/fixes

* Update mteb/models/ordalietech_solon_embeddings_mini_beta_1_1.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update mteb/models/ordalietech_solon_embeddings_mini_beta_1_1.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Register OrdalieTech/Solon-embeddings-mini-beta-1.1 in overview (ModelMeta + loader)

* Update mteb/models/ordalietech_solon_embeddings_mini_beta_1_1.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* fix import

* Add memory_usage_mb=808.0 and required fields to ModelMeta

* Fix 210 milions of parameters

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          fix: Allow closed datasets (#3059)

5844cc7

* - Added an include_private parameter to the get_tasks() function that defaults to False
  - This ensures that by default, tests only run on public datasets
  - Tests can explicitly set include_private=True when needed to test private datasets

  - Added is_public: bool | None = None field to TaskMetadata
  - The field is optional and defaults to None (treated as public)
  - Updated the is_filled() method to exclude is_public from required fields
  - Added documentation

* - Added an include_private parameter to the get_tasks() function that defaults to False
  - This ensures that by default, tests only run on public datasets
  - Tests can explicitly set include_private=True when needed to test private datasets

  - Added is_public: bool | None = None field to TaskMetadata
  - The field is optional and defaults to None (treated as public)
  - Updated the is_filled() method to exclude is_public from required fields
  - Added documentation

* Correcting due to comments

* Update mteb/abstasks/TaskMetadata.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Update mteb/overview.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Removing the not used filter_tasks_by_privacy function

* Correcting due to comments

* Correcting due to comments

* Correcting due to comments

* Removing the test case

* Rename the include_private parameter to exclude_private

* Rename the include_private parameter to exclude_private

* Add private tasks tests

* Add private tasks tests

* Update tests/test_tasks/test_private_tasks.py

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>

* Add private tasks tests

* Add private tasks tests

* Add private tasks tests

---------

Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>


          1.38.52

07bf861

Automatically generated by python-semantic-release


          Merge branch 'maeb' into main-merge-for-maeb

5e0bf22

isaac-chung merged commit 53b8b62 into maeb

9 checks passed

isaac-chung deleted the main-merge-for-maeb branch

September 1, 2025 18:50

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment

Labels

None yet

27 participants