Skip to content

Merge main into MAEB #2488

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 247 commits into from
Apr 4, 2025
Merged

Merge main into MAEB #2488

merged 247 commits into from
Apr 4, 2025

Conversation

isaac-chung
Copy link
Collaborator

@isaac-chung isaac-chung commented Apr 4, 2025

Merge in the latest changes, especially CI.
Odd that the same commits can still be seen from #2471

Code Quality

  • Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

  • Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

  • New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
  • Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

isaac-chung and others added 30 commits February 13, 2025 23:35
* add ImageClassificationDescriptiveStatistics

* add MNIST descriptive stats

* use tuples instead

* add label count and update docstrings

* update MNIST example
* fix: Add column descriptions to leaderboard

* removed existing overlap
#2041)

* fix: Add BRIGHT Long

Fixes #1978

* fix: Add BRIGHT(long)

* fix bug in task results

* updated bright

* updated tests for TaskResults
Automatically generated by python-semantic-release
* add image clustering descirptive stats and run
* finish off last one
* remove script
Automatically generated by python-semantic-release
* add gigaembeddings

* use jasper

* fix name

* create sentence_transformer instruct wrapper

* apply instruction template

* fix jasper

* update meta
…plementation (#2059)

* add image clustering descirptive stats and run

* finish off last one

* remove script

* add ImageMultilabelClassificationDescriptiveStatistics

* add VOC2007

* add zeroshot and mnist example
* add visualsts stats

* add last dataset
* fix: Added gte models

* fix: Add mixbai models (#1540)

for #1515
* Updated ClimateFEVER dataset with new version

* Adds Fill in the empty metadata.

* Updates the date tuple

* Update class name

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* Update domains

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* Update task_subtypes

* Update annotations_creators for the first version

* Update date

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

* Update task subtypes

* Update path

* Update description

---------

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Mina Parham <minaparham@Keatext.local>
* change reference revisions to align with paper

* Update author list

* Added code for main results table

* updated minor changes

* added external as a "no_revision_available" case

* revert unintended changes

* format
Automatically generated by python-semantic-release
#1911)

* adding clustering tasks (built-bench-clustering S2S & P2P)

* updated built-bench-clustering tasks

* Updated BuiltBenchClustering tasks

* Added "Engineering" as new domain to TaskMetadata.py
* Updated tasks table in docs
* Updated task metadata for BuiltBenchClustering S2S and P2P

* updated metadata for clustering tasks

* Add/update BuiltBench tasks

- Add BuiltBenchRetrieval task
- Add BuiltBenchReranking task
- Update metadata for BuiltBenchClusterinP2P
- Update metadata for BuiltBenchClusterinS2S

* update BuiltBench benchmark

* Update mteb/benchmarks/benchmarks.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update mteb/tasks/Clustering/eng/BuiltBenchClusteringS2S.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update mteb/tasks/Clustering/eng/BuiltBenchClusteringP2P.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update mteb/benchmarks/benchmarks.py

Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>

* Fix formatting via ruff

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
* update model names to adjust for adding to results repo

* update model meta script
* add most image classification descr stats

* revert changes to encoder

* add stats

---------

Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
* fix: rerun tests that fail - Networking

* update tests to use tmp_path

* set versions for dev dependencies

* add pytest options to pyproject.toml

* add rerun json.decoder.JSONDecodeError

* remove JSONDecodeError from pyproject.toml

* add huggingface_hub.errors.HfHubHTTPError

* add huggingface_hub.errors.LocalEntryNotFoundError
https://github.com/embeddings-benchmark/mteb/actions/runs/13298535701/job/37139767443?pr=2044

* FileNotFoundError
https://github.com/embeddings-benchmark/mteb/actions/runs/13302915091/job/37147507251?pr=2029

* add doc to pytest rerun

---------

Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com>
* fix: generate metadata

* use logging not print for script

* lint

* add iso639 to dev pyproject

* fix import

* add memory_usage_mb

* set version for iso639

Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>

---------

Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com>
Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Automatically generated by python-semantic-release
add missing training datasets
Automatically generated by python-semantic-release
github-actions bot and others added 24 commits March 26, 2025 19:35
* Added meta information about SearchMap_Preview model to the model_dir

* Added meta information about SearchMap_Preview model to the model_dir

* updated revision name

* Device loading and cuda cache cleaning step left out

* removed task instructions since it's not necessary

* changed sentence transformer loader to mteb default loader and passed instructions s model prompts

* Included searchmap to the models overview page

* Included searchmap to the models overview page

* added meta data information about where model was adpated from

* Update mteb/models/searchmap_models.py

* fix lint

* lint

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
* Add Background Gradients in Summary and Task Table

* Remove warnings and add light green cmap

* Address comments

* Separate styling function

* address comments

* added comments
* add ops_moa_models

* add custom implementations

* Simplify custom implementation and format the code

* support SentenceTransformers

* add training datasets

* Update mteb/models/ops_moa_models.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* update training_datasets

---------

Co-authored-by: kunka.xgw <kunka.xgw@taobao.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
ci: cache ~/.cache/huggingface

Co-authored-by: sam021313 <40773225+sam021313@users.noreply.github.com>
…mplement ImageCoDe (#2468)

* reimplement ImageCoDe with ImageTextPairClassification

* add missing stats file
* feat: added pubmedbert model2vec models

* fix: attribute model_name

* fix: fixed commit hash for pubmed_bert model2vec models

* fix: changes requested in PR 2443
* add_nb_sbert_model

* Update nb_sbert.py

added n_parameters and release_date

* Update mteb/models/nb_sbert.py

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>

* Update nb_sbert.py

fix: make lint

* added nb_sbert to overview.py + ran make lint

* Update nb_sbert.py

Fix error: Input should be a valid date or datetime, month value is outside expected range of 1-12

---------

Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Automatically generated by python-semantic-release
* supress logging warnings

* remove loggers

* return blocks

* rename function

* fix gme models

* add server name

* update after merge

* fix ruff
Automatically generated by python-semantic-release
rename VisionCentric to VisionCentricQA
Automatically generated by python-semantic-release
* Fix Task Lang Table

* added tasks.md

* fix
@isaac-chung isaac-chung requested a review from Samoed April 4, 2025 18:38
@isaac-chung isaac-chung changed the base branch from main to maeb April 4, 2025 18:38
@isaac-chung isaac-chung merged commit 5acab7f into maeb Apr 4, 2025
8 of 9 checks passed
@isaac-chung isaac-chung deleted the merge-main-into-maeb-20250404 branch April 4, 2025 19:39
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.