-
Notifications
You must be signed in to change notification settings - Fork 460
Merge main maeb 07 10 #2894
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Merged
Merged
Merge main maeb 07 10 #2894
Conversation
This file contains hidden or bidirectional Unicode text that may be interpreted or compiled differently than what appears below. To review, open the file in an editor that reveals hidden Unicode characters.
Learn more about bidirectional Unicode characters
* add custom instructions * fixed * lint * fix last instruction --------- Co-authored-by: Kolodin Egor <eikolodin@sberbank.ru> Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
* add Seed-1.6-embedding model * Update seed_1_6_embedding_models.py * update model meta info * support image encoder interface * error fix * fix: format seed_1_6_embedding_models.py with Ruff
* fix: Update model selection for the leaderboard fixes #2834 This removed the lower bound selection, but generally I don't think people should care about the models being too small. * fix 1M --> 1B * format * rename model_size -> max_model_size
update seed1.6 model training data info
* add model meta * linting * fix: add check for code lora * fix: apply review comments
* Adding Hinvec Model's Meta data. * Adding hinvec_model.py * Update mteb/models/hinvec_models.py Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com> * formated code with Black and lint with Ruff --------- Co-authored-by: Kenneth Enevoldsen <kennethcenevoldsen@gmail.com>
* nvidia_llama_nemoretriever_colembed * correct 3b reference * lint fix * add training data and license for nvidia/llama_nemoretriever_colembed * lint --------- Co-authored-by: Isaac Chung <chungisaac1217@gmail.com>
* fix sbert `v5` * add comment
* add listconranker modelmeta * fix bugs * use linter * lint --------- Co-authored-by: Roman Solomatin <36135455+Samoed@users.noreply.github.com>
* feat: add KaLM_Embedding_X_0605 in kalm_models * Update kalm_models.py for lint format --------- Co-authored-by: xinshuohu <xinshuohu@tencent.com>
comment kalm model
* Add JaCWIR and JQaRA for reranking * Fix ANLP Journal datasets * Add NLPJournalAbsArticleRetrieval and JaCWIRRetrieval * tackle test cases * Remove _evaluate_subset usage * Separate v1 and v2 * Update info for NLP Journal datasets
* add tooka v2s * add mcinext models * update mcinext.py * Apply PR review suggestions * Update mteb/models/mcinext_models.py --------- Co-authored-by: mehran <mehan.sarmadi16@gmail.com> Co-authored-by: Kenneth Enevoldsen <kenevoldsen@pm.me>
* Added DadoEvalCoarseClassification * Removed unnecessary columns from DadoEvalCoarseClassification * Added EmitClassification task * added SardiStanceClassification task * Added GeoLingItClassification task * Added DisCoTexPairClassification tasks * Added EmitClassification, DadoEvalCoarseClassification, GeoLingItClassification, SardiStanceClassification inside the inits * changed import in DisCoTexPairClassification * removed GeoLingItClassification dataset * fixed citation formatting, missing metadata parameters and lint formatting * - Added XGlueWRPReranking task - Added missing __init__.py files * fixed metadata in XGlueWRPReranking * Added MKQARetrieval task * fixed type in XGlueWRPReranking * changed MKQARetrieval from cross-lingual to monolingual * formatted MKQARetrieval file * removed unused const --------- Co-authored-by: Mattia Sangermano <MattiaSangermano@users.noreply.huggingface.co>
fix datasets version
# Conflicts: # README.md # docs/adding_a_model.md # docs/mieb/readme.md # mteb/abstasks/Audio/AbsTaskAudioZeroshotClassification.py # mteb/abstasks/TaskMetadata.py # mteb/benchmarks/benchmarks.py # mteb/custom_validators.py # mteb/descriptive_stats/BitextMining/WebFAQBitextMiningQAs.json # mteb/descriptive_stats/BitextMining/WebFAQBitextMiningQuestions.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/ROxfordEasyI2IRetrieval.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/ROxfordHardI2IRetrieval.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/ROxfordMediumI2IRetrieval.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/RParisEasyI2IRetrieval.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/RParisHardI2IRetrieval.json # mteb/descriptive_stats/Image/Any2AnyRetrieval/RParisMediumI2IRetrieval.json # mteb/models/overview.py # pyproject.toml # scripts/mmteb_create_author_list.ipynb # scripts/task_selection/europe_tasks.csv # scripts/task_selection/indic_tasks.csv # scripts/task_selection/mult_tasks.csv # scripts/task_selection/task_selection_eng_lite.ipynb # scripts/task_selection/task_selection_eu.ipynb # scripts/task_selection/task_selection_example.ipynb # scripts/task_selection/task_selection_indic.ipynb # scripts/task_selection/task_selection_mult.ipynb # tests/test_benchmark/mock_models.py
isaac-chung
approved these changes
Jul 10, 2025
Sign up for free
to join this conversation on GitHub.
Already have an account?
Sign in to comment
Add this suggestion to a batch that can be applied as a single commit.
This suggestion is invalid because no changes were made to the code.
Suggestions cannot be applied while the pull request is closed.
Suggestions cannot be applied while viewing a subset of changes.
Only one suggestion per line can be applied in a batch.
Add this suggestion to a batch that can be applied as a single commit.
Applying suggestions on deleted lines is not supported.
You must change the existing code in this line in order to create a valid suggestion.
Outdated suggestions cannot be applied.
This suggestion has been applied or marked resolved.
Suggestions cannot be applied from pending reviews.
Suggestions cannot be applied on multi-line comments.
Suggestions cannot be applied while the pull request is queued to merge.
Suggestion cannot be applied right now. Please check back later.
Merge main branch fix fixed datasets version