Skip to content

Conversation

ll0ruc
Copy link
Contributor

@ll0ruc ll0ruc commented May 30, 2025

Checklist

Reason for dataset addition: (R2MED dataset, https://huggingface.co/R2MED) R2MED, the first benchmark explicitly designed for reasoning-driven medical retrieval. More details in Paper, Homepage

  • I did not add a dataset, or if I did, I added the dataset checklist to the PR and completed it.
  • I have tested that the dataset runs with the mteb package.
  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.
    • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    • intfloat/multilingual-e5-small
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).
  • I have considered the size of the dataset and reduced it if it is too big (2048 examples is typically large enough for most tasks)
  • I did not add a model, or if I did, I added the model checklist to the PR and completed it.

@KennethEnevoldsen
Copy link
Contributor

Related to: embeddings-benchmark/results#209

ll0ruc and others added 2 commits June 3, 2025 18:23
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
Co-authored-by: Roman Solomatin <samoed.roman@gmail.com>
@KennethEnevoldsen
Copy link
Contributor

@ll0ruc seems like linting fails. You can fix this by running make lint

@ll0ruc
Copy link
Contributor Author

ll0ruc commented Jun 5, 2025

@ll0ruc seems like linting fails. You can fix this by running make lint

While I have ruff format ./mteb/benchmarks/benchmarks.py,
ruff check ./mteb/benchmarks/benchmarks.py --fix
for three new file ./mteb/benchmarks/benchmarks.py, ./mteb/tasks/Retrieval/__init__.py, ./mteb/tasks/Retrieval/eng/R2MEDRetrieval.py compared the mteb. Then update them, while it still seems like linting fails. I don't know how to solve it

@KennethEnevoldsen KennethEnevoldsen changed the title Add R2MED Benchmark fix: Add R2MED Benchmark Jun 5, 2025
@KennethEnevoldsen
Copy link
Contributor

@ll0ruc seems like it passed - I have enabled auto-merge on this one. @ll0ruc, given that it is now multiple tasks, you will need to rerun the models to obtain the correct result format.

@KennethEnevoldsen KennethEnevoldsen enabled auto-merge (squash) June 5, 2025 13:23
@ll0ruc
Copy link
Contributor Author

ll0ruc commented Jun 5, 2025

@ll0ruc seems like it passed - I have enabled auto-merge on this one. @ll0ruc, given that it is now multiple tasks, you will need to rerun the models to obtain the correct result format.

OK, I will upload the new results in embeddings-benchmark/results#212 according to the current code version.

auto-merge was automatically disabled June 5, 2025 15:24

Head branch was pushed to by a user without write access

@Samoed Samoed changed the title fix: Add R2MED Benchmark dataset: Add R2MED Benchmark Jun 5, 2025
@KennethEnevoldsen
Copy link
Contributor

The test still fail with the error:

The metadata of the following datasets is not filled: ['R2MEDBiologyRetrieval', 'R2MEDBioinformaticsRetrieval', 'R2MEDMedicalSciencesRetrieval', 'R2MEDMedXpertQAExamRetrieval', 'R2MEDMedQADiagRetrieval', 'R2MEDPMCTreatmentRetrieval', 'R2MEDPMCClinicalRetrieval', 'R2MEDIIYiClinicalRetrieval'].

@KennethEnevoldsen KennethEnevoldsen merged commit 631b4ef into embeddings-benchmark:main Jun 9, 2025
4 of 9 checks passed
@KennethEnevoldsen
Copy link
Contributor

I attempted the merge, since I thought the error was on our side. That is not the case, we are still missing date. Opened a new PR at #2795 (can't re-open this one).

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants