Skip to content

Conversation

SighingSnow
Copy link

  • I have outlined why this dataset is filling an existing gap in mteb

  • I have tested that the dataset runs with the mteb package.

  • I have run the following models on the task (adding the results to the pr). These can be run using the mteb run -m {model_name} -t {task_name} command.

    • sentence-transformers/paraphrase-multilingual-MiniLM-L12-v2
    • intfloat/multilingual-e5-smal
  • I have checked that the performance is neither trivial (both models gain close to perfect scores) nor random (both models gain close to random scores).

@SighingSnow
Copy link
Author

SighingSnow commented Jun 12, 2025

Could anyone help me to review my code. If the code actually goes well, I will further run some evaluation and provide some results on this benchmark.
Also, I will close the previous PR as it's outdated.

@SighingSnow SighingSnow mentioned this pull request Jun 12, 2025
6 tasks
@Samoed Samoed changed the title Add IFIR benchmark. dataset: Add IFIR benchmark Jun 12, 2025
scores_dict = {"level_1": [], "level_2": [], "level_3": []}
for k, v in scores.items():
if "v1" in k:
scores_dict["level_1"].append(v["ndcg_cut_20"])
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
scores_dict["level_1"].append(v["ndcg_cut_20"])
scores_dict["level_1"].append(v[self.metadata.main_score])

Comment on lines 25 to 29
task_subtypes=None,
license=None,
annotations_creators=None,
dialect=None,
sample_creation=None,
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Can you fill missing metadata?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Oh, no problem! Thanks for pointing the errors!

Copy link
Member

@Samoed Samoed Jun 12, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Also you should calculate metadata of your tasks. You can do it by

import mteb

benchmark = mteb.get_benchmark("IFIR")
for task in benchmark.tasks:
    task.calculate_metadata_metrics()

And format bibtex citations by

python scripts/format_citations.py benchmarks
python scripts/format_citations.py tasks

@SighingSnow
Copy link
Author

I will further provide some evaluation results later. Could you please wait me for 1-3 days.

@Samoed Samoed requested a review from KennethEnevoldsen June 27, 2025 07:33
@@ -0,0 +1,64 @@
from __future__ import annotations

from mteb.abstasks.TaskMetadata import TaskMetadata
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think you should merge v2 into your branch and change import to

Suggested change
from mteb.abstasks.TaskMetadata import TaskMetadata
from mteb.abstasks.task_metadata import TaskMetadata

Copy link
Author

@SighingSnow SighingSnow Jun 27, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes. Thank you for pointing that out. I have tried the v2.0.0 branch without my revision, but it will still fails. Is there any bug in the the v2.0.0 code.

I use the following command

git clone https://github.com/embeddings-benchmark/mteb.git
git switch v2.0.0
pip install -e .
python run_mteb.py 

Then I will get the error from
https://github.com/embeddings-benchmark/mteb/blob/v2.0.0/mteb/models/colpali_models.py#l11

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

@SighingSnow it seems like the error is still on the import (did you push the changes?), then we can take a look at the new error - Can you also do the lint?

Copy link
Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

it seems like the error is still on the import (did you push the changes?)

I just pushed a new version. But it seems that the error is not from IFIR relevant files.

Can you also do the lint?

Yes, of course. I have fixed the format in the new commit.

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Code + metadata looks fine to me

@SighingSnow SighingSnow force-pushed the v2.0.0 branch 2 times, most recently from b44cbb3 to 7e368e2 Compare June 28, 2025 07:53
Signed-off-by: SighingSnow <songtingyu220@gmail.com>
@isaac-chung
Copy link
Collaborator

I see some force pushes after approval. @Samoed would you mind checking if this is ready to merge?

@Samoed Samoed merged commit ed69e60 into embeddings-benchmark:v2.0.0 Jun 28, 2025
8 checks passed
@SighingSnow SighingSnow deleted the v2.0.0 branch June 28, 2025 10:43
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants