Skip to content

benchmark [ADD] VN-MTEB benchmark and Leaderboard #2995

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Merged
merged 14 commits into from
Aug 20, 2025

Conversation

BaoLocPham
Copy link
Contributor

Hi, this PR related to #2994 and #2964
This created with intent of extend MTEB for Vietnamese,

  • I'm aware that my dataset is not created by human.
  • This is a version 1. After this benchmark and dataset go online, I'm gonna start the version 2 of collect and process more dataset for Vietnamese Embedding.

@KennethEnevoldsen
Copy link
Contributor

Waiting on: #2964

@BaoLocPham BaoLocPham marked this pull request as ready for review August 9, 2025 15:48
@BaoLocPham
Copy link
Contributor Author

Hi @KennethEnevoldsen I naming my benchmark is VN-MTEB, is this okay or I need to change to follow your convention.

Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think the current name is fine

Before we can add it the leaderboard, we will need to run some models on it as it is currently empty.

You can render the leaderboard using:

make run-leaderbard

You should see the dashboard as follows:

Screenshot 2025-08-17 at 11 08 59

If you want we can start by simply merging the benchmark (without adding it to the leaderboard)

@BaoLocPham
Copy link
Contributor Author

Hi @KennethEnevoldsen, I suppose after the success of PR embeddings-benchmark/results#257, my sub-leaderboard is now ready to merge right?
My local benchmark now appears like this.
Screenshot 2025-08-20 at 22 47 04

@KennethEnevoldsen
Copy link
Contributor

Yes indeed! Great to have it!

@KennethEnevoldsen KennethEnevoldsen merged commit 0a6e855 into embeddings-benchmark:main Aug 20, 2025
9 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants