Skip to content

Conversation

Samoed
Copy link
Member

@Samoed Samoed commented Feb 2, 2025

To make fully compatible old leaderboard with new, we can add model memory usage. Add property to ModelMeta.

Code Quality

  • Code Formatted: Format the code using make lint to maintain consistent style.

Documentation

  • Updated Documentation: Add or update documentation to reflect the changes introduced in this PR.

Testing

  • New Tests Added: Write tests to cover new functionality. Validate with make test-with-coverage.
  • Tests Passed: Run tests locally using make test or make test-with-coverage to ensure no existing functionality is broken.

@isaac-chung
Copy link
Collaborator

isaac-chung commented Feb 3, 2025

Have you already discussed with @KennethEnevoldsen or @x-tabdeveloping ? I'd prefer not to have such back-and-forth: to add then remove then add back things. #1729

@Samoed
Copy link
Member Author

Samoed commented Feb 3, 2025

Opened issue for this #1935. Previously it was removed, because it was not filled and should be autocalculated based on the number of parameters of the model

Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I understand now (we're not adding this back to be filled in - we are auto-calculating it here. Great stuff!). Thanks for the initiative! Just 2 clarifications then I think we're ready.

Copy link
Collaborator

@isaac-chung isaac-chung left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sweet!

Copy link
Collaborator

@x-tabdeveloping x-tabdeveloping left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Left a couple of minor comments. Thanks for adding this

if self.n_parameters is None:
return None
# Model memory in bytes. For FP32 each parameter is 4 bytes.
model_memory_bytes = self.num_params * 4
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Is this a good assumption to make? Do all models have FP32 parameters?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Large models (>1B) params usally loaded with fp16/bp16, but I don't know how to handle this automatically

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I suppose you could get this information using huggingface_hub.hf_api.get_safetensors_metadata

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

And then do a cached_property so that it doesn't have to be fetched every time

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think integrating this could slow down leaderboard building. Maybe we could manually set memory usage instead or add information about the number of parameters for each model weight?

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I agree. We could fetch all of these in a script and manually keep count of them in ModelMeta perhaps?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Calculated them

@Samoed
Copy link
Member Author

Samoed commented Feb 6, 2025

@x-tabdeveloping Can this PR be merged?

@Samoed Samoed changed the title feat: add model memory usage add model memory usage Feb 7, 2025
Copy link
Contributor

@KennethEnevoldsen KennethEnevoldsen left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good on my end - great addition. Do we want to create a column for it on the leaderboard (if so let us make an issue on that)

@Samoed
Copy link
Member Author

Samoed commented Feb 7, 2025

@KennethEnevoldsen I've already created issue #1935

# Conflicts:
#	mteb/models/gritlm_models.py
@Samoed Samoed enabled auto-merge (squash) February 7, 2025 16:15
@Samoed Samoed merged commit e46539a into main Feb 7, 2025
9 checks passed
@Samoed Samoed deleted the add_memory_usage branch February 7, 2025 16:26
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants