Skip to content

Conversation

zhaochenyang20
Copy link
Collaborator

Checklist Before Starting

  • Search for similar PR(s).

What does this PR do?

I updated serval comments and the with block usage. one question left in here:

        # The format of the model weights to be loaded.
        # TODO(chenyang): why we set `load_format` to `dummy`?

        # “auto” will try to load the weights in the safetensors format and
        # fall back to the pytorch bin format if safetensors format is not
        # available.
        # “pt” will load the weights in the pytorch bin format.
        # “safetensors” will load the weights in the safetensors format.
        # “dummy” will initialize the weights with random values, which is
        # mainly for profiling.
        # “bitsandbytes” will load the weights using bitsandbytes quantization.
        # “npcache” will load the weights in pytorch format and store a numpy
        # cache to speed up the loading.

@zhaochenyang20
Copy link
Collaborator Author

Well, I am cool with the load_format. But I think seldomly do we use dummy 😂

non_pad_index = torch.nonzero(prompt_token_ids != pad_token_id, as_tuple=False)[0][0]
non_pad_index = torch.nonzero(prompt_token_ids != pad_token_id, as_tuple=False)[0][
0
]
Copy link
Owner

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's not graceful, plz roll back to single line

@zyzshishui zyzshishui merged commit 7be85f7 into refactor May 25, 2025
2 checks passed
zyzshishui pushed a commit that referenced this pull request May 27, 2025
Co-authored-by: Bihan  Rana <bihan@Bihans-MacBook-Pro.local>
Co-authored-by: peterschmidt85 <andrey.cheptsov@gmail.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants