Skip to content

Conversation

Chen-XiaoBing
Copy link
Contributor

@Chen-XiaoBing Chen-XiaoBing commented Feb 19, 2025

Motivation

In the implementation of grouped topk in MoE layer, the scores of masked groups are set to 0, which may leads to select incorrect experts in certain scenarios.

Modifications

In SGLang, the mask scores are set to 0. This configuration may result in the selection of experts in masked groups if the scores in the unmasked groups are negative. This behavior can lead to incorrect or suboptimal selections in certain scenarios.

Checklist

@zhyncs zhyncs merged commit d5d80ab into sgl-project:main Feb 20, 2025
@zhyncs
Copy link
Member

zhyncs commented Feb 20, 2025

Thanks!

@Chen-XiaoBing Chen-XiaoBing deleted the fix-moe-topk branch February 20, 2025 23:58
@ispobock
Copy link
Collaborator

ispobock commented Feb 21, 2025

@Chen-XiaoBing Actually in the official modeling code, it's set to 0. And we don't have accuracy issue for the previous version. So I am not sure if it will involve accuracy issue with this change.

cc: @zhyncs

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants