Skip to content

Conversation

merrymercy
Copy link
Contributor

@merrymercy merrymercy commented Dec 31, 2024

This is used for speculative decoding in #2150

Co-authored-by: yukavio <kavioyu@gmail.com>

@merrymercy merrymercy changed the title Support target verify in the attention backend Support target model verification in the attention backend Dec 31, 2024
@merrymercy merrymercy merged commit f44d143 into main Dec 31, 2024
16 of 17 checks passed
@merrymercy merrymercy deleted the pr-attn branch December 31, 2024 06:58
@zhyncs
Copy link
Member

zhyncs commented Dec 31, 2024

@merrymercy test_dp_attention failed after this PR

python3 test_dp_attention.py

@zhyncs
Copy link
Member

zhyncs commented Dec 31, 2024

FYI After I revert git revert f44d143949f6c6fbca6cb96c52381b8bc1769a87, it works well cc @ispobock

@merrymercy merrymercy changed the title Support target model verification in the attention backend Eagle speculative decoding part 1. Support target model verification in the attention backend Dec 31, 2024
@merrymercy merrymercy changed the title Eagle speculative decoding part 1. Support target model verification in the attention backend Eagle speculative decoding part 1: Support target model verification in the attention backend Dec 31, 2024
@merrymercy
Copy link
Contributor Author

The hanging issue is fixed by #2684

XiaotongJiang pushed a commit to XiaotongJiang/sglang that referenced this pull request Jan 3, 2025
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants