[Eagle] Remove the greedy branch and some redundant code #4363

Ying1123 · 2025-03-13T05:47:00Z

llama 2 7b: 390 token/s -> 400 token/s with this PR

Co-authored-by: Sehoon Kim <sehoon@x.ai>

Ying1123 requested review from merrymercy, rkooo567 and kssteven418 as code owners March 13, 2025 05:47

Ying1123 force-pushed the ying-eagle branch from 2789395 to c278b27 Compare March 13, 2025 05:47

Ying1123 requested review from zhyncs, ispobock, HandH1998, BBuf and yizhang2077 as code owners March 13, 2025 05:47

Ying1123 mentioned this pull request Mar 13, 2025

Add greedy verification kernel #4383

Merged

Ying1123 force-pushed the ying-eagle branch 2 times, most recently from 8b8df4f to acaa267 Compare March 14, 2025 10:36

merrymercy force-pushed the ying-eagle branch from acaa267 to b4c7b10 Compare March 16, 2025 04:54

merrymercy requested review from hnyls2002 and ByronHsu as code owners March 16, 2025 05:45

Optimize greedy case & support padding

72c9c2f

merrymercy force-pushed the ying-eagle branch from ee15d31 to 72c9c2f Compare March 16, 2025 08:05

merrymercy added 5 commits March 16, 2025 01:10

Fix lint

c8387cb

Merge branch 'main' into ying-eagle

0ba8f4c

update

83fd78e

Fix test case

f878bce

Fix flush cache

1cfe77f

merrymercy merged commit 1b85929 into main Mar 16, 2025
4 of 22 checks passed

merrymercy deleted the ying-eagle branch March 16, 2025 09:48

muniu-brian approved these changes Mar 30, 2025

View reviewed changes

Provide feedback