Skip to content

Conversation

coolhok
Copy link
Contributor

@coolhok coolhok commented Jan 4, 2025

Motivation

fix some Eagle2 some bug

  • hidden_states interference between multiple concurrent requests,in func (prepare_for_decode).
  • Multiple concurrent occurrences have a certain probability of crashing.when req1 is finished,req2 is comming.in func (merge_batch).
  • Multiple concurrent, add func filter_batch for handle AbortReq。

work with @jjjjohnson @bisunny

Need to rely on #2723

Modifications

  • when Multiple concurrent in func prepare_for_decode,selected_input_index need add batch offset.
  • when req1 is finished.req2 is new decode request.req1 hidden_states is none.
  • add func filter_batch for handle AbortReq.

Checklist

  • Format your code according to the Contributor Guide.
  • Add unit tests as outlined in the Contributor Guide.
  • Update documentation as needed, including docstrings or example tutorials.

@coolhok coolhok changed the title [Eagle2]Some errors when dealing with multiple concurrency. [Eagle2]Fix multiple concurrent request crashes Jan 4, 2025
@zhyncs zhyncs mentioned this pull request Jan 5, 2025
3 tasks
@merrymercy
Copy link
Contributor

merrymercy commented Jan 6, 2025

@coolhok Thanks for the contribution. I left some comments at #2723. Let us merge that first and then I will review this.
In the meantime, can you add a test case that can trigger the bug in the previous code?

@merrymercy
Copy link
Contributor

#2723 is merged. Please fix the conflicts. Thanks!

cc @yukavio Can you also take a review?

finished_reason, FINISH_ABORT
):
abort_count += 1
if not self.reqs[i].finished():
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why can't the code under this if statement be placed under the if statement of line 1055?

@coolhok
Copy link
Contributor Author

coolhok commented Jan 9, 2025

After my offline testing, the AbortReq issue has been resolved in the main branch. I am confirming if I need to modify the filter_match.

@zhyncs zhyncs requested a review from yukavio January 9, 2025 11:59
@yukavio
Copy link
Contributor

yukavio commented Jan 10, 2025

LGTM cc @zhyncs @merrymercy

@merrymercy merrymercy merged commit a47bf39 into sgl-project:main Jan 10, 2025
15 checks passed
@merrymercy
Copy link
Contributor

@coolhok @yukavio Thanks. It is merged.

@merrymercy
Copy link
Contributor

@coolhok Can you take a look at this failed test case? https://github.com/sgl-project/sglang/actions/runs/12724186117/job/35470111598?pr=2839#step:4:545

@coolhok
Copy link
Contributor Author

coolhok commented Jan 13, 2025

@coolhok Can you take a look at this failed test case? https://github.com/sgl-project/sglang/actions/runs/12724186117/job/35470111598?pr=2839#step:4:545

func health_generate in popen_launch_server is faster than _wait_and_warmup will cause the apiserver to crash。 I try to initialize it, assigning the value of 'finic_xend_len' to [] can solve the problem.
image

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants