feat: update nightly gsm8k eval #1304

zhyncs · 2024-09-02T12:42:46Z

Motivation

Modifications

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

zhyncs · 2024-09-02T14:48:17Z

# H100 TP 2, latest v0.2.15
python3 -m sglang.launch_server --model neuralmagic/Qwen2-72B-Instruct-FP8 --quantization fp8  --trust-remote-code --tp 2 --kv-cache-dtype fp8_e5m2
python3 -m sglang.bench_serving --backend sglang

Traceback (most recent call last):
[14:47:01 TP0] Exception in ModelTpServer:
Traceback (most recent call last):
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 244, in exposed_step
    self.forward_step()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 260, in forward_step
    self.forward_prefill_batch(new_batch)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 507, in forward_prefill_batch
    sample_output, logits_output = self.model_runner.forward(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 584, in forward
    return self.forward_extend(batch)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 542, in forward_extend
    input_metadata = InputMetadata.from_schedule_batch(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 215, in from_schedule_batch
    ret.init_flashinfer_handlers(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 245, in init_flashinfer_handlers
    update_flashinfer_indices(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 374, in update_flashinfer_indices
    model_runner.flashinfer_prefill_wrapper_paged.begin_forward(
  File "/usr/local/lib/python3.10/dist-packages/flashinfer/prefill.py", line 832, in plan
    self._wrapper.plan(
RuntimeError: Failed to allocate memory for batch_prefill_tmp_v with size 599785472 and alignment 16 in AlignedAllocator

  File "/usr/lib/python3.10/multiprocessing/process.py", line 314, in _bootstrap
    self.run()
  File "/usr/lib/python3.10/multiprocessing/process.py", line 108, in run
    self._target(*self._args, **self._kwargs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 896, in run_tp_server
    model_server.exposed_step(recv_reqs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 244, in exposed_step
    self.forward_step()
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 260, in forward_step
    self.forward_prefill_batch(new_batch)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/managers/tp_worker.py", line 507, in forward_prefill_batch
    sample_output, logits_output = self.model_runner.forward(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 584, in forward
    return self.forward_extend(batch)
  File "/usr/local/lib/python3.10/dist-packages/torch/utils/_contextlib.py", line 116, in decorate_context
    return func(*args, **kwargs)
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/model_runner.py", line 542, in forward_extend
    input_metadata = InputMetadata.from_schedule_batch(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 215, in from_schedule_batch
    ret.init_flashinfer_handlers(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 245, in init_flashinfer_handlers
    update_flashinfer_indices(
  File "/usr/local/lib/python3.10/dist-packages/sglang/srt/model_executor/forward_batch_info.py", line 374, in update_flashinfer_indices
    model_runner.flashinfer_prefill_wrapper_paged.begin_forward(
  File "/usr/local/lib/python3.10/dist-packages/flashinfer/prefill.py", line 832, in plan
    self._wrapper.plan(
RuntimeError: Failed to allocate memory for batch_prefill_tmp_v with size 599785472 and alignment 16 in AlignedAllocator

It works well without --kv-cache-dtype fp8_e5m2. @ispobock @yzh119 may help take a look

test/srt/test_nightly_gsm8k_eval.py

zhyncs · 2024-09-02T14:54:41Z

fix #1272

zhyncs · 2024-09-02T14:56:13Z

ref https://github.com/sgl-project/sglang/actions/runs/10669348477/job/29571069828

zhyncs added the wip label Sep 2, 2024

zhyncs force-pushed the night branch from ef7a6ac to 212d91a Compare September 2, 2024 13:21

feat: update nightly gsm8k eval

33c23cd

zhyncs force-pushed the night branch from 42a12fd to 33c23cd Compare September 2, 2024 14:27

upd

278fee8

zhyncs removed the wip label Sep 2, 2024

zhyncs commented Sep 2, 2024

View reviewed changes

test/srt/test_nightly_gsm8k_eval.py Show resolved Hide resolved

upd

0e502c0

zhyncs requested review from Ying1123, yzh119, merrymercy, ispobock and hnyls2002 September 2, 2024 14:56

zhyncs self-assigned this Sep 2, 2024

zhyncs enabled auto-merge (squash) September 2, 2024 14:57

zhyncs disabled auto-merge September 2, 2024 15:18

zhyncs merged commit 2561ed0 into main Sep 2, 2024
10 checks passed

zhyncs deleted the night branch September 2, 2024 15:18

zhyncs mentioned this pull request Sep 2, 2024

feat: update linear deps 1/N #1305

Merged

3 tasks

ispobock mentioned this pull request Sep 7, 2024

[Bug] RuntimeError in ModelTpServer #1323

Closed

5 tasks

timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025

feat: update nightly gsm8k eval (sgl-project#1304)

f516f17

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

feat: update nightly gsm8k eval #1304

feat: update nightly gsm8k eval #1304

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

Uh oh!

Uh oh!

feat: update nightly gsm8k eval #1304

feat: update nightly gsm8k eval #1304

Uh oh!

Conversation

zhyncs commented Sep 2, 2024

Motivation

Modifications

Checklist

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

zhyncs commented Sep 2, 2024

Uh oh!

Uh oh!

Uh oh!