Skip to content

Conversation

merrymercy
Copy link
Contributor

No description provided.

@merrymercy merrymercy merged this pull request into main Jan 16, 2024
@merrymercy merrymercy deleted the readme branch January 16, 2024 05:37
merrymercy added a commit that referenced this pull request Jan 16, 2024
Ying1123 pushed a commit that referenced this pull request Sep 13, 2024
timethink pushed a commit to timethink/sglang that referenced this pull request Mar 9, 2025
yanbing-j pushed a commit to yanbing-j/sglang that referenced this pull request Mar 12, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
chunyuan-w added a commit to chunyuan-w/sglang that referenced this pull request Mar 14, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
chunyuan-w added a commit to chunyuan-w/sglang that referenced this pull request Mar 14, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
chunyuan-w added a commit to chunyuan-w/sglang that referenced this pull request Mar 14, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
NorthmanPKU pushed a commit to NorthmanPKU/sglang that referenced this pull request May 16, 2025
yanbing-j pushed a commit to yanbing-j/sglang that referenced this pull request May 16, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
chunyuan-w added a commit to chunyuan-w/sglang that referenced this pull request May 28, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
yanbing-j pushed a commit to yanbing-j/sglang that referenced this pull request May 30, 2025
* add a sgl_kernel.cpu wrapper for CPU OPs in sgl-kernel

* add wrapper for attention OPs

* set default value of is_vnni to True
sleepcoo pushed a commit to shuaills/sglang that referenced this pull request Jun 24, 2025
fix model loading and add eagle3 inference
yichiche pushed a commit to yichiche/sglang that referenced this pull request Jul 7, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Jul 23, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Jul 25, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Jul 30, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 1, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 6, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 7, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 11, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
yichiche pushed a commit to yichiche/sglang that referenced this pull request Aug 11, 2025
Signed-off-by: Ivan Butygin <ivan.butygin@gmail.com>
liupeng374 pushed a commit to liupeng374/sglang that referenced this pull request Aug 11, 2025
* chore: download related file from cache

Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>

* fix ci

Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>

---------

Signed-off-by: mywaaagh_admin <pkwarcraft@gmail.com>
CatherineSue pushed a commit that referenced this pull request Aug 12, 2025
…d-on-v0.4.10.post2 (#10)

* Cherry-pick conflicts: features-based-on-v0.4.6.post5 → features-based-on-v0.4.7.post1 (#4)

* [Docker] Use local source file instead of git clone sglang (#4)

* [OAI] Support non-normalized logprobs in OpenAI server (#5961)

* feat: Improve Mistral and Qwen25 function call parsing (#6597)

* [Bugfix]: Fix call for function_call_parser.multi_format_detector in adapter.py (#6650)

* fix(tool call): Fix tool_index in PythonicDetector and issues with mixed output in non-streaming (#6678)

* bugfix(OAI): Fix image_data processing for jinja chat templates (#6877)

* compressed-tensors check_accelerate error need install accelerate

* Fix OOM in Llama4 with large images

* Fix lint

* ✅ Resolved: Cherry-pick

---------

Co-authored-by: Chao Yang <chao.c.yang@oracle.com>
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>

* ✅ Resolved: Cherry-pick from features-based-on-v0.4.7.post1 to features-based-on-v0.4.10.post2

* Delete openai_api folder

---------

Co-authored-by: Chao Yang <chao.c.yang@oracle.com>
Co-authored-by: Simo Lin <linsimo.mark@gmail.com>
Co-authored-by: github-actions[bot] <github-actions[bot]@users.noreply.github.com>
Xia-Weiwen pushed a commit to Xia-Weiwen/sglang that referenced this pull request Sep 9, 2025
* Revert "port prefill optimization (sgl-project#7)"

This reverts commit ea0d028.

* improve bfloat16 gemm performance for prefilling

before:
```
gemm_bf16(native): 4.772 ms, gemm_fp8(opt): 0.000 ms, gemm_int8(opt): 0.000 ms, gemm_bf16(opt): 15.328 ms
```

after:
```
gemm_bf16(native): 4.847 ms, gemm_fp8(opt): 0.000 ms, gemm_int8(opt): 0.000 ms, gemm_bf16(opt): 3.927 ms
```

* improve fp8 gemm performance with large M

* enable amx-int8 for gemm, fused moe, shared moe and qkv_proj kernels on PyTorch 2.7

* improve int8 gemm performance with large M

* improve bf16 and int8 moe performance with large nbatches

* update naming for nb0 and nb1 in fused gemm and silu_mul kernel

* improve fp8 moe performance with large nbatches

* remove hardcode numbers

---------

Co-authored-by: mingfeima <mingfei.ma@intel.com>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

1 participant