[Feature] Support Deepseek-VL2 #2798

ccw1996 · 2025-01-08T13:52:01Z

Motivation

Add Deepseek-VL2 model to SGLang, as requested in #2653

Modifications

Add new model Deepseek-VL2

Checklist

Format your code according to the Contributor Guide.
Add unit tests as outlined in the Contributor Guide.
Update documentation as needed, including docstrings or example tutorials.

ispobock · 2025-01-08T15:55:38Z

python/sglang/srt/models/deepseekvlv2.py

@@ -0,0 +1,127 @@
+from typing import List,Optional,Tuple,Union


rename the file to deepseek_vl2?

rename done

ispobock · 2025-01-08T15:59:31Z

python/sglang/srt/models/deepseekvlv2.py

+
+        self.layers = modules
+
+    def forward(self, x):


add unit test in https://github.com/sgl-project/sglang/blob/main/test/srt/test_vision_openai_server.py

I have not yet implemented the forward part of the DeepseekV2ForCausalLM. I will finish all the implementations and add the unit test this weekend.

yizhang2077 · 2025-01-25T13:23:27Z

@ccw1996 Do you need our help?

SashvDave · 2025-01-29T18:27:18Z

Has support for deepseek vl2 been implemented?

Edenzzzz · 2025-01-30T01:45:13Z

python/sglang/srt/models/deepseekvl2.py

+        if config.projector_type == "downsample_mlp_gelu":
+            mlp_depth = config.depth
+            mlp_ratio = config.mlp_ratio
+            modules = [nn.Linear(config.input_dim * config.downsample_ratio * config.downsample_ratio, config.n_embed * mlp_ratio)]
+            for _ in range(1, mlp_depth - 1):
+                modules.append(nn.GELU())
+                modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio))
+            modules.append(nn.GELU())
+            modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed))
+            modules = nn.Sequential(*modules)


@ccw1996 I'm happy to take the rest of the work to parallelize the remaining functions. Could you give me access to your branch?

Edenzzzz · 2025-01-30T02:16:16Z

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

ccw1996 · 2025-01-31T13:55:57Z

@ccw1996 Do you need our help?

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

sure, i have some trouble about adapting preprocess. i need help to adapt siglip implement without timm

i will update my other implement code later

Edenzzzz · 2025-01-31T16:10:01Z

@ccw1996 I see, I think you can copy those layers from timm into python/sglang/srt/models/deepseekvl2.py, and then replace layers with sgl classes. I'm interested in helping if you can give me access.

…allelize

Edenzzzz · 2025-02-04T04:42:52Z

@yizhang2077 @ispobock Looks like we'll have to copy lots of code from timm--now mostly just the linear layers with variable depth to parallelize, will finish soon

ccw1996 · 2025-02-06T16:14:55Z

@ccw1996 Apologies for the delay. Would you like me to help with the rest of it?

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?

Edenzzzz · 2025-02-06T16:16:32Z

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?

Sure, can you mark the problematic part?

Edenzzzz · 2025-02-06T16:17:20Z

python/sglang/srt/models/deepseek_vl2.py

+        if config.projector_type == "downsample_mlp_gelu":
+            mlp_depth = config.depth
+            mlp_ratio = config.mlp_ratio
+            modules = [
+                nn.Linear(
+                    config.input_dim
+                    * config.downsample_ratio
+                    * config.downsample_ratio,
+                    config.n_embed * mlp_ratio,
+                )
+            ]
+            for _ in range(1, mlp_depth - 1):
+                modules.append(nn.GELU())
+                modules.append(
+                    nn.Linear(config.n_embed * mlp_ratio, config.n_embed * mlp_ratio)
+                )
+            modules.append(nn.GELU())
+            modules.append(nn.Linear(config.n_embed * mlp_ratio, config.n_embed))
+            modules = nn.Sequential(*modules)
+


Need to parallelize this part with Column and Row linear

@yizhang2077 Actually with GELU we'll have to gather output for each TP linear. Should we use replicated linear instead?

ccw1996 · 2025-02-09T14:21:33Z

@Edenzzzz hello, i have run deepseekvl2 success with timm preprocess, but i am confused that result have some unexpected value. Can you help me find out the reason?
Sure, can you mark the problematic part?

two problem. one is radix cache will make input error, i will try to fix it. the second is output seems like not use images embedding. Can you help me to debug it?

Edenzzzz · 2025-02-09T21:59:52Z

Let me try tomorrow

python/sglang/srt/models/deepseek_vl2.py

Edenzzzz · 2025-02-11T14:41:06Z

python/sglang/srt/model_executor/model_runner.py

+                logger.info(
+                    "Automatically turn off --chunked-prefill-size and disable radix cache for deekseek-vl2."
+                )
+                server_args.chunked_prefill_size = -1
+                server_args.disable_radix_cache = True


The language part still supports radix cache.

The language part relay on input embed. If use radix cache, the input embed is wrong. I will try to debug it.

I see, I think you're right. Llava and qwen_vl also don't use radix attn

zhaochenyang20 · 2025-03-14T18:00:37Z

I think some parts of this file just move codes from timm. Can we try import timm and guide user to install it when user use this model (vllm use this way)? I think this will make code more clear. Do you think it is ok? @zhaochenyang20 (In this way ci will be blocked so we need update ci denpendency shell)

Has this been done? @ccw1996 @yizhang2077 basically, do not paste codes but rather import it when needed

ccw1996 · 2025-03-15T14:51:22Z

I think some parts of this file just move codes from timm. Can we try import timm and guide user to install it when user use this model (vllm use this way)? I think this will make code more clear. Do you think it is ok? @zhaochenyang20 (In this way ci will be blocked so we need update ci denpendency shell)我认为这个文件的某些部分只是从 timm 移动了代码。我们可以尝试导入 timm 并在用户使用此模型时指导用户安装它（vllm 使用这种方式）吗？我认为这将使代码更加清晰。你觉得没问题吗？@zhaochenyang20 （这样 ci 会被阻止，所以我们需要更新 ci denpendency shell）

Has this been done? @ccw1996 @yizhang2077 basically, do not paste codes but rather import it when needed这样做了吗？@ccw1996 @yizhang2077 基本上，不要粘贴代码，而是在需要时导入代码

Sorry, i missed this comment.
However, if timm implement is changed, there is a risk to the accuracy of the results. I think it's better now. What do you think? @zhaochenyang20 @yizhang2077
Beside, i will try to import timm as soon as possible.

Edenzzzz · 2025-03-15T15:29:04Z

I initially thought it's unnecessary to add a dependency just for one model, cuz you'll also have to update CI😂but it's fine if you guys agree

zhaochenyang20 · 2025-03-15T16:42:36Z

I think some parts of this file just move codes from timm. Can we try import timm and guide user to install it when user use this model (vllm use this way)? I think this will make code more clear. Do you think it is ok? @zhaochenyang20 (In this way ci will be blocked so we need update ci denpendency shell)我认为这个文件的某些部分只是从 timm 移动了代码。我们可以尝试导入 timm 并在用户使用此模型时指导用户安装它（vllm 使用这种方式）吗？我认为这将使代码更加清晰。你觉得没问题吗？@zhaochenyang20 （这样 ci 会被阻止，所以我们需要更新 ci denpendency shell）

Has this been done? @ccw1996 @yizhang2077 basically, do not paste codes but rather import it when needed这样做了吗？@ccw1996 @yizhang2077 基本上，不要粘贴代码，而是在需要时导入代码

Sorry, i missed this comment. However, if timm implement is changed, there is a risk to the accuracy of the results. I think it's better now. What do you think? @zhaochenyang20 @yizhang2077 Beside, i will try to import timm as soon as possible.

i think better import it. timm should keep their utility, we should not put dead codes in sglang.

yizhang2077 · 2025-03-15T17:29:17Z

hi @ccw1996, once you pr is ready, I think you can update here #4456

ccw1996 · 2025-03-16T04:33:02Z

I think some parts of this file just move codes from timm. Can we try import timm and guide user to install it when user use this model (vllm use this way)? I think this will make code more clear. Do you think it is ok? @zhaochenyang20 (In this way ci will be blocked so we need update ci denpendency shell)我认为这个文件的某些部分只是从 timm 移动了代码。我们可以尝试导入 timm 并在用户使用此模型时指导用户安装它（vllm 使用这种方式）吗？我认为这将使代码更加清晰。你觉得没问题吗？@zhaochenyang20 （这样 ci 会被阻止，所以我们需要更新 ci denpendency shell）

Has this been done? @ccw1996 @yizhang2077 basically, do not paste codes but rather import it when needed这样做了吗？@ccw1996 @yizhang2077 基本上，不要粘贴代码，而是在需要时导入代码

Sorry, i missed this comment. However, if timm implement is changed, there is a risk to the accuracy of the results. I think it's better now. What do you think? @zhaochenyang20 @yizhang2077 Beside, i will try to import timm as soon as possible.

i think better import it. timm should keep their utility, we should not put dead codes in sglang.

@zhaochenyang20 @yizhang2077 done. Does i need to update branch?

ccw1996 · 2025-03-16T04:34:00Z

hi @ccw1996, once you pr is ready, I think you can update here #4456

@yizhang2077 i can not update this sheet

yizhang2077 · 2025-03-16T05:47:53Z

@yizhang2077 i can not update this sheet

You can paste result here, and then I update it

ccw1996 · 2025-03-16T09:50:24Z

@yizhang2077 i can not update this sheet

You can paste result here, and then I update it

sglang is 0.442 and hf is not support

zhaochenyang20 · 2025-03-17T00:39:14Z

@yizhang2077 @ccw1996 what shall we do next? I have the access and I can merge after yi's approval.

ccw1996 · 2025-03-17T04:41:11Z

@yizhang2077 @ccw1996 what shall we do next? I have the access and I can merge after yi's approval.

@zhaochenyang20 @yizhang2077 i resolve merge conflict about gemma3. now waiting for approval.

zhaochenyang20 · 2025-03-17T04:46:34Z

sure!

zhaochenyang20 · 2025-03-17T04:49:02Z

@yizhang2077 @mickqian could you help to give a quick overview?

zhaochenyang20 · 2025-03-17T04:50:10Z

@ccw1996 fix lint plz

mickqian · 2025-03-17T04:53:42Z

LGTM

zhaochenyang20 · 2025-03-17T04:56:15Z

@ccw1996 sure. I will run the CI for you and do not rebase any more. leave it to us

yizhang2077 · 2025-03-17T05:35:58Z

@ccw1996 please add pip install timm in ci_install_dependency.sh, otherwise your test will not pass.

zhaochenyang20 · 2025-03-17T05:41:49Z

@ccw1996 please add pip install timm in ci_install_dependency.sh, otherwise your test will not pass.

I've done it

[WIP] initial deepseekvlv2

b6e13d0

zhyncs added the enhancement New feature or request label Jan 8, 2025

zhyncs assigned ispobock and zhyncs Jan 8, 2025

ispobock reviewed Jan 8, 2025

View reviewed changes

ispobock assigned yizhang2077 Jan 8, 2025

adapt siglip and part of languange model

e3aa9f3

Edenzzzz reviewed Jan 30, 2025

View reviewed changes

ccw1996 and others added 2 commits February 1, 2025 22:45

🚧 adapt deepseekvl2 inference, work on preprocess

c9441bb

Add some layers to timm; sequential mlp with var. depth remain to par…

ce085fd

…allelize

🚧 error result but can success running with timm

1d7fd28

Edenzzzz reviewed Feb 6, 2025

View reviewed changes

Edenzzzz and others added 2 commits February 6, 2025 23:45

Remove radix attn; parallelize mostly done

73b03b1

format input token

1a4b3d5

Merge branch 'main' into main

1d0fd8b

Edenzzzz reviewed Feb 10, 2025

View reviewed changes

python/sglang/srt/models/deepseek_vl2.py Outdated Show resolved Hide resolved

fix inference value fault

79f2a24

Edenzzzz reviewed Feb 11, 2025

View reviewed changes

ccw1996 added 2 commits February 12, 2025 22:57

support cuda graph and clean code

375b0dc

add deepseekvl2 test case

8d047ad

ccw1996 added 2 commits March 16, 2025 12:30

import timm

721954e

Merge branch 'main' of https://github.com/ccw1996/sglang

1356046

Merge branch 'main' into main

2b9ad70

Merge branch 'main' into main

5813b24

Merge branch 'main' into main

3290ef2

ccw1996 requested a review from xiezhq-hermann as a code owner March 17, 2025 04:39

fix lint

2879dd8

Add timm in CI

b6b9100

zhaochenyang20 merged commit d6d2164 into sgl-project:main Mar 17, 2025
21 checks passed

vhain mentioned this pull request Mar 27, 2025

deps: lazy import optional dependencies gguf and torchvision #4826

Merged

6 tasks

		@@ -0,0 +1,127 @@
		from typing import List,Optional,Tuple,Union

[Feature] Support Deepseek-VL2 #2798

[Feature] Support Deepseek-VL2 #2798

Uh oh!

Conversation

ccw1996 commented Jan 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Motivation

Modifications

Checklist

Uh oh!

ispobock Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

ccw1996 Jan 12, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

ispobock Jan 8, 2025

Choose a reason for hiding this comment

Uh oh!

ccw1996 Jan 9, 2025

Choose a reason for hiding this comment

Uh oh!

yizhang2077 commented Jan 25, 2025

Uh oh!

SashvDave commented Jan 29, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edenzzzz Jan 30, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Edenzzzz commented Jan 30, 2025

Uh oh!

ccw1996 commented Jan 31, 2025

Uh oh!

Edenzzzz commented Jan 31, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edenzzzz commented Feb 4, 2025

Uh oh!

ccw1996 commented Feb 6, 2025

Uh oh!

Edenzzzz commented Feb 6, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edenzzzz Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

Edenzzzz Feb 6, 2025

Choose a reason for hiding this comment

Uh oh!

ccw1996 commented Feb 9, 2025

Uh oh!

Edenzzzz commented Feb 9, 2025

Uh oh!

Uh oh!

Edenzzzz Feb 11, 2025

Choose a reason for hiding this comment

Uh oh!

ccw1996 Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Edenzzzz Feb 11, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

zhaochenyang20 commented Mar 14, 2025

Uh oh!

ccw1996 commented Mar 15, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Edenzzzz commented Mar 15, 2025

Uh oh!

zhaochenyang20 commented Mar 15, 2025

Uh oh!

yizhang2077 commented Mar 15, 2025

Uh oh!

ccw1996 commented Mar 16, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

ccw1996 commented Jan 8, 2025 •

edited

Loading

ccw1996 Jan 12, 2025 •

edited

Loading

SashvDave commented Jan 29, 2025 •

edited

Loading

Edenzzzz Jan 30, 2025 •

edited

Loading

Edenzzzz commented Jan 31, 2025 •

edited

Loading

Edenzzzz commented Feb 6, 2025 •

edited

Loading

ccw1996 Feb 11, 2025 •

edited

Loading

Edenzzzz Feb 11, 2025 •

edited

Loading

ccw1996 commented Mar 15, 2025 •

edited

Loading

ccw1996 commented Mar 16, 2025 •

edited

Loading