Skip to content

Conversation

stduhpf
Copy link
Contributor

@stduhpf stduhpf commented Jun 17, 2025

Adds a way to chose which backend device to use for Clip/vision encoder by setting the MTMD_BACKEND_DEVICE env variable to the device name.

@stduhpf
Copy link
Contributor Author

stduhpf commented Jun 26, 2025

@ngxson Can you take a look at this? I think it's very easy to review and can be useful in some situations.

@hjc4869
Copy link
Contributor

hjc4869 commented Jun 26, 2025

It's indeed a useful feature. In my configuration I have two GPUs running different layers of the model with -ot and the current implementation use both of them to run vision encoder. In my use case it would be nice if it is possible to pin it to a specific device.

hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jun 28, 2025
hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jul 5, 2025
@stduhpf stduhpf force-pushed the clip-gpu-select branch from 205e4cb to 47e9237 Compare July 7, 2025 12:49
hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jul 8, 2025
hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jul 11, 2025
hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jul 16, 2025
hjc4869 added a commit to hjc4869/llama.cpp that referenced this pull request Jul 22, 2025
@stduhpf
Copy link
Contributor Author

stduhpf commented Jul 22, 2025

Can any of the maintainers take a look at this?

Copy link
Member

@ggerganov ggerganov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It would be better to pass the device as a parameter instead of using an env variable. Any reason to not introduce a new parameter?

backend = ctx_params.use_gpu
? ggml_backend_init_by_type(GGML_BACKEND_DEVICE_TYPE_GPU, nullptr)
: nullptr;
backend = nullptr;
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this should be defined earlier:

ggml_backend_t backend = nullptr;
ggml_backend_t backend_cpu = nullptr;

@ngxson
Copy link
Collaborator

ngxson commented Jul 22, 2025

It would be better to pass the device as a parameter instead of using an env variable. Any reason to not introduce a new parameter?

Adding an arg can be a breaking change, because mtmd_context_params will be modified. But this is not a very big problem, so I have no preference either we should or should not add an arg.

For arg, maybe we can add --mmproj-device since all other multimodal-related arg are prefixed with --mmproj-*

@stduhpf
Copy link
Contributor Author

stduhpf commented Jul 22, 2025

Any reason to not introduce a new parameter?

No particular reason other than using a env variable is more straight-forward to implement.

@ngxson ngxson merged commit c8ade30 into ggml-org:master Jul 22, 2025
47 checks passed
gabe-l-hart added a commit to gabe-l-hart/llama.cpp that referenced this pull request Jul 23, 2025
* origin/master: (49 commits)
ci : correct label refactor->refactoring (ggml-org#14832)
CUDA: fix quantized KV cache + multiple sequences (ggml-org#14822)
tests : add non-cont K,V FA tests
memory : handle saving/loading null layers in recurrent memory (ggml-org#14675)
ggml: fix loongarch quantize_row_q8_1 error (ggml-org#14827)
CANN: weight format to NZ for Ascend310P3 (ggml-org#14407)
CUDA: add fused rms norm (ggml-org#14800)
ggml : model card yaml tab->2xspace (ggml-org#14819)
vulkan: fix rms_norm_mul to handle broadcasting dim0 (ggml-org#14817)
llama : add model type detection for rwkv7 7B&14B (ggml-org#14816)
imatrix: add option to display importance score statistics for a given imatrix file (ggml-org#12718)
Mtmd: add a way to select device for vision encoder (ggml-org#14236)
cuda : implement bf16 cpy ops and enable bf16 cont (ggml-org#14763)
opencl: remove unreachable `return` (ggml-org#14806)
server : allow setting `--reverse-prompt` arg (ggml-org#14799)
cuda: remove linking to cublasLt (ggml-org#14790)
opencl: fix `im2col` when `KW!=KH` (ggml-org#14803)
opencl: add conv2d kernel (ggml-org#14403)
sycl: Fix im2col (ggml-org#14797)
kleidiai: add support for get_rows (ggml-org#14676)
...
taronaeo pushed a commit to taronaeo/llama.cpp-s390x that referenced this pull request Jul 25, 2025
* Mtmd: add a way to select device for vision encoder

* simplify

* format

* Warn user if manual device selection failed

* initialize backend to nullptr
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants