Fix llama_cpp cuda usage in Containerfile #1025

n1hility · 2024-04-27T23:37:33Z

Problem

The current cuda builds of instructlab are unaccelerated for serve/generate

Changes

Sync requirements.txt of llama_cpp_python with instructlab

The step which pip installs llama_cpp_python is overwritten by the instructlab install since the instructlab has a pinned version that is lesser than the preceding llama_cpp_python. Since the instructlab install does not specify the cuda CMAKE args it gets rebuilt without cuda support. By aligning requirements we ensure that llama_cpp_python will not be rebuilt.

Before this change:

>>> llama_cpp.llama_supports_gpu_offload()      
False

After:

>>> llama_cpp.llama_supports_gpu_offload()      
True

Signed-off-by: Jason T. Greene <jason.greene@redhat.com>

tiran · 2024-04-28T09:55:49Z

@n1hility @russellb
The workaround in this PR is problematic. It doesn't use the current requirements.txt but the version from the main branch. This is likely to cause issues when we create a stable branch and modify requirements in main. Also pip install -r ... will install all dependencies in the requirements file.

I'm using this approach in my container files, which I believe is better:

copy requirements file from context into the container
convert the requirements file to a constraints file by stripping off optional dependencies
pip install with -c instead of -r

COPY requirements.txt /tmp
RUN sed 's/\[.*\]//' /tmp/requirements.txt >/tmp/constraints.txt
RUN CMAKE_ARGS="-DLLAMA_CUBLAS=on" python3.11 -m pip install -c /tmp/constraints.txt --force-reinstall --no-cache-dir llama-cpp-python

russellb · 2024-04-28T13:28:46Z

Thanks @tiran -- good feedback. you're right.

However, this Containerfile isn't building instructlab from the current source tree. It's explicitly pulling stable from git, so the added line also pulls requirements.txt from stable.

I do think this Containerfile should be enhanced to allow building from the current source tree, as well.

Fix llama_cpp cuda install

306f732

Signed-off-by: Jason T. Greene <jason.greene@redhat.com>

russellb approved these changes Apr 28, 2024

View reviewed changes

mergify bot merged commit 3ebaa26 into instructlab:main Apr 28, 2024

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fix llama_cpp cuda usage in Containerfile #1025

Fix llama_cpp cuda usage in Containerfile #1025

Uh oh!

n1hility commented Apr 27, 2024 •

edited

Loading

Uh oh!

tiran commented Apr 28, 2024

Uh oh!

russellb commented Apr 28, 2024 •

edited

Loading

Uh oh!

Uh oh!

Fix llama_cpp cuda usage in Containerfile #1025

Fix llama_cpp cuda usage in Containerfile #1025

Uh oh!

Conversation

n1hility commented Apr 27, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Problem

Changes

Uh oh!

tiran commented Apr 28, 2024

Uh oh!

russellb commented Apr 28, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

Uh oh!

n1hility commented Apr 27, 2024 •

edited

Loading

russellb commented Apr 28, 2024 •

edited

Loading