-
Notifications
You must be signed in to change notification settings - Fork 24.9k
Closed
Labels
module: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: third_partyoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleupstream tritonUpstream Triton IssueUpstream Triton Issue
Milestone
Description
🐛 Describe the bug
Pytorch2.4 uses a new version of triton that adds the cuTensorMapEncodeTiled
API (triton-lang/triton@7289a23#diff-0d645ca31937abba9a3357062ee2c3708f6d49f66d7842d5f6577a2044f962f5)
This API requires a sufficiently new NVIDIA driver. Otherwise triton refuses to compile anything. To reproduce:
- Use a machine with pre-cuda12 drivers.
- Create a venv, install pytorch 2.4 cu118 RC using pip
- Run the official triton example https://github.com/triton-lang/triton/blob/main/python/tutorials/06-fused-attention.py
- Obtain error:
Traceback (most recent call last):
File "/users/XXXg/home/projects/triton/python/tutorials/06-fused-attention.py", line 81, in <module>
configs = [
File "/users/XXXg/home/projects/triton/python/tutorials/06-fused-attention.py", line 85, in <listcomp>
for s in ([1] if is_hip() else [3, 4, 7])\
File "/users/XXXg/home/projects/triton/python/tutorials/06-fused-attention.py", line 22, in is_hip
return triton.runtime.driver.active.get_current_target().backend == "hip"
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/runtime/driver.py", line 23, in __getattr__
self._initialize_obj()
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/runtime/driver.py", line 20, in _initialize_obj
self._obj = self._init_fn()
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/runtime/driver.py", line 9, in _create_driver
return actives[0]()
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 371, in __init__
self.utils = CudaUtils() # TODO: make static
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 80, in __init__
mod = compile_module_from_src(Path(os.path.join(dirname, "driver.c")).read_text(), "cuda_utils")
File "/home/XXXg/.pyenv/versions/torch24/lib/python3.10/site-packages/triton/backends/nvidia/driver.py", line 62, in compile_module_from_src
mod = importlib.util.module_from_spec(spec)
File "<frozen importlib._bootstrap>", line 571, in module_from_spec
File "<frozen importlib._bootstrap_external>", line 1176, in create_module
File "<frozen importlib._bootstrap>", line 241, in _call_with_frames_removed
ImportError: /home/XXXg/.triton/cache/2920354f453efffb492e73b112abcee1d2d301a37ade21e318a1ba26fa4fcd7c/cuda_utils.so: undefined symbol: cuTensorMapEncodeTiled
My driver version is: NVIDIA-SMI 470.161.03 Driver Version: 470.161.03
. Note that this driver had been running older pytorch cu118 wheels without problems.
Related issue: triton-lang/triton#2062
Versions
PyTorch version: 2.4.0+cu118
Is debug build: False
CUDA used to build PyTorch: 11.8
ROCM used to build PyTorch: N/A
OS: Ubuntu 20.04.6 LTS (x86_64)
GCC version: (Ubuntu 9.4.0-1ubuntu1~20.04.3) 9.4.0
Clang version: Could not collect
CMake version: version 3.16.3
Libc version: glibc-2.31
Python version: 3.10.0 (default, Dec 18 2023, 03:34:21) [GCC 9.4.0] (64-bit runtime)
Python platform: Linux-5.4.250-2-velinux1u1-amd64-x86_64-with-glibc2.31
Is CUDA available: True
CUDA runtime version: 11.8.89
CUDA_MODULE_LOADING set to: LAZY
Nvidia driver version: 470.161.03
cuDNN version: Probably one of the following:
/usr/lib/x86_64-linux-gnu/libcudnn.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_adv_train.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_cnn_train.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_infer.so.8.7.0
/usr/lib/x86_64-linux-gnu/libcudnn_ops_train.so.8.7.0
cc @ptrblck @msaroufim @ezyang @anijain2305 @chauhang @penguinwu @bertmaher @int3 @davidberard98 @nmacchioni @chenyang78 @embg @malfet @seemethere
Metadata
Metadata
Assignees
Labels
module: cudaRelated to torch.cuda, and CUDA support in generalRelated to torch.cuda, and CUDA support in generalmodule: third_partyoncall: pt2triagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleThis issue has been looked at a team member, and triaged and prioritized into an appropriate moduleupstream tritonUpstream Triton IssueUpstream Triton Issue