Skip to content

python -c "import torch;print(torch.nn.GELU()(torch.rand(2)))" crashes on aarch64 #115482

@malfet

Description

@malfet

🐛 Describe the bug

OpenDNN-3.3.2 update (made by #112700 ) reintroduced regression reported in #111695 and fixed by pytorch/builder@c5e331c

One line reproducer (must be run on Apple silicon Mac or AWS lambda environment when /sys is not accessible):

 % docker run --rm -it python:3.11.4 bash -c "pip install torch==2.2.0 --index-url https://download.pytorch.org/whl/test/cpu; python -c 'import torch;print(torch.nn.GELU()(torch.rand(2)))'"
...
/usr/local/lib/python3.11/site-packages/torch/nn/modules/transformer.py:20: UserWarning: Failed to initialize NumPy: No module named 'numpy' (Triggered internally at /pytorch/torch/csrc/utils/tensor_numpy.cpp:84.)
  device: torch.device = torch.device(torch._C._get_default_device()),  # torch.device('cpu'),
bad err=11 in Xbyak::Error
Traceback (most recent call last):
  File "<string>", line 1, in <module>
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1511, in _wrapped_call_impl
    return self._call_impl(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/module.py", line 1520, in _call_impl
    return forward_call(*args, **kwargs)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
  File "/usr/local/lib/python3.11/site-packages/torch/nn/modules/activation.py", line 682, in forward
    return F.gelu(input, approximate=self.approximate)
           ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
RuntimeError: internal error

Versions

2.2.0 RC1, nightly

cc @ezyang @gchanan @zou3519 @kadeng @gujinghui @PenghuiCheng @XiaobingSuper @jianyuh @jgong5 @mingfeima @sanchitintel @ashokei @jingxu10 @min-jean-cho @yanbing-j @Guobing-Chen @Xia-Weiwen @snadampal

Metadata

Metadata

Assignees

Labels

has workaroundhigh prioritymodule: armRelated to ARM architectures builds of PyTorch. Includes Apple M1module: mkldnnRelated to Intel IDEEP or oneDNN (a.k.a. mkldnn) integrationmodule: regressionIt used to work, and now it doesn'ttriagedThis issue has been looked at a team member, and triaged and prioritized into an appropriate module

Type

No type

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions