all_gather with gloo backend does not work in inference mode

### 🐛 Describe the bug

A minimal reproducible example:

```python
import torch
import torch.distributed as dist
dist.init_process_group(backend='gloo')
# dist.init_process_group(backend='nccl')
# torch.cuda.set_device(dist.get_rank())
with torch.inference_mode():
    data = [torch.ones((3, 3))] * dist.get_world_size()
    obj = data[dist.get_rank()]
    dist.all_gather(data, obj)
    # dist.broadcast(obj, src=0)
```

The error is:

> E           RuntimeError: Inplace update to inference tensor outside InferenceMode is not allowed.You can make a clone to get a normal tensor before doing inplace update.See https://github.com/pytorch/rfcs/pull/17 for more details.

It looks strange, that `nccl` backend works in this case. `broadcast` works, too. Only `all_gather` does not work.

### Versions

pytorch 2.3.0

cc @mrshenli @pritamdamania87 @zhaojuanmao @satgera @gqchen @aazzolini @osalpekar @jiayisuse @H-Huang @kwen2501 @awgu @penguinwu @fegin @XilunWu @wanchaol @fduwjj @wz337 @tianyu-l @wconstab @yf225 @chauhang @d4l3k

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

all_gather with gloo backend does not work in inference mode #126032

🐛 Describe the bug

Versions

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

all_gather with gloo backend does not work in inference mode #126032

Description

🐛 Describe the bug

Versions

Metadata

Metadata

Assignees

Labels

Type

Projects

Milestone

Relationships

Development

Issue actions