Skip to content

How ncclAllreduce is implemented? #530

@szhengac

Description

@szhengac

I have two questions:

  1. Does ncclAllreduce handle intra- and inter-node communication differently or there are multiple flat rings that connects all the GPUs?
  2. Is hierarchical allreduce (i.e., intra-node reduce-scatter -> inter-node multiple allreduce -> intra-node allgather) currently supported in the latest NCCL?

Metadata

Metadata

Assignees

No one assigned

    Labels

    No labels
    No labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions