Skip to content

[REQ] Query number of SMs in the Device #584

@sivahari

Description

@sivahari

Description

Ability to query the number of SMs in the device

Context

In CUDA we can get it using cudaGetDeviceProperties ( cudaDeviceProp* prop, int device ) and then accessing prop.multiProcessorCount.

This feature can be helpful in right-sizing the grid. Sometimes we would like to avoid tail effects, which can be created by distributing the work to 11 blocks on a 10-SM GPU (for example). The ability to query the number of SMs can help us avoid such tail effects.

Metadata

Metadata

Assignees

Labels

Projects

No projects

Milestone

Relationships

None yet

Development

No branches or pull requests

Issue actions