Skip to content

Conversation

slayton58
Copy link
Contributor

Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes

New fp16 datatype, *_70 builds, fix include ordering in a test
Copy link
Contributor

@pietern pietern left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just a question

gloo/types.h Outdated
@@ -56,17 +56,27 @@ struct __attribute__((__aligned__(2))) float16 {
return x == res.x;
}
#ifdef __CUDA_ARCH__
float16(half h) {
float16(__half h) {
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's OK the constructor for half is gone now? (don't know where that type is defined TBH)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not 100% sure either :) I've always known the type as __half, and that's what is defined in cuda_fp16.h

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think half and __half are interchangeable. cuda_fp16.h has the following code:

#ifndef CUDA_NO_HALF
typedef __half half;
typedef __half2 half2;
#endif /*CUDA_NO_HALF*/

Should we stick to one for consistency? I vote for half. :)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah, some more directed grepping found that for me too. I don't care which one is used

set(gloo_known_gpu_archs "20 21(20) 30 35 50 52 60 61")
set(gloo_known_gpu_archs7 "20 21(20) 30 35 50 52")
set(gloo_known_gpu_archs "30 35 50 52 60 61 70")
set(gloo_known_gpu_archs7 "30 35 50 52")
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could you add a gloo_known_gpu_archs8 so that we don't break the CUDA 8 builds. AFAICT this is what's tripping up the Travis build.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Done and pushed

@facebook-github-bot
Copy link

@pietern has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator.

@pietern
Copy link
Contributor

pietern commented Jun 23, 2017

Thanks for adding this @slayton58!

ezyang pushed a commit to ezyang/ATen that referenced this pull request Apr 19, 2018
Summary:
Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes
Closes pytorch/gloo#49

Differential Revision: D5315336

Pulled By: pietern

fbshipit-source-id: 6468b0f357206d604bdcfec69ba82509a2c91407
cosmicEarth pushed a commit to cosmicEarth/gloo that referenced this pull request Apr 9, 2024
Summary:
Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes
Closes pytorch/gloo#49

Differential Revision: D5315336

Pulled By: pietern

fbshipit-source-id: 6468b0f357206d604bdcfec69ba82509a2c91407
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants