-
Notifications
You must be signed in to change notification settings - Fork 340
CUDA 9 #49
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
CUDA 9 #49
Conversation
New fp16 datatype, *_70 builds, fix include ordering in a test
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Just a question
gloo/types.h
Outdated
@@ -56,17 +56,27 @@ struct __attribute__((__aligned__(2))) float16 { | |||
return x == res.x; | |||
} | |||
#ifdef __CUDA_ARCH__ | |||
float16(half h) { | |||
float16(__half h) { |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It's OK the constructor for half
is gone now? (don't know where that type is defined TBH)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I'm not 100% sure either :) I've always known the type as __half
, and that's what is defined in cuda_fp16.h
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think half and __half are interchangeable. cuda_fp16.h has the following code:
#ifndef CUDA_NO_HALF
typedef __half half;
typedef __half2 half2;
#endif /*CUDA_NO_HALF*/
Should we stick to one for consistency? I vote for half. :)
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Yeah, some more directed grepping found that for me too. I don't care which one is used
set(gloo_known_gpu_archs "20 21(20) 30 35 50 52 60 61") | ||
set(gloo_known_gpu_archs7 "20 21(20) 30 35 50 52") | ||
set(gloo_known_gpu_archs "30 35 50 52 60 61 70") | ||
set(gloo_known_gpu_archs7 "30 35 50 52") |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Could you add a gloo_known_gpu_archs8
so that we don't break the CUDA 8 builds. AFAICT this is what's tripping up the Travis build.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Done and pushed
@pietern has imported this pull request. If you are a Facebook employee, you can view this diff on Phabricator. |
Thanks for adding this @slayton58! |
Summary: Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes Closes pytorch/gloo#49 Differential Revision: D5315336 Pulled By: pietern fbshipit-source-id: 6468b0f357206d604bdcfec69ba82509a2c91407
Summary: Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes Closes pytorch/gloo#49 Differential Revision: D5315336 Pulled By: pietern fbshipit-source-id: 6468b0f357206d604bdcfec69ba82509a2c91407
Adds basic CUDA 9 support, including adding Volta arch, and making appropriate modifications for half precision datatype changes