-
Notifications
You must be signed in to change notification settings - Fork 2.2k
ℹ️ Unify autocast behavior to torch.autocast
and make it cover XPU
#3541
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
device-agnostic to cover xpu Signed-off-by: YAO Matrix <matrix.yao@intel.com>
|
||
```python | ||
training_args = DPOConfig(..., optimize_cuda_cache=True) | ||
training_args = DPOConfig(..., optimize_device_cache=True) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
there is no optimize_cuda_cache
anymore, so update the doc here
@@ -82,7 +82,7 @@ class ScriptArguments: | |||
batch_size=script_args.batch_size, | |||
mini_batch_size=script_args.mini_batch_size, | |||
gradient_accumulation_steps=script_args.gradient_accumulation_steps, | |||
optimize_cuda_cache=True, | |||
optimize_device_cache=True, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
as above
torch.cuda.empty_cache() | ||
elif torch_device == "xpu": | ||
torch.xpu.empty_cache() | ||
backend_empty_cache(torch_device) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
use the device-agnostic utility from transformers.testing_utils rather than if-else
@unittest.skipIf( | ||
get_device_properties()[0] == "cuda" and get_device_properties()[1] < 8, | ||
"Skipping because bf16 not supported on CUDA GPU with capability < 8.0", | ||
) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
add skipIf per the comments and remove condition-less skip
if is_torch_xpu_available(): | ||
return f"xpu:{state.local_process_index}" | ||
if torch.cuda.is_available() or is_torch_xpu_available(): | ||
return state.local_process_index | ||
elif is_torch_npu_available(): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
we don't need this WA anymore, xpu now support integer device index
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
nice! thanks! Just one comment
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
Co-authored-by: Quentin Gallouédec <45557362+qgallouedec@users.noreply.github.com>
@qgallouedec is the test failing due to the CI issue? |
Yes, fixing it in #3551 |
The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update. |
torch.autocast
and make it cover XPUtorch.autocast
and make it cover XPU
Let's wait for #3553 to be merged |
@kashif , pls help review and comment, thx very much.