Fixes clipping #601

ananyahjha93 · 2024-05-30T09:38:58Z

Added tests (CPU and GPU) to compare torch clipping and olmo clipping and fixed clipping for DDP and FSDP no_shard

@epwalsh can we merge this PR so that I can push the DDP one after this?

…o no_shard_ddp_clip

epwalsh

@ananyahjha93 please don't merge anything into main without a review. This PR touches some critical components and I'm pretty sure your changes will force an unnecessary host device sync on every batch, which could potentially have a big negative impact on training throughput.

epwalsh · 2024-05-30T18:49:34Z

olmo/optim.py

-                    group, max_norm_ratio, global_step, all_metrics, collect_param_metrics=collect_param_metrics
+                    group, max_norm_ratio, global_step, all_metrics, collect_param_metrics=True


Why this change? This will force a host device sync on every batch.

let me send in a PR for this fix!

epwalsh · 2024-05-30T18:49:50Z

olmo/optim.py

-                    group, max_norm, all_metrics, collect_param_metrics=collect_param_metrics
+                    group, max_norm, all_metrics, collect_param_metrics=True


Same question here.

ananyahjha93 added 9 commits May 30, 2024 00:54

added clipping

9b502b7

clipping for no_shard final

350839c

.

ad59b1d

.

30fc4b3

.

c42465c

changelog

96c1138

type check and lint

149708f

.

237a183

Merge branch 'main' into no_shard_ddp_clip

c382568

ananyahjha93 requested review from epwalsh and dirkgr May 30, 2024 09:50

ananyahjha93 added 10 commits May 30, 2024 03:32

isort

1535446

Merge branch 'no_shard_ddp_clip' of ssh://github.com/allenai/OLMo int…

c597c22

…o no_shard_ddp_clip

pep8 stuff

dc336f0

black formatting

0052bf1

formatting

117cf1f

formatting

2c62bad

formatting

d4398d8

formatting

3bea0ce

formatting

4041559

formatting

1bd2135

ananyahjha93 merged commit 55c1e2f into main May 30, 2024

ananyahjha93 deleted the no_shard_ddp_clip branch May 30, 2024 13:40

epwalsh reviewed May 30, 2024

View reviewed changes

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

Fixes clipping #601

Fixes clipping #601

Uh oh!

ananyahjha93 commented May 30, 2024 •

edited

Loading

Uh oh!

epwalsh left a comment

Uh oh!

epwalsh May 30, 2024

Uh oh!

ananyahjha93 May 30, 2024

Uh oh!

epwalsh May 30, 2024

Uh oh!

Uh oh!

		group, max_norm_ratio, global_step, all_metrics, collect_param_metrics=collect_param_metrics
		group, max_norm_ratio, global_step, all_metrics, collect_param_metrics=True

		group, max_norm, all_metrics, collect_param_metrics=collect_param_metrics
		group, max_norm, all_metrics, collect_param_metrics=True

Fixes clipping #601

Fixes clipping #601

Uh oh!

Conversation

ananyahjha93 commented May 30, 2024 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Uh oh!

epwalsh left a comment

Choose a reason for hiding this comment

Uh oh!

epwalsh May 30, 2024

Choose a reason for hiding this comment

Uh oh!

ananyahjha93 May 30, 2024

Choose a reason for hiding this comment

Uh oh!

epwalsh May 30, 2024

Choose a reason for hiding this comment

Uh oh!

Uh oh!

ananyahjha93 commented May 30, 2024 •

edited

Loading