🏃 Fix and make CI faster #3160

qgallouedec · 2025-03-25T22:58:38Z

This PR introduces the following changes affecting the CI

Unless there is a good reason, use default config values for greater readability

E.g in test_dpo_trainer_with_weighting

  training_args = DPOConfig(
      output_dir=tmp_dir,
-    per_device_train_batch_size=2,
-     max_steps=3,
-     remove_unused_columns=False,
-     gradient_accumulation_steps=1,
      learning_rate=9e-1,
-     eval_strategy="steps",
-     beta=0.1,
-     loss_type="sigmoid",
-     precompute_ref_log_probs=False,
      use_weighting=True,
      report_to="none",
  )

For slow tests, use a `slow` marker

instead of a dedicated subfolder:

+ @pytest.mark.slow
  def my_slow_test(self):

Don't expose the import_utils to the root of the lib

These are mainly for internal use:

- from trl import is_diffusers_available
+ from trl.import_utils import is_diffusers_available

For low priority tests, use a `low_priority` marker

eg, for alignprop trainer

+ @pytest.mark.low_priority
  def my_low_priority_test(self):

Refactor BCO tests

The tests are mostly identical as before, but cleaned and optimised. Other refactors to follow

qgallouedec · 2025-03-25T22:59:38Z

tests/slow/test_sft_slow.py

-            # check that the components of the trainer.model are monkey patched:
-            self.assertTrue(any("Liger" in type(module).__name__ for module in trainer.model.model.modules()))


This assertion doesn't pass anymore. The liger has been moved to transformers anyway

qgallouedec · 2025-03-25T23:00:04Z

tests/test_ppo_trainer.py

@@ -170,55 +176,3 @@ def test_peft_training(self):

            self.assertTrue(critic_weights_updated, "Critic weights were not updated during training")
            self.assertTrue(policy_weights_updated, "Policy LoRA weights were not updated during training")
-
-    def test_with_num_train_epochs(self):


seems like a duplicated test

qgallouedec · 2025-03-25T23:00:52Z

trl/__init__.py

-    "import_utils": [
-        "is_deepspeed_available",
-        "is_diffusers_available",
-        "is_llm_blender_available",
-        "is_mergekit_available",
-        "is_rich_available",
-        "is_unsloth_available",
-        "is_vllm_available",
-    ],


Having that at the root is not a good thing, IMO.

HuggingFaceDocBuilderDev · 2025-03-25T23:02:56Z

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

This reverts commit 315c896.

qgallouedec · 2025-04-04T23:44:21Z

tests/test_bco_trainer.py

Heavy refactor of this one. To keep the review easy I'll refactor the others in follow-up PRs

qgallouedec · 2025-04-04T23:53:16Z

tests/test_bco_trainer.py

Refactor.

Before : 1min08
Now: 44 sec

qgallouedec · 2025-04-04T23:55:31Z

Makefile

@@ -6,14 +6,14 @@ ACCELERATE_CONFIG_PATH = `pwd`/examples/accelerate_configs
 COMMAND_FILES_PATH = `pwd`/commands

 test:
-	pytest -n auto --dist=loadfile -s -v --reruns 5 --reruns-delay 1 --only-rerun '(OSError|Timeout|HTTPError.*502|HTTPError.*504||not less than or equal to 0.01)' ./tests/


--dist=loadfile isn't necessary, and probably slows test when used

qgallouedec · 2025-04-05T01:13:12Z

trl/trainer/sft_trainer.py

@@ -228,6 +228,19 @@ def __init__(
        if processing_class is None:
            processing_class = AutoTokenizer.from_pretrained(model_id)

+        # Model


This part should be before the collator part

lewtun

Nice clean up! LGTM with a question on whether we can speed things up even further by using setUpClass() to init the model / tokenizer once per set of trainer tests

pyproject.toml

lewtun · 2025-04-08T09:03:02Z

tests/test_bco_trainer.py

@@ -20,76 +20,83 @@
 from accelerate import Accelerator
 from datasets import load_dataset
 from parameterized import parameterized
-from transformers import AutoModel, AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer


Just so I understand, we're removing testing for seq2seq models in this PR right? This is fine with me since no significant LLM with that arch has been released for several years, but maybe good to mention in the PR description so users can find the PR we dropped testing for this if needed.

In fact, we defined an AutoModelForSeq2SeqLM model in the setUp, but no test used it, so technically we're not removing any tests.

lewtun · 2025-04-08T09:04:16Z

tests/test_bco_trainer.py

-                eval_strategy="steps" if eval_dataset else "no",
-                beta=0.1,
-                precompute_ref_log_probs=pre_compute,
+                remove_unused_columns=False,  # warning raised if not set to False


Not for this PR, but should we then set this as the default value in BCOConfig?

Or even better, make BCO support this arg. Like we did for DPO in #2233

lewtun · 2025-04-08T09:09:45Z

tests/test_bco_trainer.py

    @require_sklearn
-    def test_bco_trainer(self, name, pre_compute, eval_dataset, config_name):
+    def test_train(self, config_name):
+        model_id = "trl-internal-testing/tiny-Qwen2ForCausalLM-2.5"


Could we save a bit of time by using setupClass() to init the model / tokenizer once? I guess one caveat here is that training might mutate the model, in which case it's better to keep it as you have it

I guess one caveat here is that training might mutate the model

yes, see #3160 (comment)

lewtun · 2025-04-08T09:13:16Z

tests/test_sft_trainer.py

-            self.assertIn("model.safetensors", os.listdir(tmp_dir + "/checkpoint-5"))
-
-    def test_sft_trainer_infinite_with_model_epochs(self):
+    def test_with_model_neftune(self):


Not for this PR, but doesn't neftune now live on transformers (and could be removed from our tests)?

agree, let's remove it in a follow-up PR

qgallouedec · 2025-04-08T13:02:41Z

Nice clean up! LGTM with a question on whether we can speed things up even further by using setUpClass() to init the model / tokenizer once per set of trainer tests

The issue with using setUpClass is that the objects are shared between tests cases, and it can interfer with test. Initially we used it but we've changed to use setUp instead. See here: #1895

qgallouedec added 7 commits March 25, 2025 20:43

liger out-of-scope

ab221c4

lower num_proc

d71447e

add require_vllm

6b3eaa9

don't expo import utils in root

8100960

use testing utils

17f9f66

speedup ppo test

8a04e68

update ppo test

c5f3735

qgallouedec commented Mar 25, 2025

View reviewed changes

qgallouedec and others added 16 commits March 25, 2025 23:20

mark a test as slow

29042a1

use slow markers

8e1ef6c

reduce max_completion_len grpo

31fda15

remove --dist=loadfile

fa1b062

simplify dpo tests

d100a06

less frequent eval

8de965f

Merge branch 'main' into faster-ci

a601c84

simplify CI

d9e2cfa

Merge branch 'main' into faster-ci

172d53a

minors

970e0e5

Merge branch 'main' into faster-ci

79c1475

clean __init__ structure

315c896

Revert "clean __init__ structure"

6564c3f

This reverts commit 315c896.

add a low_priority marker

b428252

align prop is low priority

5e26d79

refactor BCO tests

131d29e

qgallouedec commented Apr 4, 2025

View reviewed changes

fix make

3ff9716

qgallouedec commented Apr 4, 2025

View reviewed changes

tests/test_bco_trainer.py Outdated

Copy link

Member Author

qgallouedec Apr 4, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor.

Before : 1min08
Now: 44 sec

grpo slow marked slow

d35b813

qgallouedec commented Apr 4, 2025

View reviewed changes

qgallouedec requested review from shirinyamani, kashif and lewtun and removed request for shirinyamani April 4, 2025 23:56

qgallouedec added 4 commits April 5, 2025 00:59

simplify tests

036bcc2

fix padding free

220763a

no peft tag

9538e60

precommit

d72f1c1

qgallouedec commented Apr 5, 2025

View reviewed changes

qgallouedec and others added 4 commits April 5, 2025 16:21

Merge branch 'main' into faster-ci

4c522a8

Merge branch 'main' into faster-ci

1b8f4dc

unicode template

747343d

Merge branch 'main' into faster-ci

29dc5cc

qgallouedec changed the title ~~🏃 Faster CI~~ 🏃 Fix and make CI faster Apr 6, 2025

lewtun approved these changes Apr 8, 2025

View reviewed changes

qgallouedec merged commit b6bcafb into main Apr 8, 2025
10 checks passed

qgallouedec deleted the faster-ci branch April 8, 2025 13:12

yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025

🏃 Fix and make CI faster (huggingface#3160)

6034c94

wrmthorne mentioned this pull request Apr 21, 2025

Seq2Seq Support in PPOTrainer #3332

Closed

		# check that the components of the trainer.model are monkey patched:
		self.assertTrue(any("Liger" in type(module).__name__ for module in trainer.model.model.modules()))

🏃 Fix and make CI faster #3160

🏃 Fix and make CI faster #3160

Uh oh!

Conversation

qgallouedec commented Mar 25, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Unless there is a good reason, use default config values for greater readability

For slow tests, use a slow marker

Don't expose the import_utils to the root of the lib

For low priority tests, use a low_priority marker

Refactor BCO tests

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

HuggingFaceDocBuilderDev commented Mar 25, 2025

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lewtun left a comment

Choose a reason for hiding this comment

Uh oh!

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

lewtun Apr 8, 2025 • edited Loading Uh oh! There was an error while loading. Please reload this page.

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

Choose a reason for hiding this comment

Uh oh!

qgallouedec commented Apr 8, 2025

Uh oh!

Uh oh!

Uh oh!

qgallouedec commented Mar 25, 2025 •

edited

Loading

For slow tests, use a `slow` marker

For low priority tests, use a `low_priority` marker

lewtun Apr 8, 2025 •

edited

Loading