Skip to content

Conversation

qgallouedec
Copy link
Member

@qgallouedec qgallouedec commented Mar 25, 2025

This PR introduces the following changes affecting the CI

Unless there is a good reason, use default config values for greater readability

E.g in test_dpo_trainer_with_weighting

  training_args = DPOConfig(
      output_dir=tmp_dir,
-    per_device_train_batch_size=2,
-     max_steps=3,
-     remove_unused_columns=False,
-     gradient_accumulation_steps=1,
      learning_rate=9e-1,
-     eval_strategy="steps",
-     beta=0.1,
-     loss_type="sigmoid",
-     precompute_ref_log_probs=False,
      use_weighting=True,
      report_to="none",
  )

For slow tests, use a slow marker

instead of a dedicated subfolder:

+ @pytest.mark.slow
  def my_slow_test(self): 

Don't expose the import_utils to the root of the lib

These are mainly for internal use:

- from trl import is_diffusers_available
+ from trl.import_utils import is_diffusers_available

For low priority tests, use a low_priority marker

eg, for alignprop trainer

+ @pytest.mark.low_priority
  def my_low_priority_test(self): 

Refactor BCO tests

The tests are mostly identical as before, but cleaned and optimised. Other refactors to follow

Comment on lines -425 to -419
# check that the components of the trainer.model are monkey patched:
self.assertTrue(any("Liger" in type(module).__name__ for module in trainer.model.model.modules()))
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This assertion doesn't pass anymore. The liger has been moved to transformers anyway

@@ -170,55 +176,3 @@ def test_peft_training(self):

self.assertTrue(critic_weights_updated, "Critic weights were not updated during training")
self.assertTrue(policy_weights_updated, "Policy LoRA weights were not updated during training")

def test_with_num_train_epochs(self):
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

seems like a duplicated test

Comment on lines -39 to -47
"import_utils": [
"is_deepspeed_available",
"is_diffusers_available",
"is_llm_blender_available",
"is_mergekit_available",
"is_rich_available",
"is_unsloth_available",
"is_vllm_available",
],
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Having that at the root is not a good thing, IMO.

@HuggingFaceDocBuilderDev

The docs for this PR live here. All of your documentation changes will be reflected on that endpoint. The docs are available until 30 days after the last update.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Heavy refactor of this one. To keep the review easy I'll refactor the others in follow-up PRs

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Refactor.

Before : 1min08
Now: 44 sec

@@ -6,14 +6,14 @@ ACCELERATE_CONFIG_PATH = `pwd`/examples/accelerate_configs
COMMAND_FILES_PATH = `pwd`/commands

test:
pytest -n auto --dist=loadfile -s -v --reruns 5 --reruns-delay 1 --only-rerun '(OSError|Timeout|HTTPError.*502|HTTPError.*504||not less than or equal to 0.01)' ./tests/
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

--dist=loadfile isn't necessary, and probably slows test when used

@qgallouedec qgallouedec requested review from shirinyamani, kashif and lewtun and removed request for shirinyamani April 4, 2025 23:56
@@ -228,6 +228,19 @@ def __init__(
if processing_class is None:
processing_class = AutoTokenizer.from_pretrained(model_id)

# Model
Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This part should be before the collator part

@qgallouedec qgallouedec changed the title 🏃 Faster CI 🏃 Fix and make CI faster Apr 6, 2025
Copy link
Member

@lewtun lewtun left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Nice clean up! LGTM with a question on whether we can speed things up even further by using setUpClass() to init the model / tokenizer once per set of trainer tests

@@ -20,76 +20,83 @@
from accelerate import Accelerator
from datasets import load_dataset
from parameterized import parameterized
from transformers import AutoModel, AutoModelForCausalLM, AutoModelForSeq2SeqLM, AutoTokenizer
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Just so I understand, we're removing testing for seq2seq models in this PR right? This is fine with me since no significant LLM with that arch has been released for several years, but maybe good to mention in the PR description so users can find the PR we dropped testing for this if needed.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

In fact, we defined an AutoModelForSeq2SeqLM model in the setUp, but no test used it, so technically we're not removing any tests.

eval_strategy="steps" if eval_dataset else "no",
beta=0.1,
precompute_ref_log_probs=pre_compute,
remove_unused_columns=False, # warning raised if not set to False
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but should we then set this as the default value in BCOConfig?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Or even better, make BCO support this arg. Like we did for DPO in #2233

@require_sklearn
def test_bco_trainer(self, name, pre_compute, eval_dataset, config_name):
def test_train(self, config_name):
model_id = "trl-internal-testing/tiny-Qwen2ForCausalLM-2.5"
Copy link
Member

@lewtun lewtun Apr 8, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Could we save a bit of time by using setupClass() to init the model / tokenizer once? I guess one caveat here is that training might mutate the model, in which case it's better to keep it as you have it

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I guess one caveat here is that training might mutate the model

yes, see #3160 (comment)

self.assertIn("model.safetensors", os.listdir(tmp_dir + "/checkpoint-5"))

def test_sft_trainer_infinite_with_model_epochs(self):
def test_with_model_neftune(self):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not for this PR, but doesn't neftune now live on transformers (and could be removed from our tests)?

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

agree, let's remove it in a follow-up PR

@qgallouedec
Copy link
Member Author

Nice clean up! LGTM with a question on whether we can speed things up even further by using setUpClass() to init the model / tokenizer once per set of trainer tests

The issue with using setUpClass is that the objects are shared between tests cases, and it can interfer with test. Initially we used it but we've changed to use setUp instead. See here: #1895

@qgallouedec qgallouedec merged commit b6bcafb into main Apr 8, 2025
10 checks passed
@qgallouedec qgallouedec deleted the faster-ci branch April 8, 2025 13:12
yxliu-TAMU pushed a commit to mincheolseong/ECEN743-GRPO-Project-Proposal that referenced this pull request Apr 20, 2025
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

3 participants