Skip to content

Conversation

zhiqiangdon
Copy link
Contributor

@zhiqiangdon zhiqiangdon commented Dec 13, 2024

Issue #, if available:

Description of changes:

This PR implemented a bag of tricks for multimodal AutoML involving multimodal model fusion strategies, multimodal data augmentation, cross-modal alignment, converting tabular data into text, handling missing modalities with modality dropout and learnable embeddings, and an ensemble learner to integrate the bag of tricks for optimal performance. To better implement these tricks, we also refactored AutoMM's codebase accordingly. For technical details, see this doc

Here is one example of using bag of tricks:

from autogluon.multimodal import MultiModalPredictor
predictor = MultiModalPredictor(label="label", use_ensemble=True)
predictor.fit(train_data=train_data)

For users who used AutoMM previously, please change your configurations as bellow. No action is needed if you didn’t customize these hyperparameters.

optimization -> optim
optimization.learning_rate -> optim.lr
optimization.efficient_finetune -> optim.peft
optimization.loss_function -> optim.loss_func
env.num_workers_evaluation -> env.num_workers_inference
env.eval_batch_size_ratio -> env.inference_batch_size_ratio
data.label.numerical_label_preprocessing -> data.label.numerical_preprocessing
model.categorical_mlp.drop_rate -> model.categorical_mlp.dropout
model.numerical_mlp.drop_rate -> model.numerical_mlp.dropout
model.numerical_mlp.d_token -> model.numerical_mlp.token_dim
model.timm_image.max_img_num_per_col -> model.timm_image.max_image_num_per_column
model.clip.max_img_num_per_col -> model.clip.max_image_num_per_column
model.fusion_mlp.weight -> model.fusion_mlp.aux_loss_weight
model.fusion_mlp.drop_rate -> model.fusion_mlp.dropout
model.fusion_transformer.n_blocks -> model.fusion_transformer.num_blocks
model.fusion_transformer.attention_n_heads -> model.fusion_transformer.attention_num_heads
model.fusion_transformer.ffn_d_hidden -> model.fusion_transformer.ffn_hidden_size
model.ft_transformer.attention_n_heads -> model.ft_transformer.attention_num_heads

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Copy link

Check out this pull request on  ReviewNB

See visual diffs & provide feedback on Jupyter Notebooks.


Powered by ReviewNB

@zhiqiangdon
Copy link
Contributor Author

zhiqiangdon commented Dec 13, 2024

@Innixma @tonyhoo @FANGAreNotGnu @suzhoum Please help review and approve. Please also help add the necessary labels to run multi-gpu jobs since my developer permission was revoked by someone.

George suggested adding the bag of tricks into AutoMM. The corresponding paper describing the details will be published soon. This PR is large due to the unexpected backlog.

Copy link

Job PR-4737-2e80a51 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4737/2e80a51/index.html

@hohoCode
Copy link

Great work! Just really a curious question (not as a blocker in anyway but just a question): is it really needed to change the parameter names ("optimization -> optim", "learning rate -> lr" etc)? As this will make the updated AutoMM library not backward compatible.

@zhiqiangdon
Copy link
Contributor Author

Great work! Just really a curious question (not as a blocker in anyway but just a question): is it really needed to change the parameter names ("optimization -> optim", "learning rate -> lr" etc)? As this will make the updated AutoMM library not backward compatible.

Great observation and question! To better implement the bag of tricks, we have refactored several places in the data preprocessor and processors, which are not backward compatible. Therefore, we also took this opportunity to pay tech debts and optimize some designs including the config name changes. As the optimization configs have multi-level hyperparameters such as optim.lemda.consist_threshold, shortening some key word lengths would make users' life easier. Note that in AutoGluon's previous release notes (< 1.2), it usually mentioned "Loading models trained on older versions of AutoGluon is not supported." Moreover, the optim acronym was already adopted by Pytorch (https://pytorch.org/docs/stable/optim.html).

@FANGAreNotGnu FANGAreNotGnu added the run-multi-gpu Run multimodal multi-gpu tests label Dec 16, 2024
@tonyhoo tonyhoo added module: multimodal model list checked You have updated the model list after modifying multimodal unit tests/docs labels Dec 17, 2024
@hohoCode
Copy link

Got it. Great @zhiqiangdon !

@Innixma
Copy link
Contributor

Innixma commented Dec 17, 2024

Thanks @zhiqiangdon! Taking a look at this now as I was at NeurIPS last week.

Do you have a recommendation for how we should benchmark this PR? Will the existing best_quality preset incorporate the changes automatically, or do you have a recommended configuration we should try?

This PR also adds quite a few changes and thus would be challenging to review (>10k LoC changed across 188 files). Do you have a recommendation for how we should approach this PR for review?

@Innixma
Copy link
Contributor

Innixma commented Dec 17, 2024

Given the changes / additions to the hyperparameters / params that a user may specify, it would be good if we can write a migration guide so users can know how to modify existing scripts to be compatible with this version of the code. For example, mentioning to change optimization -> optim, etc.

labels: np.ndarray,
):
weighted_ensemble = EnsembleSelection(
ensemble_size=self._ensemble_size * len(predictions),
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why * len(predictions)? This probably isn't necessary and would slow down the training a lot. I'd recommend just always using a smallish value such as ensemble_size=40.

Copy link
Contributor Author

@zhiqiangdon zhiqiangdon Dec 17, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

len(predictions) is the number of models, not the number of samples. For example, if we train 10 models, then it would be 10. ensemble_size is a multiplier on this number, making the actual ensemble size adaptable to the number of models.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yeah I was aware of len(predictions) being the # of models, but I'll mention that in my testing with TabRepo, there is virtually zero benefit going beyond ensemble_size=40. In fact, going beyond this value is harmful due to validation overfitting on small datasets, and you are more likely to lose the sparseness property as you have very high ensemble_size, which will slow down your inference speed.

Refer to fig 4 here: https://github.com/autogluon/tabrepo/blob/main/data/sensitivity.png

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Similarly, it isn't necessary to use very small values. If I'm not mistaken, with 2 models the code currently only uses ensemble_size=4. This will be suboptimal, as 4 isn't enough to get good weights between the 2 models. I'd recommend always using between 25 and 40, but selecting a single value for simplicity. Tabular uses 40.

Copy link
Contributor Author

@zhiqiangdon zhiqiangdon Dec 18, 2024

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Great suggestion! We used 2 * model_num in previous experiments. We can try 40 later to see whether it can improve performance in follow-up PRs.

ensemble_mode
The mode of conducting ensembling:
- `one_shot`: the classic ensemble selection
- `sequential`: iteratively calling the classic ensemble selection with each time growing the model zoo by the best next model.
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

sequential is a cool idea! It might make the most sense for this to be added in core so it isn't exclusive to AutoMM? That way we could simply call EnsembleSelection(mode="sequential") to do it in Tabular and TimeSeries as well.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Note: can be a follow-up PR

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's not exclusive to AutoMM. We can integrate it into EnsembleSelection later if needed.

@zhiqiangdon
Copy link
Contributor Author

Given the changes / additions to the hyperparameters / params that a user may specify, it would be good if we can write a migration guide so users can know how to modify existing scripts to be compatible with this version of the code. For example, mentioning to change optimization -> optim, etc.

Good point. I have updated the PR description by adding the migration guideline.

@@ -4,4 +4,4 @@ AutoGluon_multimodal_best:
params: # MultimodalPredictor.fit(params) # can add the actual job time limit here
presets: best_quality
hyperparameters:
optimization.max_epochs: 10 # 10 is default
optim.max_epochs: 10 # 10 is default
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

General question: minor bug fixes / updates have been made over the last several months in AutoMM mainline. Does this PR incorporate those changes as well (via rebasing during the period of development), or does the PR overwrite them?

If it rebased on them, were there any major merge conflicts resolved that are part of this PR worth mentioning?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This PR is based on the most recent AutoMM mainline. I didn't notice other major conflicts except for the configuration changes.

@zhiqiangdon
Copy link
Contributor Author

Thanks @zhiqiangdon! Taking a look at this now as I was at NeurIPS last week.

Do you have a recommendation for how we should benchmark this PR? Will the existing best_quality preset incorporate the changes automatically, or do you have a recommended configuration we should try?

This PR also adds quite a few changes and thus would be challenging to review (>10k LoC changed across 188 files). Do you have a recommendation for how we should approach this PR for review?

Good questions. Here's some context and recommendations:

Regarding benchmarking, the best_quality preset currently trains only a single model. To incorporate the bag of tricks, you’ll need to set use_ensemble=True when initializing the predictor. At this point, autogluon-bench does not support benchmarking the bag of tricks due to the additional setup required for datasets and certain model checkpoints (e.g., meta-transformer). For now, I suggest referring to the results documented here.

This PR is large because of management issues earlier this year that led to an unexpected backlog. All unit tests and tutorial builds have passed successfully. To proceed, you can benchmark the three existing presets (single model without bag of tricks) and compare the results with previous releases. Many parts of this PR involving configuration changes should be straightforward to skim through.

@Innixma Innixma added this to the 1.3 Release milestone Dec 19, 2024
@Innixma Innixma added the priority: 0 Maximum priority label Dec 19, 2024
@Innixma
Copy link
Contributor

Innixma commented Jan 9, 2025

While referring to the paper results is good, it is still results from a snapshot in time that doesn't necessarily represent the state of the system in practice today. The current idea I have is that once a few more people are back from vacation in mid Jan, we can run a benchmark to verify stability and performance with the presets.

If it looks good and there aren't any other major concerns that come up, we can look to merge. This could take some time though, as the changes are significant.

Copy link

Job PR-4737-23f5671 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4737/23f5671/index.html

@suzhoum
Copy link
Contributor

suzhoum commented Feb 20, 2025

Hi @zhiqiangdon @Innixma, to address your comments, I have reprocessed the results and attaching here. I have reverted the negative signs for rmse, and I have removed single modality datasets from the mainline benchmark, so there are in total 10 multi-modality datasets (2 image-tabular, 1 image-text, 2 image-text-tabular, 5 text-tabular) for evaluation. There was however one benchmark missing from the mainline_high_quality, but it should not affect the overall performance evaluation much.
AutoGluon_medium_master_g4_12x_ag_bench_bot_2025_02_19.csv
AutoGluon_high_master_g4_12x_ag_bench_bot_2025_02_19.csv
AutoGluon_best_master_g4_12x_ag_bench_bot_2025_02_19.csv

@Innixma
Copy link
Contributor

Innixma commented Feb 21, 2025

@suzhoum can you compare all 6 in the same table? I'm curious if ex: en_medium > mainline_high

@Innixma
Copy link
Contributor

Innixma commented Feb 21, 2025

In general results look very good, nice work @zhiqiangdon! And thanks for running the benchmarks @suzhoum!

I think we are nearly there for merging. The next steps are to minimize friction for existing users.

@zhiqiangdon The renaming of hyperparameters should include backwards compatibility, at least until v1.4 release. Can you create a separate PR that points to this PR (so it is a new branch building on-top of this PR's branch), which adds the backwards compatibility for hyperparameters? I mention a separate PR so it is easy to review isolated from the other changes in this PR.

This way we won't have a gap period between merging this PR and adding the backwards compatibility where existing users who are running AutoGluon via source install will have their code broken due to the breaking change.

You can consider this PR to be mostly fixed, I don't expect to request much changes to it before merge.

@Innixma
Copy link
Contributor

Innixma commented Feb 21, 2025

re backwards compatibility, one idea is to check the dictionary specified by the user at the start of the predictor.fit call and warn the user if using a old hyperparameter name, and then replacing it with the new one.

I had more thoughts on this earlier here: #4737 (comment)

@suzhoum
Copy link
Contributor

suzhoum commented Feb 21, 2025

@suzhoum can you compare all 6 in the same table? I'm curious if ex: en_medium > mainline_high

Please see attached.
AutoGluon_bot_vs_master_g4_12x_ag_bench_bot_2025_02_19.csv

@Innixma
Copy link
Contributor

Innixma commented Feb 21, 2025

Thanks @suzhoum! This seems to indicate that en_medium > main_best by a notable margin. If we can resolve respecting time limit while maintaining most of the benefit in a future PR, this could be a compelling option for users.

@zhiqiangdon
Copy link
Contributor Author

In general results look very good, nice work @zhiqiangdon! And thanks for running the benchmarks @suzhoum!

I think we are nearly there for merging. The next steps are to minimize friction for existing users.

@zhiqiangdon The renaming of hyperparameters should include backwards compatibility, at least until v1.4 release. Can you create a separate PR that points to this PR (so it is a new branch building on-top of this PR's branch), which adds the backwards compatibility for hyperparameters? I mention a separate PR so it is easy to review isolated from the other changes in this PR.

This way we won't have a gap period between merging this PR and adding the backwards compatibility where existing users who are running AutoGluon via source install will have their code broken due to the breaking change.

You can consider this PR to be mostly fixed, I don't expect to request much changes to it before merge.

@Innixma I have added the backward compatibility in this PR. Opening another PR doesn't seem to save time. You can check the logic through the new commit.

Copy link

Job PR-4737-d89efbe is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4737/d89efbe/index.html

Comment on lines 862 to 869
for old_k, new_k in key_pairs.items():
if k == old_k:
overrides[new_k] = overrides.pop(provided_k)
logger.warning(
f"The hyperparameter name {provided_k} is depreciated. "
f"We recommend using the new name {new_k} instead."
)
break
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

nit: if k in key_pairs: is simpler and avoids the for loop.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Thanks! Adopted the suggestion.

@@ -812,3 +813,59 @@ def update_ensemble_hyperparameters(
hyperparameters = presets_hyperparameters

return hyperparameters


def make_overrides_backward_compatible(overrides: Dict):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

should we do a overrides = copy.deepcopy(overrides) at the start of this function to avoid in-place mutation of the user's dictionary?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

A copy.deepcopy is applied before calling function apply_omegaconf_overrides, so no need to copy them again. See line 188 here: https://github.com/autogluon/autogluon/pull/4737/files#diff-a10c8783a894991a02709f19d3b1a2653731c04d798ec8b5234b0c9dc2d55bcdR188

Comment on lines 865 to 868
logger.warning(
f"The hyperparameter name {provided_k} is depreciated. "
f"We recommend using the new name {new_k} instead."
)
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Suggested change
logger.warning(
f"The hyperparameter name {provided_k} is depreciated. "
f"We recommend using the new name {new_k} instead."
)
logger.warning(
f"The hyperparameter name {provided_k} is depreciated. "
f"We recommend using the new name {new_k} instead. "
f"The deprecated hyperparameter will raise an exception starting in AutoGluon 1.4.0"
)

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated.

@Innixma
Copy link
Contributor

Innixma commented Feb 24, 2025

@zhiqiangdon thanks! I added comments to the backward compatibility code

Copy link

Job PR-4737-5c75857 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4737/5c75857/index.html

@Innixma
Copy link
Contributor

Innixma commented Feb 28, 2025

@zhiqiangdon can you resolve the minor merge conflict? Once resolved, I think we are good to merge.

hyperparameters={"AG_AUTOMM": {"env.num_workers": 0, "env.num_workers_evaluation": 0}},
time_limit=60,
hyperparameters={"AG_AUTOMM": {"env.num_workers": 0, "env.num_workers_inference": 0}},
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Please keep the time_limit=60, otherwise this takes arounds 10 minutes which slows down the CI

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Updated

Copy link

github-actions bot commented Mar 1, 2025

Job PR-4737-0787513 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4737/0787513/index.html

@zhiqiangdon
Copy link
Contributor Author

@zhiqiangdon can you resolve the minor merge conflict? Once resolved, I think we are good to merge.

@Innixma The conflict was resolved, and tests have passed.

Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks good to me! Thanks @zhiqiangdon for the major improvements and cleanup, as well as the quick follow-ups!

@Innixma Innixma merged commit d44ccf5 into autogluon:master Mar 3, 2025
34 checks passed
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
model list checked You have updated the model list after modifying multimodal unit tests/docs module: multimodal priority: 0 Maximum priority run-multi-gpu Run multimodal multi-gpu tests
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants