Skip to content

Conversation

prateekdesai04
Copy link
Contributor

Issue #, if available: #3952

Description of changes:
This feature compares the versions of all the dependencies installed while installing AutoGluon.

  1. A file listing all the dependencies is prepared for every CI run
  2. A file from the previous latest CI run is fetched and compared with the current CI run, following which differences are listed

Currently the difference can be viewed in the CI run (package_diff) task
I will also be putting a follow-up PR which will enhance the user experience, by displaying the diff in github comments once a PR is raised.
By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

Ubuntu and others added 2 commits March 4, 2024 23:23
[timeseries] Add method for plotting forecasts (#3889)

[timeseries] Add support for categorical covariates (#3874)
Copy link

github-actions bot commented Mar 5, 2024

Job PR-3962-d29ed93 is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-3962/d29ed93/index.html

Copy link
Contributor

@Innixma Innixma left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM, excited to see this one in action! Thanks for the contribution

@prateekdesai04 prateekdesai04 merged commit 25600a5 into autogluon:master Mar 5, 2024
Copy link
Contributor

@zhiqiangdon zhiqiangdon left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

LGTM. Thanks for enabling this automatic detection. Looking forward to follow-up PRs to make this easier to use.

- name: Fetch previous version file and compare
run: |
/bin/bash ./.github/workflow_scripts/version_diff.sh
continue-on-error: true
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why should we proceed with the workflow in spite of a step failing? Is it acceptable for issues to remain unnoticed for a period of time?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Because if the folder in the S3 bucket is empty and there is no initial file to compare with then we should still proceed to the next step which is to upload the current pip freeze file to S3 and this can be the first file.
Package diff failing is acceptable as we only use this to diff and not to pass or fail the entire workflow.

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

My primary interest lies in learning how to distinguish between expected failures, such as package discrepancies or non-existent s3 files, and unexpected ones, like tool malfunctions or missing credentials. Without this differentiation, and without the necessary manual upkeep, the tool is likely to cease functioning over time, primarily due to these unforeseen issues.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

(continue-on-error: true) is only valid for [Fetch previous version file and compare] step it's not applicable to any steps before or after it. If steps before or after fail, then the workflow will fail and will give out the specific error which will help us understand unexpected failures (like missing credentials or error while configuring credentials).

old_latest_file="old_${latest_file}"
mv "./$latest_file" "./$old_latest_file"

diff ./package_versions_* ./old_package_versions_* > ./diff_output.txt
Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The diff command compares files line by line in a sequential manner. If the packages are listed in a different order between two files, diff will report these as differences, even if the same packages and versions are present. This can make the output noisy and harder to understand, especially in case of package addition or deletion

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

pip freeze will always list the packages in a sorted order so the diff command will compare two sorted files.
If a new package is added or deleted then yes the output may be noisy, that's where we would download the current and previous file, remove the line containing the latest added or deleted package and then diff, this effort may have to be manual. It seems to be okay as we do not add or delete packages often

Copy link
Collaborator

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The current implementation is acceptable as it primarily functions as a support tool to help authors determine if functional failures are associated with package version changes.

ddelange added a commit to ddelange/autogluon that referenced this pull request Mar 21, 2024
…tch-4

* 'master' of https://github.com/awslabs/autogluon: (46 commits)
  [core] move transformers to setup_utils, bump dependency version (autogluon#3984)
  [AutoMM] Fix one lightning upgrade issue (autogluon#3991)
  [CI][Feature] Create a package version table (autogluon#3972)
  [v.1.1][Upgrade] PyTorch 2.1 and CUDA 12.1 upgrade (autogluon#3982)
  [WIP] Code implementation of Conv-LoRA (autogluon#3933)
  [timeseries] Ensure that all metrics handle missing values in the target (autogluon#3966)
  [timeseries] Fix path and device bugs (autogluon#3979)
  [AutoMM]Remove grounding-dino (autogluon#3974)
  [Docs] Update install modules content (autogluon#3976)
  Add note on pd.to_datetime (autogluon#3975)
  [AutoMM] Improve DINO performance (autogluon#3970)
  Minor correction in differ to pick correct environment (autogluon#3968)
  Fix windows python 3.11 issue by removing ray (autogluon#3956)
  [CI][Feature] Package Version Comparator (autogluon#3962)
  [timeseries] Add support for categorical covariates (autogluon#3874)
  [timeseries] Add method for plotting forecasts (autogluon#3889)
  Update conf.py copyright to reflect current year (autogluon#3932)
  [Timeseries][CI]Refactor CI to skip AutoMM and Tabular tests w.r.t timeseries changes (autogluon#3942)
  Fix HPO crash in memory check (autogluon#3931)
  [AutoMM][CI] Capping scikit-learn to avoid HPO test failure (autogluon#3947)
  ...
LennartPurucker pushed a commit to LennartPurucker/autogluon that referenced this pull request Jun 1, 2024
Co-authored-by: Ubuntu <ubuntu@ip-172-31-9-154.us-west-2.compute.internal>
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants