[tabular] Improve NN_TORCH runtime estimate #4247

Innixma · 2024-06-05T21:03:04Z

Issue #, if available:

Description of changes:

Improve NN_TORCH runtime estimate.

When running NN_TORCH in a ray process, the first batch includes significant time overhead to initialize torch. This leads to the mainline time requirement estimate to be vastly pessimistic as the data becomes larger because it multiplies this time overhead by the number of batches in an epoch. This caused extreme estimates for large datasets, skipping the neural network training entirely in many cases where it would have been useful.

The fix avoids using the first batch as part of the runtime estimate of future batches. This allows the v2 time estimate to be far more accurate.

Example on the adult income dataset:

(_ray_fit pid=1972457) v1 estimate: 46.5966s | v2 estimate: 2.3756s
(_ray_fit pid=1972457) True Time: 2.3813s

By submitting this pull request, I confirm that you can use, modify, copy, and redistribute this contribution, under the terms of your choice.

yinweisu · 2024-06-05T21:12:36Z

Previous CI Run	Current CI Run
botocore==1.34.118	botocore==1.34.120
fastcore==1.5.43	fastcore==1.5.44
pytest==8.2.1	pytest==8.2.2
cryptography==42.0.7	cryptography==42.0.8
huggingface-hub==0.23.2	huggingface-hub==0.23.3
boto3==1.34.118	boto3==1.34.120
typer==0.9.4	typer==0.12.3
smart-open==6.4.0	smart-open==7.0.4
thinc==8.2.3	thinc==8.2.4
ruff==0.4.7	ruff==0.4.8
prompt_toolkit==3.0.45	prompt_toolkit==3.0.46
cloudpathlib==0.16.0	cloudpathlib==0.18.1
weasel==0.3.4	weasel==0.4.1
spacy==3.7.4	spacy==3.7.5
botocore==1.34.118	botocore==1.34.120
fastcore==1.5.43	fastcore==1.5.44
-	shellingham==1.5.4
pytest==8.2.1	pytest==8.2.2
cryptography==42.0.7	cryptography==42.0.8
huggingface-hub==0.23.2	huggingface-hub==0.23.3
boto3==1.34.118	boto3==1.34.120
typer==0.9.4	typer==0.12.3
smart-open==6.4.0	smart-open==7.0.4
thinc==8.2.3	thinc==8.2.4
ruff==0.4.7	ruff==0.4.8
prompt_toolkit==3.0.45	prompt_toolkit==3.0.46
cloudpathlib==0.16.0	cloudpathlib==0.18.1
weasel==0.3.4	weasel==0.4.1
spacy==3.7.4	spacy==3.7.5

rey-allan

LGTM!

github-actions · 2024-06-06T00:09:52Z

Job PR-4247-4341b4b is done.
Docs are uploaded to http://autogluon-staging.s3-website-us-west-2.amazonaws.com/PR-4247/4341b4b/index.html

[tabular] Improve NN_TORCH runtime estimate

4341b4b

Innixma added bug Something isn't working module: tabular labels Jun 5, 2024

Innixma added this to the 1.1.1 Release milestone Jun 5, 2024

Innixma requested review from rey-allan and suzhoum June 5, 2024 21:03

rey-allan approved these changes Jun 5, 2024

View reviewed changes

suzhoum approved these changes Jun 5, 2024

View reviewed changes

Innixma merged commit 9b44d85 into autogluon:master Jun 5, 2024

Innixma deleted the tabular_improve_nn_time_estimate branch April 16, 2025 21:20

Provide feedback

Saved searches

Use saved searches to filter your results more quickly

Uh oh!

[tabular] Improve NN_TORCH runtime estimate #4247

[tabular] Improve NN_TORCH runtime estimate #4247

Uh oh!

Innixma commented Jun 5, 2024

Uh oh!

yinweisu commented Jun 5, 2024

Uh oh!

rey-allan left a comment

Uh oh!

github-actions bot commented Jun 6, 2024

Uh oh!

Uh oh!

[tabular] Improve NN_TORCH runtime estimate #4247

[tabular] Improve NN_TORCH runtime estimate #4247

Uh oh!

Conversation

Innixma commented Jun 5, 2024

Uh oh!

yinweisu commented Jun 5, 2024

Uh oh!

rey-allan left a comment

Choose a reason for hiding this comment

Uh oh!

github-actions bot commented Jun 6, 2024

Uh oh!

Uh oh!