-
Notifications
You must be signed in to change notification settings - Fork 1.1k
Description
Hi. Sorry for cross-posting (asked previously on Gitter, but got no luck with responses).
I updated my Halide installation from 14.0.0 to 15.0.1 and it looks like I'm missing something pretty basic about using the new API of the Adams2019 autoscheduler. For a plain 2D convolution testcase the Adams2019 autoscheduler with the same level of parallelism now generates a much worse schedule, which take twice as long to execute. In addition, I can see that schedule generation is performed in almost no time as follows:
generate_schedule for target=x86-64-linux-avx-avx2-f16c-fma-no_runtime-sse41-user_context
Adams2019.parallelism:8
Adams2019.beam_size:32
Adams2019.random_dropout:100
Adams2019.random_dropout_seed:0
Adams2019.weights_path:
Adams2019.disable_subtiling:0
Adams2019.disable_memoized_features:0
Adams2019.disable_memoized_blocks:0
Adams2019.memory_limit:-1
Loading weights from built-in data...
Pass 0 of 5, cost: 2.52664, time (ms): 11
Pass 1 of 5, cost: 2.52664, time (ms): 5
Pass 2 of 5, cost: 2.52664, time (ms): 5
Pass 3 of 5, cost: 2.52664, time (ms): 5
Pass 4 of 5, cost: 2.52664, time (ms): 5
Best cost: 2.52664
Cache (block) hits: 939
Cache (block) misses: 135
Cost evaluated this many times: 2485
And in the old Halide (v14) it behavied differently:
generate_schedule for target=x86-64-linux-avx-avx2-f16c-fma-no_runtime-sse41-user_context
Pass 0 of 5, cost: 1.45978, time (ms): 17335
Pass 1 of 5, cost: 1.45978, time (ms): 1301
Pass 2 of 5, cost: 1.45978, time (ms): 1307
Pass 3 of 5, cost: 1.45978, time (ms): 1371
Pass 4 of 5, cost: 1.45978, time (ms): 1364
Best cost: 1.45978
Cache (block) hits: 537
Cache (block) misses: 135
I tried to play with autoscheduler.beam_size parameter setting it to 1 for greedy search, but it produces even worse outcome.
When the beam_size is set to a huge value (eg, 8192), it does take considerable time to generate, but the resulting schedule is still bad.
Would you be so kind to guide me how the new autoscheduler api should be used correctly please? Thank you very much.