Skip to content

Duplicated suggestions generated when early stopping is enabled #2002

@shaowei-su

Description

@shaowei-su

/kind bug

What steps did you take and what happened:
[A clear and concise description of what the bug is.]

When early stopping is enabled with bayesianoptimization, duplicated suggestions/trials are generated.
See suggestion service log:

INFO:pkg.suggestion.v1beta1.skopt.base_service:Optimizer tell method takes 0 seconds
INFO:pkg.suggestion.v1beta1.skopt.base_service:List of recorded Trials names: ['exp-6f14cce0-5ac6-11ed-bd7e-acde48001122-6wcpxc58', 'exp-6f14cce0-5ac6-11ed-bd7e-acde48001122-ss88lpl9', 'exp-6f14cce0-5ac6-11ed-bd7e-acde48001122-8t8ftjlj', 'exp-6f14cce0-5ac6-11ed-bd7e-acde48001122-szmjrvjc']

INFO:pkg.suggestion.v1beta1.skopt.base_service:Running Optimizer ask to query new parameters for Trials

INFO:pkg.suggestion.v1beta1.skopt.base_service:New suggested parameters for Trial: [0.0008577340571058496, 0.23212435323286384, 'True', 13]
INFO:pkg.suggestion.v1beta1.skopt.base_service:GetSuggestions returns 1 new Trials


INFO:pkg.suggestion.v1beta1.skopt.base_service:----------------------------------------------------------------------------------------------------

INFO:pkg.suggestion.v1beta1.skopt.base_service:New GetSuggestions call

INFO:pkg.suggestion.v1beta1.skopt.base_service:Succeeded Trials didn't change: 4

INFO:pkg.suggestion.v1beta1.skopt.base_service:Running Optimizer ask to query new parameters for Trials

INFO:pkg.suggestion.v1beta1.skopt.base_service:New suggested parameters for Trial: [0.0008577340571058496, 0.23212435323286384, 'True', 13]
INFO:pkg.suggestion.v1beta1.skopt.base_service:GetSuggestions returns 1 new Trials

Here the highlight is Succeeded Trials didn't change: 4

What did you expect to happen:

Suggestion service should see early stopped trials in the service request and apply updates to skopt_suggested and loss_for_skopt: https://github.com/kubeflow/katib/blob/master/pkg/suggestion/v1beta1/skopt/base_service.py#L105

Anything else you would like to add:
[Miscellaneous information that will assist in solving the issue.]

Environment:

  • Katib version (check the Katib controller image version): V0.14.0
  • Kubernetes version: (kubectl version): v1.21.14
  • OS (uname -a): Ubuntu1804

Impacted by this bug? Give it a 👍 We prioritize the issues with the most 👍

Metadata

Metadata

Assignees

No one assigned

    Labels

    Type

    No type

    Projects

    No projects

    Milestone

    No milestone

    Relationships

    None yet

    Development

    No branches or pull requests

    Issue actions