Skip to content

Conversation

andyfaff
Copy link
Contributor

@andyfaff andyfaff commented Apr 1, 2019

Adds constraints for differential_evolution following Lampinen
Depends on #9990 being merged first.
The first commit adds the machinery, tests are being created.

@andyfaff andyfaff added enhancement A new feature or improvement scipy.optimize labels Apr 1, 2019
@andyfaff
Copy link
Contributor Author

andyfaff commented Apr 1, 2019

TODO:

  • what happens if there is more than 1 constraint presented and each of the constraints has a different M?

This is solved by concatenating constraint violations for each constraint.

@andyfaff andyfaff requested a review from mdhaber April 1, 2019 03:33
Copy link
Contributor

@nmayorov nmayorov left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some minor notes about docstrings.

@andyfaff
Copy link
Contributor Author

CI failure is a timeout.

Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Some of the CI failures seem related to me?

Also, can you please mark the resolved conversations / revisions appropriately so it is easier to see what still needs doing?

@andyfaff
Copy link
Contributor Author

@tylerjereddy, there were a couple of method calls that needed to be renamed after #9990 got merged. I've marked conversations that have been addressed as resolved (which is all of them).

Copy link
Contributor

@tylerjereddy tylerjereddy left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

  • would be nice if an optimize guru signs off--I see @mdhaber has been pinged on that
  • that said, no old tests have been modified, so that's good behavior-wise, and CI is passing now
  • coverage on new code looks pretty solid too

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm afraid I'm not too much help here! I suppose I would just agree with @tylerjereddy's comments about the tests. And beyond hitting missed lines, a few more end to end solve() tests with constrained problems would be good, especially in the absence of thorough domain-expert review. Maybe test with multiple constraint types, LinearConstraint, at least one other objective function, etc... I can't see why they wouldn't play nice with one another, but I think it's important to test anyway.

@andyfaff
Copy link
Contributor Author

I'm not going to be able to make any changes until next Monday, perhaps the milestone needs to be bumped.

@tylerjereddy
Copy link
Contributor

I'll bump it, but let's get this in soon--my review notes show very few reasons not to merge at this point apart from not having an expert sign off on review.

@tylerjereddy tylerjereddy modified the milestones: 1.3.0, 1.4.0 Apr 25, 2019
@mdhaber
Copy link
Contributor

mdhaber commented Apr 25, 2019

@andyfaff I could add as tests the benchmark problems from Lampinen. At that time I'll see if I can follow the paper; if so, maybe I could check the implementation. Wish I could complete in time for 1.3 but I didn't notice the review request until yesterday.

@andyfaff
Copy link
Contributor Author

@mdhaber, perhaps add a few - at least one LinearConstraint in the tests, and a different example in the docstring?

@mdhaber
Copy link
Contributor

mdhaber commented Apr 27, 2019

@andyfaff submitted PR to your fork. Is that the best way to do this?
I added the first test from Lampinen both as a single LinearConstraint and as a tuple of two
LinearConstraints and two NonlinearConstraints. This takes about two minutes on my machine, so we'll have to decide whether that's acceptable for a test or not. In any case, both formulations of the problem are solved successfully. differential_evolution satisfies the constraints and typically gets the decision variables within about 0.03 of their optimal values, putting it about 0.22 shy of the optimal function value. This is better than Koziel (Lampinen reference 6), but not as good as Lampinen, which reports finding the global optimum (-15) in 0.57s. Is there a difference in the underlying strategy that would explain this? I used the parameters reported in Lampinen (mutation, crossover, etc.) but didn't change from the default strategy.

@mdhaber
Copy link
Contributor

mdhaber commented Apr 28, 2019

Probably going to want to reformat most of these as benchmarks. There are at least two, though, that are fast enough to be tests.

@andyfaff
Copy link
Contributor Author

@mdhaber made some comments on your PR. For the additions to be scipy tests they shouldn't take more than about ~10s, a couple of minutes is too long.

A lot of the test tolerances are set quite loosely, and I'm surprised that differential_evolution doesn't get closer. This may merit further inspection (but I couldn't access a lot of the Lampinen citations, so that will be hard). Possible explanations include:

  1. the minimisation is stochastic (perhaps seeding is necessary for testing). Thus, for very hard problems a certain proportion of runs could be expected to 'fail'.
  2. the total population sizes are different. In the Lampinen paper Np refers to the total population size. In the scipy implementation popsize is a multiplying factor for the total population size - the total population size is np.size(x) * popsize.
  3. the way that the population is initialised could be important. scipy uses a Latin Hypercube approach to sample the available space uniformly.
  4. the updating kwd could play an important role here. The scipy default is to continually update the best solution, whereas once per generation is the traditional approach (updating='deferred').
  5. the default mutation strategy in scipy is 'DE/best/1/bin', whereas the strategy in the Lampinen paper is 'DE/rand/1/bin'. To use the same mutation strategy one would need to use the strategy='rand1bin' kwd. The 'best' in 'best1bin' means that a trial vector is constructed by addition to the best solution so far; I conjecture (on no strong basis) that this may not be a good approach for a strongly constrained system.
  6. The Lampinen paper makes multiple runs:

For each problem, the worst, the average and the best objective function values out of 1000 independent runs were recorded.

and

In all cases the reported optimum solution were found.
whereas the scipy tests are single run through.

@andyfaff
Copy link
Contributor Author

It may be just as well the PR wasn't merged for 1.3.0. I've just figured out a large slowdown in the constraints implementation, when NonLinearConstraint is employed. This type of constraint eventually calls VectorFunction.fun, which also implicitly evaluates the Hessian and Jacobian of the constraint function with finite differences. This is totally un-needed (taking a lot of time), all that's needed is the value of the evaluated function. This implicit evaluation only occurs if the hess parameter for NonLinearConstraint is an instance of HessianUpdateStrategy.

There are a few ways around this:

  1. If the constraint supplied to differential_evolution is a NonLinearConstraint call the constraint directly without sending it through PreparedConstraint.
  2. Temporarily change the hess attribute of the NonLinearConstraint to something that won't cause implicit updating of the Hessian and Jacobian when a PreparedConstraint is used.

I think option 1 is cleanest. In fact it may be cleanest to write a private version PreparedConstraint/VectorFunction/LinearVectorFunction/IdentityVectorFunction that simply allow the constraint to be evaluated without further calculation.

@andyfaff
Copy link
Contributor Author

Use of _ConstraintWrapper class means that PreparedConstraint can be avoided. Speedups from:

  1. Not having implicit calculation of constraint Jacobian and Hessian
  2. Not having to suppress warnings.

@mdhaber
Copy link
Contributor

mdhaber commented Apr 28, 2019

I was wondering whether that was happening because occasionally I'd get notifications that a LinearConstraint would be more appropriate if the Jacobian is constant. Nice find. That should really speed things up!

@andyfaff
Copy link
Contributor Author

andyfaff commented May 6, 2019

I've tidied up the commit history, this PR needed squashing. I think it's in a position to merge.

@mdhaber
Copy link
Contributor

mdhaber commented May 6, 2019

Restarted test with build failure.
I should check this out one more time before merging. So are all the tests fast enough now? Getting the accuracy you expected?
Besides skipping the Jacobian evaluation, what did you change?

@andyfaff
Copy link
Contributor Author

andyfaff commented May 6, 2019

Sorry, the individual commits were lost when I squashed in preparation for a merge.

Test changes

  1. x_opt was added to all of the tests to check f_opt vs f(x_opt).
  2. All the tests use the rand1bin strategy.
  3. All the tests are seeded for reproducibility.
  4. One test was marked as slow
  5. the default for maxiter was used.
  6. smaller popsizes (corresponding to the Lampinen paper) were used. The minimisation is proportionally speeded up. All the tests complete in under 3 secs, apart from L8.
  7. Because the differential evolution doesn't get close enough to the optimal solution without a large number of iterations (due to the convergence statistic being satisfied too early - discussed in an earlier conversation), I introduced the use of trust-constr minimisation as a polishing routine. This gets the solution quite close to the optimal. More iterations can be requested by setting the tolerances more tightly.
  8. the polishing enabled the test tolerances to be set much tighter, all tests pass but can be relaxed if there are future changes.

differential_evolution code

  1. A polish using trust-constr is now done.
  2. PreparedConstraint was discarded for a plain _ConstraintWrapper, which allows simple evaluation of constraints without the Jacobian of the constraints being calculated. This is a big speedup.
  3. the accept test inequality was fixed (when both solutions were feasible)

@mdhaber
Copy link
Contributor

mdhaber commented May 10, 2019

I haven't forgotten! I'll try to get to this soon.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I didn't look much at the recent changes, but I tested them. As you mentioned, it's much faster without the unnecessary Jacobian evaluations and the accuracy is great when forced to iterate longer than the default. Based on the performance and earlier code review, I'm pretty confident that this is implementing the algorithm as intended.

@andyfaff
Copy link
Contributor Author

If this passes now I think it's good to go.

@andyfaff
Copy link
Contributor Author

We're all good now. The pypy3 test fail can be ignored.

@andyfaff
Copy link
Contributor Author

@mdhaber, could we merge this?

@mdhaber mdhaber merged commit 02b0001 into scipy:master May 19, 2019
@andyfaff
Copy link
Contributor Author

@mdhaber thanks for your sterling review and test development work on this.

@andyfaff andyfaff deleted the de_constr branch May 20, 2019 02:25
@mdhaber
Copy link
Contributor

mdhaber commented May 20, 2019

NP and thanks for the PR. In the past, I've only used local minimizers for nonlinear problems (mostly optimal control), so it was great to play around with DE for the first time. I was impressed!

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement scipy.optimize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

6 participants