Skip to content

Conversation

ilayn
Copy link
Member

@ilayn ilayn commented May 28, 2023

Reference issue

Part of #18566, as one of the low-hanging fruits.

What does this implement/fix?

  • Pythonized Nonnegative Least Squares function optimize.nnls
  • Implemented the faster NNLS instead of the classical NNLS algorithm which is typically very slow. For large problems the performance difference is very obvious.
  • Added a tolerance keyword atol to control the inner iteration behavior. Since the algorithm errors out, the user needs something to relax the problem. I used atol just in case in the future we need rtol, so we don't need to deprecate tol keyword.
  • Added two more tests and that's 100% improvement on the number of tests.
  • Deleted the fortran77 code.

Since I am not super up-to-date with optimize etiquette, please let me know if I'm going against the grain.

This PR now opens up the possibility to solve multiple RHS through vBenthem, Keenan, https://doi.org/10.1002/cem.889 but since there are too many things happening, I'd like to leave it to later as an enhancement. Also now code is a bit more readable (or at least familiar syntax) so probably if needed can be Cythonizable.

@ilayn ilayn added enhancement A new feature or improvement scipy.optimize maintenance Items related to regular maintenance tasks labels May 28, 2023
@ilayn ilayn requested a review from andyfaff as a code owner May 28, 2023 00:06
@ilayn ilayn mentioned this pull request May 28, 2023
37 tasks
@dschmitz89
Copy link
Contributor

Good work here! One question from me: Are there benchmarks for small matrices? nnls can be the bottleneck in nested optimization problems. I am not sure if a pure numpy implementation is as fast there as the classic fortran code. For larger problems I expect the new implementation to be faster though due to the algorithmic improvements.

@ilayn
Copy link
Member Author

ilayn commented May 29, 2023

Ah good one, let me run some benchmarks

"""
m, n = A.shape

AtA = A.T @ A
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Condition number? Looks like an SVD use case.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Didn't understand your comment. Where do you want to use SVD?

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

You are constructing A.T @ A, which has the condition number a square of the condition number of A. And you're using it as a l.h.s. of a linear solve routine.

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

True, albeit partial solves. The algorithm goes through submatrices of AtA to only deal with active constraints hence the A[P, P] dance.

@ilayn
Copy link
Member Author

ilayn commented May 30, 2023

image

For small arrays indeed there is some performance loss with this PR. The quickest way is to cythonize it obviously. But not sure if it is worth it since I don't know how much this is being used..

@ev-br
Copy link
Member

ev-br commented May 30, 2023

Also it'd be useful IMO to compare to https://github.com/scipy/scipy/blob/v1.10.1/scipy/optimize/_lsq/bvls.py, which is user-facing through lsq_linear.

@dschmitz89
Copy link
Contributor

image

For small arrays indeed there is some performance loss with this PR. The quickest way is to cythonize it obviously. But not sure if it is worth it since I don't know how much this is being used..

This is impressive. For small problems there is a minor performance loss but I don't think it must be a blocker here. Cythonization can be done in a follow up.

@ilayn
Copy link
Member Author

ilayn commented Jun 8, 2023

What do the optimize residents think about this rewrite? Should we go full Cython or get this in?

@andyfaff
Copy link
Contributor

andyfaff commented Jun 8, 2023

Sorry, not familiar with this area. Id say if the python code is clean, and there's a performance benefit, then a merge should occur as is

@ilayn
Copy link
Member Author

ilayn commented Jun 10, 2023

OK let's get it in. Since we still have some time for v.1.12 we will have the chance to remedy should something arise. Thanks everyone.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
enhancement A new feature or improvement maintenance Items related to regular maintenance tasks scipy.optimize
Projects
None yet
Development

Successfully merging this pull request may close these issues.

5 participants