Skip to content

Conversation

dschult
Copy link
Contributor

@dschult dschult commented Nov 16, 2024

Reference issue
Toward #21833

What does this implement/fix?
This allows for use of new keyword rng alongside random_state in sparse.linalg.svds to prepare for the deprecation and removal of random_state and its legacy behavior.

In short:

if you pass the RNG argument by position or keyword random_state, it should work however it did before
if you pass a value val by keyword rng, it should behave as if you had passed np.random.default_rng(val) into the function.

@dschult dschult added scipy.sparse.linalg maintenance Items related to regular maintenance tasks labels Nov 16, 2024
@dschult
Copy link
Contributor Author

dschult commented Nov 16, 2024

There's a doc_string test (line ~292 in scipy/sparse/linalg/_eigen/_svds.py ) that I've kept using kwarg random_state=rng because rng is set to be RandomState(0) and if I change that, the resulting values must be changed a few lines below.

    >>> rng = np.random.RandomState(0)
    >>> X_dense = rng.random(size=(100, 100))
    >>> X_dense[:, 2 * np.arange(50)] = 0
    >>> X = sparse.csr_array(X_dense)
    >>> _, singular_values, _ = svds(X, k=5, random_state=rng)
    >>> print(singular_values)
    [ 4.3293...  4.4491...  4.5420...  4.5987... 35.2410...]

I could change the first line I've shown here to use default_rng, change the kwarg and change the values printed. But IIUC its better to leave random_state in order to minimize these changes. We'll deal with them upon removal.

Do I understand correctly? Let me know if I should change the test from using RandomState to using default_rng.

@mdhaber
Copy link
Contributor

mdhaber commented Nov 17, 2024

Do I understand correctly? Let me know if I should change the test from using RandomState to using default_rng.

Please do change it, actually. It is good to do it now because we want new users who refer to the examples to use rng. And if it were not done now, it would need to be done in two versions when the decorator begins to emit warnings about use of random_state.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Looks close. Besides the comments above, I'd just like to preserve at least one use of random_state as a sanity check that the decorator is doing its job.

@dschult
Copy link
Contributor Author

dschult commented Nov 17, 2024

I think this addresses all points brought up -- and lead to a couple others which were easy to fix.

One last point is that if random_array PR gh-21888 gets merged first, this one could change quite a few random_array calls to use rng= instead of random_state=. Both should work during the transition, so I think that's OK... but if we switch them now it makes the next transition simpler.

@dschult
Copy link
Contributor Author

dschult commented Nov 17, 2024

I made all all svds tests deterministric and all propack tests that directly call _svpd deterministic.
The number of changes to make the tests deterministic is fairly large.

Every function call to _svdp now specifies rng. It is still an "optional" argument though, so I added code to normalize the input rng inside _svpd unless it is a RandomState. I could put back the check_random_state code if that is preferred.

@mdhaber mdhaber mentioned this pull request Nov 18, 2024
50 tasks
Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks great! I've suggested marking the one remaining use of random_state in a test so it's obvious that it's not a mistake, and there are two other little things, but I think we can merge this shortly. I do think this will merge before gh-21888, if that's OK.

Copy link
Contributor Author

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I removed the rng normalization from _svdp and replaced it with a check that rng input is a Generator instance. I updated the tests and I think we're all ready to go here.

I had to move the sanity check of random_state= kwarg to a different test that doesn't use _svdp. :}

@mdhaber
Copy link
Contributor

mdhaber commented Nov 19, 2024

@dschult I'm not sure I understand - does _svdp only work with aGenerator? Why does it need to have normalization inside it?

@dschult
Copy link
Contributor Author

dschult commented Nov 19, 2024

Oops.. somehow I didn't commit my last set of changes.
pushed before committing i guess.

_svdp should work without any normalization because everything that calls it has already normalized it. That said, if someone has been using it on their own despite it being private, if/when they don't provide an rng arg is defaults to None, which creates a strange error message. So, I put code to check if rng is None and raise a ValueError if that happens. But it would only happen if someone if directly using the private function.

Just to be clear: It should work with a RandomState or a Generator instance as rng. What we are removing is the handling of int and None, which should be OK because it is private.

Copy link
Contributor

@mdhaber mdhaber left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

So, I put code to check if rng is None and raise a ValueError if that happens. But it would only happen if someone if directly using the private function.

OK. Note that it is not required to provide helpful error messages to users in private functions. That is why I have been advocating for removing it entirely. But if you think it might be helpful to developers in the future, this isn't causing problems, so I wouldn't ask for it to be changed now.

Let's see if tests come back green, and if so, I'll go ahead and merge.

Thanks @dschult!

@mdhaber mdhaber merged commit a77debb into scipy:main Nov 19, 2024
37 checks passed
@dschult dschult deleted the sparse_linalg_rng branch November 20, 2024 00:16
@ilayn
Copy link
Member

ilayn commented Jun 11, 2025

I am bit confused with this one here, I am receiving some errors while trying to enable the rng in #22748 but the check_random_state is generating instances of np.random.mtrand._rand. Shouldn't it create default_rng() when rng argument is None?

@mdhaber
Copy link
Contributor

mdhaber commented Jun 12, 2025

Immediately changing to that behavior would have been backward incompatible. The decorator _transition_to_rng converts arguments passed by rng keyword to a default_rng; otherwise, the decorator emits a warning and check_random_state does what it always has.

@ilayn
Copy link
Member

ilayn commented Jun 12, 2025

True but if rng=None then it does not matter the type of the PRNG object we create since there is no argument given. So instead of RandomState, check_random_state can produce a Generator already. Or am I misunderstanding?

@mdhaber
Copy link
Contributor

mdhaber commented Jun 12, 2025

It depends on if the global seed has been set. If so, it needs to use the global RandomState produced by check_random_state. If not, it is ok to use an unseeded RandomState produced by check_random_state because that is interchangeable, as far as the function should be concerned, with an unseeded Generator. In that case, it would also be ok for the decorator to produce an unseeded Generator, but for simplicity, it only changes behavior now if rng is passed by keyword.

@ilayn
Copy link
Member

ilayn commented Jun 12, 2025

Ah OK so this is from the np.random.seed(int) cases. Thanks.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
maintenance Items related to regular maintenance tasks scipy.sparse.linalg
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants