Skip to content

Conversation

WarrenWeckesser
Copy link
Member

Updates for diags_array and diags:

  • Change the default value of the dtype parameter to _NoValue, so we know whether or not the caller has given an explicit value.
  • If dtype is None, the new behavior (output dtype matches the data type of the given diagonals) is used.
  • If the caller doesn't specify dtype (so its value is _NoValue), the old behavior is maintained, but a FutureWarning is generated.
  • For existing tests that used integer inputs where the input data type didn't actually matter, the inputs are changed to floating point, to avoid the warning.
  • A couple new tests are added to ensure that the old deprecated behavior and the new behavior are implemented correctly.

Closes gh-23102.

…ags'.

Updates for diags_array and diags:

* Change the default value of the dtype parameter to _NoValue,
  so we know whether or not the caller has given an explicit
  value.
* If dtype is None, the new behavior (output dtype matches the
  data type of the given diagonals) is used.
* If the caller doesn't specify dtype (so its value is _NoValue),
  the old behavior is maintained, but a FutureWarning is
  generated.
* For existing tests that used integer inputs where the input data
  type didn't actually matter, the inputs are changed to floating
  point, to avoid the warning.
* A couple new tests are added to ensure that the old deprecated
  behavior and the new behavior are implemented correctly.

Closes scipygh-23102.
Use dtype=None to avoid the new warning.  That also means there
is no need to cast the result to int16.
Copy link
Contributor

@dschult dschult left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks good.
It adds better default behavior in the long run with an option for getting that default behavior now.

I only have minor suggestions for word changes. They can be changed and/or ignored without my input.
Thanks!

WarrenWeckesser and others added 2 commits July 20, 2025 11:29
@WarrenWeckesser
Copy link
Member Author

WarrenWeckesser commented Jul 20, 2025

Some notes about this PR to consider before merging:

  • I indicate that the deprecation period will end with SciPy 1.19, so the deprecation is active for versions 1.17 and 1.18. I think that is the standard deprecation period. If not, we should update the messages before merging.

  • The goal is that after the deprecation period is over, the default behavior, indicated with dtype=None, is to use np.result_type(*diagonals), so we no longer always convert to an inexact type. To activate that behavior during the deprecation period, dtype=None must be given explicitly. After the deprecation period is over, such a change will continue to work (i.e. no further changes are needed), but the explicit argument could be removed, since the default will become dtype=None.

  • The arguments to warnings.warn() are dependent on the Python version. For Python 3.12+, the warn() parameter skip_file_prefixes is used, which will set the stacklevel automatically. For Python 3.11, stacklevel=2 is used. This is correct for diags_array(), but not for diags(). There is a discussion about this here. When the warning is generated by a call of diags() with Python 3.11, the code displayed by the warning will not be the caller's code, which is confusing. To ameliorate this, an extra bit of text is included in the warning with a suggestion that diags() might be the source of the warning. This case only occurs with a call of diags() with Python 3.11 when the dtype of the output does not match what the future default behavior will be.

  • There is one potential "gotcha" with using dtype=_NoValue during the deprecation period. If a user happens to already explicitly specify dtype=None, they will get the new behavior without any warning.

    A github code search for diags_array in Python files found over 300 occurrences, but in only one file (with two calls) was the dtype parameter used, and in that case, the dtypes were float64 or complex. None used dtype=None explicitly.

    Searching for diags is tougher. A simple search results in over 30000 occurrences. A search for "diags(" "dtype=None" language:Python gave almost 700 hits, and showed one occurrence that could be a pattern used elsewhere that has a case where dtype=None would not be surprising. This is the case where a user's (or some library's) function accepts a dtype parameter and passes it on to diags(). E.g.

    def compute_matrix(data, foo, bar, dtype=None):
        d = [... compute diagonals ...]
        offsets = [... compute offsets ...]
        m = diags(d, offsets=offsets, dtype=dtype)
        return m
    

    Any call of compute_matrix() that doesn't give dtype will result in a call of diags() with dtype=None given explicitly. This pattern appears here, for example:

    https://github.com/recspert/SATF/blob/448f61c536ff702abcc5a85e69dfe92e3871aea3/models/gsatf.py#L129

    (That exact code shows up twice in the search results; I didn't try to determine the original source of the code.)

    This is only a problem if the function is expected to handle, say, an integer type and the code depends on the conversion of integer input types to floating point output type.

    Does anyone thinks this possibility is worth worrying about?

@lucascolley lucascolley changed the title MAINT: sparse: change default dtype behavior of 'diags_array' and 'diags'. DEP/MAINT: sparse: change default dtype behavior of 'diags_array' and 'diags'. Jul 20, 2025
@lucascolley lucascolley added the deprecated Items related to behavior that has been deprecated label Jul 20, 2025
Copy link
Member

@h-vetinari h-vetinari left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This looks like high-quality work to me! 👏

I'm fine with changing whatever vanishingly small number of existing users already pass dtype=None.

Comment on lines +225 to +227
else:
warn_kwargs = {'skip_file_prefixes': (os.path.dirname(__file__),)}
extra_msg = ""
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

minor nit: the else branched could be pulled out as being unconditional before the if sys.version_info < (3, 12).

Copy link
Member Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If you mean convert this

        if sys.version_info < (3, 12):
            warn_kwargs = {'stacklevel': 2}
            extra_msg = ("\nNote: In Python 3.11, this warning can be generated by "
                         "a call of scipy.sparse.diags(), but the code indicated in "
                         "the warning message will refer to an internal call of "
                         "scipy.sparse.diags_array(). If that happens, check your "
                         "code for the use of diags().")
        else:
            warn_kwargs = {'skip_file_prefixes': (os.path.dirname(__file__),)}
            extra_msg = ""

to this

        warn_kwargs = {'skip_file_prefixes': (os.path.dirname(__file__),)}
        extra_msg = ""
        if sys.version_info < (3, 12):
            warn_kwargs = {'stacklevel': 2}
            extra_msg = ("\nNote: In Python 3.11, this warning can be generated by "
                         "a call of scipy.sparse.diags(), but the code indicated in "
                         "the warning message will refer to an internal call of "
                         "scipy.sparse.diags_array(). If that happens, check your "
                         "code for the use of diags().")

I actually prefer to have the assignments occur once in any given code path, rather than assign values, and then possibly reassign based on some condition. I'm sure I've done the latter many times in code that I've written; dabbling in functional programming is mutating my world view these days. Indeed, I might even prefer a single assignment that uses the ternary operator:

        warn_kwargs, extra_msg = (
            {'stacklevel': 2},
            "\nNote: In Python 3.11, this warning can be generated by "
            "a call of scipy.sparse.diags(), but the code indicated in "
            "the warning message will refer to an internal call of "
            "scipy.sparse.diags_array(). If that happens, check your "
            "code for the use of diags()."
        ) if sys.version_info < (3, 12) else (
            {'skip_file_prefixes': (os.path.dirname(__file__),)},
            ""
        )

But I'm afraid that would trigger other (not unreasonable) nitpicking over whitespace or readability.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sure, it's just a minor stylistic quibble. Go ahead either way.

To explain my POV: While I might agree in general, for compatibility code my personal preference is to structure things as "the default/future way" vs. "the compat code we'll throw out as soon as we can", and - on the margin (i.e. between choices that are ~equally readable) - choosing the option that's less noisy in future git history, when the compat branch gets removed.

dabbling in functional programming is mutating my world view these days

Ironic 😁

Copy link
Member Author

@WarrenWeckesser WarrenWeckesser Jul 30, 2025

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The distinctions you are making are pretty subtle, and I'm not sure I agree with them--or at least, I don't think the argument is very strong in this case, so I'm inclined to leave it as is.

Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

To explain my POV: While I might agree in general, for compatibility code my personal preference is to structure things as "the default/future way" vs. "the compat code we'll throw out as soon as we can", and - on the margin (i.e. between choices that are ~equally readable) - choosing the option that's less noisy in future git history, when the compat branch gets removed.

This makes a lot of sense to me, but no need to hold up this PR for it I suppose.

@WarrenWeckesser WarrenWeckesser added this to the 1.17.0 milestone Jul 30, 2025
Copy link
Member

@lucascolley lucascolley left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

thanks all!

@lucascolley lucascolley merged commit a9bbf47 into scipy:main Aug 1, 2025
42 checks passed
@WarrenWeckesser WarrenWeckesser deleted the gh-23102-diags branch August 1, 2025 11:46
@j-bowhay j-bowhay mentioned this pull request Jun 26, 2025
32 tasks
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
deprecated Items related to behavior that has been deprecated maintenance Items related to regular maintenance tasks scipy.sparse.csgraph scipy.sparse.linalg scipy.sparse
Projects
None yet
Development

Successfully merging this pull request may close these issues.

DOC: Undocumented dtype change with sp.sparse.csr_array(sp.sparse.spdiags(...)) vs. sp.sparse.diags_array(...)
4 participants