-
-
Notifications
You must be signed in to change notification settings - Fork 5.5k
ENH: stats.zmap
/zscore
/gzscore
: add array API support
#21068
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
Array API strict failures seem to be unrelated (gh-21073). |
scipy/stats/_stats_py.py
Outdated
|
||
def _convert_common_float(*arrays, xp=None): | ||
xp = array_namespace(*arrays) if xp is None else xp | ||
arrays = [xp.asarray(array) for array in arrays] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this might need to be
arrays = [xp.asarray(array) for array in arrays] | |
arrays = [_asarray(array, subok=True) for array in arrays] |
otherwise line 3158 will break subclasses for gzscore
.
Also with this change _zmap_array_api
will preserve subclasses so could we only keep the new shorter implementation as it seems much cleaner and by the test comment seems to fix a bug or is it backwards incompatible?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Mmm maybe. I'll investigate. But in the meantime I've followed the approach of zmap
and just branched here depending on SCIPY_ARRAY_API
to avoid any possible unintended interactions.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
@j-bowhay the second to last commit removes the old implementation. Besides using _asarrray
, I needed to replace a use of xp.where
, which doesn't preserve array subclasses.
Note that _xp_mean
and _xp_var
"just worked" as far as the existing unit tests are concerned, but note that they weren't written with masked arrays in mind, and the unit tests are very weak. I don't have a preference if you're willing to test/review more carefully to check that removing the old implementation is safe (including for other subclasses, like matrix
), but personally, I'd just as soon keep both implementations and remove the old one when we no longer have to support subclasses.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok thanks I will look more closely over the next couple of days
[lint only]
z = stats.gzscore(mx) | ||
desired = ([-1.526072095151, -0.194700599824, np.inf, 0.584101799472, | ||
1.136670895503]) | ||
assert_allclose(desired, z) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This test had the actual and desired values mixed up, and it was checking for an inf
where the element should be masked. (I can't see why inf
would be the right answer anyway - probably was just a fill_value
that got put in during the calculation?) I corrected the test to consider the fact that the output should be a masked array.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Having played with this a bit I think the best option might be to revert to 396e12c as the like_zscore
branch in _zmap
doesn't not preserve subclasses, presumably due to xp.where
not preserving subclasses.
In [1]: import numpy as np
In [2]: class ExampleTensor(np.ndarray):
...: def __new__(cls, input_array):
...: return np.asarray(input_array).view(cls)
...:
...: def __array_finalize__(self, obj) -> None:
...: if obj is None: return
...: # This attribute should be maintained!
...: default_attributes = {"attr": 1}
...: self.__dict__.update(default_attributes) # another way to set attributes
...:
In [3]: a = ExampleTensor([0.5, 2.0, 2.5, 3])
In [4]: from scipy.stats import zmap
In [5]: zmap(a, a)
Out[5]: array([-1.60356745, 0. , 0.53452248, 1.06904497])
In [6]: b = ExampleTensor([0, 1, 2, 3, 4])
In [7]: zmap(a, b)
Out[7]: ExampleTensor([-1.06066017, 0. , 0.35355339, 0.70710678])
Sorry for the wild goose chase, I am just keen to keep the implementation as clean as possible where reasonable to do so.
Well, I can probably try using item assignment there instead? JAX is already skipped sometimes. |
Yes I think that would also work |
OK, tried that so you can test more thoroughly. Also replaced the use of NumPy to work with Please comment on the differences the things the old code took extra care on, such as warnings (or lack thereof) and results when elements of a slice are not unique. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I tried zmap
with some examples with nans and nan_policy="omit" and saw no issues. Also tried an a
which was constant along axis
and that worked fine so I think this is ok to proceed.
I assume we can now just move the contents of _zmap
into zmap
?
@melissawm any guess on whether the reading sources... [ 99%] tutorial/stats/discrete_planck
reading sources... [ 99%] tutorial/stats/discrete_poisson
reading sources... [ 99%] tutorial/stats/discrete_randint
reading sources... [ 99%] tutorial/stats/discrete_yulesimon
reading sources... [ 99%] tutorial/stats/discrete_zipf
reading sources... [ 99%] tutorial/stats/discrete_zipfian
reading sources... [ 99%] tutorial/stats/hypothesis_bartlett
/home/circleci/repo/doc/source/tutorial/stats/hypothesis_bartlett.md: Executing notebook using local CWD [mystnb]
/home/circleci/repo/doc/source/tutorial/stats/hypothesis_bartlett.md: Executed notebook in 9.00 seconds [mystnb]
Traceback (most recent call last):
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/cmd/build.py", line 337, in build_main
app.build(args.force_all, args.filenames)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/application.py", line 378, in build
self.builder.build_update()
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 297, in build_update
self.build(to_build,
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 318, in build
updated_docnames = set(self.read())
^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 425, in read
self._read_serial(docnames)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 477, in _read_serial
self.read_doc(docname)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/builders/__init__.py", line 536, in read_doc
publisher.publish()
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/core.py", line 234, in publish
self.document = self.reader.read(self.source, self.parser,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/sphinx/io.py", line 106, in read
self.parse()
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/readers/__init__.py", line 76, in parse
self.parser.parse(self.input, document)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_nb/sphinx_.py", line 157, in parse
mdit_renderer.render(mdit_tokens, mdit_parser.options, mdit_env)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mdit_to_docutils/base.py", line 243, in render
self._render_tokens(list(tokens))
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mdit_to_docutils/base.py", line 222, in _render_tokens
self.rules[f"render_{child.type}"](child)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_nb/core/render.py", line 117, in render_nb_cell_markdown
self.render_children(token)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mdit_to_docutils/base.py", line 396, in render_children
self.rules[f"render_{child.type}"](child)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mdit_to_docutils/base.py", line 752, in render_fence
return self.render_restructuredtext(token)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mdit_to_docutils/base.py", line 1680, in render_restructuredtext
MockRSTParser().parse(pseudosource, newdoc)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/myst_parser/mocking.py", line 552, in parse
super().parse(inputstring, document)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/__init__.py", line 184, in parse
self.statemachine.run(inputlines, document, inliner=self.inliner)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/states.py", line 169, in run
results = StateMachineWS.run(self, input_lines, input_offset,
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/statemachine.py", line 233, in run
context, next_state, result = self.check_line(
^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/statemachine.py", line 445, in check_line
return method(match, context, next_state)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/states.py", line 2357, in explicit_markup
nodelist, blank_finish = self.explicit_construct(match)
^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/states.py", line 2369, in explicit_construct
return method(self, expmatch)
^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/states.py", line 2106, in directive
return self.run_directive(
^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/docutils/parsers/rst/states.py", line 2156, in run_directive
result = directive_instance.run()
^^^^^^^^^^^^^^^^^^^^^^^^
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/site-packages/jupyterlite_sphinx/jupyterlite_sphinx.py", line 368, in run
shutil.copy(notebook, notebooks_dir)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/shutil.py", line 431, in copy
copyfile(src, dst, follow_symlinks=follow_symlinks)
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/shutil.py", line 236, in copyfile
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
shutil.SameFileError: '/home/circleci/repo/doc/source/_contents/hypothesis_bartlett.ipynb' and PosixPath('/home/circleci/repo/doc/source/_contents/hypothesis_bartlett.ipynb') are the same file
Exception occurred:
File "/home/circleci/.pyenv/versions/3.11.9/lib/python3.11/shutil.py", line 236, in copyfile
raise SameFileError("{!r} and {!r} are the same file".format(src, dst))
shutil.SameFileError: '/home/circleci/repo/doc/source/_contents/hypothesis_bartlett.ipynb' and PosixPath('/home/circleci/repo/doc/source/_contents/hypothesis_bartlett.ipynb') are the same file
The full traceback has been saved in /tmp/sphinx-err-ndbm02yd.log, if you want to report the issue to the developers.
Please also report this if it was a user error, so that a better error message can be provided next time.
A bug report can be filed in the tracker at <https://github.com/sphinx-doc/sphinx/issues>. Thanks!
make: *** [Makefile:114: html-build] Error 2
Skipping build
Exited with code exit status 1 |
The same failure showed up in #21209 so looks unrelated |
I believe this has to do with a new release of jupyterlite-sphinx. @steppi is looking into it, thanks for the ping! |
Docs CI should be functional again, cheers! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Thanks @mdhaber
Reference issue
Toward gh-20544
What does this implement/fix?
Adds array API support to
zmap
,zscore
, andgzscore
.