Skip to content

Conversation

ev-br
Copy link
Contributor

@ev-br ev-br commented Jan 9, 2022

Here's an initial WIP attempt at addressing #4510 (comment), gh-1214 and gh-1459; see also mesonbuild/meson#9049

This PR is deliberately simplistic : given a foo.pyx which cimport bar.pxd and includes baz.pxi,

$ cythonize -M foo.pyx

produces foo.c.dep file in with foo.pyx dependencies along with foo.c.

An immediate question to mainainers: is this a reasonable place to plug it? Is this simple interface OK?
I'd also appreciate a pointer to how to best add tests for this. I can of course write out several tempfiles, or is there a better way?

Also, @eli-schwartz what would meson need --- in the example above, does it need a list of dependencies for the foo.c file (thus foo.c: foo.pyx bar.pxd baz.pxi) or for the pyx file (thus foo.pyx: bar.pxd baz.pxi)?

@eli-schwartz
Copy link
Contributor

First, a note on the correct place for this argument.

There are two tools:

  • cython, which converts a pyx to a C source
  • cythonize, which also tries to use setuptools to build it

It is possible to use cythonize without building, as they are both just providing an interface into similar (shared) functionality, but the UX of each CLI command is, I think, geared towards its primary use. ;)

Hence I believe both tools might benefit from the command line argument, and particularly, meson currently uses cython not cythonize.

@eli-schwartz
Copy link
Contributor

eli-schwartz commented Jan 9, 2022

does it need a list of dependencies for the foo.c file (thus foo.c: foo.pyx bar.pxd baz.pxi) or for the pyx file (thus foo.pyx: bar.pxd baz.pxi)?

The produced file is foo.c, and you want to add rules for what files force foo.c to be regenerated.

foo.pyx is checked into git, so you do not need dependency rules that say "bar.pxd baz.pxi got modified, so foo.pyx needs to be regenerated".

tl;dr

The rule to create is foo.c: foo.pyx bar.pxd baz.pxi

@scoder
Copy link
Contributor

scoder commented Jan 9, 2022

Thanks. Regarding tests, you can add a srctree file under tests/build/. Those are just a bunch of text files stuffed into one, that the test runner splits into a directory structure. Look at the other files there as an example.

@eli-schwartz
Copy link
Contributor

Addendum to "the correct place for the argument": this was discussed a bit in mesonbuild/meson#8706 (comment) and @scoder's comment there is relevant.

That's reasonable. Seems like it doesn't matter which you choose then, and cython wins by the simpler interface.

@scoder
Copy link
Contributor

scoder commented Jan 9, 2022

* cython, which converts a pyx to a C source
* cythonize, which also tries to use setuptools to build it

I'd rather consider the main feature of cythonize that it accepts file path wildcards and checks the dependencies to rebuild only from modified sources, like "here are my files, here is what I want, do what's necessary". And yes, it also builds the final extension module(s) for you if that's what you want. It's more or less a CLI for the cythonize() function that people use in their setup.py files.

I agree that both tools should support dependency file generation. I guess that cythonize could also make use of depfiles itself, although in order to determine the target file extension (c/cpp), it may still have to look at the source file first (specifically a # distutilts: ... header line that defines the target language).

@scoder
Copy link
Contributor

scoder commented Jan 9, 2022

… That said, it's probably easier to make cythonize generate depfiles than cython, because cython does not care about dependencies at all. I'd be ok with making only cythonize generate the files, if adding this to cython proves harder.

(As I mentioned before.)

@eli-schwartz
Copy link
Contributor

Ah yeah, I guess a better way of phrasing it would be "also implements a bit of a basic build system, powered under the hood by setuptools, which collects all the files for you, and optionally builds them while detecting out of date files etc. etc."

I agree that an MVP to have cythonize create depfiles is useful on its own (meson could always switch to that, if it needs to), just thought I'd make sure the cython tool didn't get overlooked without even meaning to. :D

So that
$ cythonize -M foo.pyx

produces `foo.c.dep` file in with foo.pyx dependencies (`foo.c` is produced as well).
@ev-br
Copy link
Contributor Author

ev-br commented Jan 11, 2022

Regarding tests, you can add a srctree file under tests/build/

Thanks! I've updated the PR with a simple srctree test. It passes locally, but needs somebody with the commit bit to please approve the CI run.

Re: correct place for this. So far it's only implemented in cythonize. Duplicating the functionality in the cython executable looks easy if Cython/Compile is allowed to depend on Cython/Build : cython uses the former while cythonize uses the latter. I don't know if the difference is purely historical or there's some other reason these two should be kept separate.

@ev-br ev-br changed the title WIP: add a depfile support to cythonize add a depfile support to cythonize Jan 11, 2022
ev-br and others added 3 commits January 12, 2022 16:02
Co-authored-by: scoder <stefan_ml@behnel.de>
Co-authored-by: scoder <stefan_ml@behnel.de>
Also rework the paths in depfiles: paths below the source are relative, others
are absolute (e.g. system dependencies, numpy, cython itself etc)
@ev-br
Copy link
Contributor Author

ev-br commented Jan 12, 2022

Thanks Stefan. The PR's updated with Stefan's edits and two more tests : with numpy headers and cythonizing a package (just a slightly modified cythonize_script_package.srctree).
Tests seem to pass locally, but CI sadly needs an approval again (GH is having best intentions, I'm sure; who knows maybe I'm trying to mine some crypto under the hood).

From my side, the PR is ready for review.

@rgommers
Copy link
Contributor

CI sadly needs an approval again (GH is having best intentions, I'm sure; who knows maybe I'm trying to mine some crypto under the hood).

submit a trivial typo fix PR to sidestep this issue on your next pushes?

@@ -957,6 +957,8 @@ def cythonize(module_list, exclude=None, nthreads=0, aliases=None, quiet=False,
:param compiler_directives: Allow to set compiler directives in the ``setup.py`` like this:
``compiler_directives={'embedsignature': True}``.
See :ref:`compiler-directives`.

:param depfile: produce depfiles for the sources if True.
Copy link
Contributor

@scoder scoder Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It's funny that we list options here that are not actually arguments of the function. I'm not sure if that's really a good idea. I think we should stick to the explicit options that we allow and only state that we also accept arbitrary CompilationOptions.

This new option is not currently a part of CompilationOptions. It would probably be better if it was, but then we'd have to make sure that we support it wherever we accept CompilationOptions in the programmatic interface. And that's currently not the case. So …

Choices I see:

  • make this an explicit argument only for cythonize and leave it as a frontend feature (which could then optionally be copied to the cython command – not decided yet if we should do that)
  • move it into CompilationOptions and make it a general compiler feature (which would then also be available to the cython command) – requires a different implementation then, though.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Would you be open to considering the current implementation (of a scotch-tape-and-chewing-gum variety, yes) to unblock downstream and deferring a cleanup to a follow-up?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The interface is fine and won't change in the future. Let's keep it for now, stick with cythonize only, and see if users can adapt. We can still extend it or reimplement it later if they can't.

# paths below the base_dir are relative, otherwise absolute
paths = []
for fname in dependencies:
if src_base_dir in fname:
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

This doesn't seem safe. Rather use prefix comparisons. I'm sure we already do this … somewhere. Probably something to have in Cython/Utils.py

Comment on lines 1059 to 1061
# manually add missing *.pyx files for each *.pxd :
# `cimport foo` means a possible dependence on a `foo.pyx`
# but `all_dependencies(...)` above only list a `foo.pxd`
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Changes in a .pyx file should not trigger a rebuild of files that only depend on the .pxd. That's intentional. If anything changes in the binary interface, that should be reflected by the .pxd (or fail when building the other module itself).

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Not sure I follow. As a silly example (tests/build/depfile_package.srctree below) : pkg/sub/test.pyx does from ..test cimport get_str; the get_str function is declared in pkg/test.pxd and implemented in pkg/test.pyx. If the implementation changes (i.e. pkg/test.pyx), I'd naively expect a rebuild of the module which is built from pkg/sub/test.pyx, is not so?

Copy link
Contributor

@scoder scoder Jan 12, 2022

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

If the implementation changes, but the interface doesn't, then nothing that uses the interface needs to be rebuilt.

Note: You may get different results at runtime, but that's not a build issue.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Ah, so a full build system notices that pkg/test.pyx changes and marks its targets for a rebuild. Thanks, am rolling back this part of the patch. I think it's nearly done now, the last thing is to figure out what you meant by the prefix comparisons. Will push after that.

@ev-br
Copy link
Contributor Author

ev-br commented Jan 12, 2022

OK, review comments addressed, test finally pass locally, let's see if CI agrees (needs approval, again).

@ev-br
Copy link
Contributor Author

ev-br commented Jan 14, 2022

What I'm not sure about though is 'np_prefix/init.cython-30.pxd'

Checked locally on python 2.7 + numpy 1.16 : it's __init__.pxd. Not sure about python 3.x with x<5

@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

What I'm not sure about though is 'np_prefix/__init__.cython-30.pxd'

Checked locally on python 2.7 + numpy 1.16 : it's __init__.pxd. Not sure about python 3.x with x<5

Oh, it doesn't depend on the Python version, only on the Cython version (0.x or 3.x) and NumPy version (whether it ships the cython-30 file or not).

Thinking about that a bit more, it means that our test would potentially fail if NumPy decided to stop shipping that file or if it started providing an additional cython-31 etc. file. That's not a decision that should break tests on our side. We should rather filter out the concrete version suffix.

- Avoid hadcoding the version in np_prefix/__init__.cython-30.pxd
- Recognize cy_prefix/Include/numpy/__init__.pxd fallbacks for legacy numpy versions.
@ev-br
Copy link
Contributor Author

ev-br commented Jan 14, 2022

Trying it on python 2.7 was mostly to pick a definitely old numpy version (which needs a fallback Numpy/__init__.pxd). A desire to be able to debug things a bit faster than on the CI was a secondary factor :-).
Anyway, the last commit tries to both filter the version in cython-30 and recognize various locations for the numpy-related pxd file.

I've to say I'm amazed by how easy it is to switch Cython between python 2.7 and 3.8. Wow.

@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

I'd be happy with just stripping out any .cython-[0-9]+ parts. We already test that lookup mechanism elsewhere and shouldn't depend on NumPy's exact choice here.

And yes, older NumPy versions that do not ship a numpy/__init__.pxd at all make Cython use the one it provides itself. Yeah, that's a bit annoying to validate. We can happily allow both interchangeably, as you pretty much did.

@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

That being said, what about the header files? If there is a cdef extern from "someheader.h": ... in a source file (doesn't have to be a .pxd), shouldn't that show up in the list of dependencies as well? For NumPy, for example, that would be numpy/arrayobject.h. Your PR doesn't generate .dep files for .pxd files, does it? I think we should make sure that all #include files are added to the dependencies list of the .pyx file as well.

EDIT: No, hang on. Actually not. They are dependencies of the .c file, not of the .pyx file. We should not rerun Cython when a header file changes, but only when a .pxd file etc. changes. If a header file changes, we only need to trigger a run of the C compiler, not of Cython. So it's right to leave them out.

@ev-br
Copy link
Contributor Author

ev-br commented Jan 14, 2022

just stripping out any .cython-[0-9]+ parts. We already test that lookup mechanism elsewhere

My search-foo failed to find it, mind giving me a pointer? Or is it OK the way it is at the moment?

@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

just stripping out any .cython-[0-9]+ parts. We already test that lookup mechanism elsewhere

https://github.com/cython/cython/blob/master/tests/run/versioned_pxds.srctree

@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

Or is it OK the way it is at the moment?

I think a re.sub() that removes any [.]cython-[0-9]+ would be better (safer, more future proof…).

@ev-br
Copy link
Contributor Author

ev-br commented Jan 14, 2022

Thanks. Did not know about pxd versioning! The last commit uses re.sub, as requested.

@scoder scoder added this to the 3.0 milestone Jan 14, 2022
@scoder
Copy link
Contributor

scoder commented Jan 14, 2022

Thanks. Let's see what users make of this.

@scoder scoder merged commit 9db1fc3 into cython:master Jan 14, 2022
@ev-br
Copy link
Contributor Author

ev-br commented Jan 14, 2022

Thanks Stefan!

@rgommers
Copy link
Contributor

Awesome, thanks all!

@scoder can I ask how you feel about backporting a PR like this? We're going to release SciPy 1.9.0 with Meson as the build system around June, and for that we could really use this feature and gh-4548. And it's unlikely we'll be able to start depending on a 3.0 alpha or beta release without creating significant issues for packagers.

@eli-schwartz
Copy link
Contributor

Yes please. It would be great if we can include support for this in meson as soon as possible .

@scoder
Copy link
Contributor

scoder commented Jan 16, 2022

can I ask how you feel about backporting a PR like this?

It feels like a new feature, but I'd say that build system improvements are important enough to make an exception. And they don't hurt. Users who need them can just depend on the point version and others will (hopefully) not be impacted.

However, the patch doesn't apply cleanly since it is based off a couple of earlier changes in the master branch (which shouldn't be backported). Anyone up for a backport PR?

ev-br added a commit to ev-br/cython that referenced this pull request Jan 16, 2022
$ cythonize -M foo.pyx
produces a file `foo.c.dep` with dependencies of foo.pyx, in addition to `foo.c`.
Try to write relative paths as much as possible.

Backport of cythongh-4563
@ev-br
Copy link
Contributor Author

ev-br commented Jan 16, 2022

Anyone up for a backport PR?

#4576

scoder pushed a commit that referenced this pull request Jan 28, 2022
Add a new command line option so that
$ cythonize -M foo.pyx
produces a file `foo.c.dep` with the dependencies of foo.pyx, in addition to `foo.c`.
Try to write relative paths as much as possible.

Backport of #4563
@eli-schwartz
Copy link
Contributor

Update: after exploration, it turns out that the cythonize tool does not allow specifying the output files to use (makes sense if you think about it, it is more or less setuptools). So it doesn't seem feasible to use it directly.

In the meantime, I poked around at the frontends and hacked depfile support into the cython tool. I also fixed a bug in the Makefile generation.

@eli-schwartz
Copy link
Contributor

See #4916

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
None yet
Development

Successfully merging this pull request may close these issues.

4 participants