Skip to content

Conversation

da-woods
Copy link
Contributor

@da-woods da-woods commented Feb 28, 2020

  1. Allow the type of self in a cdef class to be specified

    As in this comment Decorated methods in cdef classes should be ordinary functions #1274 (comment).
    I've tried to follow existing behaviour with respect to type checking - self arg

    This only works in Python 3 - reflecting the different behaviour of pure Python code:

    class C:
      def f(self):
        pass
    C.f(1)
    

    This works fine in Python 3 but fails in Python 2. I don't see an easy workaround for this. Maybe to create non-__PYX_CCLASS functions?

  2. Relax some restriction on cdef class properties/staticmethods/classmethods

    Where it isn't able to do the efficient thing it falls back to calling the builtin method property/staticmethod/classmethod.

    Properties are only special-cased where the property decorator is the only decorator. Having mismatching function names now generates a warning, not an error, and follows Python 3 (which is slightly counter-intuitive but fairly easily tested). Python 2 behaviour appears to be different. compiler crash for mismatch in property names #1905.

    staticmethod/classmethod are only special-cased when the decorator is innermost. Fixes Combining @staticmethod with other decorators is broken #1434

  3. Fixes Decorated methods in cdef classes should be ordinary functions #1274 - arbitrarily decorated methods fall back to "ordinary" functions as in the issue.

    Where a method is arbitrarily decorated we don't really know the type of the self argument - it's probably just the type of the cdef class, but it could be anything else. Therefore, we produce a fused function with the self type either that of the cdef class, or object. Therefore both the likely case (where the self is the cdef class and we want access to cdef methods and attributes) is handled and the alternative case where self is some other object should also work. This behaviour can be overridden by specifying the type manually.

    This is mostly transparent to the user, but if they look hard they will be able to tell it's a fused function, mostly because the function has __getitem__ defined.

    I don't think the should break anything, but it might be possible to end up with one or the fused function definitions unable to compile, and that might be a problem....

    I suspect it fixes
    classmethod is unable to decorate a callable that is function (and not method) #3072
    class-level classmethod() can only be called on a method_descriptor or instance method. #3044
    but the associated test-cases were large and fairly confusing

    Fixes property decorator doesn't work when aliased #2181

to be consistent with general CyFunctions.

```
 # cython: binding=True
def f(arg):
  pass

class C:  # or cdef class...
  f = f
  def g(self, ...):
    pass

C.f(something) or C().f() # doesn't enforce any checks on the type of arg -
 # with a fused function it does.

C.g(something) # assumes that self is "C" (at least for a cdef class)
 # but doesn't check it. A fused function enforces that it is C.

C.f() # fails with a fused function claiming too few arguments, even though
 # default arguments may make it a valid call
```

Obviously removing checks does make things a little less safe, but it
is consistent with the more general function behaviour. (I'm doing
this as part of a broad plan to abuse fused functions to be a bit
cleverer about decorators, but I don't think the motivation hugely
matters for this change)
A few things work differently on Python 2 (so omit them).
Some of the TypeError formatting changes around Python3.6 so just
accept any TypeError
For cdef classes.

The main advantage being that name look-up is available
so it's very easy to see that names like property or
staticmethod have been overridden
to be consistent with general CyFunctions.

```
 # cython: binding=True
def f(arg):
  pass

class C:  # or cdef class...
  f = f
  def g(self, ...):
    pass

C.f(something) or C().f() # doesn't enforce any checks on the type of arg -
 # with a fused function it does.

C.g(something) # assumes that self is "C" (at least for a cdef class)
 # but doesn't check it. A fused function enforces that it is C.

C.f() # fails with a fused function claiming too few arguments, even though
 # default arguments may make it a valid call
```

Obviously removing checks does make things a little less safe, but it
is consistent with the more general function behaviour. (I'm doing
this as part of a broad plan to abuse fused functions to be a bit
cleverer about decorators, but I don't think the motivation hugely
matters for this change)
A few things work differently on Python 2 (so omit them).
Some of the TypeError formatting changes around Python3.6 so just
accept any TypeError
Uses fused functions to handle expected case where the cdef class
is passed directly though and unexpected case where something
else is passed through
Largely works on Python 3. A bit broken on Python 2 but for
reasons that I think are due to changes in the language and which
I largely understand.

Some errors when Cython is compiled which need investigating
@da-woods da-woods changed the title [WIP] Arbitrary decorators on cdef classes [WIP] Arbitrary decorators on cdef class def functions Feb 28, 2020
Sorted out an issue with staticmethod being put out of order
Changed __pyx classmethod implementation to accept basically
everything (like Python does)
Significant changes to how decorators are structured round fused
types are.
@da-woods da-woods changed the title [WIP] Arbitrary decorators on cdef class def functions Arbitrary decorators on cdef class def functions Mar 1, 2020
@da-woods
Copy link
Contributor Author

@scoder Pinging w.r.t. 3.0 alpha. This potentially fixes a long-standing incompatibility with Python so I think this would be a good thing to include. I realise it is fairly big though.

Copy link
Contributor

@scoder scoder left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this needs some more thought.

@@ -815,6 +815,7 @@ class CArgDeclNode(Node):
# kw_only boolean Is a keyword-only argument
# is_dynamic boolean Non-literal arg stored inside CyFunction
# pos_only boolean Is a positional-only argument
# type_set_manually boolean Used in overriding the type of self arguments - otherwise unimportant
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced of this feature yet. I think it's worth a separate discussion. There might be other ways to achieve the original intention, for example.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Do you mean "you're not convinced it should be possible to override the type of self" or "this way of doing it is probably not the right way"?

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not convinced it should be possible to override the type of self. Being able to call a type's method on something that is not the type? How is that not a static method then?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The main counter-argument is that it is possible in Python 3 where functions and methods are the same thing.

The difference between it and a static method is that in staticmethod always drops the self argument whether you call it with an instance or not, but it is acting very similar to a static method.

I definitely don't think this would be a good thing to do regularly. @jdemeyer claimed to have use cases for it so maybe could suggest a good reason? I'd mainly implemented it here because it seemed like a necessary step for the arbitrary decorator support.

@@ -2742,7 +2753,8 @@ class DefNode(FuncDefNode):
# specialized_cpdefs [DefNode] list of specialized cpdef DefNodes
# py_cfunc_node PyCFunctionNode/InnerFunctionNode The PyCFunction to create and assign
#
# decorator_indirection IndirectionNode Used to remove __Pyx_Method_ClassMethod for fused functions
# decorator_call_tree StatList node containing LetNodes containing SimpleCallNodes
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I'm not so sure that the DefNode should know its decorator calls. They should be part of the name assignment, and not necessarily the DefNode itself.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Yes I agree although it's replacing something similarly intrusive.

Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

although it's replacing something similarly intrusive.

Sorry for that. It's just that adding more tech debt on existing debt is the wrong direction. Changes in suboptimal code and designs should aim for leaving the code cleaner afterwards, even if that means more work along the way.

It's definitely not your fault – I'm totally aware that it's difficult to see what others (specifically me) might consider cruft or bad design for whatever reason, when all you're trying to do is to resolve that one problem at hand, following as good as you can whatever help and guiding principles you can extract from a fairly large code base. You're doing great work on that front, seriously!

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I found a much clearer way of doing it that I'll push in the next commit so it was a useful thing to point out anyway...

func = decorator.decorator
if func.is_name:
self.is_classmethod |= func.name == 'classmethod'
self.is_staticmethod |= func.name == 'staticmethod'

if self.is_classmethod and env.lookup_here('classmethod'):
cm_entry = env.lookup('classmethod')
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Why is lookup() more correct than lookup_here() in this case?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

lookup_here only checks inside the class scope so it's almost never right. It won't notice if you've re-defined classmethod at module level, which has to be much more common. e.g.

classmethod = lambda x: (lambda: 1)

cdef class C:
    @classmethod
    def cm(cls):
        pass

# classmethod() was overridden - not much we can do here ...
self.is_classmethod = False
if self.is_staticmethod and env.lookup_here('staticmethod'):
if self.is_staticmethod and env.lookup('staticmethod'):
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Shouldn't env.lookup('staticmethod') always succeed?

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

It doesn't although I can't quite work out why right now. I think it's special-cased somewhere. I'll work out why and add an explanatory comment. As with classmethod it's necessary to at least look outwards to the module scope.

else:
arg.is_self_arg = 1
arg.hdr_type = arg.type = env.parent_type
if self.is_in_arbitrary_decorator:
usual_type = PyrexTypes.FusedType([env.parent_type, PyrexTypes.py_object_type],
Copy link
Contributor

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Phew – I'm not sure if it's always a good idea to change a "normal" function into a fused one. That's seems bit too non-obvious for users, also regarding performance characteristics etc.

Copy link
Contributor Author

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I do see the point here. This seemed less painful than a custom dispatch mechanism just for this use.

A few possible options:

  • I already added a warning message for arbitrary decorators without an explicit type (https://github.com/cython/cython/pull/3383/files#r405058184). It could be more detailed about what it's done and mention the fused function.

  • Compiler option to control this (so keep current behaviour as the default, but allow this extension with a compiler options)

  • It might be possible to make a simpler dispatch mechanism - just duplicate the tree inside the function so it's effectively in an if statement. Obviously that might well have its own complications:

     ```
     if isinstance(self, CdefClassType):
         # code
     else
          # copy of code with "self" -> "<object>self"
    

da-woods and others added 2 commits April 7, 2020 21:41
Co-Authored-By: Stefan Behnel <stefan_ml@behnel.de>
Biggest change being a simpler way of removing the inner
"staticmethod" or "classmethod" from fused function
@da-woods
Copy link
Contributor Author

da-woods commented Jun 7, 2020

I think what I might try to do with this is to split it into 2 or 3 bits (depending on how nicely it splits) to make this a more manageable size, and also to separate out controversial bits from the bits that should be pretty straightforward.

I'll mark this a WIP just to block it from being merged (not that this was an imminent risk...). All being well this PR will probably end up obsolete in favour of the smaller chunks, but I'll leave it open until that happens.

@da-woods da-woods marked this pull request as draft June 7, 2020 08:57
@mattip
Copy link
Contributor

mattip commented Oct 26, 2020

The issue this was to solve is marked as a blocker for 3.0. Is there a way to help with breaking this into a set of manageable PRs?

@da-woods
Copy link
Contributor Author

The issue this was to solve is marked as a blocker for 3.0. Is there a way to help with breaking this into a set of manageable PRs?

I think part of the problem is that I haven't looked at it since deciding that it needed to be split. I think my plan was to split it along the lines of the three points in the PR summary. I think "3" depends on "1", but "2" might be genuinely stand-alone.

There's a variety of conflicts with the master branch. I think most are trivial but the conflict with cdef property looked a bit tricky last time I looked at it.

I think it was also debatable whether the "fused type for self" approach was the right way to solve it.


I don't know what (if anything) you should do with it. I can't see me having time to really look at it at least in the next month or so, so if you can see an easy way to break off one or two parts then you're welcome to give it a go. I don't think it's a hugely important blocker for Cython 3 though.

@da-woods
Copy link
Contributor Author

I've now split this into three smaller PRs so I'll close this one.

@da-woods da-woods closed this Feb 27, 2021
Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
None yet
Projects
None yet
3 participants