-
-
Notifications
You must be signed in to change notification settings - Fork 8.3k
py/objtype: Add support for __set_name__. (hazard version) #15503
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
6b650a8
to
6194bb5
Compare
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #15503 +/- ##
=======================================
Coverage 98.44% 98.44%
=======================================
Files 171 171
Lines 22192 22217 +25
=======================================
+ Hits 21847 21872 +25
Misses 345 345 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Code size report:
|
Thanks for the contribution. I think this is a good addition. It's important to support this I'm just wondering about the performance hit. It means that each class that's defined will need to iterate through all of its members. I wonder if there's a way to make use of |
I do think the extra loop is necessary, for reasons described below -- but also, this code only runs at class-creation time, something that in most cases should only happen a small fixed number of times during setup. (At least, outside of intractable runtime dynamic metaclassing scenarios for which performance is already a lost cause...)
That could be done if we are only interested in addressing instance-attribute-like descriptors, but that would miss classattr- and classmethod-like descriptors that only use
A check for a My opinion, is that it's better to just keep the loop separate. It does mean chasing down member pointers a second time, but the expensive part of the loop is looking up |
4f808a1
to
89f23c2
Compare
Update: I'm currently investigating some corner-case behavior I recently found around what happens (and what should happen) when a At the moment I'm still waiting on python/cpython#122381 to define what the exact semantics should be in that case -- but this PR is still valid for a merge. Whatever the correct semantics are, they'll need at least a small code-size increase. And with how rarely that gets used, I feel like it'll probably be better gate them behind the @dpgeorge should I spend the time adding a cpydiff to this PR about what the exact mismatch is? |
Yes please. Then at least all this work/knowledge is encoded in a test and can be improved later on. |
Finally found some time to step back in to finish what I started here. It looks like python/cpython#122381 is probably going nowhere and will just leave that behavior unspecified for now, or at best will just officially document CPython's current behavior.
Unfortunately, I've not been able to find a way to actually cpydiff this :(. I'll see if I can at least add a cpydiff that'll at least remain failing even when the iteration order happens to be the same, though, and other tests that at least verify that hazard-free class namespace editing scenarios work while I'm rebasing this to the current master. |
d0cefd3
to
7475965
Compare
Not exactly proud of needing to resort to a stochastic test to reliably show off the bug, but I have a reliable cpydiff now for the specific sequence hazard I was worried about. Fundamentally, it's a modify-while-iterating bug, of exactly the same kind that creates this diff: d = {'a':1, 'b':2, 'c':3}
for k,v in d.items():
d[k+k]=v+v
print(d) (CPython errors with CPython avoids this bug in its As a result, in my
A similar effect can happen when There's no hazard in cases where there are no class-mutating descriptors, and this might not be a use-case it's worthwhile to support. I'll see if I can create an alternative pull-request for a version that exactly matches CPython, though, so we can compare the performance cost of creating that copy on more than just speculation. |
7475965
to
b6b4c35
Compare
As one more potential alternative to leaving it an undefined behavior... we could temporarily set |
I've done some benchmarking using a new suite of internalbench class creation benchmarks (see PR #16825) and have compared the benchmark times from the base branch to this branch. There's a lot of noise in this (I'll see if I can run these on real hardware at some point), but overall this patch makes processing classes take about 12% longer:
There were also two other benchmark tests that got concerningly slower:
None of the other tests in their families seemed to be affected, but my tests for #16806 and #16816 both showed |
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Closed in favor of #16806 |
This PR adds support for the `__set_name__` data model method specified by PEP487 - Simpler customisation of class creation. This includes support for methods that mutate the owner class, and avoids the naive modify-while-iterating hazard possible in a naive implementation like micropython#15503. Note that based on the benchmarks in micropython#16825, this is also as fast or faster than the naive implementation, thanks to clever data layout in setname_list_t, and the way this allows the capture step to run during an existing loop through the class dict. Other rejected approaches for dealing with the hazard include: - python/cpython#72983 During the implementation of this feature for MicroPython, it was discovered that some versions of CPython also have this naive hazard. CPython resolved this bug in BPO-28797 and now makes a complete flat copy of the class's dict to iterate. This design decision doesn't make much sense for a microcontroller though, even if it's perfectly reasonable in the desktop world where memcpy might actually be cheaper than a hard-to-branch-predict conditional; and it's also motivated in their case by error-tracing considerations. - micropython#16816 This is an equivalent implementation to CPython's approach that places this copy directly on the stack; however it is both slower and has larger code size than the approach taken here. - micropython#15503 The simplest implementation is to just not worry about it and let the user face the consequences if they mutate the owner class. That's not a very friendly behavior, though, and it's not actually much more performant than this implementation on either time or code size. - micropython#17693 Another alternative is to do the same as micropython#15503 but leverage MicroPython's existing `is_fixed` field in its dict type to convert attempted mutations of the owner dict into `AttributeError`s. This is safer than just leaving the open hazard, but there's still important use-cases for owner-mutating descriptors, and the performance ain is small enough that it isn't worth missing support for those cases. - combined micropython#17693 with this Another version of this feature used a new feature define, `MICROPY_PY_METACLASSES_LITE`, to control whether this algorithm or the naive version is used. This was rejected in favor of simplicity, based on the very limited performance margin the naive version has (which in some cases even goes _against_ it). Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
This PR adds support for the `__set_name__` data model method specified by PEP487 - Simpler customisation of class creation. This includes support for methods that mutate the owner class, and avoids the naive modify-while-iterating hazard possible in a naive implementation like micropython#15503. Note that based on the benchmarks in micropython#16825, this is also as fast or faster than the naive implementation, thanks to clever data layout in `setname_list_t`, and the way this allows the capture step to run during an existing loop through the class dict. Other rejected approaches for dealing with the hazard include: - python/cpython#72983 During the implementation of this feature for MicroPython, it was discovered that some versions of CPython also have this naive hazard. CPython resolved this bug in BPO-28797 and now makes a complete flat copy of the class's dict to iterate. This design decision doesn't make much sense for a microcontroller though, even if it's perfectly reasonable in the desktop world where memcpy might actually be cheaper than a hard-to-branch-predict conditional; and it's also motivated in their case by error-tracing considerations. - micropython#16816 This is an equivalent implementation to CPython's approach that places this copy directly on the stack; however it is both slower and has larger code size than the approach taken here. - micropython#15503 The simplest implementation is to just not worry about it and let the user face the consequences if they mutate the owner class. That's not a very friendly behavior, though, and it's not actually much more performant than this implementation on either time or code size. - micropython#17693 Another alternative is to do the same as micropython#15503 but leverage MicroPython's existing `is_fixed` field in its dict type to convert attempted mutations of the owner dict into `AttributeError`s. This is safer than just leaving the open hazard, but there's still important use-cases for owner-mutating descriptors, and the performance ain is small enough that it isn't worth missing support for those cases. - combined micropython#17693 with this Another version of this feature used a new feature define, `MICROPY_PY_METACLASSES_LITE`, to control whether this algorithm or the naive version is used. This was rejected in favor of simplicity, based on the very limited performance margin the naive version has (which in some cases even goes _against_ it). Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Summary
This PR implements the feature described in #15501, adding support for the
__set_name__
data model method.Testing
This PR includes and passes the unit test originally submitted in #15500 to verify the feature's absence.