Skip to content

py/objtype: Add support for PEP487 __set_name__. #16806

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
wants to merge 3 commits into
base: master
Choose a base branch
from

Conversation

AJMansfield
Copy link
Contributor

@AJMansfield AJMansfield commented Feb 24, 2025

Summary

This PR adds support for the __set_name__ data model method. This includes support for methods that mutate the owner class, and avoids the modify-while-iterating hazard on the class locals dictionary encountered by a more naive implementation like #15503. It's also able to do this faster than the naive implementation, thanks to clever data layout in the linked list it uses and the way this approach allows the capture step to run during an existing step that already iterates the class dict.

Testing

This feature was developed test-first with comprehensive tests of the feature (originally cpydiffs proposed in #15500). It also includes additional tests that were needed to guarantee test sensitivity to the modify-while-iterating hazard.

I've run the test suite against both Unix and RP2 targets. I've verified that the tests are sensitive to the feature's presence, and that this implementation passes the full test suite.

Trade-offs and Alternatives

Performing __set_name__ in a way that allows both owner-class mutation and avoids the modify-while-iterating hazard, requires allocating some kind of additional memory for a structure that can be iterated; but within that there are several ways this can be done.

Also considered was combining #17693 and this PR, using a new feature define MICROPY_PY_METACLASSES_LITE to control whether this PR's algorithm or the naive version that blocks owner mutations is used. But, based on the very limited performance margin the naive version has (which in some cases even goes against it), it was determined to remove that and only support a full implementation.

Copy link

github-actions bot commented Feb 24, 2025

Code size report:

   bare-arm:    +0 +0.000% 
minimal x86:    +0 +0.000% 
   unix x64:  +232 +0.027% standard[incl +32(data)]
      stm32:  +132 +0.034% PYBV10
     mimxrt:  +136 +0.036% TEENSY40
        rp2:  +136 +0.015% RPI_PICO_W
       samd:  +152 +0.057% ADAFRUIT_ITSYBITSY_M4_EXPRESS
  qemu rv32:  +145 +0.032% VIRT_RV32

Copy link

codecov bot commented Feb 24, 2025

Codecov Report

All modified and coverable lines are covered by tests ✅

Project coverage is 98.39%. Comparing base (e993f53) to head (832f292).

Additional details and impacted files
@@            Coverage Diff             @@
##           master   #16806      +/-   ##
==========================================
- Coverage   98.41%   98.39%   -0.02%     
==========================================
  Files         171      171              
  Lines       22210    22228      +18     
==========================================
+ Hits        21857    21871      +14     
- Misses        353      357       +4     

☔ View full report in Codecov by Sentry.
📢 Have feedback on the report? Share it here.

🚀 New features to boost your workflow:
  • ❄️ Test Analytics: Detect flaky tests, report on failures, and find test suite problems.
  • 📦 JS Bundle Analysis: Save yourself from yourself by tracking and limiting bundle sizes in JS merges.

@AJMansfield AJMansfield force-pushed the set-name-2 branch 4 times, most recently from 5d73521 to c6bd944 Compare February 24, 2025 22:36
@AJMansfield AJMansfield changed the title py/objtype: Add hazard-free support for __set_name__. py/objtype: Add hazard-free support for __set_name__. (list version) Feb 25, 2025
@AJMansfield AJMansfield changed the title py/objtype: Add hazard-free support for __set_name__. (list version) py/objtype: Add support for __set_name__. (list version) Feb 26, 2025
@AJMansfield
Copy link
Contributor Author

AJMansfield commented Feb 26, 2025

I've done some benchmarking using a new suite of internalbench class creation benchmarks (see PR #16825) and have compared the benchmark times from the base branch to this branch.

There's a lot of noise in this (I'll see if I can run these on real hardware at some point), but overall this patch makes processing classes take about 32% longer:

internal_bench/class_create:
    0.349 -> 0.450 (+29%) internal_bench/class_create-0-empty.py
    0.476 -> 0.630 (+33%) internal_bench/class_create-1-slots.py
    0.490 -> 0.622 (+27%) internal_bench/class_create-1.1-slots5.py
    0.432 -> 0.551 (+28%) internal_bench/class_create-2-classattr.py
    0.776 -> 0.970 (+25%) internal_bench/class_create-2.1-classattr5.py
    0.461 -> 0.594 (+29%) internal_bench/class_create-3-instancemethod.py
    0.477 -> 0.634 (+33%) internal_bench/class_create-4-classmethod.py
    0.452 -> 0.619 (+37%) internal_bench/class_create-4.1-classmethod_implicit.py
    0.494 -> 0.651 (+32%) internal_bench/class_create-5-staticmethod.py
    0.458 -> 0.599 (+31%) internal_bench/class_create-6-getattribute.py
    0.474 -> 0.572 (+21%) internal_bench/class_create-6.1-getattr.py
    0.386 -> 0.518 (+34%) internal_bench/class_create-6.2-descriptor.py
    0.517 -> 0.722 (+39%) internal_bench/class_create-6.3-descriptor_setname.py
    0.419 -> 0.573 (+37%) internal_bench/class_create-6.4-property.py
    0.363 -> 0.477 (+31%) internal_bench/class_create-7-inherit.py
    0.368 -> 0.491 (+33%) internal_bench/class_create-7.1-inherit_initsubclass.py

There were also three other benchmark tests that got concerningly slower:

internal_bench/arrayop:
    0.174 -> 0.196 (+13%) internal_bench/arrayop-3-bytearray_inplace.py
internal_bench/func_builtin:
    0.209 -> 0.232 (+11%) internal_bench/func_builtin-2-enum_kw.py
internal_bench/loop_count:
    0.268 -> 0.340 (+27%) internal_bench/loop_count-2-range_iter.py

None of the other tests in their families seemed to be affected, but my tests for #15503 and #16816 both showed loop_count-2 slowed by near-enough the exact same margins as class creation.

@dpgeorge dpgeorge added the py-core Relates to py/ directory in source label Mar 13, 2025
@AJMansfield AJMansfield force-pushed the set-name-2 branch 3 times, most recently from 7a169aa to 0dc3ccb Compare July 16, 2025 17:41
@AJMansfield
Copy link
Contributor Author

Ended up switching out mp_obj_list_t for a bespoke setname_list_t linked list structure, in order to be able to capture all of the per-attr values needed for each __set_name__ call into a struct with the same memory layout as the eventual call list. (Using the position for the owner argument for the linked list pointer.)

@AJMansfield AJMansfield force-pushed the set-name-2 branch 2 times, most recently from d710e5a to b69bfbd Compare July 18, 2025 17:57
@AJMansfield
Copy link
Contributor Author

AJMansfield commented Jul 18, 2025

Also, since it's only binding rather than calling __set_name__ during the locals dict loop, it was possible to combine that with the loop that does the special accessors check, per #15503 (comment)

There's already check_for_special_accessors() that's run on class creation in a loop over all members. Maybe that can be reused instead of essentially duplicating the work?

@AJMansfield AJMansfield force-pushed the set-name-2 branch 3 times, most recently from c0934f9 to 96e0b67 Compare July 18, 2025 18:19
@AJMansfield
Copy link
Contributor Author

Spurious coverage change for 8179697, along with three different spurious failures on that same extmod/select_poll_eintr.py test.

I've also manually tested the path gated behind MICROPY_PY_DESCRIPTORS && !MICROPY_PY_METACLASSES_LITE to verify it passes the tests from #17693, but I'm not sure there's actually any way to write a SKIP condition for them that isn't just checking the very thing the tests are meant to discriminate.

@dpgeorge
Copy link
Member

Would be good to see the change in benchmark results with the latest version here.

@AJMansfield
Copy link
Contributor Author

AJMansfield commented Jul 20, 2025

I ran an updated version of the benchmarks in #16825 on my Pico2 board to compare the performance of master vs this PR with MICROPY_PY_METACLASSES_LITE=0 ("is_fixed") and vs MICROPY_PY_METACLASSES_LITE=1 ("setname_list").

test master is_fixed vs master setname_list vs master vs is_fixed
class_create-0-empty.py 0.171s 0.178s +4.09% 0.179s +4.68% +0.58%
class_create-1-slots.py 0.241s 0.248s +2.90% 0.242s +0.41% -2.49%
class_create-1.1-slots5.py 0.242s 0.248s +2.48% 0.242s +0.00% -2.48%
class_create-2-classattr.py 0.207s 0.218s +5.31% 0.218s +5.31% 0.00%
class_create-2.1-classattr5.py 0.329s 0.354s +7.60% 0.354s +7.60% 0.00%
class_create-2.3-classattr5objs.py 0.435s 0.469s +7.82% 0.465s +6.90% -0.92%
class_create-3-instancemethod.py 0.227s 0.228s +0.44% 0.230s +1.32% +0.88%
class_create-4-classmethod.py 0.241s 0.246s +2.07% 0.253s +4.98% +2.90%
class_create-4.1-classmethod_implicit.py 0.230s 0.231s +0.43% 0.232s +0.87% +0.43%
class_create-5-staticmethod.py 0.241s 0.246s +2.07% 0.247s +2.49% +0.41%
class_create-6-getattribute.py 0.227s 0.229s +0.88% 0.230s +1.32% +0.44%
class_create-6.1-getattr.py 0.227s 0.229s +0.88% 0.230s +1.32% +0.44%
class_create-6.2-property.py 0.222s 0.225s +1.35% 0.228s +2.70% +1.35%
class_create-6.3-descriptor.py 0.218s 0.220s +0.92% 0.219s +0.46% -0.46%
class_create-7-inherit.py 0.175s 0.189s +8.00% 0.192s +9.71% +1.71%
class_create-7.1-inherit_initsubclass.py 0.175s 0.190s +8.57% 0.192s +9.71% +1.14%
class_create-8-metaclass_setname.py 0.257s 0.275s +7.00% 0.277s +7.78% +0.78%
class_create-8.1-metaclass_setname5.py 0.439s 0.548s +24.83% 0.545s +24.15% -0.68%
Raw Test Results

(Note, these tests runs were terminated at the start of the internal_bench/var tests as something about those tests that I've not bothered to diagnose, wasn't able to function correctly on hardware.)

master: Feature Not Implemented

Run from v1.25.0-390-g270b00215 on the rp2 port, on my overclocked Pico2.

internal_bench/arrayop:
    0.049s (+00.00%) internal_bench/arrayop-1-list_inplace.py
    0.098s (+100.19%) internal_bench/arrayop-2-list_map.py
    0.058s (+18.15%) internal_bench/arrayop-3-bytearray_inplace.py
    0.115s (+134.49%) internal_bench/arrayop-4-bytearray_map.py
internal_bench/bytealloc:
    0.132s (+00.00%) internal_bench/bytealloc-1-bytes_n.py
    0.353s (+166.45%) internal_bench/bytealloc-2-repeat.py
internal_bench/bytebuf:
    0.058s (+00.00%) internal_bench/bytebuf-1-inplace.py
    0.454s (+680.66%) internal_bench/bytebuf-2-join_map_bytes.py
    0.115s (+98.44%) internal_bench/bytebuf-3-bytarray_map.py
internal_bench/class_create:
    0.171s (+00.00%) internal_bench/class_create-0-empty.py
    0.241s (+41.46%) internal_bench/class_create-1-slots.py
    0.242s (+41.48%) internal_bench/class_create-1.1-slots5.py
    0.207s (+21.36%) internal_bench/class_create-2-classattr.py
    0.329s (+92.92%) internal_bench/class_create-2.1-classattr5.py
    0.435s (+155.09%) internal_bench/class_create-2.3-classattr5objs.py
    0.227s (+32.82%) internal_bench/class_create-3-instancemethod.py
    0.241s (+41.11%) internal_bench/class_create-4-classmethod.py
    0.230s (+34.48%) internal_bench/class_create-4.1-classmethod_implicit.py
    0.241s (+41.28%) internal_bench/class_create-5-staticmethod.py
    0.227s (+33.08%) internal_bench/class_create-6-getattribute.py
    0.227s (+33.16%) internal_bench/class_create-6.1-getattr.py
    0.222s (+30.20%) internal_bench/class_create-6.2-property.py
    0.218s (+27.73%) internal_bench/class_create-6.3-descriptor.py
    0.175s (+02.61%) internal_bench/class_create-7-inherit.py
    0.175s (+02.74%) internal_bench/class_create-7.1-inherit_initsubclass.py
    0.257s (+50.62%) internal_bench/class_create-8-metaclass_setname.py
    0.439s (+157.22%) internal_bench/class_create-8.1-metaclass_setname5.py
internal_bench/from_iter:
    0.014s (+00.00%) internal_bench/from_iter-1-list_bound.py
    0.092s (+540.42%) internal_bench/from_iter-2-list_unbound.py
    0.018s (+22.85%) internal_bench/from_iter-3-tuple_bound.py
    0.092s (+538.86%) internal_bench/from_iter-4-tuple_unbound.py
    0.019s (+29.40%) internal_bench/from_iter-5-bytes_bound.py
    0.135s (+839.60%) internal_bench/from_iter-6-bytes_unbound.py
    0.018s (+23.16%) internal_bench/from_iter-7-bytearray_bound.py
    0.107s (+646.66%) internal_bench/from_iter-8-bytearray_unbound.py
internal_bench/func_args:
    0.526s (+00.00%) internal_bench/func_args-1.1-pos_1.py
    0.561s (+06.59%) internal_bench/func_args-1.2-pos_3.py
    0.587s (+11.66%) internal_bench/func_args-2-pos_default_2_of_3.py
    0.628s (+19.39%) internal_bench/func_args-3.1-kw_1.py
    0.891s (+69.32%) internal_bench/func_args-3.2-kw_3.py
internal_bench/func_builtin:
    0.084s (+00.00%) internal_bench/func_builtin-1-enum_pos.py
    0.093s (+10.24%) internal_bench/func_builtin-2-enum_kw.py
internal_bench/funcall:
    0.183s (+00.00%) internal_bench/funcall-1-inline.py
    0.596s (+226.21%) internal_bench/funcall-2-funcall.py
    0.538s (+194.47%) internal_bench/funcall-3-funcall-local.py
internal_bench/loop_count:
    0.177s (+00.00%) internal_bench/loop_count-1-range.py
    0.102s (-42.46%) internal_bench/loop_count-2-range_iter.py
    0.183s (+03.38%) internal_bench/loop_count-3-while_up.py
    0.182s (+02.63%) internal_bench/loop_count-4-while_down_gt.py
    0.198s (+11.65%) internal_bench/loop_count-5-while_down_ne.py
    0.199s (+12.41%) internal_bench/loop_count-5.1-while_down_ne_localvar.py

is_fixed: MICROPY_PY_METACLASSES_LITE=0

Run from v1.25.0-397-gcd65afdda on the rp2 port with the following variant, on my overclocked Pico2.

# mpconfigvariant_NOMETA.cmake
set(PICO_PLATFORM "rp2350")
list(APPEND MICROPY_DEF_BOARD
    MICROPY_PY_METACLASSES_LITE=0
)
internal_bench/arrayop:
    0.049s (+00.00%) internal_bench/arrayop-1-list_inplace.py
    0.106s (+115.98%) internal_bench/arrayop-2-list_map.py
    0.058s (+18.25%) internal_bench/arrayop-3-bytearray_inplace.py
    0.115s (+134.01%) internal_bench/arrayop-4-bytearray_map.py
internal_bench/bytealloc:
    0.132s (+00.00%) internal_bench/bytealloc-1-bytes_n.py
    0.353s (+166.49%) internal_bench/bytealloc-2-repeat.py
internal_bench/bytebuf:
    0.058s (+00.00%) internal_bench/bytebuf-1-inplace.py
    0.508s (+773.84%) internal_bench/bytebuf-2-join_map_bytes.py
    0.115s (+97.90%) internal_bench/bytebuf-3-bytarray_map.py
internal_bench/class_create:
    0.178s (+00.00%) internal_bench/class_create-0-empty.py
    0.248s (+39.40%) internal_bench/class_create-1-slots.py
    0.248s (+39.41%) internal_bench/class_create-1.1-slots5.py
    0.218s (+22.27%) internal_bench/class_create-2-classattr.py
    0.354s (+98.67%) internal_bench/class_create-2.1-classattr5.py
    0.469s (+163.13%) internal_bench/class_create-2.3-classattr5objs.py
    0.228s (+28.17%) internal_bench/class_create-3-instancemethod.py
    0.246s (+37.92%) internal_bench/class_create-4-classmethod.py
    0.231s (+29.38%) internal_bench/class_create-4.1-classmethod_implicit.py
    0.246s (+37.90%) internal_bench/class_create-5-staticmethod.py
    0.229s (+28.46%) internal_bench/class_create-6-getattribute.py
    0.229s (+28.50%) internal_bench/class_create-6.1-getattr.py
    0.225s (+26.17%) internal_bench/class_create-6.2-property.py
    0.220s (+23.46%) internal_bench/class_create-6.3-descriptor.py
    0.189s (+06.31%) internal_bench/class_create-7-inherit.py
    0.190s (+06.40%) internal_bench/class_create-7.1-inherit_initsubclass.py
    0.275s (+54.38%) internal_bench/class_create-8-metaclass_setname.py
    0.548s (+207.65%) internal_bench/class_create-8.1-metaclass_setname5.py
internal_bench/from_iter:
    0.014s (+00.00%) internal_bench/from_iter-1-list_bound.py
    0.092s (+537.66%) internal_bench/from_iter-2-list_unbound.py
    0.018s (+22.86%) internal_bench/from_iter-3-tuple_bound.py
    0.092s (+535.57%) internal_bench/from_iter-4-tuple_unbound.py
    0.019s (+29.82%) internal_bench/from_iter-5-bytes_bound.py
    0.163s (+1029.18%) internal_bench/from_iter-6-bytes_unbound.py
    0.018s (+23.18%) internal_bench/from_iter-7-bytearray_bound.py
    0.107s (+642.94%) internal_bench/from_iter-8-bytearray_unbound.py
internal_bench/func_args:
    0.525s (+00.00%) internal_bench/func_args-1.1-pos_1.py
    0.559s (+06.61%) internal_bench/func_args-1.2-pos_3.py
    0.586s (+11.69%) internal_bench/func_args-2-pos_default_2_of_3.py
    0.627s (+19.44%) internal_bench/func_args-3.1-kw_1.py
    0.889s (+69.50%) internal_bench/func_args-3.2-kw_3.py
internal_bench/func_builtin:
    0.087s (+00.00%) internal_bench/func_builtin-1-enum_pos.py
    0.093s (+05.99%) internal_bench/func_builtin-2-enum_kw.py
internal_bench/funcall:
    0.182s (+00.00%) internal_bench/funcall-1-inline.py
    0.594s (+226.31%) internal_bench/funcall-2-funcall.py
    0.537s (+194.82%) internal_bench/funcall-3-funcall-local.py
internal_bench/loop_count:
    0.177s (+00.00%) internal_bench/loop_count-1-range.py
    0.101s (-42.84%) internal_bench/loop_count-2-range_iter.py
    0.183s (+03.38%) internal_bench/loop_count-3-while_up.py
    0.183s (+03.01%) internal_bench/loop_count-4-while_down_gt.py
    0.199s (+12.03%) internal_bench/loop_count-5-while_down_ne.py
    0.200s (+12.78%) internal_bench/loop_count-5.1-while_down_ne_localvar.py

setname_list: MICROPY_PY_METACLASSES_LITE=1

Run from v1.25.0-397-gcd65afdda on the rp2 port, on my overclocked Pico2.

internal_bench/arrayop:
    0.049s (+00.00%) internal_bench/arrayop-1-list_inplace.py
    0.090s (+81.91%) internal_bench/arrayop-2-list_map.py
    0.058s (+18.14%) internal_bench/arrayop-3-bytearray_inplace.py
    0.099s (+100.45%) internal_bench/arrayop-4-bytearray_map.py
internal_bench/bytealloc:
    0.132s (+00.00%) internal_bench/bytealloc-1-bytes_n.py
    0.353s (+166.28%) internal_bench/bytealloc-2-repeat.py
internal_bench/bytebuf:
    0.058s (-00.00%) internal_bench/bytebuf-1-inplace.py
    0.503s (+763.99%) internal_bench/bytebuf-2-join_map_bytes.py
    0.099s (+69.62%) internal_bench/bytebuf-3-bytarray_map.py
internal_bench/class_create:
    0.179s (+00.00%) internal_bench/class_create-0-empty.py
    0.242s (+35.34%) internal_bench/class_create-1-slots.py
    0.242s (+35.34%) internal_bench/class_create-1.1-slots5.py
    0.218s (+22.21%) internal_bench/class_create-2-classattr.py
    0.354s (+98.17%) internal_bench/class_create-2.1-classattr5.py
    0.465s (+160.14%) internal_bench/class_create-2.3-classattr5objs.py
    0.230s (+28.65%) internal_bench/class_create-3-instancemethod.py
    0.253s (+41.38%) internal_bench/class_create-4-classmethod.py
    0.232s (+30.05%) internal_bench/class_create-4.1-classmethod_implicit.py
    0.247s (+38.22%) internal_bench/class_create-5-staticmethod.py
    0.230s (+28.91%) internal_bench/class_create-6-getattribute.py
    0.230s (+28.90%) internal_bench/class_create-6.1-getattr.py
    0.228s (+27.66%) internal_bench/class_create-6.2-property.py
    0.219s (+22.50%) internal_bench/class_create-6.3-descriptor.py
    0.192s (+07.61%) internal_bench/class_create-7-inherit.py
    0.192s (+07.68%) internal_bench/class_create-7.1-inherit_initsubclass.py
    0.277s (+55.05%) internal_bench/class_create-8-metaclass_setname.py
    0.545s (+204.96%) internal_bench/class_create-8.1-metaclass_setname5.py
internal_bench/from_iter:
    0.014s (+00.00%) internal_bench/from_iter-1-list_bound.py
    0.075s (+420.75%) internal_bench/from_iter-2-list_unbound.py
    0.018s (+22.72%) internal_bench/from_iter-3-tuple_bound.py
    0.075s (+418.89%) internal_bench/from_iter-4-tuple_unbound.py
    0.019s (+29.41%) internal_bench/from_iter-5-bytes_bound.py
    0.146s (+908.48%) internal_bench/from_iter-6-bytes_unbound.py
    0.018s (+23.10%) internal_bench/from_iter-7-bytearray_bound.py
    0.091s (+526.54%) internal_bench/from_iter-8-bytearray_unbound.py
internal_bench/func_args:
    0.527s (+00.00%) internal_bench/func_args-1.1-pos_1.py
    0.561s (+06.58%) internal_bench/func_args-1.2-pos_3.py
    0.588s (+11.64%) internal_bench/func_args-2-pos_default_2_of_3.py
    0.629s (+19.37%) internal_bench/func_args-3.1-kw_1.py
    0.891s (+69.24%) internal_bench/func_args-3.2-kw_3.py
internal_bench/func_builtin:
    0.083s (+00.00%) internal_bench/func_builtin-1-enum_pos.py
    0.089s (+06.61%) internal_bench/func_builtin-2-enum_kw.py
internal_bench/funcall:
    0.184s (+00.00%) internal_bench/funcall-1-inline.py
    0.671s (+264.67%) internal_bench/funcall-2-funcall.py
    0.615s (+234.37%) internal_bench/funcall-3-funcall-local.py
internal_bench/loop_count:
    0.178s (+00.00%) internal_bench/loop_count-1-range.py
    0.103s (-42.30%) internal_bench/loop_count-2-range_iter.py
    0.184s (+03.37%) internal_bench/loop_count-3-while_up.py
    0.182s (+02.25%) internal_bench/loop_count-4-while_down_gt.py
    0.198s (+11.23%) internal_bench/loop_count-5-while_down_ne.py
    0.199s (+11.98%) internal_bench/loop_count-5.1-while_down_ne_localvar.py

The important cases to pay attention to here IMO are class_create-2.3-classattr5objs.py and its comparison to class_create-8.1-metaclass_setname5.py, where the former demonstrates the overhead attributable to just performing the extra lookups of __set_name__ on concrete objects, while metaclass_setname5 demonstrates that plus the overhead of storing and invoking or just invoking the functions it finds.

The performance penalty compared to not implementing the feature isn't large --- certain cases are ~8% slower, but others are unaffected. The differences between is_fixed vs setname_list are also extremely small, to the point where I question their statistical significance. The code size difference is also quite small; setname_list is only larger by 16 bytes on rp2, and similarly by small amounts on other ports.

@AJMansfield
Copy link
Contributor Author

Coverage change for 03b37bd is spurious.

@dpgeorge
Copy link
Member

Thanks for running the benchmarks, that's quite comprehensive and certainly helps to evaluate this PR.

(Note that I think you have the sign the wrong way around in the final column "vs is_fixed".)

The important cases to pay attention to here IMO are class_create-2.3-classattr5objs.py and its comparison to class_create-8.1-metaclass_setname5.py

So, both of those are faster with the list version, compared to is-fixed. That makes sense to me because the list version only needs to iterate once through the locals dict, whereas the is-fixed version iterates twice.

The performance penalty compared to not implementing the feature isn't large --- certain cases are ~8% slower, but others are unaffected

Yes, that's good. And it's worth explicitly pointing out that this is about class creation, not instance (of a class) creation. The former is done once, the latter many times. So making class creation a little bit slower in order to add this feature is acceptable (and it won't affect instance creation performance).

The code size difference is also quite small; setname_list is only larger by 16 bytes on rp2, and similarly by small amounts on other ports.

Yes, indeed. I tested esp8266 and the list version is only +40 bytes. And on PYBV10 it's only +24 bytes.

And from a quick static code analysis, I can't fully rule out that locals_dict is not in ROM, so it may require additional code to check that is_fixed is not already set to 1 before changing it to 1 temporarily.


My conclusion: I think we should go with this approach to __set_name__, and remove the MICROPY_PY_METACLASSES_LITE=0 option (ie it's always just the list version and always enabled if MICROPY_PY_DESCRIPTORS is enabled).

@AJMansfield
Copy link
Contributor Author

(Note that I think you have the sign the wrong way around in the final column "vs is_fixed".)

Indeed, fixed it.

And from a quick static code analysis, I can't fully rule out that locals_dict is not in ROM, so it may require additional code to check that is_fixed is not already set to 1 before changing it to 1 temporarily.

Not a hazard I realized might be at play.

My conclusion: I think we should go with this approach to __set_name__, and remove the MICROPY_PY_METACLASSES_LITE=0 option (ie it's always just the list version and always enabled if MICROPY_PY_DESCRIPTORS is enabled).

I was surprised at how close the two are too; but yeah I agree, since they're so close we might as well leave out the less-functional version.

@dpgeorge dpgeorge added this to the release-1.26.0 milestone Jul 21, 2025
@dpgeorge
Copy link
Member

This is looks really good now.

I will do a final review, and might have a little play to see if there is any way to reduce code size.

@AJMansfield AJMansfield changed the title py/objtype: Add support for __set_name__. (list version) py/objtype: Add support for PEP487 __set_name__. Jul 21, 2025
@AJMansfield
Copy link
Contributor Author

I've rewritten the PR summary to document the conclusions so far.
@dpgeorge Should I reword to include this summary in the commit message? Also, which commit should have that summary? (Or, should I just squash 43ed928 and 0126fad together?)

@dpgeorge
Copy link
Member

Or, should I just squash 43ed928 and 0126fad together?

Yes, please squash those commits.

This PR could easily be just 3 commits: implementation, tests and docs.

Should I reword to include this summary in the commit message?

Yes please.

Including the stochastic tests needed to guarantee sensitivity to the
potential iterate-while-modifying hazard a naive implementation might
have.

Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
This PR adds support for the `__set_name__` data model method
specified by PEP487 - Simpler customisation of class creation.

This includes support for methods that mutate the owner class,
and avoids the naive modify-while-iterating hazard possible in a naive
implementation like micropython#15503.

Note that based on the benchmarks in micropython#16825, this
is also as fast or faster than the naive implementation, thanks to
clever data layout in `setname_list_t`, and the way this allows the
capture step to run during an existing loop through the class dict.

Other rejected approaches for dealing with the hazard include:

- python/cpython#72983
During the implementation of this feature for MicroPython, it was
discovered that some versions of CPython also have this naive hazard.
CPython resolved this bug in BPO-28797 and now makes a complete flat
copy of the class's dict to iterate. This design decision doesn't make
much sense for a microcontroller though, even if it's perfectly
reasonable in the desktop world where memcpy might actually be cheaper
than a hard-to-branch-predict conditional; and it's also motivated in
their case by error-tracing considerations.

- micropython#16816
This is an equivalent implementation to CPython's approach that places
this copy directly on the stack; however it is both slower and has
larger code size than the approach taken here.

- micropython#15503
The simplest implementation is to just not worry about it and let the
user face the consequences if they mutate the owner class.
That's not a very friendly behavior, though, and it's not actually much
more performant than this implementation on either time or code size.

- micropython#17693
Another alternative is to do the same as micropython#15503 but leverage
MicroPython's existing `is_fixed` field in its dict type to convert
attempted mutations of the owner dict into `AttributeError`s.
This is safer than just leaving the open hazard, but there's still
important use-cases for owner-mutating descriptors, and the performance
ain is small enough that it isn't worth missing support for those cases.

- combined micropython#17693 with this
Another version of this feature used a new feature define,
`MICROPY_PY_METACLASSES_LITE`, to control whether this algorithm or the
naive version is used. This was rejected in favor of simplicity, based
on the very limited performance margin the naive version has (which in
some cases even goes _against_ it).

Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
Signed-off-by: Anson Mansfield <amansfield@mantaro.com>
@AJMansfield
Copy link
Contributor Author

AJMansfield commented Jul 21, 2025

I ... might have a little play to see if there is any way to reduce code size.

One thought: are there any allocator modes that would make it safe to unconditionally call m_del_obj on the list's nodes? i.e. without having to guard against passing a location on the stack.

It might also be ok to just omit the call to m_del_obj entirely --- with the traversal algorithm already unlinking next when it sets the owner argument, what the garbage collector is left to clean up isn't actually that pessimistically structured.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
py-core Relates to py/ directory in source
Projects
None yet
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy