-
-
Notifications
You must be signed in to change notification settings - Fork 32.5k
gh-136421: Load _datetime
static types during interpreter initialization
#136583
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
gh-136421: Load _datetime
static types during interpreter initialization
#136583
Conversation
I have no idea if this will fix the Windows build, but let's hope.
My concern is unnecessarily losing portability between the builtin module and the non-builtin one. This PR avoids a crash not by locking but forcing the main interpreter initialize the PyMODINIT_FUNC
PyInit__datetime(void)
{
+ if (PyDateTime_DateType.tp_subclasses == NULL) {
+ PyInterpreterState *interp = PyInterpreterState_Get();
+ // main interp currently checks a module before subinterp imports
+ assert(_Py_IsMainInterpreter(interp));
+ init_static_types(interp, 0);
+ }
return PyModuleDef_Init(&datetimemodule);
} |
Hmm, that feels really hacky. I wouldn't call the movement of |
The crash can be avoided even on pure Python level, and managed static extsension types are already special only for
Not sure about disabling. On Windows, |
Right, there are several ways to fix this that work, but I'm trying to find the fix that is the most correct. Relying on implementation details like
We have to find a balance between correctness and simplicity, and I'm not yet sure which that is. |
This is at least a feature change more than a bug fix. |
Yeah, you're right. I'd be fine with a bandaid fix for 3.14 if you want to make a PR. The |
Still rough example, but checking During the |
|
A small concern is the startup time. Can you do a micro performance benchmark to measure the time cost of |
#136620 is ready for review.
I haven't come up with how to measure so far. |
status = _PyDateTime_InitTypes(interp); | ||
if (_PyStatus_EXCEPTION(status)) { | ||
return status; | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
status = _PyDateTime_InitTypes(interp); | |
if (_PyStatus_EXCEPTION(status)) { | |
return status; | |
} | |
if (!_Py_IsMainInterpreter(interp)) { | |
status = _PyDateTime_InitTypes(interp); | |
if (_PyStatus_EXCEPTION(status)) { | |
return status; | |
} | |
} |
Reply from #136620 (comment)
Can you run test_concurrent_initialization()
with this change? Based on the crashes that come from the change, I said this PR and my example are "almost equivalent." I'm not sure right now what this PR ensures.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hm, what are you trying to achieve here? This will just break the types for the main interpreter.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This will just break the types for the main interpreter.
Note that test_concurrent_initialization()
does not load the _datetime
in the main inter interpreter at all.
Correction: Run the script of the test without running test_datetime
.
After discussion on DPO, the consensus is that we are okay with making @vstinner, would you mind doing a review? |
Lib/test/datetimetester.py
Outdated
@support.cpython_only | ||
def test_concurrent_initialization_subinterpreter(self): | ||
# Run in a subprocess to ensure we get a clean version of _datetime |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please put an anchor instead of the well-known explanation of assert_python_ok()
. Also, move the test to ExtensionModuleTests
(@support.cpython_only
is redundant there). TestDateTime
is the place to test the datetime class.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry, I'm not sure what you mean by "an anchor".
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Sorry about that. I meant the gh-issue number or the url.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Ok, did both.
Lib/test/datetimetester.py
Outdated
print('a', end='') | ||
|
||
with InterpreterPoolExecutor() as executor: | ||
for _ in range(8): |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Eh, I think this is fine. Many other systems have 8 cores, not 10.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Okay, but it sounds like you are talking about the max_workers
argument rather than the submit count here.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
max_workers
just chooses the number of threads on the system when not set. We could try to submit os.cpu_count()
number of futures, but we don't need to overcomplicate this; we just need something that stresses several subinterpreters trying to import _datetime
concurrently.
Modules/_datetimemodule.c
Outdated
|
||
// `&...` is not a constant expression according to a strict reading | ||
// of C standards. Fill tp_base at run-time rather than statically. | ||
// See https://bugs.python.org/issue40777 |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This PR does not address the possible races in PyDateTime_*.tp_base = &PyDateTime_*Type;
below, right?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess not. I don't think there's an easy way to do this here, because atomically storing tp_base
will continue to race with all the non-atomic reads elsewhere.
Do we even need to load it at runtime like this? We have other examples of directly storing it in the PyTypeObject
structure:
cpython/Objects/methodobject.c
Line 396 in 958657b
.tp_base = &PyCFunction_Type, |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed now because the module is statically linked, that issue happens only with dynamic loaded modules so you can define it statically now.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I guess a _Py_IsMainInterPreter()
check will suffice in this PR?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It is not needed now because the module is statically linked, that issue happens only with dynamic loaded modules so you can define it statically now.
Ah, TIL. That's definitely the best option here then.
I guess a _Py_IsMainInterPreter() check will suffice in this PR?
For what?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
For ensuring tp_base
is set only once after Py_Initialize()
. But I missed the Kumar 's comment.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Please remove PyDateTime_DateTimeType.tp_base = &PyDateTime_DateType;
as well.
🤖 New build scheduled with the buildbot fleet by @kumaraditya303 for commit 3520514 🤖 Results will be shown at: https://buildbot.python.org/all/#/grid?branch=refs%2Fpull%2F136583%2Fmerge If you want to schedule another build, you need to add the 🔨 test-with-buildbots label again. |
I'll merge this once buildbots pass, thanks for the reviews everyone! |
Thanks @ZeroIntensity for the PR 🌮🎉.. I'm working now to backport this PR to: 3.14. |
…tialization (pythonGH-136583) `_datetime` is a special module, because it's the only non-builtin C extension that contains static types. As such, it would initialize static types in the module's execution function, which can run concurrently. Since static type initialization is not thread-safe, this caused crashes. This fixes it by moving the initialization of `_datetime`'s static types to interpreter startup (where all other static types are initialized), which is already properly protected through other locks. (cherry picked from commit a109606) Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
GH-136943 is a backport of this pull request to the 3.14 branch. |
…itialization (GH-136583) (GH-136943) gh-136421: Load `_datetime` static types during interpreter initialization (GH-136583) `_datetime` is a special module, because it's the only non-builtin C extension that contains static types. As such, it would initialize static types in the module's execution function, which can run concurrently. Since static type initialization is not thread-safe, this caused crashes. This fixes it by moving the initialization of `_datetime`'s static types to interpreter startup (where all other static types are initialized), which is already properly protected through other locks. (cherry picked from commit a109606) Co-authored-by: Peter Bierma <zintensitydev@gmail.com>
_datetime
is a special module, because it's the only non-builtin C extension that contains static types. As such, it would initialize static types in the module's execution function, which can run concurrently. Since static type initialization is not thread-safe, this caused crashes. This fixes it by moving the initialization of_datetime
's static types to interpreter startup (where all other static types are initialized), which is already properly protected through other locks._datetime
in sub-interpreters in the same time may crash the process #136421