-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
MNT: add linter for thread-unsafe C API uses #28634
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
Conversation
62be0b4
to
b77b43a
Compare
Does this need a whole new CI job? I know I suggested that in the issue, but it occurs to me that this could be integrated into e.g. the Also rather than just checking diffs we could do this codebase-wide and add e.g. If it ends up being really invasive to add the lint opt-outs we can reconsider. If the lint fails, it should suggest what to do to fix the issue. One of:
or, add an opt-out to the lint if:
The linter should also probably check all of the functions listed in the free-threaded HOWTO in the CPython docs: https://docs.python.org/3/howto/free-threading-extensions.html#borrowed-references Also see here for more details about why we need to do this and why we can't just globally replace all these functions: https://py-free-threading.github.io/porting-extensions/#cpython-c-api-usage We probably should also be linting uses of fast accessor macros that don't do any locking. Also, FWIW, there are probably still lurking issues. I'm aware of |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this goes in the right direction, but may need a bit of care. Overall, perhaps good to keep it simple, and maybe just use git grep
for all, like,
git grep -e PyList_GetItem\( --or -e PyList_GET_ITEM\( ... origin/main...HEAD
Or perhaps better write the list to a file and use -f
.
Notes:
- important to add the
(
otherwise correct functions likePyList_GetItemRef
will be matched too. - You have to be sure to match all C files, so also
.c.src
and.cpp
, etc. My suggestion would be to just match all files and perhaps remove.rst
... since this only runs on the differences, false positives are not very likely.
I almost wonder if one could do a custom clang-tidy style check (not if that can that work with Plus that means we have an established pattern for allowing it, just add a |
e69e629
to
3cf055d
Compare
This is looking much closer to being mergeable. Let me know if you need a hand with getting the circleci build fixed. |
.. or maybe it's unrelated and would go away with a rebase? |
bd4afe5
to
5e57f30
Compare
|
7811388
to
0ea4d9c
Compare
af12173
to
dccef23
Compare
7bc287b
to
3e90169
Compare
d155200
to
56b3d77
Compare
Let me know if you'd like a hand with some git surgery :) |
d2828c9
to
b5f5ffb
Compare
@ngoldbaum looks like all the tests are passing now.
ALL_FILES=$(find numpy -type f \( -name "*.c" -o -name "*.h" -o -name "*.c.src" -o -name "*.cpp" \) ! -path "*/pythoncapi-compat/*") Let me know if you need any additional changes. thanks for the help! |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Awesome, I'm glad this is shaping up.
I guess my main issue with this is that we should be relatively sure that all of the ones we're marking as OK really are OK, otherwise we'll be misleading future readers.
I need to sit down with this PR and try to stare at all the uses and see if I can come up with possible problems.
We might need a new NOQA category for uses that are known to be problematic but need manual fixes. That would allow us to catch and triage new uses in CI while not needing to go through and fix absolutely everything that's still in the library.
I will try to do this soon but may not have a ton of time before PyCon.
@@ -108,7 +108,7 @@ PyUFuncOverride_GetOutObjects(PyObject *kwds, PyObject **out_kwd_obj, PyObject * | |||
* PySequence_Fast* functions. This is required for PyPy | |||
*/ | |||
PyObject *seq; | |||
seq = PySequence_Fast(*out_kwd_obj, | |||
seq = PySequence_Fast(*out_kwd_obj, // noqa: borrowed-ref OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I actually have some changes I need to upstream related to this from back in February, see #28046 (comment). Maybe I should finally get around to sending in what I have...
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
oh I think I missed this one.
It sounds like you have a manual fix for this so I'll label this as:
// noqa: borrowed-ref - manual fix needed
Hi all, I'm really sorry for the delay here. Totally my fault for letting it sit for so long. I'll try to set aside some time to get this up to shape. |
@ngoldbaum No worries I should've followed up on your comments a lot sooner
|
The two links I shared above are where you should look. You could also watch the PyCon talk @lysnikolaou and I gave about free-threaded support: https://www.youtube.com/watch?v=EuU3ksI1l04, which covers this. In short, borrowed references to items in mutable containers (e.g. dicts, lists) are unsafe if the container is visible to another thread, because the item might be de-allocated if its refcount happens to go to zero. Whether or not the container is visible to other threads is context-dependent. We don't want to wholesale move everything to strong reference APIs because of performance and the risk of bugs. Performance because borrowed references skip an incref and a decref, which might be very expensive in a tight loop. Bugs because reference counting logic in C is very tricky, particularly in code that has complicated control flow, and is easy to screw up.
Makes sense to me. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
spotted some issues with how you've set the markers so far, let me know if you have questions about my logic
cool thank you for reviewing. I had a loose rubric for each use of thread-unsafe functions but your input was helpful. I updated the comments according to your input. For reference my rubric is outlined below.
|
ec4741f
to
ad699bf
Compare
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left a few comments below. Sorry I didn't make it clearer on the last review that I wasn't exhaustive in my comments. I'm also not exhaustive in pointing out every instance of a pattern I commented about here, please make sure that all of the ones dealing with fields
below are marked as OK, since it's effectively append-only and I don't think it's worth the reference counting churn to fix every spot you have currently marked as needing a fix.
numpy/_core/src/multiarray/buffer.c
Outdated
@@ -268,7 +268,7 @@ _buffer_format_string(PyArray_Descr *descr, _tmp_string_t *str, | |||
int ret; | |||
|
|||
name = PyTuple_GET_ITEM(ldescr->names, k); | |||
item = PyDict_GetItem(ldescr->fields, name); | |||
item = PyDict_GetItem(ldescr->fields, name); // noqa: borrowed-ref - manual fix needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
It looks like you missed an access to fields
here. Can you go through and make sure all of them are marked OK per my explanation from the last round of review?
@@ -130,7 +130,7 @@ PyArray_IntpConverter(PyObject *obj, PyArray_Dims *seq) | |||
* dimension_from_scalar as soon as possible. | |||
*/ | |||
if (!PyLong_CheckExact(obj) && PySequence_Check(obj)) { | |||
seq_obj = PySequence_Fast(obj, | |||
seq_obj = PySequence_Fast(obj, // noqa: borrowed-ref OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this isn't OK, because another thread can access obj
.
@@ -1135,7 +1135,7 @@ PyArray_IntpFromSequence(PyObject *seq, npy_intp *vals, int maxvals) | |||
{ | |||
PyObject *seq_obj = NULL; | |||
if (!PyLong_CheckExact(seq) && PySequence_Check(seq)) { | |||
seq_obj = PySequence_Fast(seq, | |||
seq_obj = PySequence_Fast(seq, // noqa: borrowed-ref OK |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ditto
@@ -2749,7 +2749,7 @@ nonstructured_to_structured_resolve_descriptors( | |||
|
|||
Py_ssize_t pos = 0; | |||
PyObject *key, *tuple; | |||
while (PyDict_Next(to_descr->fields, &pos, &key, &tuple)) { | |||
while (PyDict_Next(to_descr->fields, &pos, &key, &tuple)) { // noqa: borrowed-ref - manual fix needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
fields again, this is fine, ditto for all the other ones that go through PyDataType_FIELDS below.
@@ -1522,7 +1522,7 @@ arr_add_docstring(PyObject *NPY_UNUSED(dummy), PyObject *const *args, Py_ssize_t | |||
PyTypeObject *new = (PyTypeObject *)obj; | |||
_ADDDOC(new->tp_doc, new->tp_name); | |||
if (new->tp_dict != NULL && PyDict_CheckExact(new->tp_dict) && | |||
PyDict_GetItemString(new->tp_dict, "__doc__") == Py_None) { | |||
PyDict_GetItemString(new->tp_dict, "__doc__") == Py_None) { // noqa: borrowed-ref - manual fix needed |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I left an incorrect comment earlier, on second though this is theoretically an issue, if there's a race to set __doc__
.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
ok sounds good. Not sure what happened to my branch. Triaging atm, will try to re-open. Not sure why the PR closed when I committed.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Hi @ngoldbaum, really sorry for the inconvenience but are you able to re-open this PR on your end? I am unable to re-open the PR.
I'm not sure why my git push --force
automatically closed the PR. But I must've done something incorrectly. I'm reviewing my branch again to implement your suggested changes.
If re-opening on your end is not an option, I can submit a new PR. Again, sorry for the inconvenience.
Just open a new PR. I think id you hit 0 commits, this can happen. |
I just tried and I can't re-open it. Breaking github is a little bit of a "learning advanced git" right of passage, no worries on our end about this. |
Adding a linter to audit PRs.
This would lint out PRs that use problematic functions such as
PyList_GetItem
PyList_GET_ITEM
PyDict_GetItem
PyDict_getItemWithError
PyDict_Next
PyDict_GetItemString
_PyDict_GetItemStringWithError
Attempts to resolve #26159