-
Notifications
You must be signed in to change notification settings - Fork 544
BLD: add freethreading_compatible Cython markers #2478
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: master
Are you sure you want to change the base?
BLD: add freethreading_compatible Cython markers #2478
Conversation
Codecov ReportAll modified and coverable lines are covered by tests ✅
Additional details and impacted files@@ Coverage Diff @@
## master #2478 +/- ##
=======================================
Coverage 90.32% 90.32%
=======================================
Files 16 16
Lines 2439 2439
=======================================
Hits 2203 2203
Misses 236 236 ☔ View full report in Codecov by Sentry. 🚀 New features to boost your workflow:
|
Thanks! So am I understanding correctly that binaries built with this change - including the wheels we publish - would leave the GIL off in free-threaded Python, potentially exposing the user to bugs if our code is not completely thread safe? Is there a way we could make this a convenient option to set at build time, but still have the wheels we publish on PyPI require the GIL for now? I.e. I want the first round of people testing h5py without the GIL to build h5py from source, as a strong signal that you're testing something that may not work. I can see the messaging in the Python docs is that free-threading is experimental, but I expect people are going to be really excited about this. I'd much rather enthusiastic non-experts see h5py as incompatible for now, than have it claim compatibility when it might not work. |
freethreading python actually requires special wheels altogether, and there are still not built by default with cibuildwheel. This changes nothing for regular wheels, and only makes a difference for anyone we wants to build from source in freethreading Python. I very much agree that downstream testers should build from source (or maybe get a nightly if we can set this up for ubuntu), and I am not proposing we publish stable cp313t wheels for now ! |
Great, thanks! So these markers allow it to build freethreading compatible binaries if you ask. How do you ask for that? |
Not quite: you can already build free-threading binaries without them, it's just that, by default the GIL is re-enabled when one imports any extension that doesn't explicitly declare that it's fine with having the GIL released, and you get the following warning: RuntimeWarning: The global interpreter lock (GIL) has been enabled to load module 'h5py._errors', which has not declared that it can run safely without the GIL. To override this behavior and keep the GIL disabled (at your own risk), run with PYTHON_GIL=0 or -Xgil=0. So it is already possible to get the same effect (disable the warning) on the user side by invoking python with the appropriate flags (and maybe we should actually just let users do that for now, up for debate).
Locally, one needs to install Python 3.13t. Cython 3.1.0 is also needed (but not released yet) so it should be built from source or installed from https://pypi.anaconda.org/scientific-python-nightly-wheels/simple/ . Build isolation should also be turned off for this reason. In cibuildwheel, free-threaded builts are controled by a flag which is off by default. Also note that NumPy 2.1.0 has cp313 wheels for mac and linux, but not for windows, so if anyone wants to try it there, they'll also need to build numpy from source (which I guess will likely not work out of the box). |
Right, sorry, replace 'free-threading binaries' with 'extension modules that don't trigger re-enabling the GIL'. Thanks, I hadn't realised that it was possible to override the automatic re-enabling of the GIL. That will presumably be the easiest way to test h5py until we're ready to publish wheels that claim to be compatible with free-threading. |
Good point, all I really want to do is to make sure that testing is possible downstream, but this way may give off the wrong message that we're completely ready. I'll try setting up a CI job instead, and draft this for now. |
I think the first piece of the puzzle would be something like diff --git a/.github/workflows/build_wheels.yml b/.github/workflows/build_wheels.yml
index 2e9a1b6a..6fff93df 100644
--- a/.github/workflows/build_wheels.yml
+++ b/.github/workflows/build_wheels.yml
@@ -118,6 +118,7 @@ jobs:
TOX_TEST_LIMITED: ${{ env.TOX_TEST_LIMITED }}
CIBW_PRERELEASE_PYTHONS: ${{ env.CIBW_PRERELEASE_PYTHONS }}
CIBW_BEFORE_TEST: ${{ env.CIBW_BEFORE_TEST }}
+ CIBW_FREE_THREADED_SUPPORT: ${{ (github.event_name == 'schedule' || github.event_name == 'workflow_dispatch') }}
if: steps.triage.outputs.skip != '1'
# And upload the results However, as I mentionned earlier, building requires Cython 3.1 (unreleased) and it looks like it could land soon enough that it's not worth special casing it now.
|
I do not think we should mark as "free-threading-safe" until we have done some testing forcing YOLO mode (e.g. To start with we should add a cp313t job to the test matrix that runs the current test suite with |
I except this will become significantly easier to setup soon: |
FYI, v5.5.0 is available now and has full free-threading support.
Note that this just changed to |
7eb52f7
to
cc8f973
Compare
thanks for reminding me Ralf ! |
Are you planning to wait until the Cython 3.1.0 release before merging this? Even then, it's probably pretty aggressive to require that immediately as the minimum supported version. If not, you're probably a lot better off passing the compiler_directives = {}
if Version(cython_version) >= Version("3.1.0b1"):
compiler_directives["freethreading_compatible"] = True
setup(
ext_modules=cythonize(
extensions,
compiler_directives=compiler_directives,
)
) (from https://py-free-threading.github.io/porting-extensions/#__tabbed_1_3) That way this PR will mergeable now, and you don't unnecessarily force distros onto a brand new Cython release. |
Thanks for the suggestion. Indeed, I intend to polish this before undrafting; bumping our Cython requirement is a shortcut I'm taking to experiment quickly, not something I want to ship. |
Failing on all platforms with error messages similar to
I would normally suspect an outdated |
If you read the log carefully you'll see that the initial install of the built wheel is actually fine. It fails in the
|
Ah ! I agree that running tox in a containerized context is weird, I just failed to realize that it might be where the problem was, thanks for pointing it out ! |
I have mixed feelings about tox, but I do see a definite advantage in trying to keep as much as possible of the 'how to build and test this' info in a format that's not tied to one specific CI provider, or even to running in CI at all. h5py has used quite a few different CI platforms over the years, and while for now it looks like Github Actions might be a good answer, it still seems sensible not to tie ourselves to it more than we have to. The flip side of course is that we haven't actually done a great job of this; more of that detail lives in CI config than in tox.ini anyway. 🤷 |
no idea about preferences/concensus in this project, but I'd say it's far from radical at least. I quite dislike In terms of environment/dependency management, there's effectively very little of interest in |
FYI from colleagues I heard that in a couple of projects switching from |
5e907f3
to
18db6d3
Compare
tox-dev/tox#3391 was just resolved in tox 4.26.0, and Cython 3.1 was shipped last week. Let's see where we're at. |
9ed5197
to
3eb5d62
Compare
This is starting to look good. It'd be great to also set |
This looks pretty safe to merge (at least to me). I guess if no-one else merges it in a week, I'll do it? |
If I've understood this correctly, the markers and the I think the first step is to get some realistic testing of h5py with the GIL disabled, before we claim it's compatible. Running our test suite is a good first step - I think this PR does this, though it would be good to ensure the GIL is not getting accidentally reenabled during the test run. But I don't think most of our existing tests really exercise multiple threads. We should either look for a realistic example using threads & h5py, or create our own stress-tests doing HDF5 operations from multiple threads. The callback-based iterators like This doesn't necessarily need to be part of our regular test suite for now; it could be a separate script that we run manually for starters. But I think testing with |
I meant this a first step, mainly to unblock interested parties downstream so we can get feedback from actual applications; In other words, building Following the introduction of free-threading PyPI troves, my intention would be to get a release (or maybe just nightlies ?) with the level 1 support ("Experimental"). I don't have deep-enough knowledge of h5py to know where thread-safety may arise, so I'm not in a good position to add concurrency tests without help, but I'd gladly work on making some if you can give me a general idea of what should be stress-tested. |
Correct me if I've misunderstood, but I believe there are two levels of
free-threading compatibility: building for the cp313t ABI, and the flag
that tells Python not to re-enable the GIL when the extension module is
loaded. I think this PR currently does both, and I don't want to do the
second step just yet. People who want to take their chances and test
without the GIL can do so anyway by setting PYTHON_GIL=0. I'd like to hear
from some people trying that before going to step 2. I think the flag
within the extension modules is more prominent than docs or trove
classifiers.
In terms of testing, I think the starting point is just to do a bunch of
HDF5 operations from concurrent threads: create and delete groups,
datasets, attributes, write into chunked datasets so chunks have to be
allocated, change any flexible metadata, whatever you can think of. The
places where HDF5 can call back into Python could be a particular
challenge. I hope that having our own big lock in h5py will mean it all
works, but it's a pretty big complex thing, and I don't know where we might
be unwittingly relying on the GIL.
…On Sat, 31 May 2025, 19:00 Clément Robert, ***@***.***> wrote:
*neutrinoceros* left a comment (h5py/h5py#2478)
<#2478 (comment)>
I meant this a first step, mainly to unblock interested parties downstream
so we can get feedback from actual applications; In other words, building
cp3**t wheels doesn't equate to "claiming support". Well, maybe to some
level, but in any case we'd need to clarify in docs/release notes what's
actually supported and what's not.
Following the introduction of free-threading PyPI troves
<https://py-free-threading.github.io/porting/#define-and-document-thread-safety-guarantees>,
my intention would be to get a release (or maybe just nightlies ?) with the
level 1 support ("Experimental").
I don't have deep-enough knowledge of h5py to know where thread-safety
*may* arise, so I'm not in a good position to add concurrency tests
without help, but I'd gladly work on making some if you can give me a
general idea of *what* should be stress-tested.
—
Reply to this email directly, view it on GitHub
<#2478 (comment)>, or
unsubscribe
<https://github.com/notifications/unsubscribe-auth/AACQB5P74TGR2AFTHKYTIW33BHNZ7AVCNFSM6AAAAABMXRMYQ6VHI2DSMVQWIX3LMV43OSLTON2WKQ3PNVWWK3TUHMZDSMRVGQZTQMRQGM>
.
You are receiving this because you commented.Message ID:
***@***.***>
|
|
Sounds reasonable. In fact, running our test suite with |
Actually, if I only use 2 threads, I get clearer error messages.
I think they are all legit exceptions from HDF5 itself ? I guess these tests should be marked as |
It's clear now that most of this PR is premature, so, switching back to draft mode. |
Yeah, if two threads try to create an object in the same file with the same name, it's expected that one of them will fail (with or without the GIL in play). That's not unsafe, just the tests are written with an assumption that doesn't hold when running multiple of the same in parallel. |
Of course, my mistake. I'm trying to get rid of this assumption in #2588 |
I'm bouncing around my dependencies checking on 3.14 and free-threading support and found this PR. One thing you could add to this PR is the free-threading classifier in pyproject.toml:
See https://py-free-threading.github.io/porting/#define-and-document-thread-safety-guarantees Thanks for working on this! |
at this stage I would rather go with |
partially address #2475
All this does is avoid reenabling the GIL on import for extension modules, and on Python 3.13t: it is not a claim for complete thread-safety, rather a declaration of intention that we want to do it and make it easier for downstream testers at least try it out so we can prioritize real-world applications when we start actually implementing thread safety (if needed).