Skip to content

BUG: memmap segfaults on on-disk collision on the free-threaded build #29126

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Open
crusaderky opened this issue Jun 4, 2025 · 4 comments
Open
Labels
00 - Bug 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)

Comments

@crusaderky
Copy link
Contributor

crusaderky commented Jun 4, 2025

Describe the issue:

memmap segfaults on python 3.13t when two threads try to open the same file (that didn't previously exist) with mode='w+'.

Expected behaviour

Ideally, all threads read and write on the same file.
This should be in theory possible with three syscalls:

  1. open(fname, O_CREAT)
  2. resize, unless another thread has already resized it
  3. mmap

If the above was for some reason impossible (I particularly expect trouble on NFS and similar filesystems), the segfault should be replaced with a meaningful exception (that doesn't leak resources).

Reproduce the code example:

  1. deploy numpy through https://github.com/rgommers/pixi-dev-scipystack/
  2. apply patch (see DEV: stop monkeypatching and use of autouse=True fixtures by default in test suite #29090):
--- a/numpy/conftest.py
+++ b/numpy/conftest.py
@@ -147,9 +147,10 @@ def check_fpu_mode(request):
 def add_np(doctest_namespace):
     doctest_namespace['np'] = numpy
 
+
 @pytest.fixture(autouse=True)
-def env_setup(monkeypatch):
-    monkeypatch.setenv('PYTHONHASHSEED', '0')
+def env_setup():
+    os.environ['PYTHONHASHSEED'] = '0'
  1. pixi run test-nogil -- -k TestMemmap --parallel-threads=32

Error message:

All tests that use tmp_path segfault; e.g.

def test_open_with_filename(self, tmp_path):
tmpname = tmp_path / 'mmap'
fp = memmap(tmpname, dtype=self.dtype, mode='w+',
shape=self.shape)
fp[:] = self.data[:]
del fp

These tests are thread-unsafe due to Quansight-Labs/pytest-run-parallel#14, which causes all threads to call memmap on the same file name at once.

Python and NumPy Versions:

python 3.13t
numpy git tip 2025-06-04

Runtime Environment:

32-core x86_64 Linux host
/tmp is a ext4 mountpoint.

@crusaderky crusaderky changed the title BUG: <Please write a comprehensive title after the 'BUG: ' prefix> BUG: nogil: memmap segfaults on on-disk collision Jun 4, 2025
@ngoldbaum
Copy link
Member

On my Mac dev machine I get numpy/_core/tests/test_memmap.py::TestMemmap::test_del - AttributeError: 'TestMemmap' object has no attribute 'tmpfp' when I try to reproduce this, although I set up my environment manually without using pixi.

@ngoldbaum ngoldbaum added the 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703) label Jun 4, 2025
@ngoldbaum ngoldbaum changed the title BUG: nogil: memmap segfaults on on-disk collision BUG: memmap segfaults on on-disk collision on the free-threaded build Jun 4, 2025
@ngoldbaum
Copy link
Member

ngoldbaum commented Jun 4, 2025

It'd be nice to write a multithreaded test script that uses e.g. run_threaded from numpy.testing._private.utils to trigger the segfault so reproducing this doesn't rely running the numpy tests in a special way. That'd be an easy addition to the existing multithreaded tests too.

@crusaderky
Copy link
Contributor Author

On my Mac dev machine I get numpy/_core/tests/test_memmap.py::TestMemmap::test_del - AttributeError: 'TestMemmap' object has no attribute 'tmpfp' when I try to reproduce this, although I set up my environment manually without using pixi.

That sounds something wrong going on with how pytest-run-parallel calls teardown_method. At any rate, it's unrelated with the segfaults.

@seberg
Copy link
Member

seberg commented Jun 4, 2025

Never realized that memmap were still unsafe here. There may be some subtly here that makes this tricky, but the core of the fix would be to replace:

            self = ndarray.__new__(subtype, shape, dtype=descr, buffer=mm,
                                   offset=array_offset, order=order)

with np.frombuffer(). But while that should be safe these days, I believe, I suppose it might also run into problems with memview.close() refusing to work.

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Labels
00 - Bug 39 - free-threading PRs and issues related to support for free-threading CPython (a.k.a. no-GIL, PEP 703)
Projects
None yet
Development

No branches or pull requests

3 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy