Skip to content

ENH: Add madvise for memmap objects #24489

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Draft
wants to merge 1 commit into
base: main
Choose a base branch
from

Conversation

agoscinski
Copy link

Adds madvise to memmap. See issue #13172 for more information

@seberg we discussed shortly on the sprint day last week about a TypeError being raised when the memmap is not backed by any memory, but I could not find a reasonable case where this can happen. It is always backed when constructed. So I changed the access on base to _mmap so I can make test for the error, since base is not writable. Then the function is also more consistent with the __getitem__ function that also uses _mmap. But now the error message is misleading, because _mmap can be None, while base is not None, so the memmap it is still backed up. I don't think this None check actually makes sense in the current version and would remove it and switch usage everywhere to base in memmap.

What can actually happen is that the mmap is closed, and one can get a segfault because of invalid accesses (my_memmap.base.close() using functions marked as public). Maybe we test for this instead?

if not(self.base.close):
    raise TypeError("Memory map is closed and can therefore not be accessed.")

What do you think?

@seberg
Copy link
Member

seberg commented Sep 6, 2023

Using _mmap seems fine, just tweak the error message a bit. I suspect that you are right and we should not lose _mmap unless we also use the memmap class itself.

I am not quite sure what happens for view = arr[1:] if you then try to use madvise, etc.? Would this always fail due to the pagesize problem? In general users should maybe mostly use madvise on the original array and not the view.

Not sure I feel we need to clean up the error for Python (Python's mmap should maybe do this), but OTOH, it doesn't see terrible?

What can actually happen is that the mmap is closed, and one can get a segfault because of invalid accesses

Right, but this is an existing problem that I don't think we can do anything about.

Copy link
Member

@seberg seberg left a comment

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Sorry we never got back to this, there was a new issue that may have been related. By now, madvise always exists (but I am not sure if the method exists on all systems, so the error is probably still right).

I don't really remember where we seemed a bit stuck now?! In the end the only slight complexity seems behavior for views (and if the error is a bit annoying, that isn't the end of the world).

I would be fine with views working on their memory range, but even just rejecting them for now seems also OK.

@@ -322,10 +326,72 @@ def flush(self):
See Also
--------
memmap
"""
if self._mmap is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I think this makes sense, but should double check or _mmap is propagated on views. isinstance(self.base, mmap) may also make sense actually.

"""
if self.base is not None and hasattr(self.base, 'flush'):
self.base.flush()
if self._mmap is None:
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

I was wondering about views in that comment (if said view is also a memmap, e.g. slice).

I think that is a worry, although we could side-step it by rejecting madvise on views (again via checking if self.base is self._mmap, not 100% sure that always exists).

"multiple of mmap.PAGESIZE.")
else:
extra_info = ""
raise OSError(e.errno, e.strerror + extra_info)
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Maybe add from e, although I would also be OK to just put this into the documentation.
Unless this can be triggered by an offset= during mmap creation?

madvise
Flush any changes in memory to file on disk.
When you delete a memmap object, flush is called first to write
changes to disk.
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

That's the wrong docs :).

fp._mmap = None
expected_err_msg = ("madvise cannot be used, because the memory map it "
"is not backed by any memory.")
with pytest.raises(TypeError, match=re.escape(expected_err_msg)):
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

Wouldn't worry with escaping for that one . at the end, but fair :).

if self._mmap is None:
raise TypeError(
"flush cannot be used, because the memory map it "
"is not backed by any memory.")
Copy link
Member

Choose a reason for hiding this comment

The reason will be displayed to describe this comment to others. Learn more.

The error seems a bit confusing. Maybe just "not actually backed by a memory map"?

@seberg
Copy link
Member

seberg commented Apr 25, 2025

(If you are not interested anymore, don't hesitate to just close, maybe someone will pick it up eventually.)

Sign up for free to join this conversation on GitHub. Already have an account? Sign in to comment
Projects
Status: Pending authors' response
Development

Successfully merging this pull request may close these issues.

2 participants
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy