-
-
Notifications
You must be signed in to change notification settings - Fork 11.1k
ENH: Enable custom compression levels in np.savez_compressed
#29294
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
I shared this feedback over Slack but realized it makes more sense here. Not sure why there’s code in this PR to support old python versions like python 3.3, we support Python 3.11 or newer. It’s not clear to me why this feature is needed. NumPy is relatively conservative about adding new API surface, why add this? You might want to ping the mailing list to get people to take a look. One of the contribution guidelines is to ping the mailing list before opening a PR adding a new feature. |
So one reason we were a bit hesitant in the past, is that it'll break an array named My gut feelig might be that a |
Thank you for looking into this and sharing your feedback.
You’re right. That Python version check was part of my initial implementation. I’ve since realized NumPy no longer supports versions earlier than Python 3.11, so I’ve already removed those blocks from the current PR version.
The idea behind adding
I fully understand that. My intention wasn’t to expand the public API surface unnecessarily but rather, this is a small, backward-compatible enhancement of the existing |
…n options - Refactored `savez_compressed` to accept a `zipfile_kwargs` dictionary for compression settings, replacing the previous `compression` and `compression_opts` parameters. - Updated related tests to utilize the new `zipfile_kwargs` structure for specifying compression methods and levels. - Improved validation for compression levels and methods within the new structure.
@seberg Thanks for pointing that out! |
- Improved default behavior for compression settings, ensuring DEFLATED is used when compressing unless specified otherwise. - Added support for translating string-based compression methods to their corresponding integer constants. - Enhanced validation for `compresslevel`, ensuring it is an integer or None, and updated error messages for clarity. - Refactored internal logic to streamline compression method handling and validation.
- Simplified exception handling for unavailable compression methods in tests. - Removed legacy tests for Python versions <3.3, as NumPy now targets >=3.11. - Added new tests for invalid compression types and levels, ensuring robust validation. - Introduced case-insensitive handling for compression aliases in tests.
7ed85d9
to
0e074b1
Compare
This pull-request adds flexible compression support to
np.savez_compressed
while preserving backward compatibility for existing code.Implementation
zipfile_kwargs
parameter that is forwarded verbatim tozipfile.ZipFile
.compression
,compresslevel
, etc.) can now be passed through.compression=ZIP_DEFLATED
whensavez_compressed
is used and the caller does not specify one."stored"
,"deflated"
,"bzip2"
,"lzma"
) are mapped case-insensitively to the appropriatezipfile
constants.compresslevel
ranges per algorithm and for invalid types.compression=
/compression_opts=
are removed, so existing code that stores an array called"compression"
will continue to work.Tests
TestSavezCompressed
suite:zipfile_kwargs
everywhere.compression
/compresslevel
types and out-of-range levels.API changes
zipfile_kwargs
isNone
, it falls back to{"compression": ZIP_DEFLATED}
.Backward compatibility
zipfile_kwargs
continue to work unchanged."compression"
or"compression_opts"
are no longer shadowed by function parameters.Checklist
_npyio_impl.py
&zipfile_factory
.TestSavezCompressed
.pytest -q
) on Linux/macOS/Windows, CPython 3.11+.