-
-
Notifications
You must be signed in to change notification settings - Fork 32.1k
gh-59999: Add option to preserve permissions in ZipFile.extract #32289
New issue
Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.
By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.
Already on GitHub? Sign in to your account
base: main
Are you sure you want to change the base?
Conversation
Thanks for making this. I didn't really get back to my PR after it look a year to be reviewed. What I did differently is that I made preserving the permissions the default. The reason for this is that the unzip command which comes with any Linux distribution, MacOS or BSD system will preserve permissions by default as well. So the user will probably expect that to be the default. |
It's no problem
On the BPO issue (#59999) I proposed not doing this to maintain backwards compatibility |
Makes some sense, thanks for clarifying. Keep up the good work! |
Running test_zipfile I get two errors: Traceback (most recent call last): FAIL: test_extractall_preserve_none (test.test_zipfile.TestsPermissionExtraction.test_extractall_preserve_none) Ran 274 tests in 94.070s |
Ok, interesting. What OS and file system are you running this on? And, do the other test_extract tests pass, or were some of them skipped? And, do tests still fail on 6470201? |
I am running Fedora 35. I did a clean start (compiling python from scratch just in case) and ran the test again and got the same result. Other test_extract tests are ok ( except I don't know what you mean by 6470201, I didn't see any tests there. |
Oh of course, they didn't exist in that revision. I've just checked and the default mode for created files differs on Fedora. On CI and my system it's 644 on Fedora it seems to be 664. I'll write the test so it checks against what the default for the system is instead of the 0o644 constant. |
Co-authored-by: Éric <merwok@netwok.org>
Co-authored-by: Éric <merwok@netwok.org>
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looks great. I just have a few suggestions, mostly nitpicks.
Lib/zipfile/__init__.py
Outdated
# Ignore permissions if the archive was created on Windows | ||
if member.create_system == 0 or preserve_permissions == PreserveMode.NONE: | ||
return targetpath | ||
|
||
if preserve_permissions == PreserveMode.SAFE: | ||
mode = (member.external_attr >> 16) & 0o777 | ||
elif preserve_permissions == PreserveMode.ALL: | ||
mode = (member.external_attr >> 16) & 0xFFFF | ||
|
||
os.chmod(targetpath, mode) |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Looking at the diff today, it's easy to see this code is related to a single purpose. After merging, however, it becomes an inline part of a larger narrative.
For future readers, it might be clearer to have this behavior extracted as its own method.
e.g.
def _apply_permissions(self, member, path, mode):
if mode == PreserveMode.NONE:
return path
# Ignore permissions if the archive was created on Windows
if member.create_system == 0:
return path
mask = {
PreserveMode.SAFE: 0o777,
PreserveMode.ALL: 0xFFFF,
}
new_mode = (member.external_attr >> 16) & mask[mode]
os.chmod(path, new_mode)
return path
Then in this method, return self._apply_permissions(member, targetpath, preserve_permissions)
.
I also took the liberty to address a couple of nitpicks in the approach, such as to align the comment with the check, and to only perform the new mode calculation in a single place. WDYT?
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Definitely agree, although I changed _apply_permissions
to not return anything. How do you feel about that?
cpython/Lib/zipfile/__init__.py
Lines 1898 to 1903 in a2d7e77
with self.open(member, pwd=pwd) as source, \ | |
open(targetpath, "wb") as target: | |
shutil.copyfileobj(source, target) | |
self._apply_permissions(member, targetpath, preserve_permissions) | |
return targetpath |
cpython/Lib/zipfile/__init__.py
Lines 1835 to 1852 in a2d7e77
def _apply_permissions(self, member, path, mode): | |
""" | |
Apply ZipFile permissions to a file on the filesystem with | |
specified PreserveMode | |
""" | |
if mode == PreserveMode.NONE: | |
return | |
# Ignore permissions if the archive was created on Windows | |
if member.create_system == 0: | |
return | |
mask = { | |
PreserveMode.SAFE: 0o777, | |
PreserveMode.ALL: 0xFFFF, | |
} | |
new_mode = (member.external_attr >> 16) & mask[mode] | |
os.chmod(path, new_mode) |
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
Co-authored-by: Jason R. Coombs <jaraco@jaraco.com>
(Please follow the devguide: avoid force pushes, they create notifications with broken links for reviewers, and make it harder to see changes compared to previous time. Thanks!) |
Seems another look is needed from a reviewer? |
extracted as accurately as possible. *member* can be a filename or a | ||
:class:`ZipInfo` object. | ||
|
||
*path*, *pwd*, and *preserve_permissions* have the same meaning as for :meth:`extract`. |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
This looks like a mis-copy perhaps?
return | ||
|
||
# Ignore permissions if the archive was created on Windows | ||
if member.create_system == 0: |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think the check should probably be more conservative than this, there is also 10 which is Windows NTFS, and 20-255 are unused and could be invalid for this use case. An allow-list would be safer.
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I suggested some improvements, but if we follow @thatch's guidance and only support x
bits then &
-ing bitfields is not correct; what we would want to do instead is exactly what pip does here and
is_executable = mode and stat.S_ISREG(mode) and mode & 0o111
mode = 0o777 if is_executable else 0o666
so that any x
bit in u/g/o is extended to 0o111 then ANDed with umask
class PreserveMode(enum.Enum): | ||
"""Options for preserving file permissions upon extraction.""" | ||
NONE = enum.auto() | ||
SAFE = enum.auto() | ||
ALL = enum.auto() |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
There needs to be a mode for x
bits only - see #59999 (comment) . (Also this is the behaviour I want and how I found this issue.)
It is not very natural to use an enum here - much more general is to define preserve_mode
taking an int
so that I can pass 0o333
or something to preserve wx
bits and default the r
bit to 1
(from a default of 0o666
).
The constants here could directly be
PRESERVE_NONE = 0
PRESERVE_SAFE = 0o0777
PRESERVE_ALL = 0o7777
import importlib.util | ||
import io | ||
import os | ||
import pathlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I don't think this is used?
@@ -4,9 +4,12 @@ | |||
XXX references to utf-8 need further investigation. | |||
""" | |||
import binascii | |||
import contextlib |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
Unused?
PreserveMode.SAFE: 0o777, | ||
PreserveMode.ALL: 0xFFFF, | ||
} | ||
new_mode = (member.external_attr >> 16) & mask[mode] |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
I think this should honour the current umask. It's clumsy to get the current umask so the easiest way to do this is not to chmod the file after creation but to pass the requested mode bits into os.open()
with os.O_CREAT
when opening the target file for writing. It's also one fewer syscall so might be slightly faster.
mask = { | ||
PreserveMode.SAFE: 0o777, | ||
PreserveMode.ALL: 0xFFFF, | ||
} |
There was a problem hiding this comment.
Choose a reason for hiding this comment
The reason will be displayed to describe this comment to others. Learn more.
You could define
class PreserveMode(enum.IntEnum):
NONE = 0
SAFE = 0o777
ALL = 0o7777
then here it's just
(member.external_attr >> 16) & mode
(However, see my other comment about not restricting this to specific values - users may have arbitrary requirements that an int can capture better.)
Co-authored by Alexey Boriskin
https://bugs.python.org/issue15795
TODO