Skip to content

bpo-37538: Zipfile refactor #14957

New issue

Have a question about this project? Sign up for a free GitHub account to open an issue and contact its maintainers and the community.

By clicking “Sign up for GitHub”, you agree to our terms of service and privacy statement. We’ll occasionally send you account related emails.

Already on GitHub? Sign in to your account

Closed
wants to merge 29 commits into from
Closed
Changes from 1 commit
Commits
Show all changes
29 commits
Select commit Hold shift + click to select a range
a0db1c9
Add descriptive global variables for general purpose bit flags
danifus Jul 10, 2019
6710baf
Add global variable for zip64 extra data header id
danifus Jul 10, 2019
3777389
Add flag properties to ZipInfo
danifus Jul 10, 2019
f435f08
Restructure how ZipExtFile gets created from ZipFile.open
danifus Jul 10, 2019
ca41137
Fix bug when seeking on encrypted zip files
danifus Jul 10, 2019
00c87ee
Refactor _ZipDecrypter with a BaseZipDecrypter class
danifus Jul 11, 2019
b8364a6
Move compressor and decompressor selection code into classes
danifus Jul 12, 2019
6b256c0
Add zipinfo_cls, zipextfile_cls and zipwritefile_cls to ZipFile
danifus Jul 12, 2019
af8864b
Fix typo datadescripter -> datadescriptor
danifus Jul 13, 2019
42c4be6
Add dosdate and dostime properties to ZipInfo
danifus Jul 13, 2019
801d966
Move encoding datadescriptor to ZipInfo
danifus Jul 13, 2019
46604e0
Refactor how ZipInfo encodes the local file header.
danifus Jul 13, 2019
7d28d8f
Move central directory encoding to ZipInfo
danifus Jul 14, 2019
c784d7f
Move struct packing of central directory record to a ZipInfo method
danifus Jul 14, 2019
f84e481
Refactor _decodeExtra to allow subclasses to support new extra fields
danifus Jul 14, 2019
1a07518
Change the way zipfile _decodeExtra loops through the extra bytes
danifus Jul 14, 2019
6de1a9a
Decouple updating and checking crc when reading a zipfile
danifus Jul 14, 2019
6b90dfd
Move writing zipfile local header to _ZipWriteFile
danifus Jul 14, 2019
4417cc5
Move writing local header to within _ZipWriteFile
danifus Jul 15, 2019
bfa8a7e
Add some comments to zipfile's LZMACompressor
danifus Jul 15, 2019
a211abe
Add comments to ZipFile._write_end_record describing structs
danifus Jul 17, 2019
3eff8be
Small performance fix to zipfile.CRCZipDecrypter
danifus Jul 22, 2019
7220ef9
Refactor ZipFile encoding approach
danifus Jul 22, 2019
0a718f7
Change ZipInfo encoding of local extra data
danifus Jul 22, 2019
cb826d6
Allow ZipFile _open_to_write() and _open_to_read() to take kwargs
danifus Jul 26, 2019
5a88b2d
Change ZipFile._open_to_write() to accept pwd argument.
danifus Jul 26, 2019
fa374ee
ZipFile remove special case path for ZIP_STORED
danifus Jul 26, 2019
5bb4c17
📜🤖 Added by blurb_it.
blurb-it[bot] Jul 26, 2019
366f79f
bpo-37538: Small clean up of zipfile refactor
danifus Jul 27, 2019
File filter

Filter by extension

Filter by extension

Conversations
Failed to load comments.
Loading
Jump to
Jump to file
Failed to load files.
Loading
Diff view
Diff view
Prev Previous commit
bpo-37538: Small clean up of zipfile refactor
This clean up fixes some short-comings identified when implementing the
AES code used to show the utility of this refactor.
  • Loading branch information
danifus committed Jul 27, 2019
commit 366f79f47aa880b161b22449f5ce9b065754de62
39 changes: 23 additions & 16 deletions Lib/zipfile.py
Original file line number Diff line number Diff line change
Expand Up @@ -628,9 +628,9 @@ def FileHeader(self, zip64=None):
def get_central_directory_kwargs(self):
min_version = 0
# Strip the zip 64 extra block if present
extra_data = _strip_extra(self.extra, (EXTRA_ZIP64,))
extra = _strip_extra(self.extra, (EXTRA_ZIP64,))

(zip64_extra_data,
(zip64_extra,
file_size,
compress_size,
header_offset,
Expand All @@ -642,7 +642,7 @@ def get_central_directory_kwargs(self):
# There are reports that windows 7 can only read zip 64 archives if the
# zip 64 extra block is the first extra block present. So we make sure
# the zip 64 block is first.
extra_data = zip64_extra_data + extra_data
extra = zip64_extra + extra

if self.compress_type == ZIP_BZIP2:
min_version = max(BZIP2_VERSION, min_version)
Expand Down Expand Up @@ -671,7 +671,7 @@ def get_central_directory_kwargs(self):
"internal_attr": self.internal_attr,
"external_attr": self.external_attr,
"header_offset": header_offset,
"extra_data": extra_data,
"extra": extra,
"comment": self.comment,
}

Expand All @@ -680,7 +680,7 @@ def _encode_central_directory(self, filename, create_version,
flag_bits, compress_type, dostime, dosdate,
crc, compress_size, file_size, disk_start,
internal_attr, external_attr, header_offset,
extra_data, comment):
extra, comment):
try:
centdir = struct.pack(
structCentralDir,
Expand All @@ -697,7 +697,7 @@ def _encode_central_directory(self, filename, create_version,
compress_size,
file_size,
len(filename),
len(extra_data),
len(extra),
len(comment),
disk_start,
internal_attr,
Expand All @@ -713,11 +713,11 @@ def _encode_central_directory(self, filename, create_version,
create_system, extract_version, reserved,
flag_bits, compress_type, dostime, dosdate,
crc, compress_size, file_size,
len(filename), len(extra_data), len(comment),
len(filename), len(extra), len(comment),
disk_start, internal_attr, external_attr,
header_offset), file=sys.stderr)
raise
return centdir + filename + extra_data + comment
return centdir + filename + extra + comment

def central_directory(self):
params = self.get_central_directory_kwargs()
Expand Down Expand Up @@ -844,7 +844,10 @@ class BaseDecrypter:
def start_decrypt(self, fileobj):
"""Initialise or reset the decrypter.

Returns the number of bytes in the "encryption header" section.
Returns the number of bytes used for encryption that should be excluded
from the _compress_size counter (eg. the "encryption header" section
and any bytes after the "file data" used for encryption, such as the
HMAC value for winzip's AES encryption).

By the end of this method fileobj should be at the start of the
"file data" section.
Expand Down Expand Up @@ -1275,13 +1278,14 @@ def start_decrypter(self):

# self._decrypter is responsible for reading the
# "encryption header" section if present.
encryption_header_length = self._decrypter.start_decrypt(self._fileobj)
encryption_header_footer_length = self._decrypter.start_decrypt(self._fileobj)
# By here, self._fileobj should be at the start of the "file data"
# section.

# Adjust read size for encrypted files by the length of the
# "encryption header" section.
self._compress_left -= encryption_header_length
# "encryption header" section and any bytes after the encrypted
# data.
self._compress_left -= encryption_header_footer_length

def __repr__(self):
result = ['<%s.%s' % (self.__class__.__module__,
Expand Down Expand Up @@ -1579,16 +1583,19 @@ def write(self, data):
self._fileobj.write(data)
return nbytes

def flush_data(self):
if self._compressor:
buf = self._compressor.flush()
self._compress_size += len(buf)
self._fileobj.write(buf)

def close(self):
if self.closed:
return
try:
super().close()
self.flush_data()
# Flush any data from the compressor, and update header info
if self._compressor:
buf = self._compressor.flush()
self._compress_size += len(buf)
self._fileobj.write(buf)
self._zinfo.compress_size = self._compress_size
self._zinfo.CRC = self._crc
self._zinfo.file_size = self._file_size
Expand Down
pFad - Phonifier reborn

Pfad - The Proxy pFad of © 2024 Garber Painting. All rights reserved.

Note: This service is not intended for secure transactions such as banking, social media, email, or purchasing. Use at your own risk. We assume no liability whatsoever for broken pages.


Alternative Proxies:

Alternative Proxy

pFad Proxy

pFad v3 Proxy

pFad v4 Proxy