"loc: wrong sig" in ZipFileSystem on a valid ZIP file (extra data field of exactly 5 bytes)
Lance Andersen
LANCE.ANDERSEN at ORACLE.COM
Wed Oct 21 16:37:02 UTC 2020
Hi David,
From a quick look at ZipFileSystem this appears to be an optimization to avoid looking at the LOC extended Timestamp Extra field
If a Info-ZIP Extended Timestamp (0x5455)is found then:
If the "zipinfo-time" entry was set to “false” in the Map specified when creating the Zip FileSystem,
FileSystems.newFileSystem(zipFile, Map.of("zipinfo-time", "false")
and the data size of the CEN extended Timestamp is 5 (flag + mod time)
The modified time is used from the CEN Extended Timestamp extra field
Otherwise get the modified time, creation time, and access time from the LOC Extended Timestamp extra field.
—————
Extended Timestamp Extra Field:
==============================
The following is the layout of the extended-timestamp extra block.
(Last Revision 19970118)
Local-header version:
Value Size Description
----- ---- -----------
(time) 0x5455 Short tag for this extra block type ("UT")
TSize Short total data size for this block
Flags Byte info bits
(ModTime) Long time of last modification (UTC/GMT)
(AcTime) Long time of last access (UTC/GMT)
(CrTime) Long time of original creation (UTC/GMT)
Central-header version:
Value Size Description
----- ---- -----------
(time) 0x5455 Short tag for this extra block type ("UT")
TSize Short total data size for this block
Flags Byte info bits (refers to local header!)
(ModTime) Long time of last modification (UTC/GMT)
The central-header extra field contains the modification time only,
or no timestamp at all. TSize is used to flag its presence or
absence. But note:
If "Flags" indicates that Modtime is present in the local header
field, it MUST be present in the central header field, too!
This correspondence is required because the modification time
value may be used to support trans-timezone freshening and
updating operations with zip archives.
The time values are in standard Unix signed-long format, indicating
the number of seconds since 1 January 1970 00:00:00. The times
are relative to Coordinated Universal Time (UTC), also sometimes
referred to as Greenwich Mean Time (GMT). To convert to local time,
the software must know the local timezone offset from UTC/GMT.
The lower three bits of Flags in both headers indicate which time-
stamps are present in the LOCAL extra field:
bit 0 if set, modification time is present
bit 1 if set, access time is present
bit 2 if set, creation time is present
bits 3-7 reserved for additional timestamps; not set
Those times that are present will appear in the order indicated, but
any combination of times may be omitted. (Creation time may be
present without access time, for example.) TSize should equal
(1 + 4*(number of set bits in Flags)), as the block is currently
defined. Other timestamps may be added in the future.
--------------
It's hard to comment on why you received the error that you did but it is possible the tool that was used for writing the entry did something unexpected.
Out of curiosity, have you tried using ZipFile/ZipEntry to access the entry?
Best,
Lance
> On Oct 21, 2020, at 4:55 AM, Dawid Weiss <dawid.weiss at gmail.com> wrote:
>
> Hello,
>
> We've encountered a seemingly valid ZIP file (zipinfo -v shows all its
> entries are intact) that causes a runtime exception when scanning its
> contents with ZipFileSystem. The exception indicates an invalid
> signature when parsing EXTID_EXTT. I don't quite understand this
> comment in the code:
>
> case EXTID_EXTT:
> // spec says the Extened timestamp in cen only has mtime
> // need to read the loc to get the extra a/ctime, if flag
> // "zipinfo-time" is not specified to false;
> // there is performance cost (move up to loc and read) to
> // access the loc table foreach entry;
> if (zipfs.noExtt) {
> if (sz == 5)
> mtime = unixToJavaTime(LG(extra, pos + 1));
> break;
> }
> ...
>
> but this ZIP file has the extra data block of exactly 5 bytes, as
> indicated by zipinfo:
>
> Central directory entry #6:
> ---------------------------
> ...
> file system or operating system of origin: Unix
> version of encoding software: 3.0
> minimum file system compatibility required: MS-DOS, OS/2 or NT FAT
> minimum software version required to extract: 2.0
> compression method: deflated
> ...
> extended local header: no
> file last modified on (DOS date/time): 2018 Mar 1 04:56:20
> file last modified on (UT extra field modtime): 2018 Mar 1 05:56:19 local
> file last modified on (UT extra field modtime): 2018 Mar 1 04:56:19 UTC
> ...
> Unix file attributes (100000 octal): ----------
> MS-DOS file attributes (01 hex): read-only
>
> The central-directory extra field contains:
> - A subfield with ID 0x5455 (universal time) and 5 data bytes.
> The local extra field has UTC/GMT modification/access times.
>
> The above conditional block checking for length == 5 would have worked
> in ZipFileSystem but it's surrounded by a condition over an
> externally-provided property - zipfs.noExtt is only set to true if:
>
> this.noExtt = "false".equals(env.get("zipinfo-time"));
>
> I can't share this particular ZIP file with you and I don't know how
> it was created but it seems like that "zipinfo-time" flag could be
> omitted if the length of the extra data field is exactly 5?
>
> Dawid
Best
Lance
------------------
Lance Andersen| Principal Member of Technical Staff | +1.781.442.2037
Oracle Java Engineering
1 Network Drive
Burlington, MA 01803
Lance.Andersen at oracle.com
More information about the core-libs-dev
mailing list