RFR(S): 8073497: Lazy conversion of ZipEntry time

Claes Redestad claes.redestad at oracle.com
Mon Feb 23 15:41:48 UTC 2015


On 02/22/2015 10:16 PM, Xueming Shen wrote:
> On 2/21/15 6:11 PM, Claes Redestad wrote:
>> Hi Sherman,
>>
>> On 2015-02-21 19:49, Xueming Shen wrote:
>>> Hi Claes,
>>>
>>> This change basically undo the "fix" for 4759491 [1], for better 
>>> performance ...
>>>
>>
>> my intent was never to break current behavior, but that mistake can 
>> be rectified
>> without missing out on the startup benefit of laziness:
>>
>> http://cr.openjdk.java.net/~redestad/jdk9/8073497/webrev.1/
>>
>> The time/mtime getters now use the mtime field if it exists, while 
>> the setters will
>> update both fields. Since getLastModified already fell back to 
>> converting the
>> time field rather than null if mtime wasn't set, setting mtime to a 
>> FileTime when
>> calling setTime seems consistent and a cheap way to preserve the time 
>> precision.
>>
>> I guess a range check to skip the FileTime creation in setTime if the 
>> time is within
>> the DOS time bounds would be a valid optimization, since that will 
>> typically
>> always be the case.
>
>
> It's a reasonable solution. I would assume we probably need that 
> "range check" optimization
> to avoid setting the "mtime" field in normal use scenario. ZOS now 
> outputs the more accurate
> "mtime" to the zip file using "extend timestamp" if it's not null (the 
> entry gets bigger). The
> assumption now is that we only output the extended timestamp if the 
> time stamps set
> via the setXYZTime() explicitly.
>
> ZOS.java
>
> ...
>   432         int elenEXTT = 0;               // info-zip extended timestamp
>   433         int flagEXTT = 0;
>   434         if (e.mtime != null) {
>   435             elenEXTT += 4;
>   436             flagEXTT |= EXTT_FLAG_LMT;
>   437         }
>   438         if (e.atime != null) {
>   439             elenEXTT += 4;
>   440             flagEXTT |= EXTT_FLAG_LAT;
>   441         }
>   442         if (e.ctime != null) {
>   443             elenEXTT += 4;
>   444             flagEXTT |= EXTT_FLAT_CT;
>   445         }
>   446         if (flagEXTT != 0)
>   447             elen += (elenEXTT + 5);    // headid(2) + size(2) + flag(1) + data
>   448         writeShort(elen);
> ...

Here's my attempt to resolve this:

webrev: http://cr.openjdk.java.net/~redestad/jdk9/8073497/webrev.2/
incremental: 
http://cr.openjdk.java.net/~redestad/jdk9/8073497/webrev.1_to_2/

Calculating exact upper and lower bounds would only make sense for the 
current timezone
and would have to be recalculated, while approximate bounds should 
suffice for the practical
purpose of avoiding the FileTime creation in the typical case for 
performance reasons.

I rather arbitrarily chose an upper bound around 2098. Lower bound check 
could test for
some value near 1980, but since there's already code to check if 
Date.getYear() is less
than 1980 I figured that might suffice...

... that's when I stumbled on a few subtle bugs in the 
dosToJava/javaToDos conversion methods,
first and foremost that a Date representing times before the era 
(typically year 1), getYear
returns positive years. While a pretty artificial scenario, this is a 
bug in the javaToDosTime
conversion which breaks some tests. A simple check that millisecond long 
is larger than 0
is sufficient to resolve this negative-year-overflow for the current 
implementation.

We also don't check for year overflow for instants after 2107 in the 
local time: these will start
over at 1980. The old behavior is to simply accept the overflow. I've 
kept the behavior for the
dostime conversion, while the upper bound check ensure that the mtime 
will be calculated in
these cases, which seems reasonable.

This code does come with loss of some precision when setting time via 
setTime(long) in the
valid DOS time range (it will be rounded to 2-second precision), but 
with setLastModifiedTime
available to get the better precision this might be an acceptable 
behavioral regression of
setTime (while slightly weird, it doesn't seem to violate the 
specification).

/Claes

>
> -Sherman
>
>
>>> If we go with this change, I think we should also update the field 
>>> comment back to the
>>> original one to clearly indicates the "time" is "in DOS time".
>>>
>>> -    long time = -1;     // modification time (in DOS time)
>>> +    long mtime = -1;    // last modification time
>>>
>>
>> Done.
>>
>>>
>>> The set/getLastModifiedTime() pair also need update to set/get the 
>>> "time" field
>>> correctly to/from the dos time.
>>
>> Done.
>>
>> Thanks!
>>
>> /Claes
>>
>>>
>>> -Sherman
>>>
>>> [1] http://hg.openjdk.java.net/jdk8/jdk8/jdk/rev/90df6756406f
>>>
>>> On 2/21/15 6:34 AM, Claes Redestad wrote:
>>>> Hi all,
>>>>
>>>> please review this patch to re-introduce laziness in the 
>>>> java-to-dos time
>>>> conversions for the ZipEntry.time field.
>>>>
>>>> Webrev: http://cr.openjdk.java.net/~redestad/jdk9/8073497/webrev.0/
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8073497
>>>>
>>>> See bug for more details.
>>>>
>>>> This behavior was actually the case before 8-b94, when the time field
>>>> was removed in favor of a set of FileTime fields, but when it was 
>>>> later
>>>> re-introduced to address some compatibility issues the conversion was
>>>> implemented in an eager fashion. This inadvertently affects VM startup
>>>> ever so little, since for every entry read via a ZipFile or 
>>>> ZipInputStream
>>>> we'll do a relatively expensive call creating a Date and doing 
>>>> timezone
>>>> conversion.
>>>>
>>>> Some gains from loading fewer classes during VM startup, as well.
>>>>
>>>> Thanks!
>>>>
>>>> /Claes
>>>
>>
>




More information about the core-libs-dev mailing list