RFR: 8152733: Avoid creating Manifest when checking for Multi-Release attribute

Claes Redestad claes.redestad at oracle.com
Fri Mar 25 16:37:59 UTC 2016



On 2016-03-25 16:03, Paul Sandoz wrote:
>> On 25 Mar 2016, at 13:56, Claes Redestad <claes.redestad at oracle.com> wrote:
>>
>>> You've replaced the precomputed good suffix shift which i think makes the Math.max in math a bit harder to read. A comment or just move a local for "j < len - 1 ? len : 1" would make it a bit easier to read.
>>>
>>> It would be good to move the value of MULTIRELEASE_CHARS to its own line as the current line is requires scrolling when looking at the side-by-side diffs.
>> Sure: http://cr.openjdk.java.net/~redestad/8152733/webrev.02/
>>
> +1

Thanks!

>
>
> We could marginally improve the bad shift too:
>
>    int badShift = (c < 128) ? lastOcc[c] : 0;
>
> rather than masking (which might lop off the top bit of bytes of value encoded characters), that’s clearer code-wise to me. In fact i bet we could reduce the last occurrence array by half if we canonicalize to upper-case characters and bound the range to [‘ ‘, ‘Z’].
>
> byte c = b[i + j];
> int goodShift = 0;
> int badShift = 0;
> if (c >= ‘ ‘ && c <= ‘z') {
>    if (c >= ‘a’) c -= 32; // Canonicalize
>
>    if (c != src[j]) {
>      // no match
>      goodShift = (j < len - 1) ? len : 1;
>      badShift = lastOcc[c - 32];
>    }
> } else {
>      // no match, character not valid for name
>      goodShift = (j < len - 1) ? len : 1;
> }
> if (goodShift > 0) {
>      i += Math.max(j + 1 - badShift, goodShift);
>      continue next;
> }
>
> Just idle musings on my part, feel free to ignore ‘em.

Interesting, my only concern is we'd add more code/complexity, but it 
seems the shift will always be == len when not matching a character in 
src (the proof is left out as an exercise), so this can be simplified 
further to:

byte c = b[i + j];
if (c >= ' ' && c <= 'z') {
     if (c >= 'a') c -= 32; // Canonicalize

     if (c != src[j]) {
         // no match
         int goodShift = (j < len - 1) ? len : 1;
         int badShift = lastOcc[c - 32];
         i += Math.max(j + 1 - badShift, goodShift);
         continue next;
     }
} else {
     // no match, character not valid for name
     i += len;
     continue next;
}

http://cr.openjdk.java.net/~redestad/8152733/webrev.03/

(included Steve's suggestion)

Since this avoids reading from lastOcc for anything but the chars 
actually matching, this actually seems to improve things a bit locally, 
so I'm willing to accept the pile-on. I'll run some numbers in the lab...

/Claes
>
> Paul.
>
>
>> Difference from previous for convenience:
>>
>> diff -r fc748fa355e0 src/java.base/share/classes/java/util/jar/JarFile.java
>> --- a/src/java.base/share/classes/java/util/jar/JarFile.java    Fri Mar 25 13:38:37 2016 +0100
>> +++ b/src/java.base/share/classes/java/util/jar/JarFile.java    Fri Mar 25 13:47:31 2016 +0100
>> @@ -887,11 +887,15 @@
>>      }
>>
>>      // Statics for hand-coded Boyer-Moore search
>> -    private static final byte[] CLASSPATH_CHARS = {'c','l','a','s','s','-','p','a','t','h', ':', ' '};
>> +    private static final byte[] CLASSPATH_CHARS =
>> +            {'c','l','a','s','s','-','p','a','t','h', ':', ' '};
>> +
>>      // The bad character shift for "class-path:"
>>      private static final byte[] CLASSPATH_LASTOCC;
>>
>> -    private static final byte[] MULTIRELEASE_CHARS = {'m','u','l','t','i','-','r','e','l','e', 'a', 's', 'e', ':', ' '};
>> +    private static final byte[] MULTIRELEASE_CHARS =
>> +            {'m','u','l','t','i','-','r','e','l','e', 'a', 's', 'e', ':', ' '};
>> +
>>      // The bad character shift for "multi-release: "
>>      private static final byte[] MULTIRELEASE_LASTOCC;
>>
>> @@ -959,6 +963,8 @@
>>      /**
>>       * Returns true if the pattern {@code src} is found in {@code b}.
>>       * The {@code lastOcc} array is the precomputed bad character shifts.
>> +     * Since there are no repeated substrings in our search strings,
>> +     * the good character shifts can be replaced with a comparison.
>>       */
>>      private boolean match(byte[] src, byte[] b, byte[] lastOcc) {
>>          int len = src.length;
>> @@ -972,7 +978,8 @@
>>                      c += 32;
>>                  }
>>                  if (c != src[j]) {
>> -                    i += Math.max(j + 1 - lastOcc[c&0x7F], j < len - 1 ? len : 1);
>> +                    int goodShift = (j < len - 1) ? len : 1;
>> +                    i += Math.max(j + 1 - lastOcc[c&0x7F], goodShift);
>>                      continue next;
>>                  }
>>              }
>>
>> Thanks!
>>
>> /Claes




More information about the core-libs-dev mailing list