RFR 8080640: Reduce copying when reading JAR/ZIP entries

Wed May 20 17:57:15 UTC 2015

On 05/18/2015 06:44 PM, Staffan Friberg wrote:
> Hi,
>
> Wanted to get reviews and feedback on this performance improvement for reading from JAR/ZIP files during classloading by reducing unnecessary copying and reading the entry in one go instead of in small portions. This shows a significant improvement when reading a single entry and for a large application with 10k classes and 500+ JAR files it improved the startup time by 4%.
>
> For more details on the background and performance results please see the RFE entry.
>
> RFE - https://bugs.openjdk.java.net/browse/JDK-8080640
> WEBREV - http://cr.openjdk.java.net/~sfriberg/JDK-8080640/webrev.0
>
> Cheers,
> Staffan

Hi Staffan,

If I did not miss something here, from your use scenario it appears to me the only thing you really
need here to help boost your performance is

     byte[] ZipFile.getAllBytes(ZipEntry ze);

You are allocating a byte[] at use side and wrapping it with a ByteBuffer if the size is small enough,
otherwise, you letting the ZipFile to allocate a big enough one for you. It does not look like you
can re-use that byte[] (has to be wrapped by the ByteArrayInputStream and return), why do you
need two different methods here? The logic would be much easier to simply let the ZipFile to allocate
the needed buffer with appropriate size, fill the bytes and return, with a "OOME" if the entry size
is bigger than 2g.

The only thing we use from the input ze is its name, get the size/csize from the jzentry, I don't think
jzentry.csize/size can be "unknown", they are from the "cen" table.

If the real/final use of the bytes is to wrap it with a ByteArrayInputStream,why bother using ByteBuffer
here? Shouldn't a direct byte[] with exactly the size of the entry server better.

-Sherman