RFR: 8304020: Speed up test/jdk/java/util/zip/ZipFile/TestTooManyEntries.java and clarify its purpose [v6]
Eirik Bjorsnos
duke at openjdk.org
Mon Mar 13 09:31:26 UTC 2023
On Sun, 12 Mar 2023 21:25:46 GMT, Eirik Bjorsnos <duke at openjdk.org> wrote:
>> Please review this PR which speeds up TestTooManyEntries and clarifies its purpose:
>>
>> - The name 'TestTooManyEntries' does not clearly convey the purpose of the test. What is tested is the validation that the total CEN size fits in a Java byte array. Suggested rename: CenSizeTooLarge
>> - The test creates DEFLATED entries which incurs zlib costs and File Data / Data Descriptors for no additional benefit. We can use STORED instead.
>> - By creating a single LocalDateTime and setting it with `ZipEntry.setTimeLocal`, we can avoid repeated time zone calculations.
>> - The name of entries is generated by calling UUID.randomUUID, we could use simple counter instead.
>> - The produced file is unnecessarily large. We know how large a CEN entry is, let's take advantage of that to create a file with the minimal size.
>> - By adding a maximally large extra field to the CEN entries, we get away with fewer CEN records and save memory
>> - The summary and comments of the test can be improved to help explain the purpose of the test and how we reach the limit being tested.
>>
>> These speedups reduced the runtime from 4 min 17 sec to 4 seconds on my Macbook Pro. The produced ZIP size was reduced from 5.7 GB to 2 GB. Memory consumption is down from 8GB to something like 12MB.
>
> Eirik Bjorsnos has updated the pull request incrementally with two additional commits since the last revision:
>
> - MAX_EXTRA_FIELD_SIZE can be better expressed as 0xFFFF
> - Bring back '@requires sun.arch.data.model == 64' for now
The test now runs fast with much less memory, but still consumes 2GB of disk space.
I brought back `@requires (sun.arch.data.model == "64")`, is this required for files > 2GB?
We could bring down the consumed disk space to 131MB by using a sparse file. Whether this is worth pursuing depends on whether the 2GB file is considered problematic.
Here's the SparseFileOutputStream used to bring the size down to 131MB:
/**
* By writing mostly extra fields as sparse 'holes', we can save disk space
* used by this test from ~2GB to ~131MB
*/
private static class SparseOutputStream extends FilterOutputStream {
private final byte[] extra;
private final FileChannel channel;
public SparseOutputStream(byte[] extra, FileChannel channel) {
super(new BufferedOutputStream(Channels.newOutputStream(channel)));
this.extra = extra;
this.channel = channel;
}
@Override
public void write(byte[] b, int off, int len) throws IOException {
if (b == extra && off == 0 && len == extra.length) {
// Write extra field header
out.write(b, off, EXTRA_HEADER_LENGTH);
out.flush();
// The data is all zeros, we can advance the position instead
channel.position(channel.position() + len - EXTRA_HEADER_LENGTH);
} else {
out.write(b, off, len);
}
}
}
-------------
PR: https://git.openjdk.org/jdk/pull/12991
More information about the core-libs-dev
mailing list