RFR: 8321620: Optimize JImage decompressors
Glavo
duke at openjdk.org
Fri Dec 8 22:43:34 UTC 2023
On Wed, 8 Nov 2023 11:55:22 GMT, Glavo <duke at openjdk.org> wrote:
> This PR significantly speeds up decompressing resources in Jimage while significantly reducing temporary memory allocations in the process.
>
> This will improve startup speed for runtime images generated using `jlink --compress 1` and `jlink --compress 2` .
>
> I generated a runtime image containing javac using `jlink --compress 1 --add-modules jdk.compiler` and tested the time it took to compile a simple HelloWorld program 20 times using `perf stat -r20 javac /dev/shm/HelloWorld.java`, this PR reduces the total time taken from 17830ms to 13598ms (31.12% faster).
I generated runtime images using `jlink --compress (1|2) --add-modules java.se,jdk.unsupported,jdk.management` and then ran the following JMH benchmark:
@Warmup(iterations = 10, time = 2)
@Measurement(iterations = 5, time = 3)
@Fork(value = 1, jvmArgsAppend = {"-XX:+UseG1GC", "-Xms8g", "-Xmx8g", "--add-exports=java.base/jdk.internal.jimage=ALL-UNNAMED"})
@BenchmarkMode(Mode.AverageTime)
@OutputTimeUnit(TimeUnit.NANOSECONDS)
@State(Scope.Benchmark)
public class Decompress {
private static final ImageReader READER = ImageReaderFactory.getImageReader();
private static final ImageLocation LOC = READER.findLocation("java.base", "java/lang/String.class");
@Benchmark
public ByteBuffer test() {
return READER.getResourceBuffer(LOC);
}
}
This is the result:
String Share
(Baseline) (Current)
Benchmark Mode Cnt Score Error Units Score Error Units
Decompress.test avgt 5 184243.403 ± 3643.983 ns/op 35176.514 ± 507.618 ns/op (-80.91%)
Decompress.test:gc.alloc.rate avgt 5 3263.730 ± 64.431 MB/sec 3143.058 ± 45.330 MB/sec
Decompress.test:gc.alloc.rate.norm avgt 5 630544.422 ± 0.008 B/op 115936.081 ± 0.001 B/op (-81.61%)
Decompress.test:gc.count avgt 5 10.000 counts 9.000 counts
Decompress.test:gc.time avgt 5 14.000 ms 13.000 ms
Zip
Benchmark Mode Cnt Score Error Units Score Error Units
Decompress.test avgt 5 194237.534 ± 1026.180 ns/op 152855.728 ± 16058.780 ns/op (-21.30%)
Decompress.test:gc.alloc.rate avgt 5 1197.700 ± 6.306 MB/sec 464.278 ± 47.465 MB/sec
Decompress.test:gc.alloc.rate.norm avgt 5 243953.338 ± 5.810 B/op 74376.291 ± 2.175 B/op (-69.51%)
Decompress.test:gc.count avgt 5 2.000 counts 1.000 counts
Decompress.test:gc.time avgt 5 4.000 ms 3.000 ms
The results show that memory allocation is reduced by more than 70% while decompression speed is significantly improved.
I ran the tier1 and tier2 tests and there were no new errors.
I generated a runtime image containing javac using `jlink --compress (0|1) --add-modules jdk.compiler` and tested the time it took to compile a simple HelloWorld program 20 times using `perf stat -r20 javac /dev/shm/HelloWorld.java`. This is the result:
Baseline, No Compress: 10829ms
String Share:
* Baseline: 17830ms
* This PR: 13598ms
Zip:
* Baseline: 12350ms
* This PR: 12279ms
You can see that in this test, this PR made the runtime image compressed using string share 30% faster.
Does anyone want to take a look at this PR?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16556#issuecomment-1801743767
PR Comment: https://git.openjdk.org/jdk/pull/16556#issuecomment-1802425032
PR Comment: https://git.openjdk.org/jdk/pull/16556#issuecomment-1804817957
PR Comment: https://git.openjdk.org/jdk/pull/16556#issuecomment-1847870676
More information about the core-libs-dev
mailing list