RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2]
Ioi Lam
iklam at openjdk.org
Mon Jul 10 05:39:08 UTC 2023
On Fri, 7 Jul 2023 23:16:00 GMT, Ioi Lam <iklam at openjdk.org> wrote:
> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled.
I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size.
See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1
This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified.
There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next.
$ java -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version
[0.004s][info][cds,gc] Delayed allocation records alloced: 640
[0.004s][info][cds,gc] Load Time: 1388458
The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate.
All times are in ms, for "java --version"
====================================
Dump: java -Xshare:dump -Xmx128m
G1 old new diff
128m 14.476 15.754 +1.277 (**)
8192m 15.359 16.085 +0.726
Serial old new
128m 13.442 14.241 +0.798
8192m 13.740 14.532 +0.791
====================================
Dump: java -Xshare:dump -Xmx8192m
G1 old new diff
128m 14.975 15.787 +0.812
2048m 16.239 17.035 +0.796
8192m 14.821 16.042 +1.221 (**)
Serial old new
128m 13.444 14.167 +0.723
8192m 13.717 14.502 +0.785
While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well.
The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now, and we don't expect this size to drastically increase in the near future.
The extra memory cost is:
- a temporary in-memory copy of the archived heap objects
- a temporary table of 1/2 the size of the archived heap objects
The former can be reduced by reading the archived objects in a stream. The latter can be reduced by a more elaborate relocation algorithm that assumes that most of the allocated objects are in a contiguous block. Such changes may cause further slow down.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1628247922
More information about the hotspot-gc-dev
mailing list