RFR: 8310160: Make GC APIs for handling archive heap objects agnostic of GC policy [v2]

Ioi Lam iklam at openjdk.org
Mon Jul 10 05:39:08 UTC 2023


On Fri, 7 Jul 2023 23:16:00 GMT, Ioi Lam <iklam at openjdk.org> wrote:

> I hope to implement a fast path for relocation that avoids using the hash tables at all. If we can get the total alloc + reloc time to be about 1.5ms, then it would be just as fast as before when relocation is enabled.

I've implemented a fast relocation lookup. It currently uses a table of the same size as the archived heap objects, but I can reduce that to 1/2 the size.

See https://github.com/openjdk/jdk/compare/master...iklam:jdk:8310823-materialize-cds-heap-with-regular-alloc?expand=1

This is implemented by about 330 lines of code in archiveHeapLoader.cpp. The code is templatized to try out different approaches (like `-XX:+NahlRawAlloc` and `-XX:+NahlUseAccessAPI`), so it can be further simplified.

There's only one thing that's not yet implemented -- the equivalent of `ArchiveHeapLoader::patch_native_pointers()`. I'll do that next.


$ java  -XX:+NewArchiveHeapLoading -Xmx128m -Xlog:cds+gc --version
[0.004s][info][cds,gc] Delayed allocation records alloced: 640
[0.004s][info][cds,gc] Load Time: 1388458


The whole allocation + reloc takes about 1.4ms. It's about 1.25ms slower in the worst case (when the "old" code doesn't have to relocate -- see the `(**)` in the table below). It's 0.8ms slower when the "old" code has to relocate.


All times are in ms, for "java --version"

====================================
Dump: java -Xshare:dump -Xmx128m

G1         old        new       diff
 128m   14.476     15.754     +1.277 (**)
8192m   15.359     16.085     +0.726


Serial     old        new
 128m   13.442     14.241     +0.798
8192m   13.740     14.532     +0.791

====================================
Dump: java -Xshare:dump -Xmx8192m

G1         old        new       diff
 128m   14.975     15.787     +0.812
2048m   16.239     17.035     +0.796
8192m   14.821     16.042     +1.221 (**)


Serial     old        new
 128m   13.444     14.167     +0.723
8192m   13.717     14.502     +0.785


While the code is slower than before, it's a lot simpler. It works on all collectors. I tested on ZGC, but I think Shenandoah should work as well.

The cost is about 1.3 ms per MB of archived heap objects. This may be acceptable as it's a small fraction of JVM bootstrap. We have about 1MB of archived objects now,  and we don't expect this size to drastically increase in the near future.

The extra memory cost is:

- a temporary in-memory copy of the archived heap objects
- a temporary table of 1/2 the size of the archived heap objects

The former can be reduced by reading the archived objects in a stream. The latter can be reduced by a more elaborate relocation algorithm that assumes that most of the allocated objects are in a contiguous block. Such changes may cause further slow down.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14520#issuecomment-1628247922


More information about the hotspot-gc-dev mailing list