JEP 522 performance regression with large pages
Brian S O'Neill
bronee at gmail.com
Thu Aug 28 14:42:14 UTC 2025
I'm experimenting with the changes in PR 23739 (8342382: Implement JEP
522: G1 GC: Improve Throughput by Reducing Synchronization) and I'm
seeing a small performance regression when large pages are configured.
The test is fairly complicated, and most of the memory it uses is off
heap. The GC heap size is set to 3GB (min and max), which is much larger
than is actually required. A bunch of objects are allocated up front and
remain in the old gen for the duration of the test run. Between each GC
cycle, almost all the old gen objects will have been updated to
reference a young object. The young gen objects live for about 2
microseconds, and the references from the old gen objects are cleared.
Here's the baseline results when running with "normal" pages:
ParallelGC: 235.6 seconds
G1GC JEP 522: 238.7 seconds
G1GC: 241.5 seconds
ZGC: 246.2 seconds
With JEP 522, there's a small performance improvement, about 1%, which
is nice to see. Here's the results when running with large pages
(-XX:+UseLargePages -XX:+UseTransparentHugePages shmem_enabled is advise):
ParallelGC: 228.9 seconds
G1GC: 235.1 seconds
ZGC: 239.3 seconds
G1GC JEP 522: 239.7 seconds
All of the GCs show a performance improvement when using large pages,
but with JEP 522, G1 is slower than the current version (JDK 24).
I don't know why there's a performance regression. Is this to be
expected with large pages, or is there a missing configuration
somewhere? I'm not configuring anything other than -Xms, -Xmx, and the
large page settings. Also note that the test is run ten times (without
restarting the JVM) and the average time is reported.
More information about the hotspot-gc-dev
mailing list