JEP 522 performance regression with large pages

Thu Aug 28 14:42:14 UTC 2025

I'm experimenting with the changes in PR 23739 (8342382: Implement JEP 
522: G1 GC: Improve Throughput by Reducing Synchronization) and I'm 
seeing a small performance regression when large pages are configured.

The test is fairly complicated, and most of the memory it uses is off 
heap. The GC heap size is set to 3GB (min and max), which is much larger 
than is actually required. A bunch of objects are allocated up front and 
remain in the old gen for the duration of the test run. Between each GC 
cycle, almost all the old gen objects will have been updated to 
reference a young object. The young gen objects live for about 2 
microseconds, and the references from the old gen objects are cleared.

Here's the baseline results when running with "normal" pages:

ParallelGC:   235.6 seconds
G1GC JEP 522: 238.7 seconds
G1GC:         241.5 seconds
ZGC:          246.2 seconds

With JEP 522, there's a small performance improvement, about 1%, which 
is nice to see. Here's the results when running with large pages 
(-XX:+UseLargePages -XX:+UseTransparentHugePages shmem_enabled is advise):

ParallelGC:   228.9 seconds
G1GC:         235.1 seconds
ZGC:          239.3 seconds
G1GC JEP 522: 239.7 seconds

All of the GCs show a performance improvement when using large pages, 
but with JEP 522, G1 is slower than the current version (JDK 24).

I don't know why there's a performance regression. Is this to be 
expected with large pages, or is there a missing configuration 
somewhere? I'm not configuring anything other than -Xms, -Xmx, and the 
large page settings. Also note that the test is run ten times (without 
restarting the JVM) and the average time is reported.