<html><head></head><body><div class="yahoo-style-wrap" style="font-family:Helvetica Neue, Helvetica, Arial, sans-serif;font-size:13px;"><div dir="ltr" data-setdir="false">Hi,</div><div dir="ltr" data-setdir="false"><br></div><div dir="ltr" data-setdir="false">I am interested in using -XX:+AlwaysPreTouch and I was checking the delay during booting. I am also using TransparentHugePages<br></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>THP config</span></div><div dir="ltr" data-setdir="false"><div><div>[cheva-virtualmachine ~]# cat /sys/kernel/mm/transparent_hugepage/shmem_enabled</div><div>always within_size [advise] never deny force</div><div>[cheva-virtualmachine ~]# cat /sys/kernel/mm/transparent_hugepage/enabled </div><div>[always] madvise never</div><div><br></div></div></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>I am running a VM with 8 cores and I observed that PreTouch is much faster with G1 that with Gen ZGC. Main reason is that I could check that G1 is using 8 concurrent threads for doing the job while Gen ZGC was using 2 threads (I used top -d 1 and observed the busy threads there)</span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>I made this test with 32G heap, BUT the production enviornment is running with 300G so I expect the figures to be even more different.</span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span># G1 - Using 8 cores (GC Thread #0 ..#8)</span></div><div dir="ltr" data-setdir="false"><div><div dir="ltr" data-setdir="false">$> time java -Xmx32G -Xms32G -XX:-UseTransparentHugePages -XX:+AlwaysPreTouch -version</div><div>openjdk version "21.0.4" 2024-07-16 LTS</div><div>OpenJDK Runtime Environment Zulu21.36+17-CA (build 21.0.4+7-LTS)</div><div>OpenJDK 64-Bit Server VM Zulu21.36+17-CA (build 21.0.4+7-LTS, mixed mode, sharing)</div><div>java -Xmx32G -Xms32G -XX:-UseTransparentHugePages -XX:+AlwaysPreTouch -versio 0,44s user 12,28s system 688% cpu 1,848 total</div><div><br></div></div></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>#Gen ZGC - Using 1 thread and at some point switch to 2 threads (ZGCWorker#0 and #1)</span></div><div dir="ltr" data-setdir="false">$> time java -Xmx34G -Xms34G -XX:+UseZGC -XX:+ZGenerational -XX:-UseTransparentHugePages -XX:+AlwaysPreTouch -version ✔ <br></div><div dir="ltr" data-setdir="false"><div>openjdk version "21.0.4" 2024-07-16 LTS</div><div>OpenJDK Runtime Environment Zulu21.36+17-CA (build 21.0.4+7-LTS)</div><div>OpenJDK 64-Bit Server VM Zulu21.36+17-CA (build 21.0.4+7-LTS, mixed mode, sharing)</div><div>java -Xmx34G -Xms34G -XX:+UseZGC -XX:+ZGenerational -XX:+AlwaysPreTouch 1,08s user 11,92s system 136% cpu 9,530 total</div><div><br></div><div><br></div><div dir="ltr" data-setdir="false">Non generational ZGC is even slower.</div></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>In this case, GenZGC is 5 times slower than G1 and it is NOT using all available cores to do the job.</span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>Is this somehow expected behaviour? Maybe could be optimized or there is any reason to avoid using more threads?</span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>Thanks in advance,</span></div><div dir="ltr" data-setdir="false"><span><br></span></div><div dir="ltr" data-setdir="false"><span>Evaristo</span></div><div dir="ltr" data-setdir="false"><span><br></span></div></div></body></html>