Using JFR both with ZGC degrades application throughput

Fabrice Bibonne fabrice.bibonne at courriel.eco
Tue Jan 13 04:36:58 UTC 2026


Thank you for your advise, I just give a few precisions in a few lines  
:

* for `-XX:+UseCompressedOops`, I must admit I do not know this option : 
I add it because JDK Mission control warned me about it in "Automated 
analysis result" after a fisrt try (<<Compressed Oops is turned off 
[...].Use the JVM argument '-XX:+UseCompressedOops' to enable this 
feature>>)

* it is true that application waste time in GC pauses (46,6% of time 
with G1) : I wanted an example app which uses GC a lot. Maybe this is a 
little too much compared to real apps (even if for some of them, we may 
wonder...).

* the stack I showed about finding a new empty page/region allocation is 
present in both cases (with jfr and without jfr). But in the case with 
jfr, it is much more wider : it takes much more samples.

Best regards,

Fabrice

Le 2026-01-12 14:18, Thomas Schatzl a écrit :

> Hi,
> 
> while not being able to answer the question about why using JFR takes 
> so much additional time, when reading about your benchmark setup the 
> following things came to my mind:
> 
> * -XX:+UseCompressedOops for ZGC does nothing (ZGC does not support 
> compressed oops at all), and G1 will automatically use it. You can 
> leave it off.
> 
> * G1 having a significantly worse throughput than ZGC is very rare: 
> even then the extent you show is quite large. Taking some of content 
> together (4g heap, Maps, huge string variables) indicates that you 
> might have run into a well-known pathology of G1 with large objects: 
> the application might waste up to 50% of your application due to these 
> humongous objects [0 [1]].
> G1 might work better in JDK 26 too as some enhancement to some 
> particular case has been added. More is being worked on.
> 
> TL;DR: Your application might run much better with a large(r) 
> G1HeapRegionSize setting. Or just upgrading to JDK 26.
> 
> * While ZGC does not have that in some cases extreme memory wastage for 
> large allocations, there is still some. Adding JFR might just push it 
> over the edge (the stack you showed are about finding a new empty 
> page/region for allocation, failing to do so, doing a GC, stalling and 
> waiting).
> 
> Hth,
> Thomas
> 
> [0] https://tschatzl.github.io/2021/11/15/heap-regions-x-large.html
> 
> On 11.01.26 19:23, Fabrice Bibonne wrote:
> 
>> Hi all,
>> 
>> I would like to report a case where starting jfr for an application 
>> running with zgc causes a significant throughput degradation (compared 
>> to when JFR is not started).
>> 
>> My context : I was writing a little web app to illustrate a case where 
>> the use of ZGC gives a better throughput than with G1. I benchmarked 
>> with grafana k6 my application running with G1 and my application 
>> running with ZGC  : the runs with ZGC gave better throughputs. I 
>> wanted to go a bit further in explanation so I began again my 
>> benchmarks with JFR to be able to illustrate GC gains in JMC. When I 
>> ran my web app with ZGC+JFR, I noticed a significant throughput 
>> degradation in my benchmark (which was not the case with G1+JFR).
>> 
>> Although I did not measure an increase in overhead as such, I still 
>> wanted to report this issue because the degradation in throughput with 
>> JFR is such that it would not be usable as is on a production service.
>> 
>> I wrote a little application (not a web one) to reproduce the problem 
>> : the application calls a little conversion service 200 times with 
>> random numbers in parallel (to be like a web app in charge and to 
>> pressure GC). The conversion service (a method named 
>> `convertNumberToWords`) convert the number in a String looking for the 
>> String in a Map with the number as th key. In order to instantiate and 
>> destroy many objects at each call, the map is built parsing a huge 
>> String at each call. Application ends after 200 calls.
>> 
>> Here are the step to reproduce :
>> 1. Clone https://framagit.org/FBibonne/poc-java/-/tree/jfr+zgc_impact 
>> (be aware to be on branch jfr+zgc_impact)
>> 2. Compile it (you must include numbers200k.zip in resources : it 
>> contains a 36 Mo text files whose contents are used to create the huge 
>> String variable)
>> 3. in the root of repository :
>> 3a. Run `time java -Xmx4g -XX:+UseZGC -XX:+UseCompressedOops 
>> -classpath target/classes poc.java.perf.write.TestPerf #ZGC without 
>> JFR`
>> 3b. Run `time java -Xmx4g -XX:+UseZGC -XX:+UseCompressedOops - 
>> XX:StartFlightRecording -classpath target/classes 
>> poc.java.perf.write.TestPerf #ZGC with JFR`
>> 4. The real time of the second run (with JFR) will be considerably 
>> higher than that of the first
>> 
>> I ran these tests on my laptop :
>> - Dell Inc. Latitude 5591
>> - openSUSE Tumbleweed 20260108
>> - Kernel : 6.18.3-1-default (64-bit)
>> - 12 × Intel(R) Core(tm) i7-8850H CPU @ 2.60GHz
>> - RAM 16 Gio
>> - openjdk version "25.0.1" 2025-10-21
>> - OpenJDK Runtime Environment (build 25.0.1+8-27)
>> - OpenJDK 64-Bit Server VM (build 25.0.1+8-27, mixed mode, sharing)
>> - many tabs opened in firefox !
>> 
>> I also ran it in a container (eclipse-temurin:25) on my laptop and 
>> with a windows laptop and came to the same conclusions : here are the 
>> measurements from the container :
>> 
>> | Run with  | Real time (s) |
>> |-----------|---------------|
>> | ZGC alone | 7.473         |
>> | ZGC + jfr | 25.075        |
>> | G1 alone  | 10.195        |
>> | G1 + jfr  | 10.450        |
>> 
>> After all these tests I tried to run the app with an other profiler 
>> tool in order to understand where is the issue. I join the flamegraph 
>> when running jfr+zgc : for the worker threads of the ForkJoinPool of 
>> Stream, stack traces of a majority of samples have the same top lines 
>> :
>> - PosixSemaphore::wait
>> - ZPageAllocator::alloc_page_stall
>> - ZPageAllocator::alloc_page_inner
>> - ZPageAllocator::alloc_page
>> 
>> So many thread seem to spent their time waiting in the method 
>> ZPageAllocator::alloc_page_stall when the JFR is on. The JFR periodic 
>> tasks threads has also a few samples where it waits at 
>> ZPageAllocator::alloc_page_stall. I hope this will help you to find 
>> the issue.
>> 
>> Thank you very much for reading this email until the end. I hope this 
>> is the good place for such a feedback. Let me know if I must report my 
>> problem elsewhere. Be free to ask me more questions if you need.
>> 
>> Thank you all for this amazing tool !


Links:
------
[1] https://tschatzl.github.io/2021/11/15/heap-regions-x-large.html
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/hotspot-jfr-dev/attachments/20260113/a554dece/attachment-0001.htm>


More information about the hotspot-jfr-dev mailing list