<html><head><meta http-equiv="Content-Type" content="text/html; charset=UTF-8" /></head><body style='font-size: 10pt; font-family: Verdana,Geneva,sans-serif'>
<p>Here is a unique source code file for the reproducer (the big String is generated when starting as you suggested). It changes a little the results but the run with zgc + jfr is still taking lot of time. </p>
<div id="signature"></div>
<p>Thanks you for having a look.</p>
<p>Fabrice</p>
<p><br /></p>
<p><br /></p>
<p id="reply-intro">Le 2026-01-12 10:56, Erik Gahlin a écrit :</p>
<blockquote type="cite" style="padding: 0 0.4em; border-left: #1010ff 2px solid; margin: 0">
<div class="pre" style="margin: 0; padding: 0; font-family: monospace">Hi Fabrice,<br /><br />Thanks for reporting!<br /><br />Could you post the source code for the reproducer here? The 36 MB file could probably be replaced with a String::repeat expression.<br /><br />JFR does use some memory, which could impact available heap and performance, although the degradation you’re seeing seems awfully high.<br /><br />Thanks<br />Erik<br /><br />________________________________________<br />From: hotspot-jfr-dev <<a href="mailto:hotspot-jfr-dev-retn@openjdk.org">hotspot-jfr-dev-retn@openjdk.org</a>> on behalf of Fabrice Bibonne <<a href="mailto:fabrice.bibonne@courriel.eco">fabrice.bibonne@courriel.eco</a>><br />Sent: Sunday, January 11, 2026 7:23 PM<br />To: <a href="mailto:hotspot-jfr-dev@openjdk.org">hotspot-jfr-dev@openjdk.org</a><br />Subject: Using JFR both with ZGC degrades application throughput<br /><br />Hi all,<br /><br /> I would like to report a case where starting jfr for an application running with zgc causes a significant throughput degradation (compared to when JFR is not started).<br /><br /> My context : I was writing a little web app to illustrate a case where the use of ZGC gives a better throughput than with G1. I benchmarked with grafana k6 my application running with G1 and my application running with ZGC : the runs with ZGC gave better throughputs. I wanted to go a bit further in explanation so I began again my benchmarks with JFR to be able to illustrate GC gains in JMC. When I ran my web app with ZGC+JFR, I noticed a significant throughput degradation in my benchmark (which was not the case with G1+JFR).<br /><br /> Although I did not measure an increase in overhead as such, I still wanted to report this issue because the degradation in throughput with JFR is such that it would not be usable as is on a production service.<br /><br />I wrote a little application (not a web one) to reproduce the problem : the application calls a little conversion service 200 times with random numbers in parallel (to be like a web app in charge and to pressure GC). The conversion service (a method named `convertNumberToWords`) convert the number in a String looking for the String in a Map with the number as th key. In order to instantiate and destroy many objects at each call, the map is built parsing a huge String at each call. Application ends after 200 calls.<br /><br />Here are the step to reproduce :<br />1. Clone <a href="https://framagit.org/FBibonne/poc-java/-/tree/jfr+zgc_impact" target="_blank" rel="noopener noreferrer">https://framagit.org/FBibonne/poc-java/-/tree/jfr+zgc_impact</a> (be aware to be on branch jfr+zgc_impact)<br />2. Compile it (you must include numbers200k.zip in resources : it contains a 36 Mo text files whose contents are used to create the huge String variable)<br />3. in the root of repository :<br />3a. Run `time java -Xmx4g -XX:+UseZGC -XX:+UseCompressedOops -classpath target/classes poc.java.perf.write.TestPerf #ZGC without JFR`<br />3b. Run `time java -Xmx4g -XX:+UseZGC -XX:+UseCompressedOops -XX:StartFlightRecording -classpath target/classes poc.java.perf.write.TestPerf #ZGC with JFR`<br />4. The real time of the second run (with JFR) will be considerably higher than that of the first<br /><br />I ran these tests on my laptop :<br />- Dell Inc. Latitude 5591<br />- openSUSE Tumbleweed 20260108<br />- Kernel : 6.18.3-1-default (64-bit)<br />- 12 × Intel® Core™ i7-8850H CPU @ 2.60GHz<br />- RAM 16 Gio<br />- openjdk version "25.0.1" 2025-10-21<br />- OpenJDK Runtime Environment (build 25.0.1+8-27)<br />- OpenJDK 64-Bit Server VM (build 25.0.1+8-27, mixed mode, sharing)<br />- many tabs opened in firefox !<br /><br />I also ran it in a container (eclipse-temurin:25) on my laptop and with a windows laptop and came to the same conclusions : here are the measurements from the container :<br /><br />| Run with | Real time (s) |<br />|-----------|---------------|<br />| ZGC alone | 7.473 |<br />| ZGC + jfr | 25.075 |<br />| G1 alone | 10.195 |<br />| G1 + jfr | 10.450 |<br /><br />After all these tests I tried to run the app with an other profiler tool in order to understand where is the issue. I join the flamegraph when running jfr+zgc : for the worker threads of the ForkJoinPool of Stream, stack traces of a majority of samples have the same top lines :<br />- PosixSemaphore::wait<br />- ZPageAllocator::alloc_page_stall<br />- ZPageAllocator::alloc_page_inner<br />- ZPageAllocator::alloc_page<br /><br />So many thread seem to spent their time waiting in the method ZPageAllocator::alloc_page_stall when the JFR is on. The JFR periodic tasks threads has also a few samples where it waits at ZPageAllocator::alloc_page_stall. I hope this will help you to find the issue.<br /><br />Thank you very much for reading this email until the end. I hope this is the good place for such a feedback. Let me know if I must report my problem elsewhere. Be free to ask me more questions if you need.<br /><br />Thank you all for this amazing tool !</div>
</blockquote>
</body></html>