G1 concurrent refinement regression

Brian S O'Neill bronee at gmail.com
Mon Sep 29 15:27:56 UTC 2025


Here's the timings under ideal conditions (no logging overhead, priority 
boost):

Pre JEP 522 no conc refinement:    499 seconds
Pre JEP 522 with conc refinement:  492 seconds
Post JEP 522 no conc refinement:   488 seconds
Post JEP 522 with conc refinement: 518 seconds

In my case it seems that with JEP 522, concurrent refinement should be 
disabled to see any benefit.


On 2025-09-29 02:14 AM, Thomas Schatzl wrote:
> Hi again,
> 
>    here is a more comprehensive answer using some example application:
> 
> On 28.09.25 18:19, Brian S O'Neill wrote:
>> I'm observing an interesting regression between build 26-ea+16-1649 
>> and build 26-ea+17-1764 with respect to concurrent refinement. I 
>> suspect the cause is JEP 522.
>>
>> I ran a stress test in which GC pauses are about 25ms in the steady 
>> state. With 26-ea+16-1649, concurrent refinement is never initiated.
>>
>> With 26-ea+17-1764 (JEP 522) concurrent refinement starts after the 
>> test has run for a few minutes. Eventually, concurrent refinement runs 
>> back to back forever, wasting a CPU core and affecting overall 
>> performance.
> 
> These concurrent refinement log messages were added with JEP 522. 
> Previously there were no refinement logs at info level; just because 
> there were no logs, it does not mean that no refinement happens.
> 
> Typically, JEP 522 reduces refinement work by a very significant amount, 
> making your application (typically faster).
> 
> Let me show you on an example: using specjbb2015 with a fixed load as a 
> test application (because I do not know yours), using 8 parallel gc 
> threads and a 50ms pause time goal (to induce some more refinement 
> activity).
> 
> I used a pre-JEP 522 build (just the one before, 
> ca182912a305e1e226d97d9613c7baf8a3d22780), and latest tip 
> (75269fdb49aeb9d37acbbc1502c446a822fd30e3). Further, I will show results 
> with refinement disabled (using -XX:-G1UseConcRefinement).
> 
> Pause time stats:
> 
> pre-522:   Total: 16,062s, Avg: 50,32ms
> post-522:  Total: 17,245s, Avg: 48,31ms
> no-refine: Total: 24,940s, Avg: 50,80ms
> 
> Post-522 spends more time in GC (this may be due to heuristics changes 
> since pushing it), the no-refinement build spends much more time due to 
> more work.
> 
> Additionally I obtained CPU usage per thread type periodically every 5 
> seconds (using "jcmd <pid> PerfCounter.print"). Let's discuss the 
> results for the 50th occurence (~250s into the run):
> 
> CPU time values are in s (I just added a comma - originally they were in 
> ns), first is cpu time for concurrent mark, then concurrent refinement, 
> then the parallel workers (~gc pause cpu time) and then the service thread.
> 
> pre-jep522:
> 
> sun.threads.cpu_time.gc_conc_mark=29,602801037
> sun.threads.cpu_time.gc_conc_refine=85,135331304
> sun.threads.cpu_time.gc_parallel_workers=130,768855722
> sun.threads.cpu_time.gc_service=0,059843947
> 
> post-jep522:
> 
> sun.threads.cpu_time.gc_conc_mark=29,524970498
> sun.threads.cpu_time.gc_conc_refine=53,489763400
> sun.threads.cpu_time.gc_conc_refine_control=0,777702253
> sun.threads.cpu_time.gc_parallel_workers=139,873904419
> sun.threads.cpu_time.gc_service=0,256793143
> 
> The first observation is that, the post-jep522 build split refinement 
> cpu time into "conc_refine" and "conc_refine_control" threads, but 
> that's an implementation detail - just add them together for comparison.
> 
> CPU spent in concurrent marking is the same, as expected. This is a 
> fixed load test after all.
> 
> There is a significant difference in time spent in refinement between 
> these builds. Pre-JEP 522 around 85s are spent there, around 54.5s post- 
> JEP 522.
> 
> So actually post-JEP 522 builds spend ~36% _less_ time in concurrent 
> refinement.
> 
> Post-JEP 522 spends around 7% more in garbage collection.
> 
> However, summing up all components, post-JEP 522 spends around 9.7% less 
> time in GC related activities. So it seems a win.
> 
> no-refine (post-jep522 build):
> 
> sun.threads.cpu_time.gc_conc_mark=29,373202848
> sun.threads.cpu_time.gc_conc_refine=0
> sun.threads.cpu_time.gc_conc_refine_control=0
> sun.threads.cpu_time.gc_parallel_workers=152,750078728
> sun.threads.cpu_time.gc_service=0,237132193
> 
> When disabling concurrent refinement as -XX:-G1UseConcRefinement does, 
> more work is moved into the pauses. From a throughput POV it is a win, 
> spending only ~182s in total in GC threads, vs. 223s (post-JEP 522) and 
> 245s (pre-JEP 522).
> 
> G1 then spends more time in pauses though, potentially affecting 
> latencies more.
> 
>>
>> [244.479s][info][gc,start    ] GC(60) Pause Young (Normal) (G1 
>> Evacuation Pause)
>  > [...]> [403.761s][info][gc,refine      ] Concurrent Refine Complete 
> Work 0.022ms
>> [403.761s][info][gc,refine      ] Concurrent Refine Cycle 160.341ms
>>
>> The problem goes away when I set -XX:MaxGCPauseMillis=500 or higher. 
>> Considering that the pauses are always well below the 200ms default, 
>> I'm surprised that concurrent refinement occurs at all.
> 
> As already mentioned in the other email, by default G1 allocates 10% of 
> pause time for refinement in the pause. The rest is (attempted to be) 
> moved out to the concurrent phase. By increasing the maximum pause time 
> you increase that allowance.
> 
> Concurrent work adds some inefficiencies depending on your application, 
> but typically reduces time spent in pause where the application comes to 
> a complete halt.
> 
> G1 tries to provide a balance between pauses and concurrent work even 
> when nominally it is allowed to consume 200ms per pause; the (fixed) 
> value of -XX:G1RSetUpdatingPauseTimePercent is also likely not optimal.
> 
> Still nowadays people do not really want a 200ms pause even if they 
> implicitly allow it (by not specifying any pause time goal), so by 
> default some concurrent refinement is enabled and often actually performed.
> 
> (Fwiw, by default G1 is allowed a pause time goal of 200ms within any 
> interval of 201ms - it is unlikely that people really want that).
> 
> It would also be nice if you verified that in your test application G1 
> really does not do any refinement. If it really does not, it would be 
> fairly surprising, and we would be interested in investigated this more.
> 
> The logging I asked for (gc+refine=debug and gc+phases=debug) will help 
> us to analyze what is happening in your case.
> 
> (If you enabled more detailed refinement logging in the pre-JEP 522 
> build, I think one could also see refinement activity, i.e. use 
> gc+refine=debug).
> 
> Hth,
>    Thomas



More information about the hotspot-gc-dev mailing list