G1 concurrent refinement regression
Brian S O'Neill
bronee at gmail.com
Mon Sep 29 15:27:56 UTC 2025
Here's the timings under ideal conditions (no logging overhead, priority
boost):
Pre JEP 522 no conc refinement: 499 seconds
Pre JEP 522 with conc refinement: 492 seconds
Post JEP 522 no conc refinement: 488 seconds
Post JEP 522 with conc refinement: 518 seconds
In my case it seems that with JEP 522, concurrent refinement should be
disabled to see any benefit.
On 2025-09-29 02:14 AM, Thomas Schatzl wrote:
> Hi again,
>
> here is a more comprehensive answer using some example application:
>
> On 28.09.25 18:19, Brian S O'Neill wrote:
>> I'm observing an interesting regression between build 26-ea+16-1649
>> and build 26-ea+17-1764 with respect to concurrent refinement. I
>> suspect the cause is JEP 522.
>>
>> I ran a stress test in which GC pauses are about 25ms in the steady
>> state. With 26-ea+16-1649, concurrent refinement is never initiated.
>>
>> With 26-ea+17-1764 (JEP 522) concurrent refinement starts after the
>> test has run for a few minutes. Eventually, concurrent refinement runs
>> back to back forever, wasting a CPU core and affecting overall
>> performance.
>
> These concurrent refinement log messages were added with JEP 522.
> Previously there were no refinement logs at info level; just because
> there were no logs, it does not mean that no refinement happens.
>
> Typically, JEP 522 reduces refinement work by a very significant amount,
> making your application (typically faster).
>
> Let me show you on an example: using specjbb2015 with a fixed load as a
> test application (because I do not know yours), using 8 parallel gc
> threads and a 50ms pause time goal (to induce some more refinement
> activity).
>
> I used a pre-JEP 522 build (just the one before,
> ca182912a305e1e226d97d9613c7baf8a3d22780), and latest tip
> (75269fdb49aeb9d37acbbc1502c446a822fd30e3). Further, I will show results
> with refinement disabled (using -XX:-G1UseConcRefinement).
>
> Pause time stats:
>
> pre-522: Total: 16,062s, Avg: 50,32ms
> post-522: Total: 17,245s, Avg: 48,31ms
> no-refine: Total: 24,940s, Avg: 50,80ms
>
> Post-522 spends more time in GC (this may be due to heuristics changes
> since pushing it), the no-refinement build spends much more time due to
> more work.
>
> Additionally I obtained CPU usage per thread type periodically every 5
> seconds (using "jcmd <pid> PerfCounter.print"). Let's discuss the
> results for the 50th occurence (~250s into the run):
>
> CPU time values are in s (I just added a comma - originally they were in
> ns), first is cpu time for concurrent mark, then concurrent refinement,
> then the parallel workers (~gc pause cpu time) and then the service thread.
>
> pre-jep522:
>
> sun.threads.cpu_time.gc_conc_mark=29,602801037
> sun.threads.cpu_time.gc_conc_refine=85,135331304
> sun.threads.cpu_time.gc_parallel_workers=130,768855722
> sun.threads.cpu_time.gc_service=0,059843947
>
> post-jep522:
>
> sun.threads.cpu_time.gc_conc_mark=29,524970498
> sun.threads.cpu_time.gc_conc_refine=53,489763400
> sun.threads.cpu_time.gc_conc_refine_control=0,777702253
> sun.threads.cpu_time.gc_parallel_workers=139,873904419
> sun.threads.cpu_time.gc_service=0,256793143
>
> The first observation is that, the post-jep522 build split refinement
> cpu time into "conc_refine" and "conc_refine_control" threads, but
> that's an implementation detail - just add them together for comparison.
>
> CPU spent in concurrent marking is the same, as expected. This is a
> fixed load test after all.
>
> There is a significant difference in time spent in refinement between
> these builds. Pre-JEP 522 around 85s are spent there, around 54.5s post-
> JEP 522.
>
> So actually post-JEP 522 builds spend ~36% _less_ time in concurrent
> refinement.
>
> Post-JEP 522 spends around 7% more in garbage collection.
>
> However, summing up all components, post-JEP 522 spends around 9.7% less
> time in GC related activities. So it seems a win.
>
> no-refine (post-jep522 build):
>
> sun.threads.cpu_time.gc_conc_mark=29,373202848
> sun.threads.cpu_time.gc_conc_refine=0
> sun.threads.cpu_time.gc_conc_refine_control=0
> sun.threads.cpu_time.gc_parallel_workers=152,750078728
> sun.threads.cpu_time.gc_service=0,237132193
>
> When disabling concurrent refinement as -XX:-G1UseConcRefinement does,
> more work is moved into the pauses. From a throughput POV it is a win,
> spending only ~182s in total in GC threads, vs. 223s (post-JEP 522) and
> 245s (pre-JEP 522).
>
> G1 then spends more time in pauses though, potentially affecting
> latencies more.
>
>>
>> [244.479s][info][gc,start ] GC(60) Pause Young (Normal) (G1
>> Evacuation Pause)
> > [...]> [403.761s][info][gc,refine ] Concurrent Refine Complete
> Work 0.022ms
>> [403.761s][info][gc,refine ] Concurrent Refine Cycle 160.341ms
>>
>> The problem goes away when I set -XX:MaxGCPauseMillis=500 or higher.
>> Considering that the pauses are always well below the 200ms default,
>> I'm surprised that concurrent refinement occurs at all.
>
> As already mentioned in the other email, by default G1 allocates 10% of
> pause time for refinement in the pause. The rest is (attempted to be)
> moved out to the concurrent phase. By increasing the maximum pause time
> you increase that allowance.
>
> Concurrent work adds some inefficiencies depending on your application,
> but typically reduces time spent in pause where the application comes to
> a complete halt.
>
> G1 tries to provide a balance between pauses and concurrent work even
> when nominally it is allowed to consume 200ms per pause; the (fixed)
> value of -XX:G1RSetUpdatingPauseTimePercent is also likely not optimal.
>
> Still nowadays people do not really want a 200ms pause even if they
> implicitly allow it (by not specifying any pause time goal), so by
> default some concurrent refinement is enabled and often actually performed.
>
> (Fwiw, by default G1 is allowed a pause time goal of 200ms within any
> interval of 201ms - it is unlikely that people really want that).
>
> It would also be nice if you verified that in your test application G1
> really does not do any refinement. If it really does not, it would be
> fairly surprising, and we would be interested in investigated this more.
>
> The logging I asked for (gc+refine=debug and gc+phases=debug) will help
> us to analyze what is happening in your case.
>
> (If you enabled more detailed refinement logging in the pre-JEP 522
> build, I think one could also see refinement activity, i.e. use
> gc+refine=debug).
>
> Hth,
> Thomas
More information about the hotspot-gc-dev
mailing list