RFR: 8359348: G1: Improve cpu usage measurements for heap sizing [v3]
Man Cao
manc at openjdk.org
Wed Jul 23 07:18:57 UTC 2025
On Thu, 17 Jul 2025 13:16:29 GMT, Ivan Walulya <iwalulya at openjdk.org> wrote:
>> Hi,
>>
>> Please review this patch which takes into account the impact of concurrent GC activity on mutator threads when computing the time spent on GC activity in a time interval. Previously, only the GC pause times were considered, and the overhead imposed by concurrent GC worker threads was not included.
>>
>> With this change, we now estimate the impact of concurrent GC by dividing `elapsed_gc_cpu_time` by the number of CPUs. This provides an approximation of the additional time attributable to GC activity, assuming a fair CPU resource sharing. Although this method does not account for contention on other shared resources (such as memory bandwidth or caches), it offers a reasonable estimate for most scenarios.
>>
>> Testing: Tier 1
>
> Ivan Walulya has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains nine commits:
>
> - cleanup after merge
> - Merge remote-tracking branch 'upstream/master' into ConcCPUImpact
> - Merge branch 'NormalizeTiming' into ConcCPUImpact
> - Thomas suggestion
> - Thomas Review
> - reorder
> - concurrent gc impact
> - clean out
> - init
src/hotspot/share/gc/g1/g1Analytics.cpp line 173:
> 171: // activity. We do not account for contention on other shared resources such as memory bandwidth and
> 172: // caches, therefore underestimate the impact of the concurrent GC activity on mutator threads.
> 173: uint num_cpus = (uint)os::active_processor_count();
It is a good idea to convert CPU time to wall-clock time "lost" by mutator threads. However, it could be challenging in some environments. E.g. a background controller process that periodically runs `taskset` to change CPU affinity mask for the Java process, leading to `os::active_processor_count()` returning variable values.
Another scenario is that the Java process runs in a CPU-constraint container, but for some reasons has to run with `-XX:-UseContainerSupport`. Then all of `os::active_processor_count()`, `ConcGCThreads`, `ParallelGCThreads` could be much higher than the max # of available CPU cores.
Our internal infra has both issues above, and I don't have a good idea to mitigate them in G1. I think the current approach is acceptable: due to the latter `-XX:-UseContainerSupport` issue, it will mostly lead to under-counting concurrent GC CPU time, which is not a show stopper.
src/hotspot/share/gc/g1/g1Analytics.cpp line 174:
> 172: // caches, therefore underestimate the impact of the concurrent GC activity on mutator threads.
> 173: uint num_cpus = (uint)os::active_processor_count();
> 174: num_cpus = MIN2(num_cpus, MAX2(ConcGCThreads, ParallelGCThreads));
A minor issue is that `UseDynamicNumberOfGCThreads` may not activate all `ConcGCThreads`. I think it is acceptable the wall-clock time "lost" by mutator threads is a best-effort approximation anyway.
src/hotspot/share/gc/g1/g1Policy.cpp line 678:
> 676: double cur_gc_cpu_time_ms = _g1h->elapsed_gc_cpu_time() * MILLIUNITS;
> 677:
> 678: double concurrent_gc_cpu_time_ms = cur_gc_cpu_time_ms - prev_gc_cpu_pause_end_ms;
`CollectedHeap::elapsed_gc_cpu_time()` includes `_vmthread_cpu_time` and `string_dedup_cpu_time`. Should we exclude VM threads and StringDedup threads' CPU time from `concurrent_gc_cpu_time_ms`? Otherwise non-GC activities could affect G1's heap sizing decisions.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/26351#discussion_r2224597028
PR Review Comment: https://git.openjdk.org/jdk/pull/26351#discussion_r2224629012
PR Review Comment: https://git.openjdk.org/jdk/pull/26351#discussion_r2224541165
More information about the hotspot-gc-dev
mailing list