RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v8]
Volker Simonis
simonis at openjdk.org
Mon Sep 11 15:14:50 UTC 2023
On Sat, 9 Sep 2023 00:39:38 GMT, Man Cao <manc at openjdk.org> wrote:
>> Jonathan Joo has updated the pull request incrementally with one additional commit since the last revision:
>>
>> Fix includes
>
> src/hotspot/share/gc/g1/g1ConcurrentMark.cpp line 461:
>
>> 459:
>> 460: _g1_concurrent_mark_threads_cpu_time =
>> 461: PerfDataManager::create_counter(SUN_THREADS, "g1_conc_mark_thread.cpu_time",
>
> I find a bit odd to have "." in the name. "." should be the separator for namespace, but not within the counter name.
>
> I think @simonis 's suggestion about `sun.gc.collector.<n>.cpu_time` or `sun.gc.cpu_time` is to have a single, aggregated counter named `cpu_time`. If we don't do such aggregation, the names should probably be `g1_conc_mark_cpu_time`, `parallel_gc_workers_cpu_time`, etc.
I suggested to have a single, aggregated counter for all GC threads **in addition** to the specific counters. This aggregated counter should be the same for all GCs which implement CPU times. It could easily be used by tools without knowing which GC is enabled and even more important, it would be immune to implementation changes (e.g. if a GC would establish a new subset of GC worker threads). From what I understand this is also your primary use case for your adaptable heap sizing feature.
The aggregated counter should be in a generic place for *all* GCs (e.g. `sun.gc.cpu_time` or `sun.threads.gc_cpu_time`). For the JIT we could then add the corresponding `sun.ci.cpu_time` or `sun.threads.jit_cpu_time` (in addition to the more specific counters for C1, C2, etc.).
For the specific GC counters: instead of putting the `.cpu_time` in the counter name, you can create a sub-namespace instead (e.g. `sun.gc.cpu_time.*` or `sun.threads.cpu_time.*`) and add all the other counters under that namespace (e.g. `sun.gc.cpu_time.total`, `sun.gc.cpu_time.conc_mark`, `sun.gc.cpu_time.parallel_gc_workers`, etc... What I want to avoid in any case is to have the name of the GC in every single counter name because that's redundant information.
> src/hotspot/share/runtime/thread.hpp line 663:
>
>> 661: // hsperfdata counter.
>> 662: class ThreadTotalCPUTimeClosure: public ThreadClosure {
>> 663: private:
>
> It might be preferable to move this class to share/memory/iterator.hpp (where `ThreadClosure` is defined) or runtime/perfData.hpp (where a similar class `PerfTraceTime` is defined).
>
> Also we probably want to move the body of `do_thread()` and `~ThreadTotalCPUTimeClosure()` to the corresponding .cpp file, to minimize new include statements in .hpp.
If you move it, I'd vote for `runtime/perfData.hpp`.
> src/hotspot/share/runtime/thread.hpp line 680:
>
>> 678: // must ensure the thread exists and has not terminated.
>> 679: assert(os::is_thread_cpu_time_supported(), "os must support cpu time");
>> 680: _time_diff = os::thread_cpu_time(thread);
>
> This does not look correct, the `_time_diff` is not a delta with the previous value of the CPU time. It also no longer accumulates the CPU time across a set of threads. I think we should stay with the previous approach of using `PerfVariable`, accumulating CPU time with `_total += os::thread_cpu_time(thread)`, then call ` _counter->set_value(_total)` in the destructor.
>
> To @simonis's point about using `PerfCounter` instead of `PerfVariable`, I agree ideally CPU time could use monotonically increasing `PerfCounter`. However, it would require computing a diff with the previously observed CPU time, which is essentially: `_counter->inc(_total - _counter->get_value())`. It looks unnecessary and is not as clean as ` _counter->set_value(_total)`.
I agree that the current version doesn't look correct, but I don't see a reason to change the `PerfCounter` back to `PerfVariable`. Just accumulate the time as suggested by @caoman (i.e. `_total += os::thread_cpu_time(thread)`), then call `_counter->inc(_total - _counter->get_value())` in the destructor.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1321703912
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1321650203
PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1321669415
More information about the hotspot-dev
mailing list