RFR: 8315149: Add hsperf counters for CPU time of internal GC threads [v48]

Jonathan Joo jjoo at openjdk.org
Fri Dec 1 22:40:55 UTC 2023


On Fri, 1 Dec 2023 21:01:29 GMT, Man Cao <manc at openjdk.org> wrote:

>> I agree that the counter is valuable if always up-to-date, but if it is out of sync compared to the "concurrent counters" I think it will confuse some users. So if we want to keep it I think we should try to keep it in sync. 
>> 
>> I suppose adding a lock for updating `gc_total` should be ok. In the safepoint case we should never contend on that lock and for the concurrent updates it should not be a big deal. Basically what I think would be to change `update_counter(...)` to do something like:
>> 
>> if (CPUTimeGroups::is_gc_counter(name)) {
>>   <lock gc_total_lock>
>>   instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->inc(net_cpu_time);
>> }
>> 
>> 
>> This way we would also be able to remove the publish part above, right? Any other problems with this approach?
>
> I think the ideal approach to simplify this is to support Atomic operation on a `PerfCounter`.
> We could either introduce a `PerfAtomicCounter`/`PerfAtomicLongCounter` class, or perform `Atomic::add()` on the `PerfData::_valuep` pointer. There's already `PerfData::get_address()`, so we might be able to do:
> 
> 
> Atomic::add((volatile jlong *)(instance->get_counter(CPUTimeGroups::CPUTimeType::gc_total)->get_address()), net_cpu_time);
> 
> 
> However, a new class `PerfAtomicCounter`  is likely cleaner. E.g., we may also want to make `PerfAtomicCounter::sample()` use a CAS. It is probably better to introduce `PerfAtomicCounter` in a separate RFE later.
> 
> Would the `Atomic::add()` with `PerfData::get_address()` approach be OK for now, or would we rather introduce a lock, or leave the `gc_total` mechanism as-is and address the out-of-sync-ness in a follow-up RFE?
> 
> IMO the out-of-sync-ness problem is minor, because users are likely to either look at a single `gc_total` counter, or look at each individual GC  CPU counter and disregard `gc_total`.

In the interest of the RDP1 deadline, should we leave improving the sync issues with gc_total to a separate RFE? (Especially given that a "correct" design may take some time to come up with, and that gc_total being slightly out of sync is not a major issue.)

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15082#discussion_r1412631873


More information about the hotspot-gc-dev mailing list