RFR: Implement JEP 509: JFR CPU-Time Profiling [v45]

Johannes Bechberger jbechberger at openjdk.org
Mon May 5 08:15:58 UTC 2025


On Fri, 2 May 2025 15:47:04 GMT, Zhengyu Gu <zgu at openjdk.org> wrote:

>> Johannes Bechberger has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Remove assertions
>
> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.hpp line 66:
> 
>> 64:       }
>> 65:     } while (Atomic::cmpxchg(&_head, elementIndex, elementIndex + 1) != elementIndex);
>> 66:     _data[elementIndex] = element;
> 
> I question the correctness of the implementation. Consider following scenario:
> T1: equeue(): after complete CAS successfully, e.g. `_head = 3, elementIndex = 2`
> T2: dequeue(): after complete CAS successfully, `_head = 2, elementIndex = 3`
> T2: read `_data[--elementIndex]` // _data[2] has yet set 
> T1: write `_data[elementIndex] = element` // set _data[2] value

You're right. But this seems to be an inherent problem of stacks. I'm going to use the previous lockless queue implementation for the fresh frames queue. The problem should not occur with the thread-local queues though?

> src/hotspot/share/runtime/handshake.hpp line 133:
> 
>> 131: 
>> 132:   bool can_run();
>> 133:   bool can_run(bool allow_suspend, bool check_async_exception);
> 
> You may want to inline these three methods, looks like that they can be on hot pathes.

Would that make a difference? They only defer to other methods and haven't been inlined before.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073020459
PR Review Comment: https://git.openjdk.org/jdk/pull/20752#discussion_r2073022843


More information about the hotspot-dev mailing list