RFR: 8342818: Implement JEP 509: JFR CPU-Time Profiling [v26]

Sun Jun 1 18:13:00 UTC 2025

On Sun, 1 Jun 2025 15:23:06 GMT, Johannes Bechberger <jbechberger at openjdk.org> wrote:

>> src/hotspot/share/jfr/periodic/sampling/jfrCPUTimeThreadSampler.cpp line 581:
>> 
>>> 579: 
>>> 580:   if (jt->thread_state() == _thread_in_native &&
>>> 581:       queue.size() > queue.capacity() * 2 / 3) {
>> 
>> Is this logic still valid? You are only asking for async processing assistance depending on the load factor of the queue?
>
> Yes, so I only start the thread walking if necessary

I see. With a bounded queue as used in this solution, it can work quite nicely, that is, if the thread is actually on CPU in native, and just not waiting - if waiting (which is most likely) then pending requests could take a long time to be sent to consumers.

I also understand better the optimization you tried as part of async walk in native and frames. Also quite nice, to walk from the last JfrSampleRequest and do equals to "batch" the top JFR sample requests that are the same (i,.e taken for the ljf). Maybe you can retry that again, but then you need to save the sid AND the tid to be reused for the top equal requests (you only need stacktrace.record_inner() for one request). Its a nice optimization.

>> src/hotspot/share/jfr/periodic/sampling/jfrThreadSampling.cpp line 362:
>> 
>>> 360:   drain_enqueued_requests(now, tl, jt, current);
>>> 361: #ifdef LINUX
>>> 362:   if (tl->has_cpu_time_jfr_requests()) {
>> 
>> You are having all threads traverse over this test, even though the cpu time sampler is disabled by default. Can it be improved?
>
> Not without allocating in the signal handler

How so?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119385303
PR Review Comment: https://git.openjdk.org/jdk/pull/25302#discussion_r2119389715