RFR: 8326012: JFR: Event for time to safepoint [v9]
Denghui Dong
ddong at openjdk.org
Wed Feb 21 02:26:55 UTC 2024
On Tue, 20 Feb 2024 05:27:22 GMT, Denghui Dong <ddong at openjdk.org> wrote:
>> There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing.
>>
>> Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production.
>>
>> ~~This patch adds a new JFR event `EventSafepointTimeout` to record the threads that cause ttsp too long.~~
>>
>> ~~This event includes two fields:~~
>>
>> ~~- safepointId: the relevant safepoint id~~
>> ~~- timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint~~
>>
>> ~~In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.~~
>>
>> A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR.
>>
>> Any input would be greatly appreciated.
>>
>> Testing: jdk/jdk/jfr
>
> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision:
>
> delete _entries when disabled
src/hotspot/share/jfr/support/jfrTimeToSafepoint.hpp line 42:
> 40: JavaThread* thread;
> 41: JfrTicks end;
> 42: int iterations;
Maybe we can think about putting them into JfrThreadLocal.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/17888#discussion_r1496788017
More information about the hotspot-jfr-dev
mailing list