RFR: 8326012: JFR: Event for safepoint timeout [v2]
Erik Gahlin
egahlin at openjdk.org
Fri Feb 16 14:26:54 UTC 2024
On Fri, 16 Feb 2024 14:03:21 GMT, Denghui Dong <ddong at openjdk.org> wrote:
>> There are now some JFR events related to safepoint. When time-to-safepoint (aka ttsp) is too long, these events could not be very helpful since based on them we cannot know which threads cause it and what those threads are doing.
>>
>> Users can use `-XX:+SafepointTimeout -XX:SafepointTimeoutDelay=100` to see the threads that don't reach safepoint in time but without stack traces. Using `-XX:+ AbortVMOnSafepointTimeout` can capture the stack traces but it crashes the process, hence it's not sensible to enable the flag in production.
>>
>> This patch adds a new JFR event `EventSafepoint` to record the threads that cause ttsp too long.
>>
>> This event includes two fields:
>>
>> - safepointId: the relevant safepoint id
>> - timeExceeded: the amount of time exceeding `SafepointTimeoutDelay` used by the thread to reach safepoint
>>
>> In the current version, this event records the stack of those problematic threads when they finally reach safepoint. Hence, there is a bias, but it's still helpful to deduce the root place.
>>
>> A better implementation is to record a more accurate stack, but this will increase complexity. At the same time, the native stack may also be important for this problem, but it is not currently supported by JFR.
>>
>> Any input would be greatly appreciated.
>>
>> Testing: jdk/jdk/jfr
>
> Denghui Dong has updated the pull request incrementally with one additional commit since the last revision:
>
> remove debug code
Can the event use the settings enabled and threshold instead of XX:+SafepointTimeout -XX:SafepointTimeoutDelay=x? Then the event can be configured from command line, a .jfc file and remotely over JMX. Perhaps the event can be enabled by default with a high threshold? Would this work, or am I missing something?
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17888#issuecomment-1948475336
More information about the hotspot-jfr-dev
mailing list