RFR: 8308614: Enabling JVMTI ClassLoad event slows down vthread creation by factor 10 [v4]
Daniel D. Daugherty
dcubed at openjdk.org
Thu Nov 30 17:36:13 UTC 2023
On Thu, 30 Nov 2023 02:08:39 GMT, Serguei Spitsyn <sspitsyn at openjdk.org> wrote:
>> This is a fix of a performance/scalability related issue. The `JvmtiThreadState` objects for virtual thread filtered events enabled globally are created eagerly because it is needed when the `interp_only_mode` is enabled. Otherwise, some events which are generated in `interp_only_mode` from the debugging version of interpreter chunks can be missed.
>> However, it has to be okay to avoid eager creation of these object if no `interp_only_mode` has ever been requested.
>> It seems to be an extremely important optimization to create JvmtiThreadState objects lazily in such cases.
>> It is done by introducing the flag `JvmtiThreadState::_seen_interp_only_mode` which indicates when the `JvmtiThreadState` objects have to be created eagerly.
>>
>> Additionally, the fix includes the following related changes:
>> - Use condition double checking idiom for `MutexLocker mu(JvmtiThreadState_lock)` in the function `JvmtiVTMSTransitionDisabler::VTMS_mount_end` which is on a performance-critical path and looks like this:
>>
>> JvmtiThreadState* state = thread->jvmti_thread_state();
>> if (state != nullptr && state->is_pending_interp_only_mode()) {
>> MutexLocker mu(JvmtiThreadState_lock);
>> state = thread->jvmti_thread_state();
>> if (state != nullptr && state->is_pending_interp_only_mode()) {
>> JvmtiEventController::enter_interp_only_mode();
>> }
>> }
>>
>>
>> - Add extra check of `JvmtiExport::can_support_virtual_threads()` when virtual thread mount and unmount are posted.
>> - Minor: Added a `ThreadsListHandle` to the `JvmtiEventControllerPrivate::enter_interp_only_mode`. It is needed because of the dynamic creation of compensating carrier threads which is racy for JVMTI `SetEventNotificationMode` implementation.
>>
>> Performance mesurements:
>> - Without this fix the test provided by the bug submitter gives execution numbers:
>> - no ClassLoad events enabled: 3251 ms
>> - ClassLoad events are enabled: 40534 ms
>>
>> - With the fix:
>> - no ClassLoad events enabled: 3270 ms
>> - ClassLoad events are enabled: 3385 ms
>>
>> Testing:
>> - Ran mach5 tiers 1-6, no regressions are noticed
>
> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision:
>
> review: remove newly added ThreadsListHandle from enter_interp_only_mode
Removing the TLH is the right thing to do. If we get that failure mode again, then we
can file a new bug and hopefully have an hs_err_pid file with it.
I don't think we should change the guarantee() in `Handshake::execute()`. When the
three parameter version of `execute()` is called with `tlh == nullptr`, the caller is
saying that there is supposed to be a ThreadsListHandle in the calling context. Yes,
if the target thread is the calling thread, then a ThreadsListHandle is not needed,
but that's why we have this code to prevent the call to `Handshake::execute()`:
if (target->is_handshake_safe_for(current)) {
hs.do_thread(target);
In other words, I think `Handshake::execute()` is working the way it is supposed to
when `tlh == nullptr` is passed.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/16686#issuecomment-1834244635
More information about the serviceability-dev
mailing list