RFR: 8297286: runtime/vthread tests crashing after JDK-8296324 [v9]

Patricio Chilano Mateo pchilanomate at openjdk.org
Wed Mar 29 22:18:36 UTC 2023


On Wed, 29 Mar 2023 18:02:38 GMT, Serguei Spitsyn <sspitsyn at openjdk.org> wrote:

>> The fix is to enable virtual threads support for late binding JVMTI agents.
>> The fix includes:
>> - New function `JvmtiEnvBase::enable_virtual_threads_notify_jvmti()` which does enabling JVMTI VTMS transition notifications in case of agent loaded into running VM. This function executes a VM operation counting VTMS transition bits in all `JavaThread`'s to correctly set the static counter `_VTMS_transition_count` needed for VTMS transition protocol.
>> - New function `JvmtiEnvBase::disable_virtual_threads_notify_jvmti()` which is needed for testing. It is used by the `WhiteBox` API.
>> - New WhiteBox function `WB_SetVirtualThreadsNotifyJvmtiMode(JNIEnv* env, jobject wb, jboolean enable)` needed for testing of this update.
>> - New regression test: `serviceability/jvmti/vthread/ToggleNotifyJvmtiTest`
>> 
>> Testing:
>> - New test: `serviceability/jvmti/vthread/ToggleNotifyJvmtiTest`
>> - The originally failed tests are expected to pass now:
>>   `runtime/vthread/RedefineClass.java`
>>   `runtime/vthread/TestObjectAllocationSampleEvent.java` 
>> - In progress: Run the tiers 1-6 to make sure there are no regression.
>
> Serguei Spitsyn has updated the pull request incrementally with one additional commit since the last revision:
> 
>   review: updated correction of jt->jvmti_thread_state() links in VM_SetNotifyJvmtiEventsMode

Hi Serguei,

I took a look at the patch and looks good to me. I have a couple of comments though.

Thanks,
Patricio

src/hotspot/share/prims/jvmtiEnvBase.cpp line 1554:

> 1552:       }
> 1553:       // Correct jt->jvmti_thread_state() and jt->jvmti_vthread() if necessary.
> 1554:       // It was not maintained while notifyJvmti was disabled.

While trying to understand which exact situation we are trying to guard against with this code, I run the test without the sleeps and without this restore code and I got a crash when deleting a JvmtiThreadState (null dereference of _thread in the ~()). Probably the same crash you mentioned you had. But when debugging the crash I see that the problem is that the assumption that disabling the flag is done when no virtual threads are running is not guaranteed (see my comment there). So I think we are trying to address a case that shouldn't happen in the first place. Also not sure if applying this restore in all cases will be correct, since we might be somewhere at a transition. For example, a thread could have blocked right in the return from notifyJvmtiUnmount() in yieldContinuation(). It will looked like virtual because unmount() was not executed yet, and the jvmti_thread_state should be that of the platform thread because we never changed it when mounting. We should leave the state a
 s is but in here we would change it to the virtual thread's jvmti state. The only case I think it makes sense to do this restore steps when enabling the flag is for those threads that are outside a transition with a mounted virtual thread, since we want to adjust the jvmti_thread_state so that it looks right on the next unmount.
But in any case this is also only needed when using the WhiteBox methods right? In the intended case (no WhiteBox method used), after we execute this operation to enable the events, we will create the JvmtiThreadStates later in JvmtiExport::get_jvmti_interface() and the correct jvmti_thread_state and jvmti_vthread will be already set for each JavaThread. In that case can we only execute this restore code when using the WhiteBox API?

test/hotspot/jtreg/serviceability/jvmti/vthread/ToggleNotifyJvmtiTest/ToggleNotifyJvmtiTest.java line 142:

> 140:                 TestedThread thread = threads[i];
> 141:                 if (thread == null) {
> 142:                     break;

Bailing out here means we could later disable the flag while there are virtual threads running. If I comment out the first two sleeps in the main thread I can see that issue happening. To avoid relying on timing I suggest using a semaphore to wait at the beginning of finishThreads(), and signal at the end of startThreads().

-------------

PR Review: https://git.openjdk.org/jdk/pull/13133#pullrequestreview-1362140557
PR Review Comment: https://git.openjdk.org/jdk/pull/13133#discussion_r1152508618
PR Review Comment: https://git.openjdk.org/jdk/pull/13133#discussion_r1152514856


More information about the serviceability-dev mailing list