RFR: 8306467: Fix nsk/jdb/kill/kill001 to work with new JVMTI StopThread support for virtual threads.

Chris Plummer cjplummer at openjdk.org
Sat May 13 00:17:44 UTC 2023


On Fri, 12 May 2023 23:15:50 GMT, Alex Menkov <amenkov at openjdk.org> wrote:

>> Currently kill001 assumes that JVMTI StopThread (via JDI ThreadReference.stop) is not supported for virtual threads. [JDK-8306034](https://bugs.openjdk.org/browse/JDK-8306034) is adding support for StopThread on a virtual thread as long as it is suspended and mounted. This means, for example, it will work for virtual threads in the following conditions:
>>  - Debuggee in a loop and suspended
>>  - Debuggee at a breakpoint and suspended
>> 
>> But will continue to not work in the following situations:
>> - Debuggee in a loop but not suspended
>> - Debuggee suspended but unmounted, such as during a call the Thread.sleep()
>> 
>> kill001 suspends all threads at a breakpoint and then does a "jdb kill" on each thread, which translate to `ThreadReference.stop()`, so this is expected to work now.
>> 
>> Most of the changes involve undoing the virtual thread specific code that was added to the test as part of [JDK-8282385](https://bugs.openjdk.org/browse/JDK-8282385). However, there is an additional issue that also needs fixing. The test relies on the fact that the async exception is normally not caught, and that jdb normally stops when an uncaught exception is thrown. With virtual threads there ends up being an exception handler in `java.lang.VirtualThread.run()`, resulting in jdb not stopping when the async exception is thrown. This is fixed by having the test issue a jdb "catch all <classname>" command for each async exception type that the test throws.
>
> test/hotspot/jtreg/vmTestbase/nsk/jdb/kill/kill001/kill001a.java line 165:
> 
>> 163:         // and therefore sleep the full time, resulting in a test timeout if too long.
>> 164:         try {
>> 165:             Thread.currentThread().sleep(kill001a.vthreadMode ? 10000 : kill001a.waitTime);
> 
> The PR description says:
> `kill001 suspends all threads at a breakpoint and then does a "jdb kill" on each thread, which translate to ThreadReference.stop(), so this is expected to work now.`
> As far as I see only main thread is suspended at "breakHere", all test threads wait in Thread.sleep().
> I'd expect test threads to be unmounted in Thread.sleep(), so it's unclear how the test works in vthread mode

You're right. I misunderstood what the test is doing. At the time of the `kill` commands, the threads are actually blocked here:

        synchronized (kill001a.lock) {}

Currently blocking in this manner pins the virtual thread to the carrier thread, so it is mounted. However, we are not yet in the try block at the time the async exception is thrown. After all the threads have had the async exception thrown at them, the main thread is resumed and the sychronized block is exited. The log shows that the exception is thrown on the sleep() line, which is the first statement after the synchronized, which just so happens to be within the try and the right thing happens.

However, even if I add another statement outside the synchronized and just before the try (which ends up being the line where the exception is thrown), it still passes. This is due to yet another bug. The exception is blowing us out of the thread, so we never hit the following block:


        if (!killed) {
            // Need to make sure the increment is atomic
            synchronized (kill001a.lock) {
                kill001a.notKilled++;
            }
        }


The debugger side checks for the proper notKilled value (it should be 0). So the test still passes even though it never confirmed that the correct async exception was thrown and caught. It just knows that the thread did not exit cleanly. Probably instead of tracking `notKilled`, the test should track `killed` and only increment when the correct exception is actually caught.

And one more thing, the following comment before the sleep() wrong:

        // Sleep during waitTime to give debugger a chance to kill debugee's thread.

I'm not so sure the sleep is even necessary since we block in the synchronized indefinitely until the async exception is thrown. I think just a simple statement at the start of the try block is good enough. Maybe a println(). But this is somewhat fragile because we need to make sure there is no code between the synchronized and the try, so we probably need to move the synchronized to within the try also.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/13967#discussion_r1192879626


More information about the serviceability-dev mailing list