[External] : Re: JDK-8334085: Cannot reproduce failing test

Iñigo Mediavilla imediava at gmail.com
Tue Jul 23 20:38:05 UTC 2024


Thanks again Patricio,

I guess I was struggling to make the connection between the synchronized
blocks and the calls to thaw. Based on your last email, I think that I
understand that connection, but feel free to correct me if I'm wrong:

- There cannot be a call to thaw after a synchronized block has entered and
before its execution has completed. If I think about why this is not
possible, I think that a thaw cannot happen after something like
Thread.yield because in that case the yield would fail and the virtual
thread would be pinned. It also should not happen because of a return
barrier because a thaw of that kind only happens when we need to load a
higher frame in the stack, which by definition cannot happen inside a
synchronized block method / block.

- held_monitor_count is only affected by synchronized methods / blocks or
by JNI's MonitorEnter / MonitorExit. Other synchronization primitives in
the JVM do not affect that variable.



On Tue, Jul 23, 2024 at 6:09 PM Patricio Chilano Mateo <
patricio.chilano.mateo at oracle.com> wrote:

>
> On 7/23/24 5:59 AM, Iñigo Mediavilla wrote:
>
> Hello Patricio,
>
> Thanks a lot for your explanation.
>
> Why is it safe for Thaw code to assume that all non-jni monitors will be
> released at that point, but the same assumption cannot be made for jni
> monitors ?
>
> What would happen if ?
>
> 1. A virtual thread is unmounted
> 2. We thaw a few frames and execute code that acquires a non-JNI monitor
> 3. We call thaw again
>
> Or is that not possible ?
>
> That would not be possible unless there is a bug. All monitors acquired
> from synchronized methods/blocks should have been released once execution
> of the synchronized method/block completes, either normally or abruptly
> (see [1]).
> Monitors that are acquired through JNI function MonitorEnter though are
> not automatically exited and a call to JNI function MonitorExit is
> required, unless DetachCurrentThread is used to implicitly release them
> (see [2]).
>
> [1]
> https://docs.oracle.com/javase/specs/jls/se22/html/jls-17.html#jls-17.1
> [2]
> https://docs.oracle.com/en/java/javase/22/docs/specs/jni/functions.html#monitor-operations
>
> Patricio
>
> Thanks
> Íñigo
>
> On Mon, Jul 22, 2024 at 4:11 PM Patricio Chilano Mateo <
> patricio.chilano.mateo at oracle.com> wrote:
>
>> Hi Iñigo,
>>
>> The problem is that we can unmount a virtual thread, then mount it again,
>> thaw a few frames, execute code that acquires a JNI monitor, and then call
>> thaw again without releasing that monitor. Thaw code assumes all monitors
>> must be released at that point but doesn't consider JNI acquired ones. In
>> this test this will happen if the vthread is unmounted
>> in System.out.println("Thread doing JNI call: " ...) because of contention
>> with the main thread doing System.out.println("Main waiting for event.").
>> You can reproduce this issue by adding Thread.yield() before
>> jniMonitorEnterAndLetObjectDie().
>>
>> Thanks,
>> Patricio
>>
>> On 7/22/24 7:30 AM, Iñigo Mediavilla wrote:
>>
>> Hello Serguei,
>>
>> Thanks a lot for sharing the update and for solving the issue. Do you
>> think that you could help me understand exactly what's happening ?
>>
>> Based on the DBG output shared in JBS, my understanding is that what
>> happens in the test is the following:
>>
>> Main                         Thread
>> -------------------------   ----------------------------
>> 1. acquire java lock
>> 2. starting thread
>>                                  3. jni call
>>                                  4. MonitorContendedEnter
>> 5. release java lock
>>                                  6. acquire java lock
>>                                  7. MonitorContendedEntered
>>                                  8. Thread in sync section
>>                                  9. release java lock
>>                                10. why freeze doesn't pin ?
>>
>> What I'm struggling to understand is why after the thread releases the
>> java lock, the virtual thread is still frozen, and specially why does it
>> freeze while holding a jni monitor ? I've run tests locally trying to
>> freeze a virtual thread holding a JNI lock and my virtual threads are
>> always being pinned to the carrier with reason ("holding a lock").
>>
>> Thanks in advance
>>
>> Íñigo
>>
>> On Fri, Jul 19, 2024 at 10:41 PM Serguei Spitsyn <
>> serguei.spitsyn at oracle.com> wrote:
>>
>>> Hi Iñigo,
>>>
>>> Patricio helped to reproduce this issue and also identified the problem
>>> (please, see in the bug report).
>>> The fix is a one-liner. I’ll post a PR after some mach5 testing.
>>>
>>> Thank you for involvement into this issue!
>>>
>>> Thanks,
>>>
>>> Serguei
>>>
>>>
>>>
>>> *From: *Iñigo Mediavilla <imediava at gmail.com>
>>> *Date: *Saturday, July 13, 2024 at 12:08 AM
>>> *To: *Chris Plummer <chris.plummer at oracle.com>
>>> *Cc: *dholmes at openjdk.org <dholmes at openjdk.org>, loom-dev at openjdk.org <
>>> loom-dev at openjdk.org>, sspitsyn at openjdk.org <sspitsyn at openjdk.org>
>>> *Subject: *Re: JDK-8334085: Cannot reproduce failing test
>>>
>>> I see, in that case Serguei would you want to still own this JBS or
>>> would you be OK if I try to have a look at it ?
>>>
>>>
>>>
>>> Iñigo
>>>
>>> El vie, 12 jul 2024, 19:11, Chris Plummer <chris.plummer at oracle.com>
>>> escribió:
>>>
>>> Failures are very intermittent. We last saw a failure in our CI testing
>>> on 2024-07-03. What command are you using to run the test?
>>>
>>> Chris
>>>
>>> On 7/12/24 2:34 AM, Iñigo Mediavilla wrote:
>>> > Hello,
>>> >
>>> > While looking at possible JBS tickets to work on, I saw JDK-8334085
>>> > where an assertion was reported to be failing for the
>>> > GetOwnedMonitorInfoTest. Before I even asked around to wonder if this
>>> > issue was already being looked at, I tried to reproduce the failure
>>> > locally, but I don't manage to make the test fail. Is this still an
>>> > issue in JDK-24 ? David can you still reproduce the failing test ?
>>> >
>>> > Best
>>> >
>>> > Íñigo Mediavilla Saiz
>>> >
>>> >
>>>
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240723/27367a2d/attachment.htm>


More information about the loom-dev mailing list