[External] : Re: JDK-8334085: Cannot reproduce failing test

Tue Jul 23 23:13:23 UTC 2024

On 7/23/24 4:38 PM, Iñigo Mediavilla wrote:
> Thanks again Patricio,
>
> I guess I was struggling to make the connection between the 
> synchronized blocks and the calls to thaw. Based on your last email, I 
> think that I understand that connection, but feel free to correct me 
> if I'm wrong:
>
> - There cannot be a call to thaw after a synchronized block has 
> entered and before its execution has completed. If I think about why 
> this is not possible, I think that a thaw cannot happen after 
> something like Thread.yield because in that case the yield would fail 
> and the virtual thread would be pinned. It also should not happen 
> because of a return barrier because a thaw of that kind only happens 
> when we need to load a higher frame in the stack, which by definition 
> cannot happen inside a synchronized block method / block.
Right.

> - held_monitor_count is only affected by synchronized methods / blocks 
> or by JNI's MonitorEnter / MonitorExit. Other synchronization 
> primitives in the JVM do not affect that variable.
There is also ObjectLocker which is used to acquire a Java monitor from 
inside the VM. Also note that the counter can be modified during 
deoptimization if we need to relock objects for which synchronization 
was eliminated.

Patricio
> On Tue, Jul 23, 2024 at 6:09 PM Patricio Chilano Mateo 
> <patricio.chilano.mateo at oracle.com> wrote:
>
>
>     On 7/23/24 5:59 AM, Iñigo Mediavilla wrote:
>>     Hello Patricio,
>>
>>     Thanks a lot for your explanation.
>>
>>     Why is it safe for Thaw code to assume that all non-jni monitors
>>     will be released at that point, but the same assumption cannot be
>>     made for jni monitors ?
>>
>>     What would happen if ?
>>
>>     1. A virtual thread is unmounted
>>     2. We thaw a few frames and execute code that acquires a non-JNI
>>     monitor
>>     3. We call thaw again
>>
>>     Or is that not possible ?
>     That would not be possible unless there is a bug. All monitors
>     acquired from synchronized methods/blocks should have been
>     released once execution of the synchronized method/block
>     completes, either normally or abruptly (see [1]).
>     Monitors that are acquired through JNI function MonitorEnter
>     though are not automatically exited and a call to JNI function
>     MonitorExit is required, unless DetachCurrentThread is used to
>     implicitly release them (see [2]).
>
>     [1]
>     https://docs.oracle.com/javase/specs/jls/se22/html/jls-17.html#jls-17.1
>     [2]
>     https://docs.oracle.com/en/java/javase/22/docs/specs/jni/functions.html#monitor-operations
>
>     Patricio
>>     Thanks
>>     Íñigo
>>
>>     On Mon, Jul 22, 2024 at 4:11 PM Patricio Chilano Mateo
>>     <patricio.chilano.mateo at oracle.com> wrote:
>>
>>         Hi Iñigo,
>>
>>         The problem is that we can unmount a virtual thread, then
>>         mount it again, thaw a few frames, execute code that acquires
>>         a JNI monitor, and then call thaw again without releasing
>>         that monitor. Thaw code assumes all monitors must be released
>>         at that point but doesn't consider JNI acquired ones. In this
>>         test this will happen if the vthread is unmounted
>>         in System.out.println("Thread doing JNI call: " ...) because
>>         of contention with the main thread
>>         doing System.out.println("Main waiting for event."). You can
>>         reproduce this issue by adding Thread.yield() before
>>         jniMonitorEnterAndLetObjectDie().
>>
>>         Thanks,
>>         Patricio
>>
>>         On 7/22/24 7:30 AM, Iñigo Mediavilla wrote:
>>>         Hello Serguei,
>>>
>>>         Thanks a lot for sharing the update and for solving the
>>>         issue. Do you think that you could help me understand
>>>         exactly what's happening ?
>>>
>>>         Based on the DBG output shared in JBS, my understanding is
>>>         that what happens in the test is the following:
>>>
>>>         Main                         Thread
>>>         ------------------------- ----------------------------
>>>         1. acquire java lock
>>>         2. starting thread
>>>                                          3. jni call
>>>                                          4. MonitorContendedEnter
>>>         5. release java lock
>>>                                          6. acquire java lock
>>>                                          7. MonitorContendedEntered
>>>                                          8. Thread in sync section
>>>                                          9. release java lock
>>>                                        10. why freeze doesn't pin ?
>>>
>>>         What I'm struggling to understand is why after the thread
>>>         releases the java lock, the virtual thread is still frozen,
>>>         and specially why does it freeze while holding a jni monitor
>>>         ? I've run tests locally trying to freeze a virtual thread
>>>         holding a JNI lock and my virtual threads are always being
>>>         pinned to the carrier with reason ("holding a lock").
>>>
>>>         Thanks in advance
>>>
>>>         Íñigo
>>>
>>>         On Fri, Jul 19, 2024 at 10:41 PM Serguei Spitsyn
>>>         <serguei.spitsyn at oracle.com> wrote:
>>>
>>>             Hi Iñigo,
>>>
>>>             Patricio helped to reproduce this issue and also
>>>             identified the problem (please, see in the bug report).
>>>             The fix is a one-liner. I’ll post a PR after some mach5
>>>             testing.
>>>
>>>             Thank you for involvement into this issue!
>>>
>>>             Thanks,
>>>
>>>             Serguei
>>>
>>>             *From: *Iñigo Mediavilla <imediava at gmail.com>
>>>             *Date: *Saturday, July 13, 2024 at 12:08 AM
>>>             *To: *Chris Plummer <chris.plummer at oracle.com>
>>>             *Cc: *dholmes at openjdk.org <dholmes at openjdk.org>,
>>>             loom-dev at openjdk.org <loom-dev at openjdk.org>,
>>>             sspitsyn at openjdk.org <sspitsyn at openjdk.org>
>>>             *Subject: *Re: JDK-8334085: Cannot reproduce failing test
>>>
>>>             I see, in that case Serguei would you want to still own
>>>             this JBS or would you be OK if I try to have a look at it ?
>>>
>>>             Iñigo
>>>
>>>             El vie, 12 jul 2024, 19:11, Chris Plummer
>>>             <chris.plummer at oracle.com> escribió:
>>>
>>>                 Failures are very intermittent. We last saw a
>>>                 failure in our CI testing
>>>                 on 2024-07-03. What command are you using to run the
>>>                 test?
>>>
>>>                 Chris
>>>
>>>                 On 7/12/24 2:34 AM, Iñigo Mediavilla wrote:
>>>                 > Hello,
>>>                 >
>>>                 > While looking at possible JBS tickets to work on,
>>>                 I saw JDK-8334085
>>>                 > where an assertion was reported to be failing for the
>>>                 > GetOwnedMonitorInfoTest. Before I even asked
>>>                 around to wonder if this
>>>                 > issue was already being looked at, I tried to
>>>                 reproduce the failure
>>>                 > locally, but I don't manage to make the test fail.
>>>                 Is this still an
>>>                 > issue in JDK-24 ? David can you still reproduce
>>>                 the failing test ?
>>>                 >
>>>                 > Best
>>>                 >
>>>                 > Íñigo Mediavilla Saiz
>>>                 >
>>>                 >
>>>
>>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/loom-dev/attachments/20240723/edabd0da/attachment-0001.htm>