RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning [v9]

David Holmes dholmes at openjdk.org
Fri Oct 25 06:09:25 UTC 2024


On Thu, 24 Oct 2024 21:08:26 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

>> This is the implementation of JEP 491: Synchronize Virtual Threads without Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for further details.
>> 
>> In order to make the code review easier the changes have been split into the following initial 4 commits:
>> 
>> - Changes to allow unmounting a virtual thread that is currently holding monitors.
>> - Changes to allow unmounting a virtual thread blocked on synchronized trying to acquire the monitor.
>> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and its timed-wait variants.
>> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
>> 
>> The changes fix pinning issues for all 4 ports that currently implement continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added recently and stand in its own commit after the initial ones.
>> 
>> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default locking mode, (and `LM_MONITOR` which comes for free), but not when using `LM_LEGACY` mode. Note that the `LockingMode` flag has already been deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with the intention to remove `LM_LEGACY` code in future releases.
>> 
>> 
>> ## Summary of changes
>> 
>> ### Unmount virtual thread while holding monitors
>> 
>> As stated in the JEP, currently when a virtual thread enters a synchronized method or block, the JVM records the virtual thread's carrier platform thread as holding the monitor, not the virtual thread itself. This prevents the virtual thread from being unmounted from its carrier, as ownership information would otherwise go wrong. In order to fix this limitation we will do two things:
>> 
>> - We copy the oops stored in the LockStack of the carrier to the stackChunk when freezing (and clear the LockStack). We copy the oops back to the LockStack of the next carrier when thawing for the first time (and clear them from the stackChunk). Note that we currently assume carriers don't hold monitors while mounting virtual threads.
>> 
>> - For inflated monitors we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows us to tie the owner of the monitor to a `java.lang.Thread` instance, rather than to a JavaThread which is only created per platform thread. The tid is already a 64 bit field so we can ignore issues of the counter wrapping around.
>> 
>> #### General notes about this part:
>> 
>> - Since virtual th...
>
> Patricio Chilano Mateo has updated the pull request incrementally with four additional commits since the last revision:
> 
>  - Rename set/has_owner_anonymous to set/has_anonymous_owner
>  - Fix comments in javaThread.hpp and Thread.java
>  - Rename nonce/nounce to seqNo in VirtualThread class
>  - Remove ObjectMonitor::set_owner_from_BasicLock()

Next batch of comments ...

src/hotspot/share/classfile/javaClasses.cpp line 2082:

> 2080: }
> 2081: 
> 2082: bool java_lang_VirtualThread::set_onWaitingList(oop vthread, OopHandle& list_head) {

Some comments here about the operation would be useful. The "waiting list" here is just a list of virtual threads that need unparking by the Unblocker thread - right?

I'm struggling to understand how a thread can already be on this list?

src/hotspot/share/classfile/javaClasses.cpp line 2086:

> 2084:   jboolean vthread_on_list = Atomic::load(addr);
> 2085:   if (!vthread_on_list) {
> 2086:     vthread_on_list = Atomic::cmpxchg(addr, (jboolean)JNI_FALSE, (jboolean)JNI_TRUE);

It is not clear who the racing participants are here. How can the same thread be being placed on the list from two different actions?

src/hotspot/share/code/nmethod.cpp line 711:

> 709:   // handle the case of an anchor explicitly set in continuation code that doesn't have a callee
> 710:   JavaThread* thread = reg_map->thread();
> 711:   if ((thread->has_last_Java_frame() && fr.sp() == thread->last_Java_sp()) JVMTI_ONLY(|| (method()->is_continuation_enter_intrinsic() && thread->on_monitor_waited_event()))) {

Suggestion:

  if ((thread->has_last_Java_frame() && fr.sp() == thread->last_Java_sp()) 
      JVMTI_ONLY(|| (method()->is_continuation_enter_intrinsic() && thread->on_monitor_waited_event()))) {

src/hotspot/share/runtime/objectMonitor.cpp line 132:

> 130: 
> 131: // -----------------------------------------------------------------------------
> 132: // Theory of operations -- Monitors lists, thread residency, etc:

This comment block needs updating now owner is not a JavaThread*, and to account for vthread usage

src/hotspot/share/runtime/objectMonitor.cpp line 1140:

> 1138: }
> 1139: 
> 1140: bool ObjectMonitor::resume_operation(JavaThread* current, ObjectWaiter* node, ContinuationWrapper& cont) {

Explanatory comment would be good - thanks.

src/hotspot/share/runtime/objectMonitor.cpp line 1532:

> 1530:   } else if (java_lang_VirtualThread::set_onWaitingList(vthread, vthread_cxq_head())) {
> 1531:     // Virtual thread case.
> 1532:     Trigger->unpark();

So ignoring for the moment that I can't see how `set_onWaitingList` could return false here, the check is just an optimisation to reduce the number of unparks issued i.e. only unpark if the list has changed?

src/hotspot/share/runtime/objectMonitor.cpp line 1673:

> 1671: 
> 1672:   ContinuationEntry* ce = current->last_continuation();
> 1673:   if (interruptible && ce != nullptr && ce->is_virtual_thread()) {

So IIUC this use of `interruptible` would be explained as follows:

// Some calls to wait() occur in contexts that still have to pin a vthread to its carrier.
// All such contexts perform non-interruptible waits, so by checking `interruptible` we know 
// this is a regular Object.wait call.

src/hotspot/share/runtime/objectMonitor.cpp line 1698:

> 1696:     // on _WaitSetLock so it's not profitable to reduce the length of the
> 1697:     // critical section.
> 1698: 

Please restore the blank line, else it looks like the comment block pertains to the `wait_reenter_begin`, but it doesn't.

src/hotspot/share/runtime/objectMonitor.cpp line 2028:

> 2026:   // First time we run after being preempted on Object.wait().
> 2027:   // Check if we were interrupted or the wait timed-out, and in
> 2028:   // that case remove ourselves from the _WaitSet queue.

I'm not sure how to interpret this comment block - is this really two sentences because the first is not actually a sentence. Also unclear what "run" and "First time" relate to.

src/hotspot/share/runtime/objectMonitor.cpp line 2054:

> 2052:   // Mark that we are at reenter so that we don't call this method again.
> 2053:   node->_at_reenter = true;
> 2054:   assert(!has_owner(current), "invariant");

The position of this assert seems odd as it seems to be something that should hold at entry to this method.

src/hotspot/share/runtime/objectMonitor.hpp line 174:

> 172: 
> 173:   int64_t volatile _owner;  // Either tid of owner, NO_OWNER, ANONYMOUS_OWNER or DEFLATER_MARKER.
> 174:   volatile uint64_t _previous_owner_tid;  // thread id of the previous owner of the monitor

Looks odd to have the current owner as `int64_t` but we save the previous owner as `uint64_t`. ??

src/hotspot/share/runtime/objectMonitor.hpp line 207:

> 205: 
> 206:   static void Initialize();
> 207:   static void Initialize2();

Please add comment why this needs to be deferred - and till after what?

src/hotspot/share/runtime/objectMonitor.hpp line 312:

> 310:   void      set_successor(JavaThread* thread);
> 311:   void      set_successor(oop vthread);
> 312:   void      clear_successor();

Needs descriptive comments, or at least a preceding comment explaining what a "successor" is.

src/hotspot/share/runtime/objectMonitor.hpp line 349:

> 347:   ObjectWaiter* first_waiter()                                         { return _WaitSet; }
> 348:   ObjectWaiter* next_waiter(ObjectWaiter* o)                           { return o->_next; }
> 349:   JavaThread* thread_of_waiter(ObjectWaiter* o)                        { return o->_thread; }

This no longer looks correct if the waiter is a vthread. ??

src/hotspot/share/runtime/objectMonitor.inline.hpp line 110:

> 108: }
> 109: 
> 110: // Returns null if DEFLATER_MARKER is observed.

Comment needs updating

src/hotspot/share/runtime/objectMonitor.inline.hpp line 130:

> 128: // Returns true if owner field == DEFLATER_MARKER and false otherwise.
> 129: // This accessor is called when we really need to know if the owner
> 130: // field == DEFLATER_MARKER and any non-null value won't do the trick.

Comment needs updating

src/hotspot/share/runtime/synchronizer.cpp line 670:

> 668:   // Top native frames in the stack will not be seen if we attempt
> 669:   // preemption, since we start walking from the last Java anchor.
> 670:   NoPreemptMark npm(current);

Don't we still pin for JNI monitor usage?

src/hotspot/share/runtime/synchronizer.cpp line 1440:

> 1438: }
> 1439: 
> 1440: ObjectMonitor* ObjectSynchronizer::inflate_impl(JavaThread* inflating_thread, oop object, const InflateCause cause) {

`inflating_thread` doesn't sound right as it is always the current thread that is doing the inflating. The passed in thread may be a different thread trying to acquire the monitor ... perhaps `contending_thread`?

src/hotspot/share/runtime/synchronizer.hpp line 172:

> 170: 
> 171:   // Iterate ObjectMonitors where the owner is thread; this does NOT include
> 172:   // ObjectMonitors where owner is set to a stack lock address in thread.

Comment needs updating

-------------

PR Review: https://git.openjdk.org/jdk/pull/21565#pullrequestreview-2393922768
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815838204
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815839094
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815840245
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815985700
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815998417
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816002660
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816009160
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816014286
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816017269
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816018848
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815956322
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816040287
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815959203
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815960013
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815967260
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1815969101
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816043275
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816047142
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1816041444


More information about the nio-dev mailing list