RFR: 8338383: Implementation of Synchronize Virtual Threads without Pinning [v3]

Tue Oct 22 13:50:32 UTC 2024

On Tue, 22 Oct 2024 02:14:23 GMT, Patricio Chilano Mateo <pchilanomate at openjdk.org> wrote:

>> This is the implementation of JEP 491: Synchronize Virtual Threads without Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for further details.
>> 
>> In order to make the code review easier the changes have been split into the following initial 4 commits:
>> 
>> - Changes to allow unmounting a virtual thread that is currently holding monitors.
>> - Changes to allow unmounting a virtual thread blocked on synchronized trying to acquire the monitor.
>> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and its timed-wait variants.
>> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
>> 
>> The changes fix pinning issues for all 4 ports that currently implement continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added recently and stand in its own commit after the initial ones.
>> 
>> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default locking mode, (and `LM_MONITOR` which comes for free), but not when using `LM_LEGACY` mode. Note that the `LockingMode` flag has already been deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with the intention to remove `LM_LEGACY` code in future releases.
>> 
>> 
>> ## Summary of changes
>> 
>> ### Unmount virtual thread while holding monitors
>> 
>> As stated in the JEP, currently when a virtual thread enters a synchronized method or block, the JVM records the virtual thread's carrier platform thread as holding the monitor, not the virtual thread itself. This prevents the virtual thread from being unmounted from its carrier, as ownership information would otherwise go wrong. In order to fix this limitation we will do two things:
>> 
>> - We copy the oops stored in the LockStack of the carrier to the stackChunk when freezing (and clear the LockStack). We copy the oops back to the LockStack of the next carrier when thawing for the first time (and clear them from the stackChunk). Note that we currently assume carriers don't hold monitors while mounting virtual threads.
>> 
>> - For inflated monitors we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows us to tie the owner of the monitor to a `java.lang.Thread` instance, rather than to a JavaThread which is only created per platform thread. The tid is already a 64 bit field so we can ignore issues of the counter wrapping around.
>> 
>> #### General notes about this part:
>> 
>> - Since virtual th...
>
> Patricio Chilano Mateo has updated the pull request incrementally with six additional commits since the last revision:
> 
>  - Fix comments in objectMonitor.hpp
>  - Move frame::saved_thread_address() to platform dependent files
>  - Fix typo in jvmtiExport.cpp
>  - remove usage of frame::metadata_words in possibly_adjust_frame()
>  - Fix comments in c2 locking paths
>  - Revert and simplify changes to c1_Runtime1 on aarch64 and riscv

src/hotspot/share/runtime/continuationFreezeThaw.cpp line 2247:

> 2245:     _thread->lock_stack().move_from_address(tmp_lockstack, lockStackSize);
> 2246: 
> 2247:     chunk->set_lockstack_size(0);

After some discussion here at the office we think there might be an issue here with simply hiding the oops without clearing them. Below in `recurse_thaw` we `do_barriers`. But it does not touch these lockstack. Missing the SATB store barrier is probably fine from a liveness perspective, because the oops in the lockstack must also be in the frames. But removing the oops without a barrier and clear will probably lead to problems down the line.

Something like the following would probably handle this. Or even fuse the `copy_lockstack` and `clear_lockstack` together into some kind of `transfer_lockstack` which both loads and clears the oops.

diff --git a/src/hotspot/share/oops/stackChunkOop.cpp b/src/hotspot/share/oops/stackChunkOop.cpp
index d3d63533eed..f737bd2db71 100644
--- a/src/hotspot/share/oops/stackChunkOop.cpp
+++ b/src/hotspot/share/oops/stackChunkOop.cpp
@@ -470,6 +470,28 @@ void stackChunkOopDesc::copy_lockstack(oop* dst) {
   }
 }
 
+void stackChunkOopDesc::clear_lockstack() {
+  const int cnt = lockstack_size();
+  const bool requires_gc_barriers = is_gc_mode() || requires_barriers();
+  const bool requires_uncompress = has_bitmap() && UseCompressedOops;
+  const auto clear_obj = [&](intptr_t* at) {
+    if (requires_uncompress) {
+      HeapAccess<>::oop_store(reinterpret_cast<narrowOop*>(at), nullptr);
+    } else {
+      HeapAccess<>::oop_store(reinterpret_cast<oop*>(at), nullptr);
+    }
+  };
+
+  if (requires_gc_barriers) {
+    intptr_t* lockstack_start = start_address();
+    for (int i = 0; i < cnt; i++) {
+      clear_obj(&lockstack_start[i]);
+    }
+  }
+  set_lockstack_size(0);
+  set_has_lockstack(false);
+}
+
 void stackChunkOopDesc::print_on(bool verbose, outputStream* st) const {
   if (*((juint*)this) == badHeapWordVal) {
     st->print_cr("BAD WORD");
diff --git a/src/hotspot/share/oops/stackChunkOop.hpp b/src/hotspot/share/oops/stackChunkOop.hpp
index 28e0576801e..928e94dd695 100644
--- a/src/hotspot/share/oops/stackChunkOop.hpp
+++ b/src/hotspot/share/oops/stackChunkOop.hpp
@@ -167,6 +167,7 @@ class stackChunkOopDesc : public instanceOopDesc {
   void fix_thawed_frame(const frame& f, const RegisterMapT* map);
 
   void copy_lockstack(oop* start);
+  void clear_lockstack();
 
   template <typename OopT, class StackChunkLockStackClosureType>
   inline void iterate_lockstack(StackChunkLockStackClosureType* closure);
diff --git a/src/hotspot/share/runtime/continuationFreezeThaw.cpp b/src/hotspot/share/runtime/continuationFreezeThaw.cpp
index 5b6e48a02f3..e7d505bb9b1 100644
--- a/src/hotspot/share/runtime/continuationFreezeThaw.cpp
+++ b/src/hotspot/share/runtime/continuationFreezeThaw.cpp
@@ -2244,8 +2244,7 @@ NOINLINE intptr_t* Thaw<ConfigT>::thaw_slow(stackChunkOop chunk, Continuation::t
     chunk->copy_lockstack(tmp_lockstack);
     _thread->lock_stack().move_from_address(tmp_lockstack, lockStackSize);
 
-    chunk->set_lockstack_size(0);
-    chunk->set_has_lockstack(false);
+    chunk->clear_lockstack();
     retry_fast_path = true;
   }
 ```

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1810764911