RFR: 8338383: Implement JEP 491: Synchronize Virtual Threads without Pinning

Wed Nov 6 17:40:03 UTC 2024

On Sat, 26 Oct 2024 01:40:41 GMT, Dean Long <dlong at openjdk.org> wrote:

>> This is the implementation of JEP 491: Synchronize Virtual Threads without Pinning. See [JEP 491](https://bugs.openjdk.org/browse/JDK-8337395) for further details.
>> 
>> In order to make the code review easier the changes have been split into the following initial 4 commits:
>> 
>> - Changes to allow unmounting a virtual thread that is currently holding monitors.
>> - Changes to allow unmounting a virtual thread blocked on synchronized trying to acquire the monitor.
>> - Changes to allow unmounting a virtual thread blocked in `Object.wait()` and its timed-wait variants.
>> - Changes to tests, JFR pinned event, and other changes in the JDK libraries.
>> 
>> The changes fix pinning issues for all 4 ports that currently implement continuations: x64, aarch64, riscv and ppc. Note: ppc changes were added recently and stand in its own commit after the initial ones.
>> 
>> The changes fix pinning issues when using `LM_LIGHTWEIGHT`, i.e. the default locking mode, (and `LM_MONITOR` which comes for free), but not when using `LM_LEGACY` mode. Note that the `LockingMode` flag has already been deprecated ([JDK-8334299](https://bugs.openjdk.org/browse/JDK-8334299)), with the intention to remove `LM_LEGACY` code in future releases.
>> 
>> 
>> ## Summary of changes
>> 
>> ### Unmount virtual thread while holding monitors
>> 
>> As stated in the JEP, currently when a virtual thread enters a synchronized method or block, the JVM records the virtual thread's carrier platform thread as holding the monitor, not the virtual thread itself. This prevents the virtual thread from being unmounted from its carrier, as ownership information would otherwise go wrong. In order to fix this limitation we will do two things:
>> 
>> - We copy the oops stored in the LockStack of the carrier to the stackChunk when freezing (and clear the LockStack). We copy the oops back to the LockStack of the next carrier when thawing for the first time (and clear them from the stackChunk). Note that we currently assume carriers don't hold monitors while mounting virtual threads.
>> 
>> - For inflated monitors we now record the `java.lang.Thread.tid` of the owner in the ObjectMonitor's `_owner` field instead of a JavaThread*. This allows us to tie the owner of the monitor to a `java.lang.Thread` instance, rather than to a JavaThread which is only created per platform thread. The tid is already a 64 bit field so we can ignore issues of the counter wrapping around.
>> 
>> #### General notes about this part:
>> 
>> - Since virtual th...
>
> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1555:
> 
>> 1553:   // Make VM call. In case of preemption set last_pc to the one we want to resume to.
>> 1554:   adr(rscratch1, resume_pc);
>> 1555:   str(rscratch1, Address(rthread, JavaThread::last_Java_pc_offset()));
> 
> Is it really needed to set an alternative last_Java_pc()?  I couldn't find where it's used in a way that would require a different value.

Its indeed difficult to see how the value is propagaged. I think it goes like this:

- read from the frame anchor and set as pc of `_last_frame`: https://github.com/pchilano/jdk/blob/66d5385f8a1c84e73cdbf385239089a7a9932a9e/src/hotspot/share/runtime/continuationFreezeThaw.cpp#L517
- copied to the result of `new_heap_frame`: https://github.com/pchilano/jdk/blob/66d5385f8a1c84e73cdbf385239089a7a9932a9e/src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp#L99
- Written to the frame here: https://github.com/pchilano/jdk/blob/66d5385f8a1c84e73cdbf385239089a7a9932a9e/src/hotspot/cpu/aarch64/continuationFreezeThaw_aarch64.inline.hpp#L177
- Here it's done when freezing fast: https://github.com/pchilano/jdk/blob/66d5385f8a1c84e73cdbf385239089a7a9932a9e/src/hotspot/share/runtime/continuationFreezeThaw.cpp#L771

> src/hotspot/cpu/aarch64/interp_masm_aarch64.cpp line 1567:
> 
>> 1565: 
>> 1566:   // In case of preemption, this is where we will resume once we finally acquire the monitor.
>> 1567:   bind(resume_pc);
> 
> If the idea is that we return directly to `resume_pc`, because of `last_Java_pc`(), then why do we poll `preempt_alternate_return_offset` above?

The address at `preempt_alternate_return_offset` is how to continue immediately after the call was preempted. It's where the vthread frames are popped off the carrier stack.

At `resume_pc` execution continues when the vthread becomes runnable again. Before its frames were thawed and copied to its carriers stack.

> src/hotspot/cpu/x86/stubGenerator_x86_64.cpp line 3796:
> 
>> 3794:   __ movbool(rscratch1, Address(r15_thread, JavaThread::preemption_cancelled_offset()));
>> 3795:   __ testbool(rscratch1);
>> 3796:   __ jcc(Assembler::notZero, preemption_cancelled);
> 
> If preemption was canceled, then I wouldn't expect patch_return_pc_with_preempt_stub() to get called.  Does this mean preemption can get canceled (asynchronously be a different thread?) even afgter patch_return_pc_with_preempt_stub() is called?

The comment at the `preemption_cancelled` label explains that a second attempt to acquire the monitor succeeded after freezing. The vthread has to continue execution. For that its frames (removed just above) need to be thawed again.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817702223
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817702986
PR Review Comment: https://git.openjdk.org/jdk/pull/21565#discussion_r1817703994