Safepoint between load and LRB

Fri May 14 20:55:31 UTC 2021

Ah yes, indeed! It seemed obvious to me, but it is in-fact far from 
obvious (as is anything to do with C2).

There is one little detail that is worth mentioning in this context: 
there is one use of the from-space-oop, that is in the context of 
NULL-checks: if we find that an obj is *only* used in comparison with 
NULL, then we allow the LRB to be elided. And it's obviously ok: it 
doesn't matter if we compare the from-space-oop with NULL, or the 
to-space-oop, the result is the same. This is frequent enough to make a 
measurable difference in performance.

Thanks and regards,
Roman

> Hi Zhengyu and Roman,
> 
> I think I figured out the my puzzle.
> 
> The LRB is generated in nodes level so the from_space_oop
> and to_space_oop are 2 different SSA and from_space_oop will be never
> used later except for storing into stack slot. I treated them as same
> register incorrectly. There should be no problem.
> 
> Thanks a lot for explanation!
> 
> Liang
> 
> 
> 
> 
>     ------------------------------------------------------------------
>     From:Zhengyu Gu <zgu at redhat.com>
>     Send Time:2021 May 14 (Fri.) 20:31
>     To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; shenandoah-dev
>     <shenandoah-dev at openjdk.java.net>; Roman Kennke <rkennke at redhat.com>
>     Subject:Re: Safepoint between load and LRB
> 
> 
> 
>     On 5/13/21 11:34 PM, Liang Mao wrote:
>      > Hi Zhengyu and Roman,
>      >
>      > I'm still got a little confused by the "non-atomic" loads and LRB.
>      >
>      > Say we have the following scenario:
>      >
>      > We had an oop_A:
>      >
>      > live_range_1:
>      > step 1: load "oop_A_from_space" from heap
>      > step 2: safe_point: oop_A_from_space spill into stack
>      > step 3: LRB: oop_A_from_space updated to oop_A_to_space
>      > step 4: end of live_range_1
>      >       // oop_A_to_space will not be spilled into stack again
>      >
>      > live_range_2:
>      > step 1: load oop_A(register) from stack,which is oop_A_from_space
>      >
>      > So looks like we still may get from-space oop, right?
> 
>     I don't understand what live_range_2 is, and how that can happen. Can
>     you elaborate?
> 
>     -Zhengyu
> 
> 
> 
> 
>      >
>      > Thanks,
>      > Liang
>      >
>      >
>      >
>      >
>      >     ------------------------------------------------------------------
>      >     From:Zhengyu Gu <zgu at redhat.com>
>      >     Send Time:2021 May 13 (Thu.) 20:19
>      >     To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; shenandoah-dev
>      >     <shenandoah-dev at openjdk.java.net>; Roman Kennke <rkennke at redhat.com>
>      >     Subject:Re: Shenandoah hangs on specjvm2008 due to reentrant handshake
>      >
>      >
>      >
>      >      >
>      >      > Is there a chance that threads will load stale oop in from-space  and
>      >      > hold it after fix-ref is down and then store the stale oop into heap before
>      >      > entry_update_thread_roots?
>      >      >
>      >
>      >     No, LRB guarantees no from-space oops can be written to heap when it is
>      >     armed.
>      >
>      >      > Is there a plan to re-use the barrier data to implement LRB to completely
>      >      > avoid "non-atomic" of loads and LRB like ZGC?
>      >      >
>      >
>      >     ZGC uses late barrier expansion to solve the problem, where Shenandoah
>      >     has trouble to do. Roman (cc'd) can explain better.
>      >
>      >     -Zhengyu
>      >
>      >      > Thanks,
>      >      > Liang
>      >      >
>      >      > ------------------------------------------------------------------
>      >      > From:Zhengyu Gu <zgu at redhat.com>
>      >      > Send Time:2021 May 12 (Wed.) 20:25
>      >      > To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; shenandoah-dev
>      >      > <shenandoah-dev at openjdk.java.net>
>      >      > Subject:Re: Shenandoah hangs on specjvm2008 due to reentrant handshake
>      >      >
>      >      > Hi Liang,
>      >      >
>      >      >
>      >      > On 5/12/21 6:53 AM, Liang Mao wrote:
>      >      >  > Hi Shenandoah team,
>      >      >  >
>      >      >  > We found this issue while running Shenandoah with jdk/master:
>      >      >  > https://bugs.openjdk.java.net/browse/JDK-8266963
>     <https://bugs.openjdk.java.net/browse/JDK-8266963>
>      >     <https://bugs.openjdk.java.net/browse/JDK-8266963>
>     <https://bugs.openjdk.java.net/browse/JDK-8266963>>
>      >      > <https://bugs.openjdk.java.net/browse/JDK-8266963>
>     <https://bugs.openjdk.java.net/browse/JDK-8266963>>
>      >     <https://bugs.openjdk.java.net/browse/JDK-8266963>>
>     <https://bugs.openjdk.java.net/browse/JDK-8266963>>> in which we had a simple fix
>      >      >  > so far.
>      >      >
>      >      > Thanks for reporting. Could you please post your fix?
>      >      >
>      >      >  >
>      >      >  > The hanging comes from
>      >      > https://bugs.openjdk.java.net/browse/JDK-8262443
>     <https://bugs.openjdk.java.net/browse/JDK-8262443>
>      >     <https://bugs.openjdk.java.net/browse/JDK-8262443>
>     <https://bugs.openjdk.java.net/browse/JDK-8262443>>
>      >      > <https://bugs.openjdk.java.net/browse/JDK-8262443>
>     <https://bugs.openjdk.java.net/browse/JDK-8262443>>
>      >     <https://bugs.openjdk.java.net/browse/JDK-8262443>>
>     <https://bugs.openjdk.java.net/browse/JDK-8262443>>> which
>      >      >  > introduced the possibility of reentrant handshake and looks like only shenandoah
>      >      >  > would hit the scenario. I have a question specific to shenandoah that why do we
>      >      >  >   need the phase of "entry_update_thread_roots"?
>      >      >  > I think we already concurrently relocate the oops in thread roots by
>      >      >  > "entry_thread_roots".
>      >      >
>      >      > entry_thread_roots fixes on-stack frames, but can not prevent thread
>      >      > from loading from-space oops during continuing execution. Ideally,
>      >      > Shenandoah LRB should prevent this from happening, if LRB is "atomic" (
>      >      > no safepoints between load and LRB), unfortunately, current LRB is not.
>      >      >
>      >      >  > And "entry_update_thread_roots" doesn't use a general concurrent stack processing
>      >      >  > via stack watermark but just a handshake.
>      >      >
>      >      > At point we execute entry_update_thread_roots, cset has been completely
>      >      > evacuated, therefore, no more from-space oops in heap. There are very
>      >      > rare cases that threads can still hold stalled oops, can be flushed out
>      >      > very quick, especially, if we can handshake individual Java thread and
>      >      > not cause global safepoint.
>      >      >
>      >      > -Zhengyu
>      >      >
>      >      >  >
>      >      >  > Thanks,
>      >      >  > Liang
>      >      >  >
>      >      >
>      >
>      > 
> 
>