[aarch64-port-dev ] RFR: 8217368: AArch64: C2 recursive stack locking optimisation not triggered

Andrew Haley aph at redhat.com
Fri Jan 18 09:36:31 UTC 2019


On 1/18/19 8:40 AM, Nick Gasson (Arm Technology China) wrote:
> Hi,
> 
> While I was cleaning up the patch for 8216350 I noticed an issue in the 
> implementation of recursive locking in aarch64_enc_fast_lock:
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8217368
> Webrev: http://cr.openjdk.java.net/~ngasson/8217368/webrev.0/
> 
> First we load the markOop of the object we want to lock and OR it with 
> markOopDesc::unlocked_value (1). Then we do a CAS to exchange the 
> address of the box on our thread's stack with the object's header word 
> iff it's equal to the (markOop | 1) we just computed. If this fails, 
> then we should check for a recursive lock by comparing
> 
>    (~(page size - 1) | 3) & (markOop - SP) == 0
> 
> Where "markOop" is the current object header word loaded by the failed 
> CAS. This checks that the lock bits are zero (locked) and the stack 
> address of the displaced header is within one page of the current SP. 
> But on AArch64 we actually do this:
> 
>    (~(page size - 1) | 3) & ((old markOop | 1) - SP) == 0
> 
> Where "old markOop | 1" is the compare-to value used for the CAS. This 
> is always false as the result has at least bit #0 set. This only affects 
> C2, the C1_MacroAssembler version has the correct test.
> 
> The diff looks big but all it does is swap the usage of registers `tmp' 
> and `disp_hdr' in the first section so the markOop loaded by the CAS 
> ends up in disp_hdr and tmp holds the (markOop | 1) compare-to value.

The patch looks good. However, I don't understand why we aren't using
MacroAssembler::cmpxchgptr here. It looks like we should be, and you'd
end up with a less complex result.

> Two other minor things:
> 
> * Does anyone know what the comment "// Load Compare Value application 
> register." means? It's present in the PPC and S390 ports too.

Probably no-one can remember. We'll have inherited it from x86.

> * The x86 port #ifdef LP64 uses "7 - os::vm_page_size()" as the mask in 
> the recursive lock test. I think the "7" here is 
> markOopDesc::biased_lock_mask and is presumably there to prevent a 
> silent mutual exclusion failure if a markOop with the bias locking bits 
> set ends up the fast_lock path (although this should never happen). 
> Should we change markOopDesc::lock_mask_in_place to 
> markOopDesc::biased_lock_mask_in_place in the AArch64 port too?

I wouldn't think so. You're describing a change that by definition we
can't test.

-- 
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


More information about the hotspot-compiler-dev mailing list