[aarch64-port-dev ] RFR: 8217368: AArch64: C2 recursive stack locking optimisation not triggered
Andrew Haley
aph at redhat.com
Fri Jan 18 09:36:31 UTC 2019
On 1/18/19 8:40 AM, Nick Gasson (Arm Technology China) wrote:
> Hi,
>
> While I was cleaning up the patch for 8216350 I noticed an issue in the
> implementation of recursive locking in aarch64_enc_fast_lock:
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8217368
> Webrev: http://cr.openjdk.java.net/~ngasson/8217368/webrev.0/
>
> First we load the markOop of the object we want to lock and OR it with
> markOopDesc::unlocked_value (1). Then we do a CAS to exchange the
> address of the box on our thread's stack with the object's header word
> iff it's equal to the (markOop | 1) we just computed. If this fails,
> then we should check for a recursive lock by comparing
>
> (~(page size - 1) | 3) & (markOop - SP) == 0
>
> Where "markOop" is the current object header word loaded by the failed
> CAS. This checks that the lock bits are zero (locked) and the stack
> address of the displaced header is within one page of the current SP.
> But on AArch64 we actually do this:
>
> (~(page size - 1) | 3) & ((old markOop | 1) - SP) == 0
>
> Where "old markOop | 1" is the compare-to value used for the CAS. This
> is always false as the result has at least bit #0 set. This only affects
> C2, the C1_MacroAssembler version has the correct test.
>
> The diff looks big but all it does is swap the usage of registers `tmp'
> and `disp_hdr' in the first section so the markOop loaded by the CAS
> ends up in disp_hdr and tmp holds the (markOop | 1) compare-to value.
The patch looks good. However, I don't understand why we aren't using
MacroAssembler::cmpxchgptr here. It looks like we should be, and you'd
end up with a less complex result.
> Two other minor things:
>
> * Does anyone know what the comment "// Load Compare Value application
> register." means? It's present in the PPC and S390 ports too.
Probably no-one can remember. We'll have inherited it from x86.
> * The x86 port #ifdef LP64 uses "7 - os::vm_page_size()" as the mask in
> the recursive lock test. I think the "7" here is
> markOopDesc::biased_lock_mask and is presumably there to prevent a
> silent mutual exclusion failure if a markOop with the bias locking bits
> set ends up the fast_lock path (although this should never happen).
> Should we change markOopDesc::lock_mask_in_place to
> markOopDesc::biased_lock_mask_in_place in the AArch64 port too?
I wouldn't think so. You're describing a change that by definition we
can't test.
--
Andrew Haley
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
More information about the hotspot-compiler-dev
mailing list