[aarch64-port-dev ] 8153107: Unbalanced recursive locking

Daniel D. Daugherty daniel.daugherty at oracle.com
Sat Jun 16 20:08:49 UTC 2018


Hi Andrey,

I updated the bug report with the Mach5 results so the only thing I need
now is confirmation that ARM testing went fine. So whenever you or aph
or both have some ARM test results we should be good to go.

In particular it would be good to verify that your test that reproduces
this hang no longer reproduces the hang in some relatively high number
of iterations. So if your test repros the hang in 5/100 runs, I would
be happy if the hang didn't repro in a 1000 runs.

Dan


On 6/16/18 12:11 PM, Andrey Petushkov wrote:
> Hi Dan,
>
> Thank you for sponsorship and yes,
> > I don't really know ARM assembly, but I'm guessing that the 'eq'
> > parameter
> >     controls whether the store of Rscratch into "Address(Rbox, ...)"
> > occurs or
> >     not
> You guess is perfectly correct!
>
> Thanks again,
> Andrey
>
>
> On Fri, Jun 15, 2018 at 9:00 PM Daniel D. Daugherty 
> <daniel.daugherty at oracle.com <mailto:daniel.daugherty at oracle.com>> wrote:
>
>     Andrey,
>
>     I'm about to send the following changeset thru Mach5 testing:
>
>     $ hg log -v -r tip
>     changeset:   50592:dc9e3b56c8ab
>     tag:         tip
>     user:        apetushkov
>     date:        Fri Jun 15 13:57:32 2018 -0400
>     files:       src/hotspot/cpu/arm/macroAssembler_arm.cpp
>     src/hotspot/share/runtime/sharedRuntime.cpp
>     description:
>     8153107: enabling ObjectSynchronizer::quick_enter() on ARM64
>     causes hangs
>     Summary: Always set the markword for recursive monitors in
>     MacroAssembler::fast_lock().
>     Reviewed-by: aph, drwhite, dcubed
>
>
>     I don't expect any problems, but I'm paranoid so...
>
>     Dan
>
>
>     On 6/15/18 1:29 PM, Daniel D. Daugherty wrote:
>     > > http://cr.openjdk.java.net/~apetushkov/8153107/
>     <http://cr.openjdk.java.net/%7Eapetushkov/8153107/>
>     >
>     > src/hotspot/share/runtime/sharedRuntime.cpp
>     >     No comments.
>     >
>     > src/hotspot/cpu/arm/macroAssembler_arm.cpp
>     >     Looks correct. Here's my analysis:
>     >
>     >     The X64 version of this code does this:
>     >
>     >     src/hotspot/cpu/x86/macroAssembler_x86.cpp:
>     >
>     >       // Recursive locking.
>     >       // The object is stack-locked: markword contains stack
>     pointer
>     > to BasicLock.
>     >       // Locked by current thread if difference with current SP is
>     > less than one page.
>     >       subptr(tmpReg, rsp);
>     >       // Next instruction set ZFlag == 1 (Success) if difference is
>     > less then one page.
>     >       andptr(tmpReg, (int32_t) (NOT_LP64(0xFFFFF003) LP64_ONLY(7 -
>     > os::vm_page_size())) );
>     >       movptr(Address(boxReg, 0), tmpReg);
>     >       if (counters != NULL) {
>     >         cond_inc32(Assembler::equal,
>     > ExternalAddress((address)counters->fast_path_entry_count_addr()));
>     >       }
>     >       jmp(DONE_LABEL);
>     >
>     >     This code path always updates "Address(boxReg, 0)" with the
>     >     "tmpReg" value so the markword always gets updated with a zero
>     >     or non-zero value.
>     >
>     >     The previous AMD64 code did this:
>     >
>     >       L3011: #ifdef AARCH64
>     >       L3012:   intptr_t mask = ((intptr_t)3) -
>     > ((intptr_t)os::vm_page_size());
>     >       L3013:   Assembler::LogicalImmediate imm(mask, false);
>     >       L3014:   mov(Rscratch, SP);
>     >       L3015:   sub(Rscratch, Rmark, Rscratch);
>     >       L3016:   ands(Rscratch, Rscratch, imm);
>     >       L3017:   b(done, ne); // exit with failure
>     >       L3018:   str(Rscratch, Address(Rbox,
>     > BasicLock::displaced_header_offset_in_bytes())); // set to zero
>     >       L3019:   b(done);
>     >       L3020:
>     >       L3021: #else
>     >       L3022:   // -1- test low 2 bits
>     >       L3023:   movs(Rscratch, AsmOperand(Rmark, lsl, 30));
>     >       L3024:   // -2- test (hdr - SP) if the low two bits are 0
>     >       L3025:   sub(Rscratch, Rmark, SP, eq);
>     >       L3026:   movs(Rscratch, AsmOperand(Rscratch, lsr,
>     > exact_log2(os::vm_page_size())), eq);
>     >       L3027:   // If still 'eq' then recursive locking OK
>     >       L3028:   str(Rscratch, Address(Rbox,
>     > BasicLock::displaced_header_offset_in_bytes()), eq); // set to zero
>     >       L3029:   b(done);
>     >       L3030: #endif
>     >
>     >     For the AARCH64 version, the key bug is the "b(done, ne); //
>     exit
>     > with failure"
>     >     which results in the markword not getting set to a specific
>     value.
>     >     Someone mentioned that we're getting a random value here and
>     that
>     >     contributes to the intermittent nature of the hang. Removing
>     L3017
>     >     so that we always set the markword makes perfect sense.
>     >
>     >     For the non-AARCH64 version, the key bug is a little more
>     obscure
>     >     (at least to me): "str(Rscratch, Address(Rbox, ...), eq); //
>     set
>     > to zero".
>     >     I don't really know ARM assembly, but I'm guessing that the
>     'eq'
>     > parameter
>     >     controls whether the store of Rscratch into "Address(Rbox,
>     ...)"
>     > occurs or
>     >     not. Removing the "eq" parameter in the fix so that we
>     always set the
>     >     markword makes perfect sense.
>     >
>     >
>     > So thumbs up on the fix and I can sponsor it. Please provide me
>     with an
>     > importable changeset and I'll take care of that after
>     confirmation that
>     > the ARM folks have tested the fix and are happy with it.
>     >
>     > Dan
>     >
>     >
>     > On 6/14/18 6:08 PM, White, Derek wrote:
>     >> Hi Andrey,
>     >>
>     >> It took me looking at the code three times, but I finally saw what
>     >> you mean about condition code register. OK, looks fine to me.
>     >>
>     >> You still need a (R)eviewer’s OK and then a sponsor.
>     >>
>     >>
>     >>    *   Derek
>     >>
>     >> From: Andrey Petushkov [mailto:andrey.petushkov at gmail.com
>     <mailto:andrey.petushkov at gmail.com>]
>     >> Sent: Thursday, June 14, 2018 4:55 PM
>     >> To: White, Derek <Derek.White at cavium.com
>     <mailto:Derek.White at cavium.com>>
>     >> Cc: Andrew Haley <aph at redhat.com <mailto:aph at redhat.com>>;
>     >> hotspot-runtime-dev at openjdk.java.net
>     <mailto:hotspot-runtime-dev at openjdk.java.net>;
>     >> aarch64-port-dev at openjdk.java.net
>     <mailto:aarch64-port-dev at openjdk.java.net>; AArch32 Port Project
>     >> <aarch32-port-dev at openjdk.java.net
>     <mailto:aarch32-port-dev at openjdk.java.net>>
>     >> Subject: Re: [aarch64-port-dev ] 8153107: Unbalanced recursive
>     locking
>     >>
>     >>
>     >> External Email
>     >> Hi Derek,
>     >>
>     >> that shall be ands since it expected to leave the result in the
>     flags
>     >> register, see the cmpFastLock instruct in aarch64.ad
>     <http://aarch64.ad><http://aarch64.ad>
>     >> to my taste this version is error-prone, since it's quite hard to
>     >> deduce that contract FastLock node involves writing of non 0 value
>     >> into DHW of stack lock when it has failed. however I don't
>     insist. so
>     >> if everyone agrees could someone please commit. sorry, I don't
>     have
>     >> necessary status
>     >>
>     >>
>     >> Regards,
>     >> Andrey
>     >>
>     >> On Thu, Jun 14, 2018 at 9:13 PM White, Derek
>     >> <Derek.White at cavium.com
>     <mailto:Derek.White at cavium.com><mailto:Derek.White at cavium.com
>     <mailto:Derek.White at cavium.com>>> wrote:
>     >> Hi Andrey,
>     >>
>     >> I like this version.
>     >>
>     >> My only suggestion is minor - the "ands" at line 3016 could be a
>     >> simple "and". It causes no harm but could be confusing to a reader:
>     >>
>     >> 3016   ands(Rscratch, Rscratch, imm);
>     >>
>     >> No need to see a new webrev.
>     >> - Derek
>     >>
>     >>> -----Original Message-----
>     >>> From: hotspot-runtime-dev
>     >>> [mailto:hotspot-runtime-dev-
>     <mailto:hotspot-runtime-dev-><mailto:hotspot-runtime-dev-
>     <mailto:hotspot-runtime-dev->>
>     >>> bounces at openjdk.java.net
>     <mailto:bounces at openjdk.java.net><mailto:bounces at openjdk.java.net
>     <mailto:bounces at openjdk.java.net>>] On Behalf
>     >>> Of Andrey Petushkov
>     >>> Sent: Thursday, June 14, 2018 8:25 AM
>     >>> To: Andrew Haley <aph at redhat.com
>     <mailto:aph at redhat.com><mailto:aph at redhat.com
>     <mailto:aph at redhat.com>>>
>     >>> Cc:
>     >>> hotspot-runtime-dev at openjdk.java.net
>     <mailto:hotspot-runtime-dev at openjdk.java.net><mailto:hotspot-runtime-dev at openjdk.java.net
>     <mailto:hotspot-runtime-dev at openjdk.java.net>>;
>     >>> aarch64-port-
>     >>> dev at openjdk.java.net
>     <mailto:dev at openjdk.java.net><mailto:dev at openjdk.java.net
>     <mailto:dev at openjdk.java.net>>; AArch32 Port
>     >>> Project <aarch32-port-
>     >>> dev at openjdk.java.net
>     <mailto:dev at openjdk.java.net><mailto:dev at openjdk.java.net
>     <mailto:dev at openjdk.java.net>>>
>     >>> Subject: Re: [aarch64-port-dev ] 8153107: Unbalanced recursive
>     locking
>     >>>
>     >>> External Email
>     >>>
>     >>> Hm, strange. It displays well for me in the archives page.
>     Anyway,
>     >>> I've put
>     >>> the webrev at http://cr.openjdk.java.net/~apetushkov/8153107/
>     <http://cr.openjdk.java.net/%7Eapetushkov/8153107/>
>     >>>
>     >>> On Thu, Jun 14, 2018 at 3:15 PM Andrew Haley
>     >>> <aph at redhat.com <mailto:aph at redhat.com><mailto:aph at redhat.com
>     <mailto:aph at redhat.com>>> wrote:
>     >>>
>     >>>> On 06/14/2018 12:59 PM, Andrey Petushkov wrote:
>     >>>>> So then if you prefer to leave the different logic in shared
>     code
>     >>>>> between quick and slow paths I believe the fix for cpu/arm
>     >>>>> implementation (and removal of unnecessary workaround for
>     >>> cpu/aarch64) should look like this:
>     >>>> Something awful happened to the formatting of your mail.  Can
>     you put
>     >>>> it up somewhere we can see the patch?
>     >>>>
>     >>>> --
>     >>>> Andrew Haley
>     >>>> Java Platform Lead Engineer
>     >>>> Red Hat UK Ltd. <https://www.redhat.com>
>     >>>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332
>     >>>> <https://maps.google.com/?q=6035+332&entry=gmail&source=g>F A671
>     >>>>
>     >
>     >
>



More information about the hotspot-runtime-dev mailing list