RFR(S) Contended Locking fast exit bucket (8073165)

David Holmes david.holmes at oracle.com
Fri Mar 13 04:50:09 UTC 2015


Hi Dan,

Well that was fun ... :)

On 10/03/2015 3:11 AM, Daniel D. Daugherty wrote:
> Greetings,
>
> I have the Contended Locking fast exit bucket ready for review.
>
> The code changes in this bucket are primarily assembly language
> tweaks that come into play when an inflated ObjectMonitor is in
> use. There are tweaks that apply to the default code path and
> there are tweaks that only apply when an experimental flag is
> set. This code code review is focused on the default code path
> or the "out of the box" configuration.
>
> This work is being tracked by the following bug ID:
>
>      JDK-8073165 Contended Locking fast exit bucket
>      https://bugs.openjdk.java.net/browse/JDK-8073165
>
> Here is the webrev URL:
>
> http://cr.openjdk.java.net/~dcubed/8073165-webrev/0-jdk9-hs-rt/
>
> Here is the JEP link:
>
>      https://bugs.openjdk.java.net/browse/JDK-8046133
>
> Testing:
>
> - Aurora Adhoc RT/SVC baseline batch
> - JPRT test jobs
> - MonitorExitStresser micro-benchmark (in process)
> - MonitorEnterExitStresser micro-benchmark (in process)
> - CallTimerGrid stress testing (in process)
> - Aurora performance testing:
>    - out of the box for the "promotion" and 32-bit server configs
>    - heavy weight monitors for the "promotion" and 32-bit server configs
>      (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>      (in process)
>
>
> 8073165 summary of changes:
>
> macroAssembler_sparc.cpp: MacroAssembler::compiler_unlock_object()
>
> - default (OOTB) EmitSync value == 0
> - an _owner field optimization is in the non-default path under
>    a (EmitSync & 1024) check
>
> - classic lock release without 1-0 support is in the non-default
>    path under a (EmitSync & 512) check
>
> - an EntryList, cxq and _succ field optimization is in the
>    non-default path under a (EmitSync & 16384) check
>
> - the default path is now the 1-0 form that avoids CAS and membar
>    in the common case; see the "1-0 exit" section in objectMonitor.cpp

I don't pretend that I could follow all these changes (I don't read 
sparc very well) - the main thing of interest is the EmitSync==0 path - 
everything else is just experimental (did you actually test to any great 
extent with all these different possible EmitSync values?).

So looking at the default path ... I could not figure out what logic 
this (pre-existing) code is effecting:

3071      ld_ptr(Address(Rmark,
OM_OFFSET_NO_MONITOR_VALUE_TAG(recursions)), Rbox);
3072      orcc(Rbox, G0, G0);
3073      brx(Assembler::notZero, false, Assembler::pn, done);

It seems to be a check for:

if (recursions != 0) return;

which would be fine except where is recursions decremented ??

But that aside the description seems reasonable.

> sharedRuntime_sparc.cpp:
>
> - add JavaThread* param for
>    SharedRuntime::complete_monitor_unlocking_C call

Ok.

> macroAssembler_x86.cpp: MacroAssembler::fast_unlock()
>
> - with all the new/modified comments and all the optional
>    code path updates, it is hard to see that the only real
>    changes to the default code path are 1) the _owner==Self
>    check has been made optional and 2) the optional succ
>    field check for the slow path is now on the default path
>
> - default (OOTB) EmitSync value == 0
> - remove optional (EmitSync & 8) code
> - refactor code in MacroAssembler::fast_unlock
> - remove old mfence() support

So I had expected to see this change conditional on the runtime 
processor type. But then I checked Dave's blog and saw this was 
determined 5+ years ago (!!!) so those older processors are no longer a 
consideration. I do hope the performance tests have been re-run since 
then though ;-)

> - an _owner field optimization is in the non-default path under
>    a (EmitSync & 1024) check

Doesn't the comment:

2163     if (EmitSync & 1024) {
2164       // Don't bother to ratify that m_Owner==Self

apply to else part?

> - an _owner field optimization is in the non-default path under
>    a (EmitSync & 16) check
> - a barrier optimization is in the non-default path under a
>    (EmitSync & 32) check
> - remove old AMD optional work around

2229       // box is really RAX -- the following CMPXCHG depends on that 
binding
2230       // cmpxchg R,[M] is equivalent to rax = CAS(M,rax,R)
2231
2232       movptr(boxReg, (int32_t)NULL_WORD);     // box is really RAX

You've repeated the comment about box.

> sharedRuntime_x86_32.cpp:
> sharedRuntime_x86_64.cpp:
>
> - add thread param for SharedRuntime::complete_monitor_unlocking_C
>    call

Ok.

> macro.[ch]pp: PhaseMacroExpand::make_slow_call()
>
> - add param2 and update callers

Ok.

> runtime.[ch]pp: OptoRuntime::complete_monitor_exit_Type()
>
> - add JavaThread* parameter

Ok

> sharedRuntime.[ch]pp: SharedRuntime::complete_monitor_unlocking_C()
>
> - add JavaThread* parameter

Ok. So all that passing through of the Thread was just to avoid a call 
to JavaThread::current()? I thought that was already optimized to just 
be a register load (at least on some platforms) ?

> synchronizer.cpp: ObjectSynchronizer::omRelease()
>
> - cleanup in use list traversal
> - add assert that ObjectMonitor being released was
>    actually found and extracted
>
> synchronizer.[ch]pp:
>
> - add some comments in various places
> - rename walk_monitor_list() -> deflate_monitor_list()
>    and update callers
> - minor logic refactoring

Generally okay.

Minor nit: you changed inuse to "in use" but for the "inuse list" it 
should really be "in-use list" (and not in-use-list as appears in a 
couple of existing places)

1507       // if deflate_monitor succeeded,/moribund

Not sure what that comment is supposed to mean :)

> thread.hpp:
>
> - add _TSelf field for mfence() replacement

Hmmmm, so we add a new pointer field to the Thread structure purely for 
use on x86 if running with a particular experimental option set ? That 
doesn't seem like it fits with our normal considerations for space/time 
trade-offs. Maybe the xchg T,T->TSelf stuff should be under a 
compile-flag instead?

>
> Thanks, in advance, for any comments, questions or suggestions.

Cheers,
David

> Dan


More information about the hotspot-runtime-dev mailing list