RFR(S) Contended Locking fast enter bucket (8061553)
Daniel D. Daugherty
daniel.daugherty at oracle.com
Tue Nov 11 03:46:18 UTC 2014
The webrev is now available! Sorry for any confusion.
Dan
On 11/6/14 7:17 PM, Daniel D. Daugherty wrote:
> The fix for JDK-8062851 has been reviewed, tested and pushed to
> RT_Baseline.
>
> Time to get back to this review thread so here's an updated webrev:
>
> http://cr.openjdk.java.net/~dcubed/8061553-webrev/1-jdk9-hs-rt/
>
> David H., I believe I've addressed all of your comments. Please
> let me know if I missed something...
>
> Thanks, in advance, for any comments, questions or suggestions.
>
> Dan
>
>
> On 11/4/14 11:26 AM, Daniel D. Daugherty wrote:
>> The cleanup is turning into a bigger change than the fast enter
>> bucket itself so I'm spinning the cleanup into a new bug:
>>
>> JDK-8062851 cleanup ObjectMonitor offset adjustments
>> https://bugs.openjdk.java.net/browse/JDK-8062851
>>
>> Yes, this means that the Contended Locking cleanup bucket has reopened
>> for yet another change...
>>
>> We'll get back to "fast enter" after the dust has settled...
>>
>> Dan
>>
>>
>> On 11/3/14 6:59 PM, Daniel D. Daugherty wrote:
>>> David,
>>>
>>> Thanks for the review! As usual, replies are embedded below...
>>>
>>>
>>> On 11/2/14 9:44 PM, David Holmes wrote:
>>>> Hi Dan,
>>>>
>>>> Looks good.
>>>
>>> Thanks!
>>>
>>>
>>>> Couple of nits and one semantic query below ...
>>>>
>>>> src/cpu/sparc/vm/macroAssembler_sparc.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Yes, I have no idea what got into me. Normally I do formatting
>>> changes separately so the noise does not distract...
>>>
>>> It turns out there is a constant defined that should be used
>>> instead of all these literal '2's:
>>>
>>> src/share/vm/oops/markOop.hpp: monitor_value = 2
>>>
>>> Typically used as follows:
>>>
>>> src/cpu/x86/vm/macroAssembler_x86.cpp: int owner_offset =
>>> ObjectMonitor::owner_offset_in_bytes() - markOopDesc::monitor_value;
>>>
>>> I will clean this up just for the files that I'm touching as
>>> part of this fix.
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> src/cpu/x86/vm/macroAssembler_x86.cpp
>>>>
>>>> Formatting changes were a bit of a distraction.
>>>
>>> Same reply as for macroAssembler_sparc.cpp.
>>>
>>>
>>>> 1929 // unconditionally set stackBox->_displaced_header = 3
>>>> 1930 movptr(Address(boxReg, 0),
>>>> (int32_t)intptr_t(markOopDesc::unused_mark()));
>>>>
>>>> At 1870 we refer to box rather than stackBox. Also it takes some
>>>> sleuthing to realize that "3" here is somehow a pseudonym for
>>>> unused_mark(). Back up at 1808 we have a to-do:
>>>>
>>>> 1808 // use markOop::unused_mark() instead of "3".
>>>>
>>>> so the current change seems to be implementing that, even though
>>>> other uses of "3" are left untouched.
>>>
>>> I'll take a look at cleaning those up also...
>>>
>>> In some cases markOopDesc::marked_value will work for the literal '3',
>>> but in other cases we'll use markOop::unused_mark():
>>>
>>> static markOop unused_mark() {
>>> return (markOop) marked_value;
>>> }
>>>
>>> to save us the noise of the (markOop) cast.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/sharedRuntime.cpp
>>>>
>>>> 1794 JRT_BLOCK_ENTRY(void,
>>>> SharedRuntime::complete_monitor_locking_C(oopDesc* _obj, BasicLock*
>>>> lock, JavaThread* thread))
>>>> 1795 if (!SafepointSynchronize::is_synchronizing()) {
>>>> 1796 if (ObjectSynchronizer::quick_enter(_obj, thread, lock))
>>>> return;
>>>>
>>>> Is it necessary to check is_synchronizing? If we are executing this
>>>> code we are not at a safepoint and the quick_enter wont change
>>>> that, so I'm not sure what we are guarding against.
>>>
>>> So this first state checker:
>>>
>>> src/share/vm/runtime/safepoint.hpp:
>>> inline static bool is_synchronizing() { return _state ==
>>> _synchronizing; }
>>>
>>> means that we want to go to a safepoint and:
>>>
>>> inline static bool is_at_safepoint() { return _state ==
>>> _synchronized; }
>>>
>>> means that we are at a safepoint. Dice's optimization bails out if
>>> we want to go to a safepoint and ObjectSynchronizer::quick_enter()
>>> has a "No_Safepoint_Verifier nsv" in it so we're expecting that
>>> code to be quick (and not go to a safepoint). I'm not seeing
>>> anything obvious....
>>>
>>> Sometimes we have to be careful with JavaThread suspend requests and
>>> monitor acquisition, but I don't think that's a problem here... In
>>> order for the "suspend requesting" thread to be surprised, the suspend
>>> API, e.g., JVM/TI SuspendThread() has to return to the caller and then
>>> the suspend target has do something unexpected like acquire a monitor
>>> that it was previously blocked upon when it was suspended. We've had
>>> bugs like that in the past... In this optimization case, our target
>>> thread is not blocked on a contended monitor...
>>>
>>> In this particular case, the "suspend requesting" thread will set the
>>> suspend request state on the target thread, but the target thread is
>>> busy trying to enter this uncontended monitor (quickly). So the
>>> "suspend requesting" thread, will request a no-op safepoint, but it
>>> won't return from the suspend API until that safepoint completes.
>>> The safepoint won't complete until the target thread is done acquiring
>>> the previously uncontended monitor... so the target thread will be
>>> suspended while holding the previous uncontended monitor and the
>>> "suspend requesting" thread will return from the suspend API all
>>> happy...
>>>
>>> Well, I don't see the reason either so I'll have to ping Dave Dice
>>> and Karen Kinnear to see if either of them can fill in the history
>>> here. This could be an abundance of caution case.
>>>
>>>
>>>> ---
>>>>
>>>> src/share/vm/runtime/synchronizer.cpp
>>>>
>>>> Minor nit: line 153 the usual acronym is NPE (for
>>>> NullPointerException) not NPX
>>>
>>> I'll do a search for uses of NPX and other uses of 'X' in exception
>>> acronyms...
>>>
>>>
>>>>
>>>> Nit: 159 Thread * const ox
>>>>
>>>> Please change ox to owner.
>>>
>>> Will do.
>>>
>>> Thanks again for the review!
>>>
>>> Dan
>>>
>>>
>>>>
>>>> ---
>>>>
>>>> Thanks,
>>>> David
>>>>
>>>>
>>>>
>>>> On 31/10/2014 8:50 AM, Daniel D. Daugherty wrote:
>>>>> Greetings,
>>>>>
>>>>> I have the Contended Locking fast enter bucket ready for review.
>>>>>
>>>>> The code changes in this bucket are primarily a quick_enter()
>>>>> function that works on inflated but uncontended Java monitors.
>>>>> This quick_enter() function is used on the "slow path" for Java
>>>>> Monitor enter operations when the built-in "fast path" (read
>>>>> assembly code) doesn't work.
>>>>>
>>>>> This work is being tracked by the following bug ID:
>>>>>
>>>>> JDK-8061553 Contended Locking fast enter bucket
>>>>> https://bugs.openjdk.java.net/browse/JDK-8061553
>>>>>
>>>>> Here is the webrev URL:
>>>>>
>>>>> http://cr.openjdk.java.net/~dcubed/8061553-webrev/0-jdk9-hs-rt/
>>>>>
>>>>> Here is the JEP link:
>>>>>
>>>>> https://bugs.openjdk.java.net/browse/JDK-8046133
>>>>>
>>>>> 8061553 summary of changes:
>>>>>
>>>>> macroAssembler_sparc.cpp: MacroAssembler::compiler_lock_object()
>>>>>
>>>>> - clean up spacing around some
>>>>> 'ObjectMonitor::owner_offset_in_bytes() - 2' uses
>>>>> - remove optional (EmitSync & 64) code
>>>>> - change from cmp() to andcc() so icc.zf flag is set
>>>>>
>>>>> macroAssembler_x86.cpp: MacroAssembler::fast_lock()
>>>>>
>>>>> - remove optional (EmitSync & 2) code
>>>>> - rewrite LP64 inflated lock code that tries to CAS in
>>>>> the new owner value to be more efficient
>>>>>
>>>>> interfaceSupport.hpp:
>>>>>
>>>>> - add JRT_BLOCK_NO_ASYNC to permit splitting a
>>>>> JRT_BLOCK_ENTRY into two pieces.
>>>>>
>>>>> sharedRuntime.cpp: SharedRuntime::complete_monitor_locking_C()
>>>>>
>>>>> - change entry type from JRT_ENTRY_NO_ASYNC to JRT_BLOCK_ENTRY
>>>>> to permit ObjectSynchronizer::quick_enter() call
>>>>> - add JRT_BLOCK_NO_ASYNC use if the quick_enter() doesn't work
>>>>> to revert to JRT_ENTRY_NO_ASYNC-like semantics
>>>>>
>>>>> synchronizer.[ch]pp:
>>>>>
>>>>> - add ObjectSynchronizer::quick_enter() for entering an
>>>>> inflated but unowned Java monitor without thread state
>>>>> changes
>>>>>
>>>>> Testing:
>>>>>
>>>>> - Aurora Adhoc RT/SVC baseline batch
>>>>> - JPRT test jobs
>>>>> - MonitorEnterStresser micro-benchmark (in process)
>>>>> - CallTimerGrid stress testing (in process)
>>>>> - Aurora performance testing:
>>>>> - out of the box for the "promotion" and 32-bit server configs
>>>>> - heavy weight monitors for the "promotion" and 32-bit server
>>>>> configs
>>>>> (-XX:-UseBiasedLocking -XX:+UseHeavyMonitors)
>>>>> (in process)
>>>>>
>>>>>
>>>>> Thanks, in advance, for any comments, questions or suggestions.
>>>>>
>>>>> Dan
>>>
>>>
>>
>>
>>
>
>
>
More information about the hotspot-runtime-dev
mailing list