Request for reviews (L): 7125896: Eliminate nested locks

Vladimir Kozlov vladimir.kozlov at oracle.com
Thu Jan 5 10:25:14 PST 2012


Thank you, Tom

Vladimir

Tom Rodriguez wrote:
> On Jan 4, 2012, at 3:54 PM, Vladimir Kozlov wrote:
>>
>> I updated webrev:
>>
>> http://cr.openjdk.java.net/~kvn/7125896/webrev.01
> 
> I think that looks good.  Thanks for refactoring it.
> 
> tom
> 
>> New changes:
>>
>> - Added new enum field AbstractLockNode::_kind (I tried to avoid _type name) and corresponding methods instead of boolean fields.
>> - Added new method LockNode::is_nested_lock_region() to find if it is nested.
>> - Added new method BoxLockNode::is_simple_lock_region() to check if Box node is used to lock only one object and to avoid replacing this Box with clone in macro.cpp.
>> - Added new method PhaseMacroExpand::mark_eliminated_box() to move EA eliminated marking code from mark_eliminated_locking_nodes().
>>
>> Thanks,
>> Vladimir
>>
>> Tom Rodriguez wrote:
>>> On Jan 3, 2012, at 6:07 PM, Vladimir Kozlov wrote:
>>>> Tom Rodriguez wrote:
>>>>>>>> I ran full CTW and compiler regression tests and did not hit BoxLockNode merge case so it could be very rare case. But I want to be careful.
>>>>>>> I'm ok with careful but I'm having trouble understanding how it can even happen in normal ideal.  The existing locking code in macro.cpp doesn't appear to handle it.  Phis of BoxLocks shouldn't be possible unless the Phis were just never reprocessed.  All locking in compiled code is required to be block structured so the slot of the BoxLockNodes must match if they feed into a Phi and GVN should common them up to be equivalent, so the Phi should just collapse.
>>>>>> You forgot that BoxLockNode commoning is switched off for EliminateNestedLocks, it is main reason for Phis. Phis are generated for monitorexit (unlocks) when monitorenter (locks) code is merged (as was in loop head cloning case and could be in other cases).
>>>>> How would that result in Phis?  The unlock pulls a box from the JVMState, which I guess could have Phis but again they should all have collapsed since they must refer to the same lock operation and BoxLock.
>>>> No, Box and Lock nodes are different since they are generated on different paths and are not commoning (I had to change code in Parse::ensure_phi() for BoxLockNode). To be clear we are talking about this case:
>>>>
>>>> if (a) {
>>>> monitorenter(obj)
>>>> } else {
>>>> monitorenter(obj)
>>>> }
>>>> monitorexit(obj)
>>> I assumed we wouldn't compile things like that but I guess that's technically block structured.  The monitor matching logic really accepts that as block structured?  I think it should reject it to simplify the compiler.  There's no benefit to accepting code like that since it could always be restructured to properly pair.
>>>>>> Yes, in current code BoxLockNode are commoned with the same slot (but could be for different objects) so it could bei used by eliminated and not eliminated locks. That is why elimination code clones BoxLockNode only for eliminated locks.
>>>>>>
>>>>>>> BoxLocks are used with mach nodes too so during code generation your might get Phis for BoxLocks.  Maybe that's why the logic is there.  I think the macro.cpp code should assert that it never sees a Phi.  This would also simplify all that logic quite a bit since you could just modify the BoxLock in place.  Why isn't the eliminated flag taken from the BoxLock instead of being stored on the AbstractLock?
>>>>>> Because in current code the same BoxLockNode could be referenced by eliminated and not eliminated locks.
>>>>> That's true with commoning of BoxLocks enabled but with it turned off they should follow the block structure and could be eliminated and referenced as a group.
>>>> In above case you have several Box and Phi nodes associated with one locking region. So you can't mark only one Box node.
>>>>
>>>>> I actually think not commoning the BoxLocks should be the only behaviour instead of making it conditional on EliminateNestedLocks, though I have a vague memory of some code that relies on commoning of them for some purpose.  I can't seem to find it though.
>>>> Commoning simplified all compiler optimizations since you need only compare Box nodes. Now I have to look through Phi to find Box nodes and compare their slots. For example, look on lock coarsening code changes.
>>> Why do you have to compare their slots?  As far as I can see, BoxLock is created for monitorenter and the unlock pulls the box from the JVMState, guaranteeing that a single BoxLock corresponds to a single lock region, which seems like a nice property.  Am I missing something?  I guess it's just the example above?
>>> Anyway, I'm ok with your original code with a little refactoring.  It just seemed like the whole thing could more simple once commoning was eliminated.  I guess that's not true.
>>> tom
>>>> Vladimir
>>>>
>>>>> tom
>>>>>> Vladimir
>>>>>>
>>>>>>> tom
>>>>>>>>> Does lock elimination require that the Phis be collapsed?
>>>>>>>> No, that is why it clones BoxLockNode.
>>>>>>>>
>>>>>>>>>> I do modify original BoxLockNode for nested case where merged cases are excluded. For eliminated by EA case I could do cloning only for merged cases but it needs additional checks so I decided to do it always. But if merged cases are very rare then cloning should be avoided for not merged BoxLock, I will check.
>>>>>>>>> At a minimum, can't the explicit iteration over all the users can be replaced with igvn.replace_node(oldbox, newbox)?  The explicit iteration seems like overkill.
>>>>>>>> I want to play it safe and replace BoxLock only for related users (same box and object and not Phi). It is matching old elimination code so I will move it to separate method and will use in both cases.
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>>>>> eliminating the cloning in ciTyepFlow could have performance implications.  I assume it's rare?
>>>>>>>>>> I ran refworkload on x86 and SPARC, there was no change in scores. For normal loop's head cloning cases monitorenter can not be in first loop's block (there will be condition there). The case I hit was "while(true) { synchronize(o);" where there is no condition. So it is not common case. Also LockNode is Call node and we don't do much loop optimizations for loops with calls inside.
>>>>>>>>> Ok.
>>>>>>>>> tom
>>>>>>>>>> Thanks,
>>>>>>>>>> Vladimir
>>>>>>>>>>
>>>>>>>>>>> tom
>>>>>>>>>>>> Thanks,
>>>>>>>>>>>> Vladimir
>>>>>>>>>>>>
>>>>>>>>>>>>> -- Chris
>>>>>>>>>>>>> On Dec 29, 2011, at 10:20 PM, Vladimir Kozlov wrote:
>>>>>>>>>>>>>> http://cr.openjdk.java.net/~kvn/7125896/webrev
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> 7125896: Eliminate nested locks
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Nested locks elimination done before lock nodes expansion by looking for outer locks of the same object. Commoning (GVN) of BoxLock nodes is switched off because nested locks elimination requires separate BoxLock node for each locked region to generated correct debug info for deoptimization. As result there could be merges (and Phi nodes) of BoxLock nodes. One such merge generated by ciTypeFlow (cloning loop head) is avoided but there could be other cases so new code is added to handle it.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> New code is under new product flag EliminateNestedLocks.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Also added missed KILL effect for box register in fastlock and fastunlock mach nodes definitions.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Tested with full CTW, nsk, jtreg tests, refworkload.
>>>>>>>>>>>>>>
>>>>>>>>>>>>>> Thanks,
>>>>>>>>>>>>>> Vladimir
> 


More information about the hotspot-compiler-dev mailing list