Problem with testN_mem_reg0 in register allocation.

Doerr, Martin martin.doerr at sap.com
Thu Sep 5 04:01:01 PDT 2013


Hi Vladimir,

thanks for pointing us to that piece of code.

We found a Phi node with bottom type memory on the path from Block 146 to Block 140.
PhaseCFG::insert_anti_dependences forces the testN_mem_reg0 to get scheduled before the Phi's input.

The Phi node was introduced by our code (and we can change the type), so it's not an open jdk problem.

I guess the recompilation without subsuming loads is the right thing to do when such a problem occurs
(unless someone spends more intelligence for the matcher).

Best regards,
Martin


-----Original Message-----
From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com] 
Sent: Freitag, 30. August 2013 20:23
To: Doerr, Martin
Cc: Lindenmaier, Goetz; hotspot-compiler-dev at openjdk.java.net
Subject: Re: Problem with testN_mem_reg0 in register allocation.



On 8/30/13 9:03 AM, Doerr, Martin wrote:
> Hi Vladimir,
>
> the problem has occurred with a modified version of C2 which performs a significantly reduced set of optimizations.
>
> Here's a more detailed picture of block 35:
>
>                                 163 Call
>                           /                      \
> MachProj (control)     MachProj (mem)
>                     |                               |
> 161 Catch                        209 testN_mem_reg0
>
> Node 209 has a required edge to 14 MachProj (Param 0) (which is connected to 11 Start).
> And it has only one out edge to 208 jmpCon.
>
> I can't see a good reason why 209 should be placed in block 35. Maybe it's just to schedule loads early.
> We have to investigate a little more. We also have to check if we have a change in GCM which could be responsible.

If it is only one user then it should not happen. Look on 
PhaseCFG::schedule_late(). mast_clone[] is true for nodes producing 
flags so it should be placed into LCA which is block of its user (jmpCon):

1237    if (mach != NULL && must_clone[mach->ideal_Opcode()])
1238      try_to_hoist = false;

The only case it may not true is anti-dependentcy on memory:

1206    // Check if 'self' could be anti-dependent on memory
1207    if (self->needs_anti_dependence_check()) {
1208      // Hoist LCA above possible-defs and insert anti-dependences to
1209      // defs in new LCA block.
1210      LCA = insert_anti_dependences(LCA, self);
1211    }

>
>
> Anyway, the graph looks like this problem could be prevented by GCM.
> However, I guess this is not true in general.
> I mean the matcher does not care about if such a node can be placed without spilling.
> And testN_mem_reg0 is a node which subsumes a load and returns a condition flag which can't be spilled.
> If such a node is matched, we need luck. Do you agree?
>
> So even if we fix this problem in GCM, it might be safer to have the circumvention from the webrev below.

It is always good to have "plan B" :)

Regards,
Vladimir

>
> Best regards,
> Martin
>
>
> -----Original Message-----
> From: hotspot-compiler-dev-bounces at openjdk.java.net [mailto:hotspot-compiler-dev-bounces at openjdk.java.net] On Behalf Of Vladimir Kozlov
> Sent: Freitag, 23. August 2013 21:17
> To: Lindenmaier, Goetz
> Cc: hotspot-compiler-dev at openjdk.java.net
> Subject: Re: Problem with testN_mem_reg0 in register allocation.
>
> On 8/21/13 6:27 AM, Lindenmaier, Goetz wrote:
>> Hi Vladimir,
>>
>> Thanks a lot for your help!
>> We implemented what you proposed, see webrev.
>> We also initialize idealreg2XXXmask completely, as only this allowed us
>> to cleanly reproduce the problem.
>> http://cr.openjdk.java.net/~goetz/webrevs/testN_regalloc_bug/
>
> This looks good.
>
>>
>> The fix helped to resolve our current problem.  But it somehow feels
>> like a quick hack.
>
> I agree with that but it was the fastest way to help you :)
>
>
>> Shouldn't already call_catch_cleanup bail out when it makes a Phi for an
>
> It may be not good place for that.
>
>> operation that defines RegFlags?  We looked at the memory operations
>> between the testN and the branch, there is not much besides klass
>
> Does testN_mem_reg0 have other uses in addition to 208 jmpCon in block
> 140? I did tried to create test with several uses but was not able to
> recreate your situation: LoadN become shared and generated as separate
> instruction in my test.
>
>> accesses.  So it could be that gcm thinks block 35 is a good placement, and
>> assumes RegAlloc does rematerialization if necessary.  But RegAlloc
>
> Do you have additional code in GCM to what we have? If you can reproduce
> the problem can you add instrumentation (prints) to see why GCM does it?
>
> Thanks,
> Vladimir
>
>> fails, as call_catch_cleanup constructed a graph that can not be
>> fixed by rematerializations?
>>
>> In Matcher::find_shared() we can not find code considering anti
>> dependencies.  We think that code is in LabelRoot.
>>
>> Best regards,
>>     Martin & Goetz.
>>
>>
>> -----Original Message-----
>> From: Vladimir Kozlov [mailto:vladimir.kozlov at oracle.com]
>> Sent: Montag, 19. August 2013 19:58
>> To: Lindenmaier, Goetz
>> Cc: hotspot-compiler-dev at openjdk.java.net
>> Subject: Re: Problem with testN_mem_reg0 in register allocation.
>>
>> Goetz, Martin
>>
>> I would speculate it is scheduled very early by GCM because of memory
>> access anti-dependency. Somewhere there is a store or other changes to
>> this memory slice. Yes, matcher should not allow such matching. I am not
>> sure we have an anti-dependency check when we match (need to look). The
>> method which should do that is Matcher::find_shared().
>>
>> The easiest fix would be in place where we try to spill flag do bailout
>> with recompilation without subsuming loads:
>>
>>        if (C->subsume_loads() == true && !C->failing()) {
>>          // Retry with subsume_loads == false
>>          // If this is the first failure, the sentinel string will "stick"
>>          // to the Compile object, and the C2Compiler will see it and retry.
>>          C->record_failure(C2Compiler::retry_no_subsuming_loads());
>>
>> Regards,
>> Vladimir
>>
>> On 8/19/13 8:04 AM, Lindenmaier, Goetz wrote:
>>> Hi,
>>>
>>> we are currently analyzing a bug in our jdk7 VM, which is based on
>>> HS23.  It happens on linux x86_64.
>>>
>>> Register allocation tries to spill a RefFlags register, and, as to
>>> expect, fails on this leading to a SIGSEGV.
>>>
>>> I suspect this is a strange interdependency between gcm and
>>> register allocation
>>>
>>> Gcm produces something as the following pattern (I simplified
>>> to get the essentials).
>>>
>>>              Block 35
>>>                Call
>>>                209 testN_mem_reg0
>>>                Catch
>>>           /              \
>>>          /                \
>>>        |/_                _\|
>>> Block 36             Block 146
>>>      ...                 ...
>>>                          159 testP_reg
>>>                          158 jmpCon(212)
>>>         \                   /
>>>          \                 /
>>>          _\|             |/_
>>>            Block 39
>>>              ...
>>>           /              \
>>>          /                \
>>>        |/_                _\|
>>> Block 140              ...
>>>      208 jmpCon(209)
>>>
>>> Then call_catch_cleanup() repairs the position of 209.
>>> It adds 1088, 1089 and 1090.
>>>
>>>              Block 35
>>>                Call
>>>                Catch
>>>           /              \
>>>          /                \
>>>        |/_                _\|
>>> Block 36               Block 146
>>>      1089 testN_mem_reg0    1088 testN_mem_reg0
>>>      ...                    ...
>>>                             159 testP_reg
>>>                             158 jmpCon(212)
>>>         \                   /
>>>          \                 /
>>>          _\|             |/_
>>>            Block 39
>>>              1090 Phi(1089, 1088)
>>>              ...
>>>           /              \
>>>          /                \
>>>        |/_                _\|
>>> Block 140              ...
>>>      208 jmpCon(1090)
>>>
>>> Register allocation now wants to spill the register of 1088.
>>> This is necessary as 159 needs the flag register.
>>> Have you seen such a problem before?
>>>
>>> We could envision that one the following 3 behaviors is expected and
>>> should get fixed:
>>>
>>> 1. gcm should place node 209 in Block 140.
>>> 2. Register allocation should be able to properly rematerialize the
>>> testN_mem_reg0 node (which does allow rematerialization).
>>>       This would require that it recognizes that the Phi merges twice the
>>> same value.
>>> 3. The matcher should not match this node (which matches on a Load and
>>> Cmp)?
>>>
>>> Best reagrds,
>>> Martin & Goetz.
>>>


More information about the hotspot-compiler-dev mailing list