Request for reviews (M): 6709742: find_base_for_derived's use of Ideal NULL is unsafe causing crashes during register allocation

Vladimir Kozlov Vladimir.Kozlov at Sun.COM
Tue Apr 21 11:30:35 PDT 2009



Tom Rodriguez wrote:
>>> You call new_lrg for every use of base, which seems wrong.  It should 
>>> only be done once shouldn't it and it shouldn't be done at all if 
>>> there's already an LRG for it, right?
>>
>> I don't think it is matter since the live ranges will be recomputed
>> since new lrg is added:
> 
> Yeah, though you uselessly grow some tables every time you call it.
> 
> void PhaseChaitin::new_lrg( const Node *x, uint lrg ) {
>   // Make the Node->LRG mapping
>   _names.extend(x->_idx,lrg);
>   // Make the Union-Find mapping an identity function
>   _uf_map.extend(lrg,lrg);
> }

I did next change and it passed CTW:

       _cfg._bbs.map( base->_idx, startb );
+     assert (n2lidx(base) == 0, "should not have LRG yet");
     }
+   if (n2lidx(base) == 0) {
+     new_lrg(base, maxlrg++);
+   }
-   new_lrg(base, maxlrg++);

> 
>> The difference is the shared mach NULL has control set to _root
>> and not shared node don't have it so it could be placed into
>> use's block so we can't use it for derived base. And we don't know if
>> we will need it for derived base so we can't always set NULL's control
>> to _root since it will affect GCM and RA.
> 
> So you're saying that during GCM we might have placed it into the use 
> block instead of in the first block so it can't be shared?  I don't see 
> how your logic ensures that we never try to use a node like that.  Is 
> there some hidden connection between the fact that has_new_node returns 
> true and where the node was scheduled?  Otherwise how do we know that 
> the base has been scheduled in a shareable location?  Shouldn't the code 
> in chaitin be asserting that base->in(0) == _cfg._root if base->in(0) != 
> NULL and maybe also checking that it's in the start block?

Yes, you are right, has_new_node() == true is not enough.
I added the assert you suggested and immediately hit it.

So I am thinking may be instead of using generated mach_null in
find_base_for_derived() we should look for live in values which is NULL
for this block and use it. And if there is no live in then use
mach_null's clone and place it in this block with local control.
It will prevent having a long live range for such NULL.

     Node *base = NULL;
     while ((neighbor = elements.next()) != 0) {
       Node *mach_null = lrgs(neighbor)._def;
       if (mach_null->is_Mach() &&
           mach_null->as_Mach()->ideal_Opcode() == Op_ConP &&
           mach_null->bottom_type() == TypePtr::NULL_PTR) {
         base == mach_null;
         break;
       }
     }
     if (base == NULL) { // No livein NULL
       base = _matcher.mach_null()->clone();
       base->init_req(0, _cfg._root);
       new_lrg(base, maxlrg++);
       b->_nodes.insert(b->find_node(derived), base);
       _cfg._bbs.map(base->_idx, b);
     }
     assert(base != NULL, "sanity");

Vladimir

> 
> tom
> 
>>> Have you compared the generated code for this change?  We've 
>>> potentially got a bunch of new values floating around and we want to 
>>> make sure it doesn't really change the code we generate.  I played 
>>> with a variant of this fix and saw a lot of new NULL values floating 
>>> around.  Might we need to clone these derived mach nulls to uncommon 
>>> traps?
>>
>> No, I did not look on generated code. I will look.
>>
>>> I have a vague memory of some old bug related NULL+con derived oops 
>>> appearing in oopmaps and thought there was some special handling of 
>>> the ideal ConP so that we didn't bother putting these in oopmaps.  I 
>>> can't find the code I'm thinking of anymore but maybe Chuck has some 
>>> thoughts on this?
>>> Once this is fixed we can restore the logic for undoing Phi of AddP, 
>>> though maybe we should consider revisiting when we do this 
>>> transformation.  I think we did a lot of ping ponging with the old 
>>> code and maybe we should clean it up at the end of the compile instead.
>>
>> Yes, I have bug for it: 6747632.
>>
>> Thanks,
>> Vladimir
>>
>>> tom
>>> On Apr 20, 2009, at 2:50 PM, Vladimir Kozlov wrote:
>>>>
>>>> http://cr.openjdk.java.net/~kvn/6709742/webrev.00
>>>>
>>>> Fixed 6709742: find_base_for_derived's use of Ideal NULL is unsafe 
>>>> causing crashes during register allocation
>>>>
>>>> Problem:
>>>> PhaseChaitin::stretch_base_pointer_live_ranges() stretches
>>>> the base pointers for live ranges and in some cases may
>>>> have to construct a NULL base in find_base_for_derived.
>>>> It constructs an Ideal NULL instead of a mach one and
>>>> if the Ideal NULL is ever used in Phi with real machine
>>>> values we will die during register allocation.
>>>>
>>>> Solution:
>>>> Create a mach node corresponding to ideal node ConP #NULL
>>>> specifically for derived pointers.
>>>> Use an existing mach node (matched for ConP #NULL) only
>>>> if it is shared to avoid false sharing if the mach node
>>>> for derived pointers is not used.
>>>>
>>>> Add the assert to catch the bug case.
>>>> Add asserts to verify that narrow pointers can't be derived.
>>>>
>>>> Reviewed by:
>>>>
>>>> Fix verified (y/n): y
>>>>
>>>> Other testing:
>>>> JPRT, CTW (32- and 64-bit), JPRT and CTW with compressed oops
> 



More information about the hotspot-compiler-dev mailing list