8066103: C2's range check smearing allows out of bound array accesses

Thu Dec 4 09:20:02 UTC 2014

Hi John,

Thanks for the discussion on this.

>> …
>>> Isn't this statement always a no-op?
>>> +  adjust_check(rc0.ctl, range1, index1, flip1, off_lo, igvn);
>>> I see that it reproduces the previous behavior, so it's OK.  As far as I can see, the main point of this path in the code is that the IfNode can idealize to a dominating duplicate test, just as if it were not a special range check.  Is that true?
>> 
>> It’s true it’s a no-op so I removed it.
>> 
>>> If that's true, perhaps the search loop should just record (outside of the prev_doms array) the shallowest exact match for the current node.  Then, if the range check smearing logic fails to work, the Ideal method can simply return the exact match.
>>> 
>>> In any case, the associated comment "Valid only if there's a single dominating range check" makes me nervous, since it's possible that there are other dominating range checks which we simply failed to find (due to the search cutoff of 999).  If there is a real change to make (if it's not a no-op), we should disable the change if the loop hit 999.
>> 
>> The comment must be misleading. I reworded it. What we want is that there’s no other range check between the range check that we’re calling Ideal on and the top range check that is identical. Otherwise something like 8048170 (that you reviewed yesterday as well) could happen.
> 
> That's the part I was missing, and comment explains the problem much better.
> 
> Let me suggest a further tweak to your comment:
> 
> +           // If we match the test exactly, then the top test covers
> +           // both our lower and upper bounds. Valid only if there's no
> +           // other range check between us and the top test.
> +           // Given Bottom -dom-> Middle… -dom->  Top, with
> +           // us at the bottom, we can't float up past any middle
> +           // checks, because for all we
> +           // know our range check was widened and accesses that
> +           // depend on it also depend on the middle range checks to
> +           // be correct.

Looks better indeed. Thanks for the improved comment.

> 
> Also, now that I understand it better, I think that logic sometimes applies to the nb_checks==2 case
> where you currently return NULL, and I think the "replace immediate" logic should precede
> all nb_checks tests except nb_checks==0:
> 
> +     if (index1) {
> !       RangeCheck rc0 = prev_checks[chk0];
> +       if (nb_checks == 1) {
> +         if (rc0.ctl->in(0)->in(1) == in(1)) {
> +           // If we match the test exactly … be correct.
> +           prev_dom = rc0.ctl;
> +         } else {
> +           return NULL;
>          }
> -        // Test is now covered by prior checks, dominate it out
> -        prev_dom = prev_chk2;
>        } else {
> 
> ==>
> 
> +     if (index1) {
> !       RangeCheck rc0 = prev_checks[chk0];
> +       if (rc0.ctl->in(0)->in(1) == in(1)) {
> +         // If we match the test exactly … be correct.
> +         prev_dom = rc0.ctl;
> +       } else if (nb_checks == 1) {
> +           return NULL;
>         } else {
> 
> That is, do the dance with hi/lo and three range checks, but before that happens prefer to merge back-to-back identical tests.  That will catch cases like "a[i+8] += (expr);" where the bytecodes include two range checks that might be separated.  It would also make the logic of nb_checks==1 less mysterious, and less obviously buggy with respect to search failures (the 999 problem).  (…Or am I missing another point here?)

With your change above, wouldn’t we end up with an out of bound array access in my example because it’s now applied even if nb_checks>2?

Consider the following example:

if i <u array.length then uncommon trap
x = array[i]

// some stuff that blocks range check smearing

if i-3 <u array.length then uncommon trap
x = array[i-3]
if i <u array.length then uncommon trap
x = array[i]
if i-2 <u array.length then uncommon trap
x = array[i-2]

optimized to:

if i <u array.length then uncommon trap
x = array[i]

// some stuff that blocks range check smearing

if i-3 <u array.length then uncommon trap
x = array[i-3]
if i <u array.length then uncommon trap
x = array[i]
x = array[i-2]

if the stuff that blocks is optimized out then we can end up with an out of bound access with nb_checks == 2:

if i <u array.length then uncommon trap
x = array[i]
x = array[i-2]
if i-3 <u array.length then uncommon trap
x = array[i-3]

> As we saw with the other bug, interchanging 'if'-order is a tricky business, and RC smearing makes it harder.  I think our IR doesn't serve us as well as it could here, because there is no way to pin the control of a node to two or more control inputs:  You can pin only to zero or one.  But after performing RC smearing, what you really want is for the elided RC's dependents to depend on the both rc1 and rc2, simultaneously.  If that were expressible, then the 'if'-reordering optimizations later on would be able to correctly place the dependent code.  Not that we can do it in C2, but if we were to reverse the order of control edges (as Graal does) it would be more natural to express multi-control placement (at the expense of other things, naturally).

8054478 introduced CastII that cannot be removed because they carry a dependence (to fix an unrelated problem).
Wouldn’t that be a conceptually cleaner way of expressing the dependence between a range check and an array access?
A Load node wouldn’t need a control but the index computation would depend on a CastII whose control depend on the range check.
Then dependency on multiple range check could be expressed with a chain of CastII nodes.

>> if i <u array.length then uncommon trap
>> x = array[i]
>> 
>> // some stuff that blocks range check smearing
>> 
>> if i-1 <u array.length then uncommon trap
>> x = array[i-1]
>> if i-3 <u array.length then uncommon trap
>> x = array[i-3]
>> if i <u array.length then uncommon trap
>> x = array[i]
>> if i-2 <u array.length then uncommon trap
>> x = array[i-2]
>> 
>> Assuming the second series is optimized:
>> 
>> if i <u array.length then uncommon trap
>> x = array[i]
>> 
>> // some stuff that blocks range check smearing
>> 
>> if i-1 <u array.length then uncommon trap
>> x = array[i-1]
>> if i-3 <u array.length then uncommon trap
>> x = array[i-3]
>> if i <u array.length then uncommon trap
>> x = array[i]
>> x = array[i-2]
>> 
>> and now if the stuff that blocks range check smearing is optimized out and if i <u array.length is processed we would end up with:
>> 
>> if i <u array.length then uncommon trap
>> x = array[i]
>> x = array[i-2]
>> 
>> if i-1 <u array.length then uncommon trap
>> x = array[i-1]
>> if i-3 <u array.length then uncommon trap
>> x = array[i-3]
>> 
>> And we can do an out of bound array access.
> 
> That is the best explanation I've seen yet of how these bugs work.  Would you mind attaching it (verbatim, if you wish) to JDK-8066103?

Done.

Roland.

> 
>> So the cut off at 999 is not a problem here. The problem is that we can’t substitute a range check for another identical one if there are range checks in between.
> 
> Excellent.
> 
> — John