8066103: C2's range check smearing allows out of bound array accesses

Wed Dec 3 22:55:52 UTC 2014

On Dec 3, 2014, at 1:13 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> …
>> Isn't this statement always a no-op?
>>  +  adjust_check(rc0.ctl, range1, index1, flip1, off_lo, igvn);
>> I see that it reproduces the previous behavior, so it's OK.  As far as I can see, the main point of this path in the code is that the IfNode can idealize to a dominating duplicate test, just as if it were not a special range check.  Is that true?
> 
> It’s true it’s a no-op so I removed it.
> 
>> If that's true, perhaps the search loop should just record (outside of the prev_doms array) the shallowest exact match for the current node.  Then, if the range check smearing logic fails to work, the Ideal method can simply return the exact match.
>> 
>> In any case, the associated comment "Valid only if there's a single dominating range check" makes me nervous, since it's possible that there are other dominating range checks which we simply failed to find (due to the search cutoff of 999).  If there is a real change to make (if it's not a no-op), we should disable the change if the loop hit 999.
> 
> The comment must be misleading. I reworded it. What we want is that there’s no other range check between the range check that we’re calling Ideal on and the top range check that is identical. Otherwise something like 8048170 (that you reviewed yesterday as well) could happen.

That's the part I was missing, and comment explains the problem much better.

Let me suggest a further tweak to your comment:

+           // If we match the test exactly, then the top test covers
+           // both our lower and upper bounds. Valid only if there's no
+           // other range check between us and the top test.
+           // Given Bottom -dom-> Middle… -dom->  Top, with
+           // us at the bottom, we can't float up past any middle
+           // checks, because for all we
+           // know our range check was widened and accesses that
+           // depend on it also depend on the middle range checks to
+           // be correct.

Also, now that I understand it better, I think that logic sometimes applies to the nb_checks==2 case
where you currently return NULL, and I think the "replace immediate" logic should precede
all nb_checks tests except nb_checks==0:

+     if (index1) {
!       RangeCheck rc0 = prev_checks[chk0];
+       if (nb_checks == 1) {
+         if (rc0.ctl->in(0)->in(1) == in(1)) {
+           // If we match the test exactly … be correct.
+           prev_dom = rc0.ctl;
+         } else {
+           return NULL;
          }
-        // Test is now covered by prior checks, dominate it out
-        prev_dom = prev_chk2;
        } else {

==>

+     if (index1) {
!       RangeCheck rc0 = prev_checks[chk0];
+       if (rc0.ctl->in(0)->in(1) == in(1)) {
+         // If we match the test exactly … be correct.
+         prev_dom = rc0.ctl;
+       } else if (nb_checks == 1) {
+           return NULL;
         } else {

That is, do the dance with hi/lo and three range checks, but before that happens prefer to merge back-to-back identical tests.  That will catch cases like "a[i+8] += (expr);" where the bytecodes include two range checks that might be separated.  It would also make the logic of nb_checks==1 less mysterious, and less obviously buggy with respect to search failures (the 999 problem).  (…Or am I missing another point here?)

As we saw with the other bug, interchanging 'if'-order is a tricky business, and RC smearing makes it harder.  I think our IR doesn't serve us as well as it could here, because there is no way to pin the control of a node to two or more control inputs:  You can pin only to zero or one.  But after performing RC smearing, what you really want is for the elided RC's dependents to depend on the both rc1 and rc2, simultaneously.  If that were expressible, then the 'if'-reordering optimizations later on would be able to correctly place the dependent code.  Not that we can do it in C2, but if we were to reverse the order of control edges (as Graal does) it would be more natural to express multi-control placement (at the expense of other things, naturally).

> if i <u array.length then uncommon trap
> x = array[i]
> 
> // some stuff that blocks range check smearing
> 
> if i-1 <u array.length then uncommon trap
> x = array[i-1]
> if i-3 <u array.length then uncommon trap
> x = array[i-3]
> if i <u array.length then uncommon trap
> x = array[i]
> if i-2 <u array.length then uncommon trap
> x = array[i-2]
> 
> Assuming the second series is optimized:
> 
> if i <u array.length then uncommon trap
> x = array[i]
> 
> // some stuff that blocks range check smearing
> 
> if i-1 <u array.length then uncommon trap
> x = array[i-1]
> if i-3 <u array.length then uncommon trap
> x = array[i-3]
> if i <u array.length then uncommon trap
> x = array[i]
> x = array[i-2]
> 
> and now if the stuff that blocks range check smearing is optimized out and if i <u array.length is processed we would end up with:
> 
> if i <u array.length then uncommon trap
> x = array[i]
> x = array[i-2]
> 
> if i-1 <u array.length then uncommon trap
> x = array[i-1]
> if i-3 <u array.length then uncommon trap
> x = array[i-3]
> 
> And we can do an out of bound array access.

That is the best explanation I've seen yet of how these bugs work.  Would you mind attaching it (verbatim, if you wish) to JDK-8066103?

> So the cut off at 999 is not a problem here. The problem is that we can’t substitute a range check for another identical one if there are range checks in between.

Excellent.

— John