8066103: C2's range check smearing allows out of bound array accesses

Fri Dec 5 19:04:11 UTC 2014

> On Dec 5, 2014, at 5:30 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> 
> Hi John,
> 
>>> With your change above, wouldn’t we end up with an out of bound array access in my example because it’s now applied even if nb_checks>2?
>> 
>> You are right.  That's because I suggested the wrong code.  I meant to suggest merging the last two tests, not the first and last:
>> 
>>> +       if (nb_checks <= NRC && prev_checks[(nb_checks - NRC) % NRC].ctl->in(0)->in(1) == in(1)) {
>> 
>> Or:
>> 
>>> +       if (most_recent_prev_check.ctl->in(0)->in(1) == in(1)) {
>> 
>> (Where most_recent_prev_check is collected in the previous loop only when nb_checks==0.)
>> 
>> That logic should cleanly subsume the tricky existing check at nb_checks==1 and generalize well.
>> 
>> The adjustment I'm suggesting *only* applies to the last two checks (on the same index1, in the encounter order of the search loop).  Since they are always adjacent, and we take care never to reorder range check operations, this would be immune to the failure case you describe.
>> 
>> My point is to separate out the adjacent-check merging from the smearing transformation.  As originally coded, the two tactics are imperfectly combined in the nb_checks==1 case.
> 
> Is below what you’re suggesting?

Yes, pretty much, but see the following.

> Indeed it makes sense.

The first encountered matching test will be stored in prev_checks[0] if and only if 1 <= nb_checks <= NRC.
So I was thinking:
  not:  nb_checks < NRC && prev_checks[nb_checks].ctl->in(0)->in(1) == in(1)
  but:  nb_checks <= NRC && prev_checks[0].ctl->in(0)->in(1) == in(1)

Or better yet a purely local cutout.  (This sketch has a bad goto; the value of 'prev_dom' carried by the goto is what I called 'most_recent_prev_check' above.)

diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
--- a/src/share/vm/opto/ifnode.cpp
+++ b/src/share/vm/opto/ifnode.cpp
@@ -876,6 +876,7 @@
     // Low and high offsets seen so far
     jint off_lo = offset1;
     jint off_hi = offset1;
+    int nb_checks = 0;
 
     // Scan for the top 2 checks and collect range of offsets
     for( int dist = 0; dist < 999; dist++ ) { // Range-Check scan limit
@@ -890,6 +891,15 @@
         // the same array bounds.
         if( flip2 == flip1 && range2 == range1 && index2 == index1 &&
             dom->outcnt() == 2 ) {
+          if (nb_checks == 0 && dom->in(1) == in(1)) {
+            // Found an immediately dominating test at the same offset.
+            // This kind of back-to-back test can be eliminated locally,
+            // and there is no need to search further for dominating tests.
+            assert(offset2 == offset1, "");
+            prev_dom = dom;
+            goto got_prev_dom;  // FIXME: unstructured control flow
+          }
+          nb_checks++;
           // Gather expanded bounds
           off_lo = MIN2(off_lo,offset2);
           off_hi = MAX2(off_hi,offset2);
@@ -985,6 +995,7 @@
 
   } // End of Else scan for an equivalent test
 
+ got_prev_dom:  // FIXME: unstructured control flow
   // Hit!  Remove this IF
 #ifndef PRODUCT
   if( TraceIterativeGVN ) {


> It should be nb_checks < NRC, because in case we have 3 checks they could be optimized to 2 range checks which is better than replacing the current range check with the 3rd range check, right?

That's a fair argument.  Except I would say "the same as" rather than "better", because I think we will usually get the same code either way, but in a very different IGVN ordering.

The thing I'm getting at is I like to think about the back-to-back case with distinct logic.  Local transformations are easier to reason about in IGVN and may be more robust.  But it's also a matter of taste here.

I'm paying attention to this code because I have visited it before and failed to spot the bug, but remember my feelings of unease.
I think this logic (with other RCE) will continue to be important as we invest in "Arrays 2.0".
Like you, I want to leave it clearer than before so we can build on it confidently.

> diff --git a/src/share/vm/opto/ifnode.cpp b/src/share/vm/opto/ifnode.cpp
> ...
> 
> 
>> BTW, it seems to me that the degenerate case of off_hi==off_lo is interesting.  (I.e., a long run of tests at the same index, perhaps with interfering tests disappearing due to optimization.)  I think that your code would fold those into two tests of the same index, wouldn't it?
> 
> Wouldn’t we adjust the top 2 range checks and keep 2 of them but the range checks that are adjusted are reprocessed by the IGVN and it would fold the top 2 range checks in a single one because they perform an identical check?

That's what I hope for; glad you agree.

Two final thoughts, neither of which block this review:

We can consider having a CheckNode as a subclass of IfNode, kind of like LoopNode is a subclass of RegionNode.  This could make RCE transformations more robust.  The important distinction is that the CheckNode can optimistically perform over-strong checks; this makes it different from a normal IfNode and suitable for smeared RCE and other speculative stuff.

We need a way for Java library code (our JVM users) to code range checks which can be optimized with the same power as built-in range checks (from "aaload" and similar bytecodes).  Supporting RCE for user-written RC's is key for Arrays 2.0.  Optimizing "x >= 0 && x < len" to "x <u len" is part of it, but we also need the user-written idiom to access the optimistic RCE, along with the full machinery of deopt on speculation failure and profile-based suppression of subsequent speculation.

— John