JDK-8221605 fix backport candidate (dup of JDK-8146792)

Thu Nov 21 07:29:12 UTC 2019

Hi, Nikola and Felix, 

I marked it dup. It is kind of duplication. The attached file of JDK-8221605 is closer to your program. It crashes for the same reason as yours. 
The description of JDK-8146792 is about partial peel. Actually, peeling could cause broken graph. Roland's testcase prevents  general peeling on purpose. 
I think It's a good idea to file a new JBS issue to amend a new testcase.  We backport it as well. 

Thanks,
--lx

On 11/15/19, 6:36 AM, "jdk8u-dev on behalf of Nikola Grcevski" <jdk8u-dev-bounces at openjdk.java.net on behalf of Nikola.Grcevski at microsoft.com> wrote:

    Hello jdk8u-dev,

    I'm a VM engineer at Microsoft and new to this mailing list. After reproducing the error in "JDK-8221605: C2 crashed at PhaseIdealLoop::build_loop_late_post" (https://bugs.openjdk.java.net/browse/JDK-8221605) I was able to identify the root cause as a missing backport of a fix made in JDK9. The fix for the problem can be found in the following JDK9 revision: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2748d975045f and it's covered by the bug report https://bugs.openjdk.java.net/browse/JDK-8146792 . I found that this backport has already been discussed under the following thread, but it was never finally committed: https://mail.openjdk.java.net/pipermail/jdk8u-dev/2019-August/009947.html .

    From the original bug report in JDK-8221605, I was able to reduce the failing testcase to the following very simple program (which is very similar to the failing testcase as described by Felix Yang):

    /**
    * @summary Predicate moved after partial peel may lead to broken graph
     * @run main/othervm -XX:-TieredCompilation -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:CompileOnly=BadGraphAfterPeelAndPredicateTransform::m1 -XX:CompileCommand=quiet BadGraphAfterPeelAndPredicateTransform
    */
    public class BadGraphAfterPeelAndPredicateTransform {
        public static final int SIZE = 40;
        public static final int RUN_TO_COMPILE = 200;

        private void m1() {
            /**
             * The arrays need to be allocated here to reproduce the problem
             * because we can prove the lengths for bound checks
             */
            int arr[] = new int[SIZE], arr2[][] = new int[SIZE][SIZE];

            /**
              * The loop induction variable needs to be float, double or long, so that we fail detection as a counted loop
            */
            for (float f = 0; f < SIZE; ++f) {
                arr[(int)(f)] = 1;
                for (int j = 0; j < 1; ++j) {
                    arr = arr2[j];
                }
            }
        }

        public static void main(String[] args) {
            BadGraphAfterPeelAndPredicateTransform instance = new BadGraphAfterPeelAndPredicateTransform();

            for (int i = 0; i < RUN_TO_COMPILE; i++ ) {
                instance.m1();
            }
        }
    }

    After investigation into the root cause of the issue and I can confirm that the webrev (http://cr.openjdk.java.net/~fyang/8146792-backport/webrev.00/) as submitted by Felix Yang in the email thread above is the correct patch for JDK8, as the second change in src/share/vm/opto/loopPredicate.cpp (@@ -641,6 +665,7 @@) from the original JDK9 fix is already there in JDK8. I did some debugging to understand the issue better and I have made the following conclusions:

    - The problem only happens if we perform loop peeling. In the testcase above (or in the one described by Felix in https://mail.openjdk.java.net/pipermail/jdk8u-dev/2019-August/009947.html) if one changes the loop variable to be of type int, the problem never happens because the loop is identified as a counted loop and peeling is never attempted.
    - The original testcase supplied with the JDK9 fix in revision http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/2748d975045f never fails on JDK8 because peeling is never attempted on that testcase in JDK8.
    - The root of the problem is related to computing invariance of certain nodes. Namely, when the test fails, a conditional's invariance is already computed on a given node (coming from the IdealLoopTree to Invariance class constructor) and then the loop is peeled. After peeling the computed invariance remains set on the node, however the set computed invariance may not hold true after loop peeling is performed. The fix in JDK9 prevents the loop predicate optimization on all nodes that might be pinned between the peeled predicates and the loop entry which prevents the invalid movement of a predicate. As an experiment, I forcibly re-computed the invariance on the predicate (in my case it was a null check) ahead of performing loop predication and this resolved the problem too, as the node was deemed as non loop invariant after freshly recomputing the invariance. The experiment is not a feasible fix, it was done only to prove the hypothesis.

    Can we please backport the fix to JDK8 and perhaps make bug report JDK-8221605 a duplicate of JDK-8146792? While it's somewhat uncommon for a programmer to use an induction variable other than an int in a loop, it seems that if they choose to do so the problem occurs anytime the loop is peeled. Based on the original JDK9 bug report, this can occur with loops that use an int as an induction variable if the loop isn't deemed as counted. 

    I also think it would be good to add these additional tests that use non-int loop induction variables as tests in all releases. I would really like to help with making these additional tests on top of the backport.

    I would also like to perform some further performance testing on the change in JDK8 to ensure the fix doesn't cause performance regressions. Does anyone have any suggested benchmarks that might be a good choice for verifying the impact of this change in loop optimizations?

    Cheers,
    -Nikola