RFR: 8328528: C2 should optimize long-typed parallel iv in an int counted loop [v21]
Tobias Hartmann
thartmann at openjdk.org
Wed Oct 16 05:15:14 UTC 2024
On Wed, 9 Oct 2024 18:18:21 GMT, Kangcheng Xu <kxu at openjdk.org> wrote:
>> `compiler/loopopts/parallel_iv/TestParallelIvInIntCountedLoop.java` times out in our testing both with `-XX:StressLongCountedLoop=200000000` and with `-XX:+UnlockExperimentalVMOptions -XX:PerMethodSpecTrapLimit=0 -XX:PerMethodTrapLimit=0`:
>>
>>
>> "main" #1 [2771172] prio=5 os_prio=0 cpu=500187.70ms elapsed=503.08s allocated=6554K defined_classes=227 tid=0x0000ffff9002d550 nid=2771172 runnable [0x0000ffff972bf000]
>> java.lang.Thread.State: RUNNABLE
>> Thread: 0x0000ffff9002d550 [0x2a48e4] State: _at_safepoint _at_poll_safepoint 1
>> JavaThread state: _thread_blocked
>> at compiler.loopopts.parallel_iv.TestParallelIvInIntCountedLoop.testIntCountedLoopWithIntIVLeq(TestParallelIvInIntCountedLoop.java:93)
>> at compiler.loopopts.parallel_iv.TestParallelIvInIntCountedLoop.runTestIntCountedLoopWithIntIVLeq(TestParallelIvInIntCountedLoop.java:103)
>> at java.lang.invoke.DirectMethodHandle$Holder.invokeStatic(java.base at 24-internal/DirectMethodHandle$Holder)
>> at java.lang.invoke.LambdaForm$MH/0x0000ffff58460870.invoke(java.base at 24-internal/LambdaForm$MH)
>> at java.lang.invoke.Invokers$Holder.invokeExact_MT(java.base at 24-internal/Invokers$Holder)
>> at jdk.internal.reflect.DirectMethodHandleAccessor.invokeImpl(java.base at 24-internal/DirectMethodHandleAccessor.java:154)
>> at jdk.internal.reflect.DirectMethodHandleAccessor.invoke(java.base at 24-internal/DirectMethodHandleAccessor.java:104)
>> at java.lang.reflect.Method.invoke(java.base at 24-internal/Method.java:573)
>> at compiler.lib.ir_framework.test.CustomRunTest.invokeTest(CustomRunTest.java:159)
>> at compiler.lib.ir_framework.test.AbstractTest.run(AbstractTest.java:98)
>> at compiler.lib.ir_framework.test.CustomRunTest.run(CustomRunTest.java:89)
>> at compiler.lib.ir_framework.test.TestVM.runTests(TestVM.java:861)
>> at compiler.lib.ir_framework.test.TestVM.start(TestVM.java:252)
>> at compiler.lib.ir_framework.test.TestVM.main(TestVM.java:165)
>
> @TobiHartmann. Thanks for the feedback! I did some investigation, reasons for timeouts comes three folds:
>
> 1. Tests with `i <= stop` is not a counted loop in the first place and should be removed:
>
> Now I remember why I originally didn't test for it. Consider `for (int i = 0; i <= stop; i++);` when `stop = Integer.MAX_VALUE`. Overflow in Java is well-defined, which means the code must loop indefinitely and optimizations of any kind can't break this. Therefore, `<=` are not counted loops to begin with. `@IR(failOn = {IRNode.COUNTED_LOOP})` doesn't fail either. I removed these test cases.
>
> 2. It is normal to timeout with `-XX:StressLongCountedLoop=200000000` for all test cases:
>
> An value other than `0` for this flag will forcefully convert int counted loops to long counted loops, which C2 doesn't do parallel IV at this point. This is same issue as [JDK-8294839](https://bugs.openjdk.org/browse/JDK-8294838). Loops are still loops. For a large random `stop` value, this will take a long time to loop through.
>
> 3. It is normal to timeout with `-XX:PerMethodTrapLimit=0` for test cases with stride other than `1`:
>
> Take `for (int i = 0; i < stop; i += 2)` for an example. Since there is a chance for increment to `i` go beyond `stop` (and eventually overflows), there must be some sort of runtime check for `stop`. Normally, a `loop_limit_check` trap is compiled to take the slow path (deoptimization). However, the zero trap limit forces C2 to loop and check `i < stop` on every iteration. For a large random `stop` value, this will take a long time.
>
> For the latter two reasons, I added `runWithFlags()` to essentially disable the flags in questions.
>
> https://github.com/openjdk/jdk/blob/845e34cc7a82ef5cb69620a12f487adaca9d2613/test/hotspot/jtreg/compiler/loopopts/parallel_iv/TestParallelIvInIntCountedLoop.java#L47-L51
@tabjy We are still seeing timeouts with `-XX:+UnlockDiagnosticVMOptions -XX:TieredStopAtLevel=3 -XX:+StressLoopInvariantCodeMotion -XX:+StressRangeCheckElimination -XX:+StressLinearScan`. Maybe the test should be enabled only if C2 is available.
-------------
PR Comment: https://git.openjdk.org/jdk/pull/18489#issuecomment-2415748974
More information about the hotspot-compiler-dev
mailing list