RFR: 8346989: C2: deoptimization and re-compilation cycle with Math.*Exact in case of frequent overflow [v2]
Marc Chevalier
mchevalier at openjdk.org
Wed Mar 26 08:39:09 UTC 2025
On Fri, 21 Mar 2025 22:34:43 GMT, Vladimir Ivanov <vlivanov at openjdk.org> wrote:
>> I think adapting and re-using `builtin_throw` like you described is reasonable but I let @iwanowww confirm :slightly_smiling_face:
>
> Yes, that's basically what I had in mind.
>
> Currently, the focus of the intrinsic is on well-behaved case (overflows are **very** rare). `builtin_throw()` covers more ground and optimize for scenarios when exceptions are thrown. But it depends on `ciMethod::can_omit_stack_trace()` where `-XX:-OmitStackTraceInFastThrow` mode will suffer from the original problem (continuous deoptimizations), plus a round of recompilations before giving up.
>
> I suggest to improve and reuse `builtin_throw` here and add additional checks in the intrinsic to guard against problematic scenario with continuous deoptimizations. IMO it improves performance model for a wide range of use cases while addressing pathological scenarios.
So, I have done something like that (getting the exception object to throw from parameter, and factor out the logic whether builtin_throw is possible, so we can bailout of intrinsics instead of cycling again). Test seem to pass in the various cases I wrote. As for benchmark, it's quite a change. I post only the new part, the rest is pretty much the same. C2_no_builtin_throw does what the original C2 was (no builtin throw, just bailing out of intrinsics to cut our losses), and new C2 is with builtin_throw. tldr: builtin_throw makes the overflow case of the same order as the in-bound cases (1-4ms) instead of being about 100 times bigger (600-700ms with C1, C2 without intrinsics, C2 with bailing out).
MathExact.C2.loopAddIInBounds 1000000 avgt 3 1.657 ± 11.994 ms/op
MathExact.C2.loopAddIOverflow 1000000 avgt 3 1.313 ± 4.188 ms/op
MathExact.C2.loopAddLInBounds 1000000 avgt 3 0.980 ± 0.396 ms/op
MathExact.C2.loopAddLOverflow 1000000 avgt 3 2.474 ± 3.473 ms/op
MathExact.C2.loopDecrementIInBounds 1000000 avgt 3 3.733 ± 13.709 ms/op
MathExact.C2.loopDecrementIOverflow 1000000 avgt 3 2.792 ± 23.724 ms/op
MathExact.C2.loopDecrementLInBounds 1000000 avgt 3 2.761 ± 24.744 ms/op
MathExact.C2.loopDecrementLOverflow 1000000 avgt 3 2.730 ± 23.065 ms/op
MathExact.C2.loopIncrementIInBounds 1000000 avgt 3 3.134 ± 20.980 ms/op
MathExact.C2.loopIncrementIOverflow 1000000 avgt 3 3.271 ± 8.876 ms/op
MathExact.C2.loopIncrementLInBounds 1000000 avgt 3 2.756 ± 22.912 ms/op
MathExact.C2.loopIncrementLOverflow 1000000 avgt 3 4.549 ± 9.543 ms/op
MathExact.C2.loopMultiplyIInBounds 1000000 avgt 3 1.268 ± 0.574 ms/op
MathExact.C2.loopMultiplyIOverflow 1000000 avgt 3 1.572 ± 11.171 ms/op
MathExact.C2.loopMultiplyLInBounds 1000000 avgt 3 1.021 ± 1.054 ms/op
MathExact.C2.loopMultiplyLOverflow 1000000 avgt 3 3.167 ± 20.666 ms/op
MathExact.C2.loopNegateIInBounds 1000000 avgt 3 3.575 ± 29.997 ms/op
MathExact.C2.loopNegateIOverflow 1000000 avgt 3 4.222 ± 9.041 ms/op
MathExact.C2.loopNegateLInBounds 1000000 avgt 3 4.452 ± 6.680 ms/op
MathExact.C2.loopNegateLOverflow 1000000 avgt 3 4.739 ± 34.662 ms/op
MathExact.C2.loopSubtractIInBounds 1000000 avgt 3 1.087 ± 0.539 ms/op
MathExact.C2.loopSubtractIOverflow 1000000 avgt 3 3.027 ± 9.709 ms/op
MathExact.C2.loopSubtractLInBounds 1000000 avgt 3 1.197 ± 5.763 ms/op
MathExact.C2.loopSubtractLOverflow 1000000 avgt 3 1.765 ± 10.037 ms/op
MathExact.C2_no_builtin_throw.loopAddIInBounds 1000000 avgt 3 2.310 ± 2.990 ms/op
MathExact.C2_no_builtin_throw.loopAddIOverflow 1000000 avgt 3 594.036 ± 500.000 ms/op
MathExact.C2_no_builtin_throw.loopAddLInBounds 1000000 avgt 3 1.577 ± 14.053 ms/op
MathExact.C2_no_builtin_throw.loopAddLOverflow 1000000 avgt 3 631.345 ± 75.836 ms/op
MathExact.C2_no_builtin_throw.loopDecrementIInBounds 1000000 avgt 3 2.090 ± 0.937 ms/op
MathExact.C2_no_builtin_throw.loopDecrementIOverflow 1000000 avgt 3 618.080 ± 38.047 ms/op
MathExact.C2_no_builtin_throw.loopDecrementLInBounds 1000000 avgt 3 4.164 ± 6.184 ms/op
MathExact.C2_no_builtin_throw.loopDecrementLOverflow 1000000 avgt 3 596.031 ± 584.159 ms/op
MathExact.C2_no_builtin_throw.loopIncrementIInBounds 1000000 avgt 3 2.383 ± 11.729 ms/op
MathExact.C2_no_builtin_throw.loopIncrementIOverflow 1000000 avgt 3 626.425 ± 134.612 ms/op
MathExact.C2_no_builtin_throw.loopIncrementLInBounds 1000000 avgt 3 2.345 ± 13.927 ms/op
MathExact.C2_no_builtin_throw.loopIncrementLOverflow 1000000 avgt 3 630.535 ± 99.348 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyIInBounds 1000000 avgt 3 1.419 ± 4.289 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyIOverflow 1000000 avgt 3 587.796 ± 52.215 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyLInBounds 1000000 avgt 3 0.934 ± 0.272 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyLOverflow 1000000 avgt 3 589.736 ± 347.848 ms/op
MathExact.C2_no_builtin_throw.loopNegateIInBounds 1000000 avgt 3 2.236 ± 5.749 ms/op
MathExact.C2_no_builtin_throw.loopNegateIOverflow 1000000 avgt 3 618.711 ± 725.158 ms/op
MathExact.C2_no_builtin_throw.loopNegateLInBounds 1000000 avgt 3 2.605 ± 17.373 ms/op
MathExact.C2_no_builtin_throw.loopNegateLOverflow 1000000 avgt 3 627.055 ± 184.767 ms/op
MathExact.C2_no_builtin_throw.loopSubtractIInBounds 1000000 avgt 3 1.006 ± 0.584 ms/op
MathExact.C2_no_builtin_throw.loopSubtractIOverflow 1000000 avgt 3 588.062 ± 403.116 ms/op
MathExact.C2_no_builtin_throw.loopSubtractLInBounds 1000000 avgt 3 0.978 ± 0.193 ms/op
MathExact.C2_no_builtin_throw.loopSubtractLOverflow 1000000 avgt 3 611.004 ± 456.779 ms/op
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23916#discussion_r2013625437
More information about the hotspot-compiler-dev
mailing list