RFR: 8346989: C2: deoptimization and re-compilation cycle with Math.*Exact in case of frequent overflow [v6]
Marc Chevalier
mchevalier at openjdk.org
Thu Apr 3 13:01:16 UTC 2025
On Wed, 2 Apr 2025 17:23:03 GMT, Marc Chevalier <mchevalier at openjdk.org> wrote:
>> `Math.*Exact` intrinsics can cause many deopt when used repeatedly with problematic arguments.
>> This fix proposes not to rely on intrinsics after `too_many_traps()` has been reached.
>>
>> Benchmark show that this issue affects every Math.*Exact functions. And this fix improve them all.
>>
>> tl;dr:
>> - C1: no problem, no change
>> - C2:
>> - with intrinsics:
>> - with overflow: clear improvement. Was way worse than C1, now is similar (~4s => ~600ms)
>> - without overflow: no problem, no change
>> - without intrinsics: no problem, no change
>>
>> Before the fix:
>>
>> Benchmark (SIZE) Mode Cnt Score Error Units
>> MathExact.C1_1.loopAddIInBounds 1000000 avgt 3 1.272 ± 0.048 ms/op
>> MathExact.C1_1.loopAddIOverflow 1000000 avgt 3 641.917 ± 58.238 ms/op
>> MathExact.C1_1.loopAddLInBounds 1000000 avgt 3 1.402 ± 0.842 ms/op
>> MathExact.C1_1.loopAddLOverflow 1000000 avgt 3 671.013 ± 229.425 ms/op
>> MathExact.C1_1.loopDecrementIInBounds 1000000 avgt 3 3.722 ± 22.244 ms/op
>> MathExact.C1_1.loopDecrementIOverflow 1000000 avgt 3 653.341 ± 279.003 ms/op
>> MathExact.C1_1.loopDecrementLInBounds 1000000 avgt 3 2.525 ± 0.810 ms/op
>> MathExact.C1_1.loopDecrementLOverflow 1000000 avgt 3 656.750 ± 141.792 ms/op
>> MathExact.C1_1.loopIncrementIInBounds 1000000 avgt 3 4.621 ± 12.822 ms/op
>> MathExact.C1_1.loopIncrementIOverflow 1000000 avgt 3 651.608 ± 274.396 ms/op
>> MathExact.C1_1.loopIncrementLInBounds 1000000 avgt 3 2.576 ± 3.316 ms/op
>> MathExact.C1_1.loopIncrementLOverflow 1000000 avgt 3 662.216 ± 71.879 ms/op
>> MathExact.C1_1.loopMultiplyIInBounds 1000000 avgt 3 1.402 ± 0.587 ms/op
>> MathExact.C1_1.loopMultiplyIOverflow 1000000 avgt 3 615.836 ± 252.137 ms/op
>> MathExact.C1_1.loopMultiplyLInBounds 1000000 avgt 3 2.906 ± 5.718 ms/op
>> MathExact.C1_1.loopMultiplyLOverflow 1000000 avgt 3 655.576 ± 147.432 ms/op
>> MathExact.C1_1.loopNegateIInBounds 1000000 avgt 3 2.023 ± 0.027 ms/op
>> MathExact.C1_1.loopNegateIOverflow 1000000 avgt 3 639.136 ± 30.841 ms/op
>> MathExact.C1_1.loop...
>
> Marc Chevalier has updated the pull request incrementally with one additional commit since the last revision:
>
> fix typo in comment
I've made the test flags tighter as discussed offline. I'll need a fresh approval.
And for completeness, there are the bench result on this last state. We can see that things behave as we expect: builtin_throw is taken and making the situation a lot better. When intrinsics or builtin_throw are disabled, we see C1-like perfs.
Benchmark (SIZE) Mode Cnt Score Error Units
MathExact.C1_1.loopAddIInBounds 1000000 avgt 3 1.616 ± 7.813 ms/op
MathExact.C1_1.loopAddIOverflow 1000000 avgt 3 654.971 ± 573.250 ms/op
MathExact.C1_1.loopAddLInBounds 1000000 avgt 3 1.398 ± 0.274 ms/op
MathExact.C1_1.loopAddLOverflow 1000000 avgt 3 629.620 ± 41.181 ms/op
MathExact.C1_1.loopDecrementIInBounds 1000000 avgt 3 2.048 ± 0.340 ms/op
MathExact.C1_1.loopDecrementIOverflow 1000000 avgt 3 681.702 ± 63.721 ms/op
MathExact.C1_1.loopDecrementLInBounds 1000000 avgt 3 3.057 ± 13.688 ms/op
MathExact.C1_1.loopDecrementLOverflow 1000000 avgt 3 660.457 ± 295.393 ms/op
MathExact.C1_1.loopIncrementIInBounds 1000000 avgt 3 2.531 ± 13.692 ms/op
MathExact.C1_1.loopIncrementIOverflow 1000000 avgt 3 647.970 ± 65.451 ms/op
MathExact.C1_1.loopIncrementLInBounds 1000000 avgt 3 5.350 ± 25.080 ms/op
MathExact.C1_1.loopIncrementLOverflow 1000000 avgt 3 681.097 ± 72.604 ms/op
MathExact.C1_1.loopMultiplyIInBounds 1000000 avgt 3 1.552 ± 3.145 ms/op
MathExact.C1_1.loopMultiplyIOverflow 1000000 avgt 3 648.402 ± 62.995 ms/op
MathExact.C1_1.loopMultiplyLInBounds 1000000 avgt 3 2.501 ± 0.720 ms/op
MathExact.C1_1.loopMultiplyLOverflow 1000000 avgt 3 701.498 ± 47.948 ms/op
MathExact.C1_1.loopNegateIInBounds 1000000 avgt 3 2.074 ± 0.949 ms/op
MathExact.C1_1.loopNegateIOverflow 1000000 avgt 3 665.143 ± 537.941 ms/op
MathExact.C1_1.loopNegateLInBounds 1000000 avgt 3 5.487 ± 7.165 ms/op
MathExact.C1_1.loopNegateLOverflow 1000000 avgt 3 687.085 ± 20.738 ms/op
MathExact.C1_1.loopSubtractIInBounds 1000000 avgt 3 1.329 ± 0.769 ms/op
MathExact.C1_1.loopSubtractIOverflow 1000000 avgt 3 683.922 ± 70.434 ms/op
MathExact.C1_1.loopSubtractLInBounds 1000000 avgt 3 1.384 ± 0.386 ms/op
MathExact.C1_1.loopSubtractLOverflow 1000000 avgt 3 664.380 ± 480.847 ms/op
MathExact.C1_2.loopAddIInBounds 1000000 avgt 3 1.862 ± 0.815 ms/op
MathExact.C1_2.loopAddIOverflow 1000000 avgt 3 660.421 ± 506.723 ms/op
MathExact.C1_2.loopAddLInBounds 1000000 avgt 3 1.829 ± 0.221 ms/op
MathExact.C1_2.loopAddLOverflow 1000000 avgt 3 681.209 ± 78.976 ms/op
MathExact.C1_2.loopDecrementIInBounds 1000000 avgt 3 3.533 ± 11.302 ms/op
MathExact.C1_2.loopDecrementIOverflow 1000000 avgt 3 682.639 ± 225.392 ms/op
MathExact.C1_2.loopDecrementLInBounds 1000000 avgt 3 3.402 ± 1.031 ms/op
MathExact.C1_2.loopDecrementLOverflow 1000000 avgt 3 697.283 ± 306.867 ms/op
MathExact.C1_2.loopIncrementIInBounds 1000000 avgt 3 3.326 ± 5.072 ms/op
MathExact.C1_2.loopIncrementIOverflow 1000000 avgt 3 658.514 ± 636.731 ms/op
MathExact.C1_2.loopIncrementLInBounds 1000000 avgt 3 3.718 ± 0.422 ms/op
MathExact.C1_2.loopIncrementLOverflow 1000000 avgt 3 693.863 ± 49.201 ms/op
MathExact.C1_2.loopMultiplyIInBounds 1000000 avgt 3 1.924 ± 2.800 ms/op
MathExact.C1_2.loopMultiplyIOverflow 1000000 avgt 3 609.308 ± 94.814 ms/op
MathExact.C1_2.loopMultiplyLInBounds 1000000 avgt 3 3.459 ± 0.625 ms/op
MathExact.C1_2.loopMultiplyLOverflow 1000000 avgt 3 713.503 ± 556.995 ms/op
MathExact.C1_2.loopNegateIInBounds 1000000 avgt 3 3.195 ± 0.726 ms/op
MathExact.C1_2.loopNegateIOverflow 1000000 avgt 3 684.176 ± 27.164 ms/op
MathExact.C1_2.loopNegateLInBounds 1000000 avgt 3 3.483 ± 0.947 ms/op
MathExact.C1_2.loopNegateLOverflow 1000000 avgt 3 656.284 ± 582.286 ms/op
MathExact.C1_2.loopSubtractIInBounds 1000000 avgt 3 1.728 ± 0.315 ms/op
MathExact.C1_2.loopSubtractIOverflow 1000000 avgt 3 688.029 ± 25.201 ms/op
MathExact.C1_2.loopSubtractLInBounds 1000000 avgt 3 1.941 ± 0.169 ms/op
MathExact.C1_2.loopSubtractLOverflow 1000000 avgt 3 694.341 ± 339.431 ms/op
MathExact.C1_3.loopAddIInBounds 1000000 avgt 3 3.122 ± 0.910 ms/op
MathExact.C1_3.loopAddIOverflow 1000000 avgt 3 688.731 ± 308.210 ms/op
MathExact.C1_3.loopAddLInBounds 1000000 avgt 3 5.492 ± 36.236 ms/op
MathExact.C1_3.loopAddLOverflow 1000000 avgt 3 697.053 ± 229.958 ms/op
MathExact.C1_3.loopDecrementIInBounds 1000000 avgt 3 9.155 ± 72.182 ms/op
MathExact.C1_3.loopDecrementIOverflow 1000000 avgt 3 708.458 ± 788.701 ms/op
MathExact.C1_3.loopDecrementLInBounds 1000000 avgt 3 6.402 ± 3.658 ms/op
MathExact.C1_3.loopDecrementLOverflow 1000000 avgt 3 705.992 ± 213.542 ms/op
MathExact.C1_3.loopIncrementIInBounds 1000000 avgt 3 7.699 ± 61.434 ms/op
MathExact.C1_3.loopIncrementIOverflow 1000000 avgt 3 697.353 ± 105.457 ms/op
MathExact.C1_3.loopIncrementLInBounds 1000000 avgt 3 6.380 ± 0.839 ms/op
MathExact.C1_3.loopIncrementLOverflow 1000000 avgt 3 669.240 ± 522.870 ms/op
MathExact.C1_3.loopMultiplyIInBounds 1000000 avgt 3 3.225 ± 0.140 ms/op
MathExact.C1_3.loopMultiplyIOverflow 1000000 avgt 3 624.811 ± 457.059 ms/op
MathExact.C1_3.loopMultiplyLInBounds 1000000 avgt 3 6.110 ± 1.265 ms/op
MathExact.C1_3.loopMultiplyLOverflow 1000000 avgt 3 718.460 ± 68.166 ms/op
MathExact.C1_3.loopNegateIInBounds 1000000 avgt 3 6.085 ± 1.430 ms/op
MathExact.C1_3.loopNegateIOverflow 1000000 avgt 3 675.036 ± 341.177 ms/op
MathExact.C1_3.loopNegateLInBounds 1000000 avgt 3 9.410 ± 93.522 ms/op
MathExact.C1_3.loopNegateLOverflow 1000000 avgt 3 652.042 ± 166.119 ms/op
MathExact.C1_3.loopSubtractIInBounds 1000000 avgt 3 3.432 ± 11.899 ms/op
MathExact.C1_3.loopSubtractIOverflow 1000000 avgt 3 654.208 ± 120.258 ms/op
MathExact.C1_3.loopSubtractLInBounds 1000000 avgt 3 5.166 ± 38.529 ms/op
MathExact.C1_3.loopSubtractLOverflow 1000000 avgt 3 691.094 ± 80.676 ms/op
MathExact.C2.loopAddIInBounds 1000000 avgt 3 2.276 ± 1.750 ms/op
MathExact.C2.loopAddIOverflow 1000000 avgt 3 1.173 ± 1.392 ms/op
MathExact.C2.loopAddLInBounds 1000000 avgt 3 0.985 ± 0.167 ms/op
MathExact.C2.loopAddLOverflow 1000000 avgt 3 1.990 ± 5.310 ms/op
MathExact.C2.loopDecrementIInBounds 1000000 avgt 3 2.072 ± 0.173 ms/op
MathExact.C2.loopDecrementIOverflow 1000000 avgt 3 1.911 ± 0.288 ms/op
MathExact.C2.loopDecrementLInBounds 1000000 avgt 3 1.845 ± 0.424 ms/op
MathExact.C2.loopDecrementLOverflow 1000000 avgt 3 2.757 ± 27.268 ms/op
MathExact.C2.loopIncrementIInBounds 1000000 avgt 3 2.136 ± 0.517 ms/op
MathExact.C2.loopIncrementIOverflow 1000000 avgt 3 2.199 ± 4.024 ms/op
MathExact.C2.loopIncrementLInBounds 1000000 avgt 3 1.957 ± 0.365 ms/op
MathExact.C2.loopIncrementLOverflow 1000000 avgt 3 2.053 ± 0.779 ms/op
MathExact.C2.loopMultiplyIInBounds 1000000 avgt 3 1.174 ± 0.941 ms/op
MathExact.C2.loopMultiplyIOverflow 1000000 avgt 3 1.971 ± 10.040 ms/op
MathExact.C2.loopMultiplyLInBounds 1000000 avgt 3 0.997 ± 0.318 ms/op
MathExact.C2.loopMultiplyLOverflow 1000000 avgt 3 2.847 ± 4.548 ms/op
MathExact.C2.loopNegateIInBounds 1000000 avgt 3 4.783 ± 2.454 ms/op
MathExact.C2.loopNegateIOverflow 1000000 avgt 3 1.915 ± 0.009 ms/op
MathExact.C2.loopNegateLInBounds 1000000 avgt 3 2.824 ± 28.297 ms/op
MathExact.C2.loopNegateLOverflow 1000000 avgt 3 4.766 ± 32.627 ms/op
MathExact.C2.loopSubtractIInBounds 1000000 avgt 3 0.990 ± 0.264 ms/op
MathExact.C2.loopSubtractIOverflow 1000000 avgt 3 1.181 ± 2.120 ms/op
MathExact.C2.loopSubtractLInBounds 1000000 avgt 3 2.363 ± 1.575 ms/op
MathExact.C2.loopSubtractLOverflow 1000000 avgt 3 2.429 ± 7.120 ms/op
MathExact.C2_no_builtin_throw.loopAddIInBounds 1000000 avgt 3 1.040 ± 0.181 ms/op
MathExact.C2_no_builtin_throw.loopAddIOverflow 1000000 avgt 3 580.950 ± 112.050 ms/op
MathExact.C2_no_builtin_throw.loopAddLInBounds 1000000 avgt 3 1.223 ± 5.700 ms/op
MathExact.C2_no_builtin_throw.loopAddLOverflow 1000000 avgt 3 585.712 ± 61.699 ms/op
MathExact.C2_no_builtin_throw.loopDecrementIInBounds 1000000 avgt 3 2.114 ± 0.663 ms/op
MathExact.C2_no_builtin_throw.loopDecrementIOverflow 1000000 avgt 3 604.866 ± 578.502 ms/op
MathExact.C2_no_builtin_throw.loopDecrementLInBounds 1000000 avgt 3 2.167 ± 9.268 ms/op
MathExact.C2_no_builtin_throw.loopDecrementLOverflow 1000000 avgt 3 621.175 ± 225.858 ms/op
MathExact.C2_no_builtin_throw.loopIncrementIInBounds 1000000 avgt 3 1.950 ± 0.326 ms/op
MathExact.C2_no_builtin_throw.loopIncrementIOverflow 1000000 avgt 3 633.735 ± 830.255 ms/op
MathExact.C2_no_builtin_throw.loopIncrementLInBounds 1000000 avgt 3 2.397 ± 11.911 ms/op
MathExact.C2_no_builtin_throw.loopIncrementLOverflow 1000000 avgt 3 627.599 ± 141.709 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyIInBounds 1000000 avgt 3 1.167 ± 1.187 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyIOverflow 1000000 avgt 3 623.224 ± 298.374 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyLInBounds 1000000 avgt 3 0.944 ± 0.743 ms/op
MathExact.C2_no_builtin_throw.loopMultiplyLOverflow 1000000 avgt 3 658.380 ± 137.021 ms/op
MathExact.C2_no_builtin_throw.loopNegateIInBounds 1000000 avgt 3 2.119 ± 0.642 ms/op
MathExact.C2_no_builtin_throw.loopNegateIOverflow 1000000 avgt 3 643.102 ± 452.213 ms/op
MathExact.C2_no_builtin_throw.loopNegateLInBounds 1000000 avgt 3 2.036 ± 0.862 ms/op
MathExact.C2_no_builtin_throw.loopNegateLOverflow 1000000 avgt 3 586.103 ± 26.173 ms/op
MathExact.C2_no_builtin_throw.loopSubtractIInBounds 1000000 avgt 3 2.552 ± 3.677 ms/op
MathExact.C2_no_builtin_throw.loopSubtractIOverflow 1000000 avgt 3 635.294 ± 217.034 ms/op
MathExact.C2_no_builtin_throw.loopSubtractLInBounds 1000000 avgt 3 1.093 ± 1.685 ms/op
MathExact.C2_no_builtin_throw.loopSubtractLOverflow 1000000 avgt 3 661.541 ± 1358.199 ms/op
MathExact.C2_no_intrinsics.loopAddIInBounds 1000000 avgt 3 2.185 ± 15.103 ms/op
MathExact.C2_no_intrinsics.loopAddIOverflow 1000000 avgt 3 831.812 ± 1260.546 ms/op
MathExact.C2_no_intrinsics.loopAddLInBounds 1000000 avgt 3 2.145 ± 0.088 ms/op
MathExact.C2_no_intrinsics.loopAddLOverflow 1000000 avgt 3 709.930 ± 658.722 ms/op
MathExact.C2_no_intrinsics.loopDecrementIInBounds 1000000 avgt 3 2.288 ± 0.950 ms/op
MathExact.C2_no_intrinsics.loopDecrementIOverflow 1000000 avgt 3 646.879 ± 186.231 ms/op
MathExact.C2_no_intrinsics.loopDecrementLInBounds 1000000 avgt 3 1.894 ± 0.421 ms/op
MathExact.C2_no_intrinsics.loopDecrementLOverflow 1000000 avgt 3 641.577 ± 323.040 ms/op
MathExact.C2_no_intrinsics.loopIncrementIInBounds 1000000 avgt 3 2.027 ± 0.249 ms/op
MathExact.C2_no_intrinsics.loopIncrementIOverflow 1000000 avgt 3 657.092 ± 229.818 ms/op
MathExact.C2_no_intrinsics.loopIncrementLInBounds 1000000 avgt 3 3.220 ± 16.992 ms/op
MathExact.C2_no_intrinsics.loopIncrementLOverflow 1000000 avgt 3 603.468 ± 73.240 ms/op
MathExact.C2_no_intrinsics.loopMultiplyIInBounds 1000000 avgt 3 1.295 ± 0.413 ms/op
MathExact.C2_no_intrinsics.loopMultiplyIOverflow 1000000 avgt 3 593.005 ± 576.291 ms/op
MathExact.C2_no_intrinsics.loopMultiplyLInBounds 1000000 avgt 3 1.093 ± 0.916 ms/op
MathExact.C2_no_intrinsics.loopMultiplyLOverflow 1000000 avgt 3 618.956 ± 554.204 ms/op
MathExact.C2_no_intrinsics.loopNegateIInBounds 1000000 avgt 3 2.035 ± 0.047 ms/op
MathExact.C2_no_intrinsics.loopNegateIOverflow 1000000 avgt 3 650.591 ± 1248.923 ms/op
MathExact.C2_no_intrinsics.loopNegateLInBounds 1000000 avgt 3 3.505 ± 20.475 ms/op
MathExact.C2_no_intrinsics.loopNegateLOverflow 1000000 avgt 3 660.686 ± 201.612 ms/op
MathExact.C2_no_intrinsics.loopSubtractIInBounds 1000000 avgt 3 1.109 ± 0.726 ms/op
MathExact.C2_no_intrinsics.loopSubtractIOverflow 1000000 avgt 3 670.468 ± 475.269 ms/op
MathExact.C2_no_intrinsics.loopSubtractLInBounds 1000000 avgt 3 1.208 ± 0.806 ms/op
MathExact.C2_no_intrinsics.loopSubtractLOverflow 1000000 avgt 3 597.522 ± 32.465 ms/op
-------------
PR Comment: https://git.openjdk.org/jdk/pull/23916#issuecomment-2775707480
More information about the hotspot-compiler-dev
mailing list