RFR: 8285390: PPC64: Handle integral division overflow during parsing
Lutz Schmidt
lucy at openjdk.java.net
Mon Apr 25 15:32:29 UTC 2022
On Thu, 21 Apr 2022 16:39:14 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:
> Move check for possible overflow from backend into ideal graph (like on x86). Makes the .ad file smaller. `parse_ppc.cpp` is an exact copy from x86.
>
> Before this change on Power9:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> IntegerDivMod.testDivide 1024 mixed avgt 5 1627.781 ± 1.197 ns/op
> IntegerDivMod.testDivide 1024 positive avgt 5 1628.640 ± 3.058 ns/op
> IntegerDivMod.testDivide 1024 negative avgt 5 1628.506 ± 1.030 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 mixed avgt 5 1620.669 ± 2.077 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 positive avgt 5 1619.910 ± 2.384 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 negative avgt 5 1619.444 ± 1.282 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 mixed avgt 5 1631.709 ± 1.992 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 positive avgt 5 1630.719 ± 0.731 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 negative avgt 5 1631.650 ± 5.654 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 mixed avgt 5 1834.094 ± 2.812 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 positive avgt 5 1833.026 ± 3.489 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 negative avgt 5 1831.663 ± 0.612 ns/op
> IntegerDivMod.testDivideUnsigned 1024 mixed avgt 5 1620.842 ± 0.711 ns/op
> IntegerDivMod.testDivideUnsigned 1024 positive avgt 5 1621.297 ± 1.197 ns/op
> IntegerDivMod.testDivideUnsigned 1024 negative avgt 5 1621.373 ± 1.192 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 mixed avgt 5 1753.691 ± 19.836 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 positive avgt 5 1753.304 ± 17.150 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 negative avgt 5 1753.961 ± 16.264 ns/op
>
>
> New:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> IntegerDivMod.testDivide 1024 mixed avgt 5 1627.701 ± 0.737 ns/op
> IntegerDivMod.testDivide 1024 positive avgt 5 1627.247 ± 1.831 ns/op
> IntegerDivMod.testDivide 1024 negative avgt 5 1626.695 ± 1.081 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 mixed avgt 5 1617.744 ± 0.471 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 positive avgt 5 1617.825 ± 0.992 ns/op
> IntegerDivMod.testDivideHoistedDivisor 1024 negative avgt 5 1617.968 ± 0.771 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 mixed avgt 5 1623.766 ± 2.621 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 positive avgt 5 1626.698 ± 7.012 ns/op
> IntegerDivMod.testDivideKnownPositive 1024 negative avgt 5 1623.288 ± 3.133 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 mixed avgt 5 1832.516 ± 2.889 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 positive avgt 5 1833.952 ± 4.185 ns/op
> IntegerDivMod.testDivideRemainderUnsigned 1024 negative avgt 5 1833.491 ± 1.200 ns/op
> IntegerDivMod.testDivideUnsigned 1024 mixed avgt 5 1620.972 ± 0.878 ns/op
> IntegerDivMod.testDivideUnsigned 1024 positive avgt 5 1620.915 ± 1.106 ns/op
> IntegerDivMod.testDivideUnsigned 1024 negative avgt 5 1621.276 ± 0.756 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 mixed avgt 5 1754.744 ± 18.203 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 positive avgt 5 1753.559 ± 19.693 ns/op
> IntegerDivMod.testRemainderUnsigned 1024 negative avgt 5 1752.696 ± 16.449 ns/op
>
>
> Performance is not impacted. New code would allow better optimization if C2 used information about the inputs (divisor != min or dividend != -1). Maybe in the future.
>
> Before this change on Power9:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> LongDivMod.testDivide 1024 mixed avgt 5 1760.504 ± 29.350 ns/op
> LongDivMod.testDivide 1024 positive avgt 5 1762.440 ± 32.993 ns/op
> LongDivMod.testDivide 1024 negative avgt 5 1765.134 ± 27.121 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 mixed avgt 5 1693.123 ± 159.356 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 positive avgt 5 1696.499 ± 168.287 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 negative avgt 5 1696.060 ± 167.528 ns/op
> LongDivMod.testDivideKnownPositive 1024 mixed avgt 5 6674.115 ± 1700.436 ns/op
> LongDivMod.testDivideKnownPositive 1024 positive avgt 5 2026.646 ± 234.461 ns/op
> LongDivMod.testDivideKnownPositive 1024 negative avgt 5 938.109 ± 2480.535 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 mixed avgt 5 1817.386 ± 5.344 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 positive avgt 5 1822.236 ± 6.462 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 negative avgt 5 1822.272 ± 2.657 ns/op
> LongDivMod.testDivideUnsigned 1024 mixed avgt 5 1615.490 ± 0.885 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 5 1611.956 ± 3.900 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 5 1614.098 ± 10.490 ns/op
> LongDivMod.testRemainderUnsigned 1024 mixed avgt 5 1736.859 ± 9.652 ns/op
> LongDivMod.testRemainderUnsigned 1024 positive avgt 5 1740.197 ± 9.719 ns/op
> LongDivMod.testRemainderUnsigned 1024 negative avgt 5 1738.892 ± 18.520 ns/op
>
>
> New:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> LongDivMod.testDivide 1024 mixed avgt 5 1627.228 ± 3.282 ns/op
> LongDivMod.testDivide 1024 positive avgt 5 1627.452 ± 1.874 ns/op
> LongDivMod.testDivide 1024 negative avgt 5 1626.685 ± 1.059 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 mixed avgt 5 1618.192 ± 0.369 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 positive avgt 5 1618.181 ± 0.500 ns/op
> LongDivMod.testDivideHoistedDivisor 1024 negative avgt 5 1617.882 ± 0.410 ns/op
> LongDivMod.testDivideKnownPositive 1024 mixed avgt 5 2367.842 ± 228.570 ns/op
> LongDivMod.testDivideKnownPositive 1024 positive avgt 5 1702.237 ± 15.417 ns/op
> LongDivMod.testDivideKnownPositive 1024 negative avgt 5 844.757 ± 1687.221 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 mixed avgt 5 1825.526 ± 2.607 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 positive avgt 5 1825.752 ± 4.904 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 negative avgt 5 1826.059 ± 3.236 ns/op
> LongDivMod.testDivideUnsigned 1024 mixed avgt 5 1621.620 ± 1.818 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 5 1622.589 ± 4.129 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 5 1616.119 ± 16.095 ns/op
> LongDivMod.testRemainderUnsigned 1024 mixed avgt 5 1740.670 ± 13.196 ns/op
> LongDivMod.testRemainderUnsigned 1024 positive avgt 5 1745.188 ± 9.884 ns/op
> LongDivMod.testRemainderUnsigned 1024 negative avgt 5 1742.949 ± 7.007 ns/op
>
>
> Performance is a bit better regarding Long division, only `testDivideKnownPositive` benefits significantly.
Changes look good to me.
Thanks for relentlessly seeking performance.
-------------
Marked as reviewed by lucy (Reviewer).
PR: https://git.openjdk.java.net/jdk/pull/8343
More information about the hotspot-compiler-dev
mailing list