RFR: 8285040: PPC64 intrinsics for divideUnsigned and remainderUnsigned methods in java.lang.Integer and java.lang.Long [v3]
Martin Doerr
mdoerr at openjdk.java.net
Wed Apr 20 14:05:34 UTC 2022
> Add match rules for UDivI, UModI, UDivL, UModL as on x86 (JDK-8282221). PPC64 doesn't have DivMod instructions which can deliver both results at once.
> Note: The x86 tests can currently not be extended to this platform because https://bugs.openjdk.java.net/browse/JDK-8280120 is not yet implemented.
>
> Removed UDivI, UModI again in second commit, because performance was worse. C2 can optimize better without intrinsification.
>
> LongDivMod without UDivL, UModL on Power9:
>
> Iteration 1: 5453.092 ns/op
> Iteration 2: 5480.991 ns/op
> Iteration 3: 5465.746 ns/op
> Iteration 4: 5496.196 ns/op
> Iteration 5: 5500.508 ns/op
>
>
> With UDivL, UModL:
>
> Iteration 1: 3253.293 ns/op
> Iteration 2: 3253.079 ns/op
> Iteration 3: 3252.806 ns/op
> Iteration 4: 3252.636 ns/op
> Iteration 5: 3252.717 ns/op
>
>
> Complete results:
>
> Without UDivL, UModL:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> LongDivMod.testDivideRemainderUnsigned 1024 mixed avgt 25 5482.364 ± 18.448 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 positive avgt 25 4722.370 ± 2.314 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 negative avgt 25 2024.052 ± 0.604 ns/op
> LongDivMod.testDivideUnsigned 1024 mixed avgt 25 4772.528 ± 63.147 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 25 3711.178 ± 1.178 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 25 1195.149 ± 0.822 ns/op
> LongDivMod.testRemainderUnsigned 1024 mixed avgt 25 4753.722 ± 115.171 ns/op
> LongDivMod.testRemainderUnsigned 1024 positive avgt 25 3749.799 ± 5.935 ns/op
> LongDivMod.testRemainderUnsigned 1024 negative avgt 25 1488.802 ± 0.628 ns/op
>
>
> With UDivL, UModL:
>
> Benchmark (BUFFER_SIZE) (divisorType) Mode Cnt Score Error Units
> LongDivMod.testDivideRemainderUnsigned 1024 mixed avgt 25 3253.162 ± 1.019 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 positive avgt 25 3252.280 ± 1.608 ns/op
> LongDivMod.testDivideRemainderUnsigned 1024 negative avgt 25 3252.933 ± 1.850 ns/op
> LongDivMod.testDivideUnsigned 1024 mixed avgt 25 1648.233 ± 1.830 ns/op
> LongDivMod.testDivideUnsigned 1024 positive avgt 25 1648.639 ± 0.816 ns/op
> LongDivMod.testDivideUnsigned 1024 negative avgt 25 1646.247 ± 3.835 ns/op
> LongDivMod.testRemainderUnsigned 1024 mixed avgt 25 1766.701 ± 1.897 ns/op
> LongDivMod.testRemainderUnsigned 1024 positive avgt 25 1767.413 ± 1.450 ns/op
> LongDivMod.testRemainderUnsigned 1024 negative avgt 25 1767.216 ± 1.800 ns/op
Martin Doerr has updated the pull request incrementally with one additional commit since the last revision:
Enable UseDivMod optimization for unsinged cases.
-------------
Changes:
- all: https://git.openjdk.java.net/jdk/pull/8304/files
- new: https://git.openjdk.java.net/jdk/pull/8304/files/d19eef58..0037f453
Webrevs:
- full: https://webrevs.openjdk.java.net/?repo=jdk&pr=8304&range=02
- incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=8304&range=01-02
Stats: 10 lines in 1 file changed: 10 ins; 0 del; 0 mod
Patch: https://git.openjdk.java.net/jdk/pull/8304.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/8304/head:pull/8304
PR: https://git.openjdk.java.net/jdk/pull/8304
More information about the hotspot-compiler-dev
mailing list