[riscv-port] RFR: 8279292: riscv: Intrinsify multiplyToLen and squareToLen
On behalf of Taiping Guo (guotaiping1@huawei.com) BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2. JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt) [squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt) ------------- Commit messages: - 8279292: riscv: Intrinsify multiplyToLen and squareToLen Changes: https://git.openjdk.java.net/riscv-port/pull/38/files Webrev: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=00 Issue: https://bugs.openjdk.java.net/browse/JDK-8279292 Stats: 589 lines in 4 files changed: 589 ins; 0 del; 0 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
On Mon, 27 Dec 2021 08:25:22 GMT, Feilong Jiang <fjiang@openjdk.org> wrote:
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Changes requested by fyang (Lead). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3249:
3247: 3248: /** 3249: * Multiply 128 bit by 128. Unrolled inner loop.
Should be: "Multiply 128 bit by 128 bit. Unrolled inner loop." src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3381:
3379: 3380: /** 3381: * Code for BigInteger::multiplyToLen() instrinsic.
There is a typo here. Should be: ""Code for BigInteger::multiplyToLen() intrinsic." src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3417:
3415: mv(carry, zr); // carry = 0; 3416: 3417: Label L_multiply_64_or_128, L_done;
Suggestion: rename L_multiply_64_or_128 into L_multiply_64_x_64_loop src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3424:
3422: const Register jdx = tmp1; 3423: 3424: // if x and y are both 8 bytes aligend.
There is a typo here. Maybe "// Check if x and y are both 8-byte aligned." ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
On Mon, 27 Dec 2021 11:26:48 GMT, Fei Yang <fyang@openjdk.org> wrote:
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision:
fix type and code renaming
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3249:
3247: 3248: /** 3249: * Multiply 128 bit by 128. Unrolled inner loop.
Should be: "Multiply 128 bit by 128 bit. Unrolled inner loop."
Fixed.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3381:
3379: 3380: /** 3381: * Code for BigInteger::multiplyToLen() instrinsic.
There is a typo here. Should be: ""Code for BigInteger::multiplyToLen() intrinsic."
Fixed.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3417:
3415: mv(carry, zr); // carry = 0; 3416: 3417: Label L_multiply_64_or_128, L_done;
Suggestion: rename L_multiply_64_or_128 into L_multiply_64_x_64_loop
Thanks for the suggestion, renamed.
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3424:
3422: const Register jdx = tmp1; 3423: 3424: // if x and y are both 8 bytes aligend.
There is a typo here. Maybe "// Check if x and y are both 8-byte aligned."
Oops! Fixed. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
On Mon, 27 Dec 2021 11:36:07 GMT, Fei Yang <fyang@openjdk.org> wrote:
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision:
fix type and code renaming
Changes requested by fyang (Lead).
@RealFYang Thank you for the review. I have addressed all your comments above. :-) ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: fix type and code renaming ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/38/files - new: https://git.openjdk.java.net/riscv-port/pull/38/files/5d745571..872b0b96 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=01 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=00-01 Stats: 16 lines in 2 files changed: 0 ins; 3 del; 13 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: fix typo ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/38/files - new: https://git.openjdk.java.net/riscv-port/pull/38/files/872b0b96..1d6c0cb3 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=02 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=01-02 Stats: 4 lines in 1 file changed: 0 ins; 0 del; 4 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: recover deleted comment ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/38/files - new: https://git.openjdk.java.net/riscv-port/pull/38/files/1d6c0cb3..d5d6a5a1 Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=03 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=02-03 Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
On Mon, 27 Dec 2021 12:49:13 GMT, Feilong Jiang <fjiang@openjdk.org> wrote:
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision:
recover deleted comment
Changes requested by fyang (Lead). src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3424:
3422: const Register jdx = tmp1; 3423: 3424: // Check if x and y are both 8-byte aligned.
I think the code logic for avoiding unaligned access should be under control of option AvoidUnalignedAccesses. ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
On Tue, 28 Dec 2021 01:32:08 GMT, Fei Yang <fyang@openjdk.org> wrote:
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision:
recover deleted comment
src/hotspot/cpu/riscv/macroAssembler_riscv.cpp line 3424:
3422: const Register jdx = tmp1; 3423: 3424: // Check if x and y are both 8-byte aligned.
I think the code logic for avoiding unaligned access should be under control of option AvoidUnalignedAccesses.
The cod logic for avoiding unaligned access has been placed under the control of the AvoidUnalignedAccesses option, and the test cases related to BIgInteger are all passed when `-XX:-AvoidUnalignedAccesses` and `-XX:+AvoidUnalignedAccesses`. Thanks for your review! ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: and avoid unaligned accesses option ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/38/files - new: https://git.openjdk.java.net/riscv-port/pull/38/files/d5d6a5a1..ae5180aa Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=04 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=03-04 Stats: 66 lines in 1 file changed: 16 ins; 14 del; 36 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision: use 32-bit instructions to deal with length and index operations ------------- Changes: - all: https://git.openjdk.java.net/riscv-port/pull/38/files - new: https://git.openjdk.java.net/riscv-port/pull/38/files/ae5180aa..aca73fbe Webrevs: - full: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=05 - incr: https://webrevs.openjdk.java.net/?repo=riscv-port&pr=38&range=04-05 Stats: 31 lines in 1 file changed: 1 ins; 0 del; 30 mod Patch: https://git.openjdk.java.net/riscv-port/pull/38.diff Fetch: git fetch https://git.openjdk.java.net/riscv-port pull/38/head:pull/38 PR: https://git.openjdk.java.net/riscv-port/pull/38
On Wed, 29 Dec 2021 02:18:16 GMT, Feilong Jiang <fjiang@openjdk.org> wrote:
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
Feilong Jiang has updated the pull request incrementally with one additional commit since the last revision:
use 32-bit instructions to deal with length and index operations
Looks good. ------------- Marked as reviewed by fyang (Lead). PR: https://git.openjdk.java.net/riscv-port/pull/38
On Mon, 27 Dec 2021 08:25:22 GMT, Feilong Jiang <fjiang@openjdk.org> wrote:
BigInteger intrinsic: MultiplyToLen and SquareToLen intrinsic are missed in current vm. They should be implemented. The JMH tests show that the MultiplyToLen intrinsic improve the performance by up to 2x ~ 3x and the SquareToLen intrinsic improve the performance by up to 1.8x ~ 2x when the length of BigInteger changed from 1 to 5000, compared with that of C2.
Full jtreg tests on qemu and hotspot/jdk tier1 test on Unmathced are passed without new failures.
JMH tests and results on D1 and Unmatched list as follows: [MyBenchmark.txt](https://github.com/openjdk/riscv-port/files/7779255/MyBenchmark.txt)
[squareToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779247/squareToLen_unmatched.tx...) [squareToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779248/squareToLen_d1.txt) [multiplyToLen_unmatched.txt](https://github.com/openjdk/riscv-port/files/7779249/multiplyToLen_unmatched....) [multiplyToLen_d1.txt](https://github.com/openjdk/riscv-port/files/7779250/multiplyToLen_d1.txt)
This pull request has now been integrated. Changeset: b396cdda Author: Feilong Jiang <fjiang@openjdk.org> Committer: Fei Yang <fyang@openjdk.org> URL: https://git.openjdk.java.net/riscv-port/commit/b396cdda88b703918fe89d80261b0... Stats: 595 lines in 4 files changed: 589 ins; 0 del; 6 mod 8279292: riscv: Intrinsify multiplyToLen and squareToLen Co-authored-by: Taiping Guo <guotaiping1@huawei.com> Reviewed-by: fyang ------------- PR: https://git.openjdk.java.net/riscv-port/pull/38
participants (3)
-
Fei Yang
-
Feilong Jiang
-
guotaiping1