RFR: JDK-8270340: Base64 decodeBlock intrinsic for Power64 needs cleanup
Corey Ashford
cashford at openjdk.java.net
Tue Jul 13 04:55:06 UTC 2021
This series of commits was written to accomplish several cleanup and Power10 optimization tasks for the Base64 decodeBlock intrinsic for Power64:
* Remove the ISA 3.1+ (Power10+) pextd instruction optimization in decodeBlock. This "optimization" turned out to actually cause a performance hit. Removing it gains back about 3% in performance.
* Introduce a constant block, similar to that in use by encodeBlock() to speed up constant loading.
* Add the ISA 3.1+ xxpermx instruction and align_prefix() method for use in a Power10 optimization for decodeBlock. Please see the commit log for my concerns about this change.
* Implement the xxpermx-based decodeBlock algorithm for Power10+, which gives about a 5% performance boost.
More details can be found in the commit logs.
I want to note here that I looked into changing the loop_unrolls constant, and found that at large buffer sizes, the values of 2 and 4 give some extra performance gain. For example, on a 20001-byte destination buffer, I see an increase from 4.7X over intrinsic disabled (loop_unrolls=1), to 5.3X over intrinsic disabled (loop_unrolls=4), but on smaller buffer sizes, up to about 512, it causes performance degradation over loop_unrolls=1, so I have decided to stick with the original value of 1, since I don't know where to focus the performance versus buffer length tradeoff.
-------------
Commit messages:
- macroAssembler_ppc.cpp: fix whitespace error
- stubGenerator_ppc.cpp: decodeBlock(): Use xxpermx to improve performance of decodeBlock on Power10+
- Add xxpermx instruction, and align_prefix() method
- stubGenerator_cpp.cpp: decodeBlock(): use constant block for loading constants into vector registers for cleaner and faster code
- stubGenerator_ppc.cpp: Remove p10 pextd optimization
Changes: https://git.openjdk.java.net/jdk/pull/4762/files
Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4762&range=00
Issue: https://bugs.openjdk.java.net/browse/JDK-8270340
Stats: 437 lines in 5 files changed: 206 ins; 126 del; 105 mod
Patch: https://git.openjdk.java.net/jdk/pull/4762.diff
Fetch: git fetch https://git.openjdk.java.net/jdk pull/4762/head:pull/4762
PR: https://git.openjdk.java.net/jdk/pull/4762
More information about the hotspot-runtime-dev
mailing list