RFR: JDK-8270340: Base64 decodeBlock intrinsic for Power64 needs cleanup

Corey Ashford cashford at openjdk.java.net
Tue Jul 13 04:55:06 UTC 2021


This series of commits was written to accomplish several cleanup and Power10 optimization tasks for the Base64 decodeBlock intrinsic for Power64:
* Remove the ISA 3.1+ (Power10+) pextd instruction optimization in decodeBlock.  This "optimization" turned out to actually cause a performance hit.  Removing it gains back about 3% in performance.
* Introduce a constant block, similar to that in use by encodeBlock() to speed up constant loading.
* Add the ISA 3.1+  xxpermx instruction and align_prefix() method for use in a Power10 optimization for decodeBlock.  Please see the commit log for my concerns about this change.
* Implement the xxpermx-based decodeBlock algorithm for Power10+, which gives about a 5% performance boost.

More details can be found in the commit logs.

I want to note here that I looked into changing the loop_unrolls constant, and found that at large buffer sizes, the values of 2 and 4 give some extra performance gain. For example, on a 20001-byte destination buffer, I see an increase from 4.7X over intrinsic disabled (loop_unrolls=1), to 5.3X over intrinsic disabled (loop_unrolls=4), but on smaller buffer sizes, up to about 512, it causes performance degradation over loop_unrolls=1, so I have decided to stick with the original value of 1, since I don't know where to focus the performance versus buffer length tradeoff.

-------------

Commit messages:
 - macroAssembler_ppc.cpp: fix whitespace error
 - stubGenerator_ppc.cpp: decodeBlock(): Use xxpermx to improve performance of decodeBlock on Power10+
 - Add xxpermx instruction, and align_prefix() method
 - stubGenerator_cpp.cpp: decodeBlock(): use constant block for loading constants into vector registers for cleaner and faster code
 - stubGenerator_ppc.cpp: Remove p10 pextd optimization

Changes: https://git.openjdk.java.net/jdk/pull/4762/files
 Webrev: https://webrevs.openjdk.java.net/?repo=jdk&pr=4762&range=00
  Issue: https://bugs.openjdk.java.net/browse/JDK-8270340
  Stats: 437 lines in 5 files changed: 206 ins; 126 del; 105 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4762.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4762/head:pull/4762

PR: https://git.openjdk.java.net/jdk/pull/4762


More information about the hotspot-runtime-dev mailing list