RFR: 8270340: Base64 decodeBlock intrinsic for Power64 needs cleanup [v2]

Corey Ashford cashford at openjdk.java.net
Sat Jul 17 02:01:23 UTC 2021


> This series of commits was written to accomplish several cleanup and Power10 optimization tasks for the Base64 decodeBlock intrinsic for Power64:
> * Remove the ISA 3.1+ (Power10+) pextd instruction optimization in decodeBlock.  This "optimization" turned out to actually cause a performance hit.  Removing it gains back about 3% in performance.
> * Introduce a constant block, similar to that in use by encodeBlock() to speed up constant loading.
> * Add the ISA 3.1+  xxpermx instruction and align_prefix() method for use in a Power10 optimization for decodeBlock.  Please see the commit log for my concerns about this change.
> * Implement the xxpermx-based decodeBlock algorithm for Power10+, which gives about a 5% performance boost.
> 
> More details can be found in the commit logs.
> 
> I want to note here that I looked into changing the loop_unrolls constant, and found that at large buffer sizes, the values of 2 and 4 give some extra performance gain. For example, on a 20001-byte destination buffer, I see an increase from 4.7X over intrinsic disabled (loop_unrolls=1), to 5.3X over intrinsic disabled (loop_unrolls=4), but on smaller buffer sizes, up to about 512, it causes performance degradation over loop_unrolls=1, so I have decided to stick with the original value of 1, since I don't know where to focus the performance versus buffer length tradeoff.

Corey Ashford has updated the pull request incrementally with one additional commit since the last revision:

  stubGenerator_ppc.cpp: fixes for feedback from Kazunori Ogata and Martin Doerr
  
  decodeBlock changes:
  
  * Remove unroll loop and associated comments
  * Change comments referring to "P10+" to "P10 (or later)" to remove ambiguity
  * Make clear the lack of need for "aligh_prefix()" calls when using xxpermx due to the align(32) call
  
  The following change isn't based on feedback:
  
  encodeBlock changes:
  * Fix a comment that still referred to "unrolled loop".  Unrolling was removed in an earlier commit.

-------------

Changes:
  - all: https://git.openjdk.java.net/jdk/pull/4762/files
  - new: https://git.openjdk.java.net/jdk/pull/4762/files/bfa1d3be..1400e26d

Webrevs:
 - full: https://webrevs.openjdk.java.net/?repo=jdk&pr=4762&range=01
 - incr: https://webrevs.openjdk.java.net/?repo=jdk&pr=4762&range=00-01

  Stats: 275 lines in 1 file changed: 103 ins; 127 del; 45 mod
  Patch: https://git.openjdk.java.net/jdk/pull/4762.diff
  Fetch: git fetch https://git.openjdk.java.net/jdk pull/4762/head:pull/4762

PR: https://git.openjdk.java.net/jdk/pull/4762


More information about the hotspot-dev mailing list