RFR: 8342095: Add autovectorizer support for subword vector casts [v12]

Emanuel Peter epeter at openjdk.org
Mon May 5 13:53:53 UTC 2025


On Sat, 3 May 2025 17:29:39 GMT, Jasmine Karthikeyan <jkarthikeyan at openjdk.org> wrote:

>> Jasmine Karthikeyan has updated the pull request incrementally with one additional commit since the last revision:
>> 
>>   Whitespace and benchmark tweak
>
> Thanks a lot for running the benchmark on your AVX512 machine! The results are very interesting, in the char cases it looks like we over-unroll the loop with SuperWord enabled even though we don't end up vectorizing the loop, fixing that could solve the slowdown. Since you mentioned the unroll amount was 32x, it might be unrolling to fill a vector (`512/sizeof(char) = 32`).
> 
>> Wait, but you seem to say that you want to support `casting to T_CHAR`. But is the issue not casting FROM char?
> 
> You are correct, I think that is my mistake. It looks like casting to char is supported because stores to both short and char become `StoreC`, but casting from char isn't supported because we have no `VectorCastC2X` node. I'll update the bug to make it more accurate.
> 
> I've also pushed a small commit to remove some extra whitespace and to make the benchmark run faster.

@jaskarth Just checked the internal testing. Saw this failure with `-XX:UseAVX=1`:


Failed IR Rules (2) of Methods (2)
----------------------------------
1) Method "public java.lang.Object[] compiler.loopopts.superword.TestCompatibleUseDefTypeSize.testByteToLong(byte[],long[])" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={}, counts={"_#V#VECTOR_CAST_B2L#_", "_ at min(max_byte, max_long)", ">0"}, applyIfPlatform={}, applyIfPlatformOr={}, failOn={}, applyIfOr={"AlignVector", "false", "UseCompactObjectHeaders", "false"}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={"avx", "true"}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(VectorCastB2X.*)+(\\s){2}===.*vector[A-Za-z]<J,2>)"
           - Failed comparison: [found] 0 > 0 [given]
           - No nodes matched!

2) Method "public java.lang.Object[] compiler.loopopts.superword.TestCompatibleUseDefTypeSize.testLongToByte(long[],byte[])" - [Failed IR rules: 1]:
   * @IR rule 1: "@compiler.lib.ir_framework.IR(phase={DEFAULT}, applyIfPlatformAnd={}, applyIfCPUFeatureOr={}, counts={"_#V#VECTOR_CAST_L2B#_", "_ at min(max_long, max_byte)", ">0"}, applyIfPlatform={}, applyIfPlatformOr={}, failOn={}, applyIfOr={"AlignVector", "false", "UseCompactObjectHeaders", "false"}, applyIfCPUFeatureAnd={}, applyIf={}, applyIfCPUFeature={"avx", "true"}, applyIfAnd={}, applyIfNot={})"
     > Phase "PrintIdeal":
       - counts: Graph contains wrong number of nodes:
         * Constraint 1: "(\\d+(\\s){2}(VectorCastL2X.*)+(\\s){2}===.*vector[A-Za-z]<B,2>)"
           - Failed comparison: [found] 0 > 0 [given]
           - No nodes matched!

-------------

PR Comment: https://git.openjdk.org/jdk/pull/23413#issuecomment-2851082595


More information about the hotspot-compiler-dev mailing list