RFR: 8348868: AArch64: Add backend support for SelectFromTwoVector [v4]

Bhavana Kilambi bkilambi at openjdk.org
Wed Jun 25 08:09:35 UTC 2025


On Wed, 25 Jun 2025 06:28:55 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> src/hotspot/cpu/aarch64/aarch64_vector.ad line 262:
>> 
>>> 260:             (UseSVE == 2 && length_in_bytes > 8 && length_in_bytes < MaxVectorSize )) {
>>> 261:           return false;
>>> 262:         }
>> 
>> How about:
>> 
>> case Op_SelectFromTwoVector:
>>   // The "tbl" instruction for two vector table is supported only in Neon and SVE2. Return
>>   // false if vector length > 16B but supported SVE version < 2.
>>   //
>>   // Additionally, this operation is disabled for doubles and longs on machines with SVE < 2,
>>   // Instead, the default VectorRearrange + VectorBlend is generated as the performance of
>>   // the default pattern is slightly better.
>>   if (UseSVE < 2 && (type2aelembytes(bt) == 8 || length_in_bytes > 16)) {
>>     return false;
>>   }
>> 
>>   // As the SVE2 "tbl" instruction is unpredicated and partial operations cannot be generated
>>   // using masks, we currently disable this operation on machines where length_in_bytes <
>>   // MaxVectorSize with the only exception of 8B vector length.
>>   if (UseSVE == 2 && length_in_bytes > 8 && length_in_bytes < MaxVectorSize)) {
>>     return false;
>>   }
>> 
>>   break;
>
> Maybe the NEON `tbl` can also be generated for SVE2 when `length_in_bytes == 16 && length_in_bytes < MaxVectorSize`. This is a special partial version for SVE2.  As a summary, The match rule's predicate will be:
> 1) NEON: UseSVE < 2 || (length_in_bytes < 16 || length_in_bytes < MaxVectorSize)
> 2) SVE: UseSVE ==2 && (length_in_bytes >= 16 && length_in_bytes == MaxVectorSize)
> 
> Seems this will make predicate or code here more complex. Advantage is this op with 128 vector shape on a SVE2 256 or larger size machine will also be intrinsified. It's not a block and change or not is up to you. We can also revisit this part once the 256-bit SVE2 machine exist in future.

Thanks @XiaohongGong . The case you mention will need an SVE2 machine with MaxVectorSize >= 32B which is currently not available. I think it's better if we revisit these cases once a functioning hardware is available. Shall I add a comment here as a reminder that we need to revisit when such hardware is available?

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/23570#discussion_r2166075939


More information about the hotspot-compiler-dev mailing list