RFR: 8290249: Vectorize signum on AArch64 [v2]

Bhavana-Kilambi duke at openjdk.org
Wed Aug 17 08:12:57 UTC 2022


On Tue, 16 Aug 2022 13:24:15 GMT, Andrew Haley <aph at openjdk.org> wrote:

>> Bhavana-Kilambi has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains three commits:
>> 
>>  - Merge sve_facgt with int/fp compare and few optimizations
>>  - Merge master
>>  - 8290249: Vectorize signum on AArch64
>>    
>>    This patch auto-vectorizes Math.signum intrinsic for float and  double
>>    types on aarch64 (Neon and SVE). On SVE supporting machines, if the
>>    MaxVectorSize <=16 the Neon code would be emitted and if the
>>    MaxVectorSize > 16, the SVE code for the intrinsic would be emitted.
>>    
>>    Following is the performance data for the micro test here -
>>    test/micro/org/openjdk/bench/vm/compiler/VectorSignum.java
>>    
>>    Benchmark	                Size    A	B       C
>>    VectorSignum.doubleSignum	256	1.79	1.70	3.18
>>    VectorSignum.doubleSignum	512	1.86	1.73	3.69
>>    VectorSignum.doubleSignum	1024	1.89	1.74	2.98
>>    VectorSignum.doubleSignum	2048	1.92	1.75	3.04
>>    VectorSignum.floatSignum	256	3.34	3.06	3.92
>>    VectorSignum.floatSignum	512	3.63	3.22	5.27
>>    VectorSignum.floatSignum	1024	3.76	3.35	4.77
>>    VectorSignum.floatSignum	2048	3.85	3.47	5.59
>>    
>>    A, B , C machine descriptions given below -
>>    A : 128-bit Neon machine
>>    B : 256-bit SVE machine
>>    C : 512-bit SVE machine
>>    
>>    The numbers in the table are the gain ratios between the runtime (ns/op)
>>    of the scalar, non-vectorized intrinsic code and the vectorized version
>>    of the intrinsic (this patch).
>
> src/hotspot/cpu/aarch64/assembler_aarch64.hpp line 3516:
> 
>> 3514:             FloatRegister Zn, FloatRegister Zm) {                                      \
>> 3515:     starti;                                                                            \
>> 3516:     assert(T != Q, "invalid size");                                                    \
> 
> Please wrap all of this in `#ifdef ASSERT`

Thank you for reviewing. Could you please clarify by what exactly you mean by "Please wrap all of this in #ifdef ASSERT"?  Do you mean squashing the if conditions with the asserts? The assert macro calls are already inside a "#define".

-------------

PR: https://git.openjdk.org/jdk/pull/9807


More information about the hotspot-compiler-dev mailing list