[vectorIntrinsics+mask] RFR: 8271313: AArch64: SVE backend support for masking operations with predicate feature
Jie Fu
jiefu at openjdk.java.net
Fri Jul 30 04:11:45 UTC 2021
On Fri, 30 Jul 2021 03:31:14 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
> This is the initial SVE backend implementations for all masking operations with the SVE predicate feature. It contains:
> - SVE codegen for vector operations under a mask controlling
> - SVE codegen for vector mask operations with predicate instructions
>
> The size of libjvm.so increases about 1.86% after adding all the backend changes. And the performance gain is about 3.7% ~ 7.88x for some masking operations of IntMaxVector with SVE 512 bits:
>
> Benchmark Gain
> IntMaxVector.ABSMasked 1.198
> IntMaxVector.ADDMasked 1.040
> IntMaxVector.ADDMaskedLanes 1.068
> IntMaxVector.ANDMasked 1.117
> IntMaxVector.ANDMaskedLanes 1.101
> IntMaxVector.AND_NOTMasked 1.037
> IntMaxVector.ASHRMasked 1.286
> IntMaxVector.ASHRMaskedShift 1.096
> IntMaxVector.BITWISE_BLENDMasked 1.085
> IntMaxVector.LSHRMasked 1.405
> IntMaxVector.LSHRMaskedShift 1.092
> IntMaxVector.MAXMaskedLanes 1.079
> IntMaxVector.MINMaskedLanes 1.079
> IntMaxVector.MULMasked 1.370
> IntMaxVector.ORMasked 1.038
> IntMaxVector.ORMaskedLanes 1.103
> IntMaxVector.SUBMasked 1.043
> IntMaxVector.XORMasked 1.151
> IntMaxVector.XORMaskedLanes 1.103
> IntMaxVector.allTrue 1.157
> IntMaxVector.anyTrue 1.158
> IntMaxVector.gatherMasked 7.880
> IntMaxVector.scatterMasked 4.732
So does the performance gain reasonable, especially for something like IntMaxVector.AND_NOTMasked?
Thanks.
-------------
PR: https://git.openjdk.java.net/panama-vector/pull/105
More information about the panama-dev
mailing list