RFR: 8320347: Emulate vblendvp[sd] on ECore [v5]
Jatin Bhateja
jbhateja at openjdk.org
Sat Nov 25 00:40:09 UTC 2023
On Fri, 24 Nov 2023 17:23:28 GMT, Volodymyr Paprotski <duke at openjdk.org> wrote:
>> src/hotspot/cpu/x86/macroAssembler_x86.cpp line 3601:
>>
>>> 3599: if (compute_mask) {
>>> 3600: vpxor(scratch, scratch, scratch, vector_len);
>>> 3601: vpcmpgtq(scratch, scratch, mask, vector_len);
>>
>> I see assertion failures in following tests with JAVA_OPTIONS= -XX:UseAVX=1 -XX:+UnlockDiagnosticVMOptions -XX:+EnableX86ECoreOpts -Xbatch
>>
>> compiler/c2/cr6340864/TestDoubleVect.java
>> compiler/loopopts/superword/ReductionPerf.java
>> compiler/vectorization/TestSignumVector.java
>> compiler/vectorization/runner/BasicDoubleOpTest.java
>>
>> AVX1 does not support integral vectors above 16 bytes, please use floating point compare instruction.
>
> Hmm. Good catch!
>
> Thinking about AVX1 case some more.. Platforms where this `vpblendvp*` emulation is needed have AVX2 at least, otherwise vpblendvp is faster. I think its better to disable this optimization entirely if AVX1 is required to be used.
>
> I would go even further and disable `EnableX86ECoreOpts` if `UseAVX==1`. Preference?
vpblendps/pd are supported for AVX1 targets, Since the patch is about emulating floating point variable blends using alternate sequence I think we should remove any impediment which prohibit its usage over E-core at AVX1 level.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/16716#discussion_r1404694438
More information about the hotspot-compiler-dev
mailing list