[jdk16] RFR: 8259775: [Vector API] Incorrect code-gen for VectorReinterpret operation [v5]

Vladimir Kozlov kvn at openjdk.java.net
Wed Jan 20 17:45:48 UTC 2021


On Tue, 19 Jan 2021 03:57:16 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>> Hi all,
>> 
>> The code-gen for VectorReinterpret may be wrong on x86.
>> 
>> Let's see the opto-assembly for the reproducer in the JBS, which was actually based on @XiaohongGong 's example in JDK-8259353 and many thanks to her.
>> 066     B7: #   out( N1 ) <- in( B6 )  Freq: 0.999994
>> 066     vector_reinterpret_expand XMM0,XMM0     !
>> 066     store_vector [R12 + R11 << 3 + #16] (compressed oop addressing),XMM0
>>  
>> Please note that the dst and src [1] share the same XMM0 register and movdqu [2] should be generated for this case.
>> But when dst == src, movdqu actually generates nothing [3], which leads to incorrect result;
>> 
>> For this case, movdqu should not be empty since the upper bits of dst should be zeroed.
>> The similar error also exists for vmovdqu [4].
>> 
>> I think we should also change movflt [5] to movss but I just can't understand why we have 4-byte vectors.
>> Isn't the shortest vectors 8-byte on x86?
>> 
>> Thanks.
>> Best regards,
>> Jie
>> 
>> [1] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3354
>> [2] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3364
>> [3] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L2490
>> [4] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/macroAssembler_x86.cpp#L2515
>> [5] https://github.com/openjdk/jdk/blob/master/src/hotspot/cpu/x86/x86.ad#L3379
>
> Jie Fu has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Refine the test to always reproduce the bug

Don't forget to request approval for JDK 16 fix integration:
http://openjdk.java.net/jeps/3#Fix-Request-Process

src/hotspot/cpu/x86/macroAssembler_x86.hpp line 172:

> 170: 
> 171:   // Move with zero extension
> 172:   void movfltz(XMMRegister dst, XMMRegister src) { movss(dst, src); }

Seems `movdbl(XMMRegister dst, XMMRegister src)` has the same issue.

-------------

PR: https://git.openjdk.java.net/jdk16/pull/122


More information about the hotspot-compiler-dev mailing list