RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v2]

Andrew Haley aph at openjdk.java.net
Tue Feb 9 09:32:37 UTC 2021


On Tue, 9 Feb 2021 09:13:47 GMT, Dong Bo <dongbo at openjdk.org> wrote:

>> In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero,
>> see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`:
>>     /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */
>>     public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT);
>> 
>> The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64,
>> assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead.
>> According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long.
>> ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i);
>> vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i);
>> 
>> The legal right shift amount should be in the range 1 to the element width in bits on aarch64:
>> https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en
>> 
>> This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate.
>> Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests.
>
> Dong Bo has updated the pull request incrementally with one additional commit since the last revision:
> 
>   add assertion in the assembler

src/hotspot/cpu/aarch64/aarch64_neon_ad.m4 line 2057:

> 2055:              as_FloatRegister($src$$reg), as_FloatRegister($src$$reg));
> 2056:     } else {ifelse($4, B,`
> 2057:       if (sh >= 8) sh = 7;

I think it would be possible to move some of this logic from the AD file into MacroAssembler, with macros to generate the appropriate instruction based on their arguments. This might be cleaner: the logic here is very hard to follow.

-------------

PR: https://git.openjdk.java.net/jdk/pull/2472


More information about the hotspot-dev mailing list