RFR: 8261142: AArch64: Incorrect instruction encoding when right-shifting vectors with shift amount equals to the element width [v7]
Andrew Haley
aph at openjdk.java.net
Fri Feb 19 14:46:41 UTC 2021
On Fri, 19 Feb 2021 03:13:12 GMT, Dong Bo <dongbo at openjdk.org> wrote:
>> In vectorAPI, when right-shifting a vector with a shift equals to the element width, the shift is transformed to zero,
>> see `src/jdk.incubator.vector/share/classes/jdk/incubator/vector/VectorOperators.java`:
>> /** Produce {@code a>>>(n&(ESIZE*8-1))}. Integral only. */
>> public static final /*bitwise*/ Binary LSHR = binary("LSHR", ">>>", VectorSupport.VECTOR_OP_URSHIFT, VO_SHIFT);
>>
>> The aarch64 assembler generates wrong or illegal instructions in this case, e.g. for the JAVA code below on aarch64,
>> assembler call `__ ushr(dst, __ T8B, src, 0)`, the instruction generated is not `ushr dst.8B, src.8B, 0`, but `ushr dst.4H, src.4H, 16` instead.
>> According to local tests, JVM gives wrong results for byte/short and crashes with SIGILL for integer/long.
>> ByteVector vba = ByteVector.fromArray(byte64SPECIES, bytesA, 8 * i);
>> vbb.lanewise(VectorOperators.ASHR, 8).intoArray(arrBytes, 8 * i);
>>
>> The legal right shift amount should be in the range 1 to the element width in bits on aarch64:
>> https://developer.arm.com/documentation/dui0801/f/A64-SIMD-Vector-Instructions/USHR--vector-?lang=en
>>
>> This fix handles zero shift separately. If the shift is zero, it generates `orr` for right shift, `addv` for right shift and accumulate.
>> Verified with linux-aarch64-server-fastdebug, tier1. Also created a jtreg to reproduce the issue and for regression tests.
>
> Dong Bo has refreshed the contents of this pull request, and previous commits have been removed. The incremental views will show differences compared to the previous content of the PR. The pull request contains one new commit since the last revision:
>
> handle zero shift in macro assembler
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 554:
> 552:
> 553: WRAP(usra) WRAP(ssra)
> 554: #undef WRAP
Are ssra and usra tested by anything? I don't seem them accessed in the test case.
src/hotspot/cpu/aarch64/macroAssembler_aarch64.hpp line 531:
> 529:
> 530: // NEON shift instructions
> 531: #define WRAP(INSN) \
This comment should be
// AdvSIMD shift by immediate.
// These are "user friendly" variants which allow a shift count of 0.
-------------
PR: https://git.openjdk.java.net/jdk/pull/2472
More information about the hotspot-dev
mailing list