RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements
Quan Anh Mai
duke at openjdk.java.net
Mon Apr 18 03:51:50 UTC 2022
On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu <jiefu at openjdk.org> wrote:
> Hi all,
>
> According to the Vector API doc, the `LSHR` operator computes `a>>>(n&(ESIZE*8-1))`.
> However, current implementation is incorrect for negative bytes/shorts.
>
> The background is that one of our customers try to vectorize `urshift` with `urshiftVector` like the following.
>
> 13 public static void urshift(byte[] src, byte[] dst) {
> 14 for (int i = 0; i < src.length; i++) {
> 15 dst[i] = (byte)(src[i] >>> 3);
> 16 }
> 17 }
> 18
> 19 public static void urshiftVector(byte[] src, byte[] dst) {
> 20 int i = 0;
> 21 for (; i < spec.loopBound(src.length); i +=spec.length()) {
> 22 var va = ByteVector.fromArray(spec, src, i);
> 23 var vb = va.lanewise(VectorOperators.LSHR, 3);
> 24 vb.intoArray(dst, i);
> 25 }
> 26
> 27 for (; i < src.length; i++) {
> 28 dst[i] = (byte)(src[i] >>> 3);
> 29 }
> 30 }
>
>
> Unfortunately and to our surprise, code at line28 computes different results with code at line23.
> It took quite a long time to figure out this bug.
>
> The root cause is that current implemenation of Vector API can't compute the unsigned right shift results as what is done for scalar `>>>` for negative byte/short elements.
> Actually, current implementation will do `(a & 0xFF) >>> (n & 7)` [1] for all bytes, which is unable to compute the vectorized `>>>` for negative bytes.
> So this seems unreasonable and unfriendly to Java developers.
> It would be better to fix it.
>
> The key idea to support unsigned right shift of negative bytes/shorts is just to replace the unsigned right shift operation with the signed right shift operation.
> This logic is:
> - For byte elements, unsigned right shift is equal to signed right shift if the shift_cnt <= 24.
> - For short elements, unsigned right shift is equal to signed right shift if the shift_cnt <= 16.
> - For Vector API, the shift_cnt will be masked to shift_cnt <= 7 for bytes and shift_cnt <= 15 for shorts.
>
> I just learned this idea from https://github.com/openjdk/jdk/pull/7979 .
> And many thanks to @fg1417 .
>
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java#L935
Hi,
Because unsigned cast should operate on unsigned types, the more appropriate usage is `(src[i] & 0xFF) >>> 3`, with the `&` operation is the cast from unsigned byte to int. Actually, I fail to understand the intention of your example, why not use signed shift instead, what does unsigned shift provide here apart from extra cognitive load in reasoning the operation.
May you provide a more concrete example to the utilisation of unsigned shift on signed subword types, please. The example provided by @fg1417 in #7979 seems to indicate the real intention is to right shifting unsigned bytes, with the unsigned cast sometimes omitted (changed to a signed cast) because the shift results are masked by a stricter mask immediately.
Thank you very much.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8276
More information about the core-libs-dev
mailing list