Withdrawn: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements
Jie Fu
jiefu at openjdk.java.net
Mon May 16 12:21:59 UTC 2022
On Sun, 17 Apr 2022 14:35:14 GMT, Jie Fu <jiefu at openjdk.org> wrote:
> Hi all,
>
> According to the Vector API doc, the `LSHR` operator computes `a>>>(n&(ESIZE*8-1))`.
> However, current implementation is incorrect for negative bytes/shorts.
>
> The background is that one of our customers try to vectorize `urshift` with `urshiftVector` like the following.
>
> 13 public static void urshift(byte[] src, byte[] dst) {
> 14 for (int i = 0; i < src.length; i++) {
> 15 dst[i] = (byte)(src[i] >>> 3);
> 16 }
> 17 }
> 18
> 19 public static void urshiftVector(byte[] src, byte[] dst) {
> 20 int i = 0;
> 21 for (; i < spec.loopBound(src.length); i +=spec.length()) {
> 22 var va = ByteVector.fromArray(spec, src, i);
> 23 var vb = va.lanewise(VectorOperators.LSHR, 3);
> 24 vb.intoArray(dst, i);
> 25 }
> 26
> 27 for (; i < src.length; i++) {
> 28 dst[i] = (byte)(src[i] >>> 3);
> 29 }
> 30 }
>
>
> Unfortunately and to our surprise, code at line28 computes different results with code at line23.
> It took quite a long time to figure out this bug.
>
> The root cause is that current implemenation of Vector API can't compute the unsigned right shift results as what is done for scalar `>>>` for negative byte/short elements.
> Actually, current implementation will do `(a & 0xFF) >>> (n & 7)` [1] for all bytes, which is unable to compute the vectorized `>>>` for negative bytes.
> So this seems unreasonable and unfriendly to Java developers.
> It would be better to fix it.
>
> The key idea to support unsigned right shift of negative bytes/shorts is just to replace the unsigned right shift operation with the signed right shift operation.
> This logic is:
> - For byte elements, unsigned right shift is equal to signed right shift if the shift_cnt <= 24.
> - For short elements, unsigned right shift is equal to signed right shift if the shift_cnt <= 16.
> - For Vector API, the shift_cnt will be masked to shift_cnt <= 7 for bytes and shift_cnt <= 15 for shorts.
>
> I just learned this idea from https://github.com/openjdk/jdk/pull/7979 .
> And many thanks to @fg1417 .
>
>
> Thanks.
> Best regards,
> Jie
>
> [1] https://github.com/openjdk/jdk/blob/master/src/jdk.incubator.vector/share/classes/jdk/incubator/vector/ByteVector.java#L935
This pull request has been closed without being integrated.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8276
More information about the hotspot-compiler-dev
mailing list