RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

Jie Fu jiefu at openjdk.java.net
Mon Apr 18 11:35:34 UTC 2022


On Mon, 18 Apr 2022 08:29:52 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>>> @DamonFool
>>> 
>>> I think the issue is that these two cases of yours are not equal semantically.
>> 
>> Why?
>> According to the vector api doc, they should compute the same value when the shift_cnt is 3, right?
>> 
>>> 
>>> ```
>>>  13     public static void urshift(byte[] src, byte[] dst) {
>>>  14         for (int i = 0; i < src.length; i++) {
>>>  15             dst[i] = (byte)(src[i] >>> 3);
>>>  16         }
>>>  17     }
>>>  18 
>>>  19     public static void urshiftVector(byte[] src, byte[] dst) {
>>>  20         int i = 0;
>>>  21         for (; i < spec.loopBound(src.length); i +=spec.length()) {
>>>  22             var va = ByteVector.fromArray(spec, src, i);
>>>  23             var vb = va.lanewise(VectorOperators.LSHR, 3);
>>>  24             vb.intoArray(dst, i);
>>>  25         }
>>>  26 
>>>  27         for (; i < src.length; i++) {
>>>  28             dst[i] = (byte)(src[i] >>> 3);
>>>  29         }
>>>  30     }
>>> ```
>>> 
>>> Besides the unsigned shift, line15 also has a type conversion which is missing in the vector api case. To get the equivalent result, one need to cast the result explicitly at line24, e.g, `((IntVector)vb.castShape(SPECISE_XXX, 0)).intoArray(idst, i);`
>> 
>> Since all the vector operations are already based on byte lane type, I don't think we need a `cast` operation here.
>> Can we solve this problem by inserting a cast operation?
>
>> @DamonFool
>> 
>> `>>>` can not apply to sub-word type in Java. `(byte)(src[i] >>> 3)` is unsigned right shift in type of INT and transformed the result to BYTE. In vector api, it extends the `>>>` to sub-word type with the same semantic meaning like `iushr`[1], that is zero extending.
>> 
>> > The vector api docs says it would compute a>>>(n&(ESIZE*8-1)).
>> 
>> I think `>>>` has some extending meanings here. As I said above, no sub-word type for `>>>` in Java.
>> 
>> [1] https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-6.html#jvms-6.5.iushr
> 
> As discussed above https://github.com/openjdk/jdk/pull/8276#issuecomment-1101016904 , there isn't any problem to apply `>>>` upon shorts/bytes.
> 
> What do you think of https://github.com/openjdk/jdk/pull/7979 , which tries to vectorize unsigned shift right on signed subword types ?
> And what do you think of the benchmarks mentioned in that PR?
> 
> The vector api doc clearly states `LSHR` operator would compute `a>>>(n&(ESIZE*8-1))`.
> But it fails to do so when `a` is negative byte/short element.
> 
> So if the doc description is correct, the current implementation would be wrong, right?
> 
> However, if you think the current implementation is correct, the vector api doc would be wrong.
> Then, we would lack an operator working like the scalar `>>>` since current implementation fails to do so for negative bytes/shorts.

> Hi @DamonFool, the doc does obviously not mean what you think, and actually seems to indicate the Eric's interpretation instead. Simply because `a >>> (n & (ESIZE - 1))` does not output the type of `a` for subword-type inputs, which is clearly wrong. This suggests that the doc here should be interpreted that `>>>` is the extended shift operation, which is defined on subword types the same as for words and double-words. Though, I agree that the doc must be modified to reflect the intention more clearly.
> 


My intention is to make Vector API to be more friendly to Java developers.
The so called extended unsigned right shift operation for bytes/shorts actually behave differently with the well-known scalar `>>>`, which may become the source of bugs.


> > Then, we would lack an operator working like the scalar >>> since current implementation fails to do so for negative bytes/shorts.
> 
> As you have noted, we have `ASHR` for bytes, shorts and `LSHR` for ints, longs. Thanks a lot.


Then people have to be very careful about when to use `AHSR` and when to use `LSHR`, which is really inconvenient and easy to make a mistake.
And not all the people are smart enough to know this skill for bytes/shorts.
So simply modifying the vector api doc can't solve these problems.

Maybe, we can add one more operator to distinguish the semantics of scalar `>>>` with the so called extended unsigned right shift operation for bytes/shorts.

-------------

PR: https://git.openjdk.java.net/jdk/pull/8276


More information about the core-libs-dev mailing list