RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements
Jie Fu
jiefu at openjdk.java.net
Mon Apr 18 08:33:30 UTC 2022
On Mon, 18 Apr 2022 05:14:25 GMT, Jie Fu <jiefu at openjdk.org> wrote:
>>> > However, just image that someone would like to optimize some code segments of bytes/shorts `>>>`
>>>
>>> Then that person can just use signed shift (`VectorOperators.ASHR`), right? Shifting on masked shift counts means that the shift count cannot be greater than 8 for bytes and 16 for shorts, which means that `(byte)(src[i] >>> 3)` is exactly the same as `(byte)(src[i] >> 3)`. Please correct me if I misunderstood something here.
>>
>> Yes, you're right that's why I said `LSHR` can be replaced with `ASHR`.
>>
>> However, not all the people are clever enough to do this source code level replacement.
>> To be honest, I would never think of that `>>>` can be auto-vectorized by this idea proposed by https://github.com/openjdk/jdk/pull/7979 .
>>
>>>
>>> Your proposed changes make unsigned shifts for subwords behave exactly the same as signed shifts, which is both redundant (we have 2 operations doing exactly the same thing) and inadequate (we lack the operation to do the proper unsigned shifts)
>>
>> `LSHR` following the behavior of scalar `>>>` is very important for Java developers to rewrite the code with vector api.
>> Maybe, we should add one more operator to support what you called `the proper unsigned shifts`, right?
>> But that's another topic which can be done in a separate issue.
>
>> @DamonFool
>>
>> I think the issue is that these two cases of yours are not equal semantically.
>
> Why?
> According to the vector api doc, they should compute the same value when the shift_cnt is 3, right?
>
>>
>> ```
>> 13 public static void urshift(byte[] src, byte[] dst) {
>> 14 for (int i = 0; i < src.length; i++) {
>> 15 dst[i] = (byte)(src[i] >>> 3);
>> 16 }
>> 17 }
>> 18
>> 19 public static void urshiftVector(byte[] src, byte[] dst) {
>> 20 int i = 0;
>> 21 for (; i < spec.loopBound(src.length); i +=spec.length()) {
>> 22 var va = ByteVector.fromArray(spec, src, i);
>> 23 var vb = va.lanewise(VectorOperators.LSHR, 3);
>> 24 vb.intoArray(dst, i);
>> 25 }
>> 26
>> 27 for (; i < src.length; i++) {
>> 28 dst[i] = (byte)(src[i] >>> 3);
>> 29 }
>> 30 }
>> ```
>>
>> Besides the unsigned shift, line15 also has a type conversion which is missing in the vector api case. To get the equivalent result, one need to cast the result explicitly at line24, e.g, `((IntVector)vb.castShape(SPECISE_XXX, 0)).intoArray(idst, i);`
>
> Since all the vector operations are already based on byte lane type, I don't think we need a `cast` operation here.
> Can we solve this problem by inserting a cast operation?
> @DamonFool
>
> `>>>` can not apply to sub-word type in Java. `(byte)(src[i] >>> 3)` is unsigned right shift in type of INT and transformed the result to BYTE. In vector api, it extends the `>>>` to sub-word type with the same semantic meaning like `iushr`[1], that is zero extending.
>
> > The vector api docs says it would compute a>>>(n&(ESIZE*8-1)).
>
> I think `>>>` has some extending meanings here. As I said above, no sub-word type for `>>>` in Java.
>
> [1] https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-6.html#jvms-6.5.iushr
As discussed above https://github.com/openjdk/jdk/pull/8276#issuecomment-1101016904 , there isn't any problem to apply `>>>` upon shorts/bytes.
What do you think of https://github.com/openjdk/jdk/pull/7979 , which tries to vectorize unsigned shift right on signed subword types ?
And what do you think of the benchmarks mentioned in that PR?
The vector api doc clearly states `LSHR` operator would compute `a>>>(n&(ESIZE*8-1))`.
But it fails to do so when `a` is negative byte/short element.
So if the doc description is correct, the current implementation would be wrong, right?
However, if you think the current implementation is correct, the vector api doc would be wrong.
Then, we would lack an operator working like the scalar `>>>` since current implementation fails to do so for negative bytes/shorts.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8276
More information about the core-libs-dev
mailing list