RFR: 8284932: [Vector API] Incorrect implementation of LSHR operator for negative byte/short elements

Eric Liu eliu at openjdk.java.net
Mon Apr 18 07:34:33 UTC 2022


On Mon, 18 Apr 2022 05:14:25 GMT, Jie Fu <jiefu at openjdk.org> wrote:

>>> > However, just image that someone would like to optimize some code segments of bytes/shorts `>>>`
>>> 
>>> Then that person can just use signed shift (`VectorOperators.ASHR`), right? Shifting on masked shift counts means that the shift count cannot be greater than 8 for bytes and 16 for shorts, which means that `(byte)(src[i] >>> 3)` is exactly the same as `(byte)(src[i] >> 3)`. Please correct me if I misunderstood something here.
>> 
>> Yes, you're right that's why I said `LSHR` can be replaced with `ASHR`.
>> 
>> However, not all the people are clever enough to do this source code level replacement.
>> To be honest, I would never think of that `>>>` can be auto-vectorized by this idea proposed by https://github.com/openjdk/jdk/pull/7979 .
>> 
>>> 
>>> Your proposed changes make unsigned shifts for subwords behave exactly the same as signed shifts, which is both redundant (we have 2 operations doing exactly the same thing) and inadequate (we lack the operation to do the proper unsigned shifts)
>> 
>> `LSHR` following the behavior of scalar `>>>` is very important for Java developers to rewrite the code with vector api.
>> Maybe, we should add one more operator to support what you called `the proper unsigned shifts`, right?
>> But that's another topic which can be done in a separate issue.
>
>> @DamonFool
>> 
>> I think the issue is that these two cases of yours are not equal semantically.
> 
> Why?
> According to the vector api doc, they should compute the same value when the shift_cnt is 3, right?
> 
>> 
>> ```
>>  13     public static void urshift(byte[] src, byte[] dst) {
>>  14         for (int i = 0; i < src.length; i++) {
>>  15             dst[i] = (byte)(src[i] >>> 3);
>>  16         }
>>  17     }
>>  18 
>>  19     public static void urshiftVector(byte[] src, byte[] dst) {
>>  20         int i = 0;
>>  21         for (; i < spec.loopBound(src.length); i +=spec.length()) {
>>  22             var va = ByteVector.fromArray(spec, src, i);
>>  23             var vb = va.lanewise(VectorOperators.LSHR, 3);
>>  24             vb.intoArray(dst, i);
>>  25         }
>>  26 
>>  27         for (; i < src.length; i++) {
>>  28             dst[i] = (byte)(src[i] >>> 3);
>>  29         }
>>  30     }
>> ```
>> 
>> Besides the unsigned shift, line15 also has a type conversion which is missing in the vector api case. To get the equivalent result, one need to cast the result explicitly at line24, e.g, `((IntVector)vb.castShape(SPECISE_XXX, 0)).intoArray(idst, i);`
> 
> Since all the vector operations are already based on byte lane type, I don't think we need a `cast` operation here.
> Can we solve this problem by inserting a cast operation?

@DamonFool 

`>>>` can not apply to sub-word type in Java. `(byte)(src[i] >>> 3)` is unsigned right shift in type of INT and transformed the result to BYTE. In vector api, it extends the `>>>` to sub-word type with the same semantic meaning like `iushr`[1], that is zero extending.

> The vector api docs says it would compute a>>>(n&(ESIZE*8-1)).

I think `>>>` has some extending meanings here. As I said above, no sub-word type for `>>>` in Java.


[1] https://docs.oracle.com/javase/specs/jvms/se18/html/jvms-6.html#jvms-6.5.iushr

-------------

PR: https://git.openjdk.java.net/jdk/pull/8276


More information about the core-libs-dev mailing list