[jdk16] RFR: 8259353: VectorReinterpretNode is incorrectly optimized out [v2]

Ningsheng Jian njian at openjdk.java.net
Wed Jan 13 02:37:04 UTC 2021


On Tue, 12 Jan 2021 02:52:13 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

>> Vector reinterpretation just reinterprets the bytes of the vector without performing any value conversions. So normally, optimization like:
>>   (VectorReinterpret (VectorReinterpret src))  ->  src
>> can be applied correctly if the logical result and the input `"src"` have the same vector type. However, the results might not be correct if truncation happens after the first reinterpretation: `"(VectorReinterpret src)"`.
>> 
>> Consider the following case:
>>   byte[] a = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
>>   byte[] b = new byte[16]
>>   byte[] c = new byte[16]
>> 
>>   // load vector from "a" with 128-bits
>>   ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
>> 
>>   // reinterpret to 64-bits
>>   ByteVector bv = (ByteVector)av.reinterpretShape(SPECIES_64, 0);
>> 
>>   // reinterpret back to 128-bits with the above reinterpreting results
>>   ByteVector cv = (ByteVector)bv.reinterpretShape(SPECIES_128, 0);
>>   cv.intoArray(c, 0)
>> This case:
>>   1. Reinterprets vector `"av"` from 128-bits to 64-bits. It should only copy the first 8 elements to vector `"bv"` and discard the higher half parts.
>>   2. Reinterprets vector `"bv"` back to 128-bits. It copies the 64-bits from` "bv"` to` "cv"`, and paddings the rest part of `"cv"` with zero.
>>   The final values in array `"c"` are expected to be:
>>   c = [ 0, 1, 2, 3, 4, 5, 6, 7, 0,  0, 0, 0, 0, 0, 0, 0]
>> However, with the optimization mentioned at the beginning, the second reinterpretation is optimized out. The values in array` "c" `are incorrectly copied from array `"a"`. The values are:
>>    c = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
>> To fix it, this patch adds the vector size constraint for the optimization, that is the first reinterpretation must not be conducted to a shorter vector.
>> 
>> It also fixes a potential issue for the implementation of match rule `"reinterpretX2D (from 16 bytes to 8 bytes)" `on Arm NEON. Specifically, the `"mov"` is always needed even if the` "dst"` and `"src"` are the same register since truncation should be conducted in order to be consistent with the semantics.
>
> Xiaohong Gong has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Revert Arm NEON match rule fixes

LGTM

-------------

Marked as reviewed by njian (Committer).

PR: https://git.openjdk.java.net/jdk16/pull/100


More information about the hotspot-compiler-dev mailing list