[jdk16] Integrated: 8259353: VectorReinterpretNode is incorrectly optimized out

Xiaohong Gong xgong at openjdk.java.net
Wed Jan 13 05:52:59 UTC 2021


On Mon, 11 Jan 2021 09:23:24 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:

> Vector reinterpretation just reinterprets the bytes of the vector without performing any value conversions. So normally, optimization like:
>   (VectorReinterpret (VectorReinterpret src))  ->  src
> can be applied correctly if the logical result and the input `"src"` have the same vector type. However, the results might not be correct if truncation happens after the first reinterpretation: `"(VectorReinterpret src)"`.
> 
> Consider the following case:
>   byte[] a = {0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15}
>   byte[] b = new byte[16]
>   byte[] c = new byte[16]
> 
>   // load vector from "a" with 128-bits
>   ByteVector av = ByteVector.fromArray(SPECIES_128, a, 0);
> 
>   // reinterpret to 64-bits
>   ByteVector bv = (ByteVector)av.reinterpretShape(SPECIES_64, 0);
> 
>   // reinterpret back to 128-bits with the above reinterpreting results
>   ByteVector cv = (ByteVector)bv.reinterpretShape(SPECIES_128, 0);
>   cv.intoArray(c, 0)
> This case:
>   1. Reinterprets vector `"av"` from 128-bits to 64-bits. It should only copy the first 8 elements to vector `"bv"` and discard the higher half parts.
>   2. Reinterprets vector `"bv"` back to 128-bits. It copies the 64-bits from` "bv"` to` "cv"`, and paddings the rest part of `"cv"` with zero.
>   The final values in array `"c"` are expected to be:
>   c = [ 0, 1, 2, 3, 4, 5, 6, 7, 0,  0, 0, 0, 0, 0, 0, 0]
> However, with the optimization mentioned at the beginning, the second reinterpretation is optimized out. The values in array` "c" `are incorrectly copied from array `"a"`. The values are:
>    c = [ 0, 1, 2, 3, 4, 5, 6, 7, 8, 9, 10, 11, 12, 13, 14, 15]
> To fix it, this patch adds the vector size constraint for the optimization, that is the first reinterpretation must not be conducted to a shorter vector.
> 
> It also fixes a potential issue for the implementation of match rule `"reinterpretX2D (from 16 bytes to 8 bytes)" `on Arm NEON. Specifically, the `"mov"` is always needed even if the` "dst"` and `"src"` are the same register since truncation should be conducted in order to be consistent with the semantics.

This pull request has now been integrated.

Changeset: 1cf2378b
Author:    Xiaohong Gong <xgong at openjdk.org>
Committer: Ningsheng Jian <njian at openjdk.org>
URL:       https://git.openjdk.java.net/jdk16/commit/1cf2378b
Stats:     6 lines in 1 file changed: 4 ins; 0 del; 2 mod

8259353: VectorReinterpretNode is incorrectly optimized out

Reviewed-by: vlivanov, njian

-------------

PR: https://git.openjdk.java.net/jdk16/pull/100


More information about the hotspot-compiler-dev mailing list