[jdk11u-dev] RFR: 8267652: c2 loop unrolling by 8 results in reading memory past array
Martin Doerr
mdoerr at openjdk.java.net
Thu Oct 14 08:10:07 UTC 2021
On Wed, 6 Oct 2021 18:05:52 GMT, Vladimir Kempik <vkempik at openjdk.org> wrote:
> The backport can't be applied clean to jdk11 ( due to the miss of 8223347: Integration of Vector API (Incubator) ) and needed big rework.
>
> The main idea behind the backport is to add new check to functions with VecS or VecD argument ( those who can read less than 16 bytes from memory) and not touch VecX/VecY/VecZ types.
>
> With LoopMaxUnroll=8
> The problematic place before the patch :
>
> 0x000000011dd55603: vmovq 0x10(%r10,%rsi,1),%xmm0 <---loading 8 bytes from memory
> 0x000000011dd5560a: vpxor 0x10(%r11,%rsi,1),%xmm0,%xmm0 <--- loading 16 bytes from memory, it's not right
> 0x000000011dd55611: vmovq %xmm0,0x10(%r13,%rsi,1) ;*bastore {reexecute=0 rethrow=0 return_oop=0}
> ; - repro::xor_array at 18 (line 12)
>
> and after the patch:
>
> 0x000000011f9a4d95: vmovq 0x10(%r11,%rsi,1),%xmm0
> 0x000000011f9a4d9c: vmovq 0x10(%r10,%rsi,1),%xmm1
> 0x000000011f9a4da3: vpxor %xmm1,%xmm0,%xmm0
> 0x000000011f9a4da7: vmovq %xmm0,0x10(%r13,%rsi,1) ;*bastore {reexecute=0 rethrow=0 return_oop=0}
> ; - repro::xor_array at 18 (line 12)
>
>
> With LoopMaxUnroll=16 it's works the old way ( as expected), loading 16 bytes from memory with vmovdqu and vpxor
>
> 0x000000011e2b8903: vmovdqu 0x10(%r10,%rsi,1),%xmm0
> 0x000000011e2b890a: vpxor 0x10(%r11,%rsi,1),%xmm0,%xmm0
> 0x000000011e2b8911: vmovdqu %xmm0,0x10(%r13,%rsi,1) ;*bastore {reexecute=0 rethrow=0 return_oop=0}
>
> Testing is pending.
>
> Also fixed a typo at https://github.com/openjdk/jdk11u-dev/pull/488/files#diff-d6a3624f0f0af65a98a47378a5c146eed5016ca09b4de1acd0a3acc823242e82L9069
Marked as reviewed by mdoerr (Reviewer).
Test results are good. Hopefully, nobody will backport other nodes which require the addition.
-------------
PR: https://git.openjdk.java.net/jdk11u-dev/pull/488
More information about the jdk-updates-dev
mailing list