RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs [v58]
Vladimir Kozlov
kvn at openjdk.org
Thu Jan 4 16:54:37 UTC 2024
On Thu, 4 Jan 2024 16:45:11 GMT, Vladimir Kozlov <kvn at openjdk.org> wrote:
>> And without `-XX:-VerifyAlignVector`
>>
>>
>> ;; B20: # out( B20 B21 ) <- in( B19 B20 ) Loop( B20-B20 inner post of N743) Freq: 4.49988
>> 0x00007ff22cbb2924: vpaddd 0x10(%rbx,%r13,4),%zmm0,%zmm1
>> 0x00007ff22cbb292f: vmovdqu32 %zmm1,0x10(%rbx,%r13,4) ;*iastore {reexecute=0 rethrow=0 return_oop=0}
>> ; - Test::test0 at 15 (line 11)
>> 0x00007ff22cbb293a: add $0x10,%r13d ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>> ; - Test::test0 at 16 (line 10)
>> 0x00007ff22cbb293e: cmp %r10d,%r13d
>> 0x00007ff22cbb2941: jl 0x00007ff22cbb2924 ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>> ; - Test::test0 at 6 (line 10)
>
> Can you show assembler code for simple load and store instructions (move data from one array to another)?
> My concern is that LoadV and StoreV are defined only with `memory` input:
>
> instruct loadV(vec dst, memory mem) %{
> match(Set dst (LoadVector mem));
>
> I would assume it will be embedded memory only. But C2 may be smart enough to generate `lea` if it sees not AddP node.
Also why your assembler example have tested alignment twice for the same address? May be because the same array's element for load and store?:
0x00007f83c8bb2f6d: mov %r10,%r8
0x00007f83c8bb2f70: test $0x7,%r8b
0x00007f83c8bb2f74: je 0x00007f83c8bb2f8a
...
0x00007f83c8bb2f8a: test $0x7,%r10b
0x00007f83c8bb2f8e: je 0x00007f83c8bb2fa4
No need to optimize I think since it is only for debugging.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1442025069
More information about the hotspot-compiler-dev
mailing list