RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs [v58]

Vladimir Kozlov kvn at openjdk.org
Thu Jan 4 16:47:35 UTC 2024


On Thu, 4 Jan 2024 08:25:25 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

>> I don't understand this comment.
>> The `LoadVector` and `StoreVector` have both a `MemNode::Address` input, which I think it the memory address.
>> The address itself usually consists of `AddP` nodes, which do the (base, index, offset) computation. These nodes later can be folded into the load/store itself, or be computed with a `lea`.
>> I simply take the address value, check it for alignment and pass it on to the load/store.
>> 
>> Take this example:
>> 
>> public class Test {
>>     static int RANGE = 1024*64;
>> 
>>     public static void main(String[] strArr) {
>>         int a[] = new int[RANGE];
>>         test0(a);
>>     }
>> 
>>     static void test0(int[] a) {
>>         for (int i = 0; i < RANGE; i++) {
>>             a[i]++;
>>         }
>>     }
>> }
>> 
>> 
>> `./java -XX:CompileCommand=compileonly,Test::test* -XX:+TraceSuperWord -Xcomp -XX:+PrintIdeal -XX:+AlignVector  -XX:+VerifyAlignVector -XX:CompileCommand=print,Test::test* Test.java`
>> 
>> This looks like the main loop:
>> 
>>  ;; B22: #	out( B22 B23 ) <- in( B21 B22 ) Loop( B22-B22 inner post of N743) Freq: 4.49988
>>   0x00007f83c8bb2f68:   lea    0x10(%rbp,%rbx,4),%r10       ;*iaload {reexecute=0 rethrow=0 return_oop=0}
>>                                                             ; - Test::test0 at 12 (line 11)
>>   0x00007f83c8bb2f6d:   mov    %r10,%r8
>>   0x00007f83c8bb2f70:   test   $0x7,%r8b
>>   0x00007f83c8bb2f74:   je     0x00007f83c8bb2f8a
>>   0x00007f83c8bb2f76:   movabs $0x7f83d77e2fc8,%rdi         ;   {external_word}
>>   0x00007f83c8bb2f80:   and    $0xfffffffffffffff0,%rsp
>>   0x00007f83c8bb2f84:   callq  0x00007f83d71a0162           ;   {runtime_call MacroAssembler::debug64(char*, long, long*)}
>>   0x00007f83c8bb2f89:   hlt    
>>   0x00007f83c8bb2f8a:   test   $0x7,%r10b
>>   0x00007f83c8bb2f8e:   je     0x00007f83c8bb2fa4
>>   0x00007f83c8bb2f90:   movabs $0x7f83d77e2fc8,%rdi         ;   {external_word}
>>   0x00007f83c8bb2f9a:   and    $0xfffffffffffffff0,%rsp
>>   0x00007f83c8bb2f9e:   callq  0x00007f83d71a0162           ;   {runtime_call MacroAssembler::debug64(char*, long, long*)}
>>   0x00007f83c8bb2fa3:   hlt    
>>   0x00007f83c8bb2fa4:   vpaddd (%r10),%zmm5,%zmm0
>>   0x00007f83c8bb2faa:   vmovdqu32 %zmm0,(%r8)               ;*iastore {reexecute=0 rethrow=0 return_oop=0}
>>                                                             ; - Test::test0 at 15 (line 11)
>>   0x00007f83c8bb2fb0:   add    $0x10,%ebx                   ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>>                              ...
>
> And without `-XX:-VerifyAlignVector`
> 
> 
>  ;; B20: #	out( B20 B21 ) <- in( B19 B20 ) Loop( B20-B20 inner post of N743) Freq: 4.49988
>   0x00007ff22cbb2924:   vpaddd 0x10(%rbx,%r13,4),%zmm0,%zmm1
>   0x00007ff22cbb292f:   vmovdqu32 %zmm1,0x10(%rbx,%r13,4)   ;*iastore {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - Test::test0 at 15 (line 11)
>   0x00007ff22cbb293a:   add    $0x10,%r13d                  ;*iinc {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - Test::test0 at 16 (line 10)
>   0x00007ff22cbb293e:   cmp    %r10d,%r13d
>   0x00007ff22cbb2941:   jl     0x00007ff22cbb2924           ;*if_icmpge {reexecute=0 rethrow=0 return_oop=0}
>                                                             ; - Test::test0 at 6 (line 10)

Can you show assembler code for simple load and store instructions (move data from one array to another)?
My concern is that LoadV and StoreV are defined only with `memory` input:

instruct loadV(vec dst, memory mem) %{
  match(Set dst (LoadVector mem));

I would assume it will be embedded memory only. But C2 may be smart enough to generate `lea` if it sees not AddP node.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1442017423


More information about the hotspot-compiler-dev mailing list