RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs

Vladimir Kozlov kvn at openjdk.org
Sat Oct 7 01:01:18 UTC 2023


On Thu, 6 Jul 2023 14:13:01 GMT, Emanuel Peter <epeter at openjdk.org> wrote:

> To calm your nerves: most of the changes are in auto-generated tests, and tests in general.
> 
> **Context**
> 
> `-XX:+AlignVector` ensures that SuperWord only creates LoadVector and StoreVector that can be memory aligned. This is achieved by iterating in the pre-loop until we reach the alignment boundary, then we can start the main loop properly aligned. However, this is not possible in all cases, sometimes some memory accesses cannot be guaranteed to be aligned, and we need to reject vectorization (at least partially, for some of the packs).
> 
> Alignment is split into two tasks:
>  - Alignment Correctness Checks: only relevant if `-XX:+AlignVector`. Need to reject vectorization if alignment is not possible. We must check if the address of the vector load/store is aligned with (divisible by) `ObjectAlignmentInBytes`.
>  - Alignment by adjusting pre-loop limit: alignment is desirable even if `-XX:-AlignVector`. We would like to align the vectors with their vector width.
> 
> **Problem**
> 
> I have recently found a bug with our AlignVector [JDK-8310190](https://bugs.openjdk.org/browse/JDK-8310190).
> In that bug, we perform a misaligned memory vector access, which results in a `SIGBUS` on an ARM32 machine.
> Thanks @fg1417 for confirming this!
> Hence, we need to fix the alignment correctness checks.
> 
> While working on this task, I also found some bugs in the "alignment by adjusting pre-loop limit": there were cases where it did not align the vectors correctly.
> 
> **Problem Details**
> 
> Reproducer:
> 
> 
>     static void test3(byte[] a, byte[] b, byte mask) {
>         for (int i = 0; i < RANGE; i+=8) {
>             // Problematic for AlignVector
>             b[i+0] = (byte)(a[i+0] & mask); // best_memref, align 0
> 
>             b[i+3] = (byte)(a[i+3] & mask); // pack at offset 6 bytes
>             b[i+4] = (byte)(a[i+4] & mask);
>             b[i+5] = (byte)(a[i+5] & mask);
>             b[i+6] = (byte)(a[i+6] & mask);
>         }
>     }
> 
> 
> During `SuperWord::find_adjacent_refs` we used to check if the references are expected to be aligned. For that, we look at each "group" of references (eg all `LoadS`) and take the reference with the lowest offset. For that chosen reference, we check if it is alignable. If yes, we accept all references of that group, if no we reject all.
> 
> This is problematic as shown in this example. We have references at index offset `0, 3, 4, 5, 6`, and byte offset `0, 6, 8, 10, 12`. While the reference at position `0` is alignable, the pack that is eventually generated has offsets `6,...

src/hotspot/cpu/x86/x86.ad line 8974:

> 8972: 
> 8973: #ifdef _LP64
> 8974: instruct verify_alignment(rRegP r, rRegL mask, rRegP tmp) %{

this should be also under #ifdef ASSERT

src/hotspot/cpu/x86/x86.ad line 8985:

> 8983:     __ movq(Rtmp, Rr);    // copy r to tmp
> 8984:     __ andq(Rtmp, Rmask); // mask off tmp
> 8985:     __ cmpq(Rtmp, 0x0);   // check if zero

Since mask is constant the argument should be immL32 and `testq` instruction could be used instead these 3 instructions. See testL_reg_imm().

src/hotspot/share/opto/classes.hpp line 452:

> 450: macro(LoadVectorMasked)
> 451: macro(StoreVectorMasked)
> 452: macro(VerifyAlignment)

May be more vector specific name: VerifyVectorAlignment

src/hotspot/share/opto/compile.cpp line 3668:

> 3666: 
> 3667:   case Op_LoadVector:
> 3668:   case Op_StoreVector:

What about Masked Load and Store vectors?

src/hotspot/share/opto/compile.cpp line 3670:

> 3668:   case Op_StoreVector:
> 3669: #ifdef ASSERT
> 3670: #ifdef AMD64

Instead of #ifdef AMD64 you can check `Matcher::has_match_rule(Op_VerifyAlignmen)`

src/hotspot/share/opto/compile.cpp line 3672:

> 3670: #ifdef AMD64
> 3671:     // Add VerifyAlignment node between adr and load / store.
> 3672:     if (AlignVector && VerifyAlignVector) {

You should check only VerifyAlignVector here. In `CompilerConfig::check_args_consistency()` you need to check that VerifyAlignVector could be set to true only if AlignVector is true. See PostLoopMultiversioning as example there. Don't forget #ifdef ASSERT there.

src/hotspot/share/opto/compile.cpp line 3684:

> 3682:         jlong alignment = MIN2(memory_size, (jlong)ObjectAlignmentInBytes);
> 3683:         assert(2 <= alignment && alignment <= 64, "alignment must be in range");
> 3684:         assert(is_power_of_2(alignment), "alignment must be power of 2");

Do we have somewhere check that vector size is always power of 2 or multiply of object alignment (for case when it is bigger than object alignment)?

src/hotspot/share/opto/superword.cpp line 2924:

> 2922: #ifdef ASSERT
> 2923:       // Mark Load/Store Vector for alignment verification
> 2924:       if (AlignVector && VerifyAlignVector && cl->is_main_loop()) {

`AlignVector &&` no need if you do check when VerifyAlignVector is set.

src/hotspot/share/opto/superword.cpp line 2927:

> 2925:         if (vn->Opcode() == Op_LoadVector) {
> 2926:           vn->as_LoadVector()->set_must_verify_alignment();
> 2927:         } else if (vn->Opcode() == Op_StoreVector) {

Do you mark all Load/StoreVector nodes for verification? Then why do you even need to set such field?
I you verify only few you need comment explaining which one you are verifying.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349391130
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349397555
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349390891
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349394968
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349392628
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349393982
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349415762
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349418418
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1349419169


More information about the hotspot-compiler-dev mailing list