RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs [v19]
Christian Hagedorn
chagedorn at openjdk.org
Wed Dec 6 12:40:05 UTC 2023
On Tue, 28 Nov 2023 06:10:42 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> I want to push this in JDK23.
>> After this fix here, I'm doing [this refactoring](https://github.com/openjdk/jdk/pull/16620).
>>
>> To calm your nerves: most of the changes are in auto-generated tests, and tests in general.
>>
>> **Context**
>>
>> `-XX:+AlignVector` ensures that SuperWord only creates LoadVector and StoreVector that can be memory aligned. This is achieved by iterating in the pre-loop until we reach the alignment boundary, then we can start the main loop properly aligned. However, this is not possible in all cases, sometimes some memory accesses cannot be guaranteed to be aligned, and we need to reject vectorization (at least partially, for some of the packs).
>>
>> Alignment is split into two tasks:
>> - Alignment Correctness Checks: only relevant if `-XX:+AlignVector`. Need to reject vectorization if alignment is not possible. We must check if the address of the vector load/store is aligned with (divisible by) `ObjectAlignmentInBytes`.
>> - Alignment by adjusting pre-loop limit: alignment is desirable even if `-XX:-AlignVector`. We would like to align the vectors with their vector width.
>>
>> **Problem**
>>
>> I have recently found a bug with our AlignVector [JDK-8310190](https://bugs.openjdk.org/browse/JDK-8310190).
>> In that bug, we perform a misaligned memory vector access, which results in a `SIGBUS` on an ARM32 machine.
>> Thanks @fg1417 for confirming this!
>> Hence, we need to fix the alignment correctness checks.
>>
>> While working on this task, I also found some bugs in the "alignment by adjusting pre-loop limit": there were cases where it did not align the vectors correctly.
>>
>> **Problem Details**
>>
>> Reproducer:
>>
>>
>> static void test(short[] a, short[] b, short mask) {
>> for (int i = 0; i < RANGE; i+=8) {
>> // Problematic for AlignVector
>> b[i+0] = (short)(a[i+0] & mask); // best_memref, align 0
>>
>> b[i+3] = (short)(a[i+3] & mask); // pack at offset 6 bytes
>> b[i+4] = (short)(a[i+4] & mask);
>> b[i+5] = (short)(a[i+5] & mask);
>> b[i+6] = (short)(a[i+6] & mask);
>> }
>> }
>>
>>
>> During `SuperWord::find_adjacent_refs` we used to check if the references are expected to be aligned. For that, we look at each "group" of references (eg all `LoadS`) and take the reference with the lowest offset. For that chosen reference, we check if it is alignable. If yes, we accept all references of that group, if no we reject all.
>>
>> This is problemati...
>
> Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
>
> For Faye: remove 64-bit platform requirement in test
src/hotspot/cpu/aarch64/aarch64.ad line 8243:
> 8241: // VerifyVectorAlignment Instruction
> 8242:
> 8243: instruct verify_vector_alignment(iRegP r, immL_positive_bitmaskI mask, rFlagsReg cr) %{
I suggest to use `addr` instead of `r` to make it simpler to read. Same for `x86.ad`.
src/hotspot/cpu/aarch64/aarch64.ad line 8255:
> 8253: __ bind(Lskip);
> 8254: %}
> 8255: ins_pipe( pipe_slow );
Suggestion:
ins_pipe(pipe_slow);
src/hotspot/cpu/x86/x86.ad line 8979:
> 8977: __ bind(Lskip);
> 8978: %}
> 8979: ins_pipe( pipe_slow );
Suggestion:
ins_pipe(pipe_slow);
src/hotspot/share/opto/c2_globals.hpp line 97:
> 95: \
> 96: develop(bool, VerifyAlignVector, false, \
> 97: "Check that vector store/load are aligned if AlignVector is on.") \
Suggestion:
"Check that vector stores/loads are aligned if AlignVector is on.") \
src/hotspot/share/opto/c2compiler.cpp line 70:
> 68: #ifdef ASSERT
> 69: if (!AlignVector && VerifyAlignVector) {
> 70: warning("VerifyAlignVector disabled because AlignVector not enabled.");
Suggestion:
warning("VerifyAlignVector disabled because AlignVector is not enabled.");
src/hotspot/share/opto/compile.cpp line 3695:
> 3693: n->as_StoreVector()->must_verify_alignment();
> 3694: if (must_verify_alignment) {
> 3695: jlong memory_size = n->is_LoadVector() ? n->as_LoadVector()->memory_size() :
I suggest to name this `vector_length` as you are mentioning it below in the comment.
Suggestion:
jlong vector_length = n->is_LoadVector() ? n->as_LoadVector()->memory_size() :
src/hotspot/share/opto/compile.cpp line 3701:
> 3699: // to ObjectAlignmentInBytes. Hence, even if multiple arrays are accessed in
> 3700: // a loop we can expect at least the following alignment:
> 3701: jlong alignment = MIN2(memory_size, (jlong)ObjectAlignmentInBytes);
To make it more explicit, I suggest to name it:
Suggestion:
jlong guaranteed_alignment = MIN2(memory_size, (jlong)ObjectAlignmentInBytes);
src/hotspot/share/opto/superword.cpp line 590:
> 588: // Find the adjacent memory references and create pack pairs for them.
> 589: // This is the initial set of packs that will then be extended by
> 590: // following use->def and def->use links.
Maybe add a comment/hint here that alignment will be checked later
src/hotspot/share/opto/superword.cpp line 612:
> 610: // Take the first mem_ref as the reference to align to. The pre-loop trip count is
> 611: // modified to align this reference to a vector-aligned address. If strict alignment
> 612: // is required, we may change the reference later (see filter_packs_for_alignment).
Suggestion:
// is required, we may change the reference later (see filter_packs_for_alignment()).
src/hotspot/share/opto/superword.cpp line 669:
> 667: } // while (memops.size() != 0)
> 668:
> 669: set_align_to_ref(align_to_mem_ref);
You could directly set it inside the loop now that we always pick the first one.
src/hotspot/share/opto/superword.cpp line 1594:
> 1592:
> 1593: // Remove all nullptr from packset
> 1594: compress_packset();
IIUC, there should not be any nullptr in the packs before `combine_packs()`. Can we assert that at the entry of `combine_packs()`?
src/hotspot/share/opto/superword.cpp line 1616:
> 1614:
> 1615: // Find the set of alignment solutions for load/store pack p.
> 1616: AlignmentSolution SuperWord::pack_alignment_solution(Node_List* p) {
As there are already a lot of variables in this methods, I suggest to name the parameter `pack` instead of `p`:
Suggestion:
AlignmentSolution SuperWord::pack_alignment_solution(Node_List* pack) {
src/hotspot/share/opto/superword.hpp line 250:
> 248: // Where scale is 0 if no scale dependency,
> 249: // and invar is nullptr if no invar dependency.
> 250: class AlignmentSolution {
Please add some new lines between non-one-liner methods for better readability.
src/hotspot/share/opto/superword.hpp line 251:
> 249: // and invar is nullptr if no invar dependency.
> 250: class AlignmentSolution {
> 251: private:
`private` is implied and can be removed.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407506080
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407494609
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407492226
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407497784
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407498480
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407513072
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407519276
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407522168
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407524783
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407545008
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407551296
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407689762
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407559295
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1407558728
More information about the hotspot-compiler-dev
mailing list