RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs [v58]
Vladimir Kozlov
kvn at openjdk.org
Wed Jan 3 19:45:44 UTC 2024
On Wed, 3 Jan 2024 09:01:31 GMT, Emanuel Peter <epeter at openjdk.org> wrote:
>> I want to push this in JDK23.
>> After this fix here, I'm doing [this refactoring](https://github.com/openjdk/jdk/pull/16620).
>>
>> To calm your nerves: most of the changes are in auto-generated tests, and tests in general.
>>
>> **Context**
>>
>> `-XX:+AlignVector` ensures that SuperWord only creates LoadVector and StoreVector that can be memory aligned. This is achieved by iterating in the pre-loop until we reach the alignment boundary, then we can start the main loop properly aligned. However, this is not possible in all cases, sometimes some memory accesses cannot be guaranteed to be aligned, and we need to reject vectorization (at least partially, for some of the packs).
>>
>> Alignment is split into two tasks:
>> - Alignment Correctness Checks: only relevant if `-XX:+AlignVector`. Need to reject vectorization if alignment is not possible. We must check if the address of the vector load/store is aligned with (divisible by) `ObjectAlignmentInBytes`.
>> - Alignment by adjusting pre-loop limit: alignment is desirable even if `-XX:-AlignVector`. We would like to align the vectors with their vector width.
>>
>> **Problem**
>>
>> I have recently found a bug with our AlignVector [JDK-8310190](https://bugs.openjdk.org/browse/JDK-8310190).
>> In that bug, we perform a misaligned memory vector access, which results in a `SIGBUS` on an ARM32 machine.
>> Thanks @fg1417 for confirming this!
>> Hence, we need to fix the alignment correctness checks.
>>
>> While working on this task, I also found some bugs in the "alignment by adjusting pre-loop limit": there were cases where it did not align the vectors correctly.
>>
>> **Problem Details**
>>
>> Reproducer:
>>
>>
>> static void test(short[] a, short[] b, short mask) {
>> for (int i = 0; i < RANGE; i+=8) {
>> // Problematic for AlignVector
>> b[i+0] = (short)(a[i+0] & mask); // best_memref, align 0
>>
>> b[i+3] = (short)(a[i+3] & mask); // pack at offset 6 bytes
>> b[i+4] = (short)(a[i+4] & mask);
>> b[i+5] = (short)(a[i+5] & mask);
>> b[i+6] = (short)(a[i+6] & mask);
>> }
>> }
>>
>>
>> During `SuperWord::find_adjacent_refs` we used to check if the references are expected to be aligned. For that, we look at each "group" of references (eg all `LoadS`) and take the reference with the lowest offset. For that chosen reference, we check if it is alignable. If yes, we accept all references of that group, if no we reject all.
>>
>> This is problemati...
>
> Emanuel Peter has updated the pull request with a new target base due to a merge or a rebase. The pull request now contains 124 commits:
>
> - Merge branch 'JDK-8311586' of https://github.com/eme64/jdk into JDK-8311586
> - Apply suggestions from code review by Christian
>
> Co-authored-by: Christian Hagedorn <christian.hagedorn at oracle.com>
> - fix copyright year 2024
> - Merge branch 'master' into JDK-8311586
> - more comments in SuperWord::adjust_pre_loop_limit_to_align_main_loop_vectors
> - comments about modulo positive / negative values
> - Apply suggestions from code review from Christian
>
> Co-authored-by: Christian Hagedorn <christian.hagedorn at oracle.com>
> - more small fixes by Christian
> - fix for yesterday's reviews by Christian
> - improve case analysis empty / constrained / trivial
> - ... and 114 more: https://git.openjdk.org/jdk/compare/06dd7353...d01a0cd9
Few comments.
src/hotspot/share/opto/chaitin.cpp line 1795:
> 1793: // See if already computed; if so return it
> 1794: if( derived_base_map[derived->_idx] )
> 1795: return derived_base_map[derived->_idx];
Please fix code style for these lines since you are touching this code. Spacing and missing {}.
src/hotspot/share/opto/chaitin.cpp line 1797:
> 1795: return derived_base_map[derived->_idx];
> 1796:
> 1797: if (derived->is_Mach() && derived->as_Mach()->ideal_Opcode() == Op_VerifyVectorAlignment) {
Missing #ifdef ASSERT
src/hotspot/share/opto/compile.cpp line 1059:
> 1057:
> 1058: if (AllowVectorizeOnDemand) {
> 1059: if (has_method() && _directive->VectorizeOption) {
This seems no related. Please explain it.
src/hotspot/share/opto/compile.cpp line 3713:
> 3711: // to ObjectAlignmentInBytes. Hence, even if multiple arrays are accessed in
> 3712: // a loop we can expect at least the following alignment:
> 3713: jlong guaranteed_alignment = MIN2(vector_width, (jlong)ObjectAlignmentInBytes);
This is more relaxed check than the actual alignment required. As I understand it is because it checks only base address of array and not actually memory address to which vector instruction is accessed (which is (base,index,offset)).
It is useful but does not guarantee correct alignment of vector access instructions.
Consider using `lea` instruction on x86 to load memory address into register and check it.
src/hotspot/share/opto/machnode.cpp line 360:
> 358: }
> 359:
> 360: if (base != nullptr && base->is_Mach() && base->as_Mach()->ideal_Opcode() == Op_VerifyVectorAlignment) {
Missing #ifdef ASSERT
src/hotspot/share/opto/superword.cpp line 674:
> 672: "packset empty or we find the alignment reference");
> 673:
> 674: if (TraceSuperWord) {
Missing #ifndef PRODUCT
src/hotspot/share/opto/superword.cpp line 1605:
> 1603: compress_packset();
> 1604:
> 1605: if (TraceSuperWord) {
Missing #ifndef PRODUCT
-------------
PR Review: https://git.openjdk.org/jdk/pull/14785#pullrequestreview-1802885677
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440813017
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440791887
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440792545
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440864022
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440796064
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440826791
PR Review Comment: https://git.openjdk.org/jdk/pull/14785#discussion_r1440828355
More information about the hotspot-compiler-dev
mailing list