RFR: 8323582: C2 SuperWord AlignVector: misaligned vector memory access with unaligned native memory [v3]
Emanuel Peter
epeter at openjdk.org
Thu Feb 20 07:21:45 UTC 2025
> Note: the approach with Predicates and Multiversioning prepares us well for Runtime Checks for Aliasing Analysis, see more below.
>
> **Background**
>
> With `-XX:+AlignVector`, all vector loads/stores must be aligned. We try to statically determine if we can always align the vectors. One condition is that the address `base` is already aligned. For arrays, we know that this always holds, because they are `ObjectAlignmentInBytes` aligned. But with native memory, the `base` is just some arbitrarily aligned pointer.
>
> **Problem**
>
> So far, we have just naively assumed that the `base` is always `ObjectAlignmentInBytes` aligned. But that does not hold for `native` memory segments: the `base` can also be unaligned. I had constructed such an example, and with `-XX:+AlignVector -XX:+VerifyAlignVector` this example hits the verification code.
>
>
> MemorySegment nativeAligned = Arena.ofAuto().allocate(RANGE * 4 + 1);
> MemorySegment nativeUnaligned = nativeAligned.asSlice(1);
> test3(nativeUnaligned);
>
>
> When compiling the test method, we assume that the `nativeUnaligned.address()` is aligned - but it is not!
>
> static void test3(MemorySegment ms) {
> for (int i = 0; i < RANGE; i++) {
> long adr = i * 4L;
> int v = ms.get(ELEMENT_LAYOUT, adr);
> ms.set(ELEMENT_LAYOUT, adr, (int)(v + 1));
> }
> }
>
>
> **Solution: Runtime Checks - Predicate and Multiversioning**
>
> Of course we could just forbid cases where we have a `native` base from vectorizing. But that would lead to regressions currently - in most cases we do get aligned `base`s, and we currently vectorize those. We cannot statically determine if the `base` is aligned, we need a runtime check.
>
> I came up with 2 options where to place the runtime checks:
> - A new "auto vectorization" Parse Predicate:
> - This only works when predicates are available.
> - If we fail the predicate, then we recompile without the predicate. That means we cannot add a check to the predicate any more, and we would have to do multiversioning at that point if we still want to have a vectorized loop.
> - Multiversion the loop:
> - Create 2 copies of the loop (fast and slow loops).
> - The `fast_loop` can make speculative alignment assumptions, and add the corresponding check to the `multiversion_if` which decides which loop we take
> - In the `slow_loop`, we make no assumption which means we can not vectorize, but we still compile - so even unaligned `base`s would end up with reasonably fast code.
> - We "stall" the `...
Emanuel Peter has updated the pull request incrementally with one additional commit since the last revision:
adjust selector if probability
-------------
Changes:
- all: https://git.openjdk.org/jdk/pull/22016/files
- new: https://git.openjdk.org/jdk/pull/22016/files/a98ffabf..b3044bc5
Webrevs:
- full: https://webrevs.openjdk.org/?repo=jdk&pr=22016&range=02
- incr: https://webrevs.openjdk.org/?repo=jdk&pr=22016&range=01-02
Stats: 1 line in 1 file changed: 0 ins; 0 del; 1 mod
Patch: https://git.openjdk.org/jdk/pull/22016.diff
Fetch: git fetch https://git.openjdk.org/jdk.git pull/22016/head:pull/22016
PR: https://git.openjdk.org/jdk/pull/22016
More information about the hotspot-dev
mailing list