RFR: 8310190: C2 SuperWord: AlignVector is broken, generates misaligned packs

Emanuel Peter epeter at openjdk.org
Wed Dec 6 09:52:43 UTC 2023


On Wed, 15 Nov 2023 03:14:22 GMT, Fei Gao <fgao at openjdk.org> wrote:

>>> @fg1417 thanks for the help! I ran tier1-6 with `-XX:+AlignVector -XX:+IgnoreUnrecognizedVMOptions -XX:+VerifyAlignVector`. Though `VerifyAlignVector` only has an effect if there is a matching rule in the `ad` files. And I ran quite a few repetitions of `compiler/loopopts/superword/TestAlignVectorFuzzer.java`.
>> 
>> Hi @eme64, we have [ `tst`](https://developer.arm.com/documentation/ddi0602/2023-09/Base-Instructions/TST--immediate---Test-bits--immediate---an-alias-of-ANDS--immediate--?lang=en) on aarch64.
>> 
>> instruct verify_vector_alignment(iRegP r, immL_positive_bitmaskI mask, rFlagsReg cr) %{
>>   match(Set r (VerifyVectorAlignment r mask));
>>   effect(KILL cr);
>>   format %{ "verify_vector_alignment $r $mask \t! verify alignment" %}
>>   ins_encode %{
>>     Label Lskip;
>>     // check if masked bits of r are zero
>>     __ tst($r$$Register, $mask$$constant);
>>     __ br(Assembler::EQ, Lskip);
>>     __ stop("verify_vector_alignment found a misaligned vector memory access");
>>     __ bind(Lskip);
>>   %}
>>   ins_pipe( pipe_slow );
>> %}
>> 
>> 
>> I tested tier1-tier3 with `-XX:+AlignVector -XX:+IgnoreUnrecognizedVMOptions -XX:+VerifyAlignVector` on aarch64(neon and 128-bit sve) platforms, and several repetitions of `compiler/loopopts/superword/TestAlignVectorFuzzer.java`. No new failures found. Hope it works for you. Thanks.
>
>> @fg1417 thanks for the snippet! Where do I add it? I don't know how the `ad` and `m4` work. Is there some script that converts them?
> 
> Hi @eme64 , you can add it [here](https://github.com/openjdk/jdk/blob/d9a89c59daa40fdc8da620940d5c518a9f18bc7b/src/hotspot/cpu/aarch64/aarch64.ad#L16394). Generally, we add **vector** operations to `src/hotspot/cpu/aarch64/aarch64_vector_ad.m4`, and redirect the output of the command: `m4 src/hotspot/cpu/aarch64/aarch64_vector_ad.m4` to `src/hotspot/cpu/aarch64/aarch64_vector.ad`. And other instructions are often added to `src/hotspot/cpu/aarch64/aarch64.ad`. Thanks!

@fg1417 I rewrote the proof a bit, after some of your suggestions. I wanted to make the decomposition `pre_iter = pre_iter_C_const + pre_iter_C_invar + pre_iter_C_init` more explicit and bring it up earlier.

-------------

PR Comment: https://git.openjdk.org/jdk/pull/14785#issuecomment-1842538942


More information about the hotspot-compiler-dev mailing list