RFR: 8343689: AArch64: Optimize MulReduction implementation [v3]
Mikhail Ablakatov
mablakatov at openjdk.org
Mon Jun 30 13:25:10 UTC 2025
On Thu, 27 Feb 2025 03:28:51 GMT, Xiaohong Gong <xgong at openjdk.org> wrote:
>> Mikhail Ablakatov has updated the pull request incrementally with two additional commits since the last revision:
>>
>> - fixup: don't modify the value in vsrc
>>
>> Fix reduce_mul_integral_gt128b() so it doesn't modify vsrc. With this
>> change, the result of recursive folding is held in vtmp1. To be able to
>> pass this intermediate result to reduce_mul_integral_le128b(), we would
>> have to use another temporary FloatRegister, as vtmp1 would essentially
>> act as vsrc. It's possible to get around this however:
>> reduce_mul_integral_le128b() is modified so it's possible to pass
>> matching vsrc and vtmp2 arguments. By doing this, we save ourselves a
>> temporary register in rules that match to reduce_mul_integral_gt128b().
>> - cleanup: revert an unnecessary change to reduce_mul_fp_le128b() formating
>
> src/hotspot/cpu/aarch64/aarch64_vector.ad line 3012:
>
>> 3010: vReg tmp1, vReg tmp2) %{
>> 3011: predicate(Matcher::vector_length_in_bytes(n->in(2)) == 8 ||
>> 3012: Matcher::vector_length_in_bytes(n->in(2)) == 16);
>
> Suggestion:
>
> predicate(Matcher::vector_length_in_bytes(n->in(2)) <= 16);
The patch doesn't add or modifies these lines. I'd prefer to leave the footprint as small as possible and leave it as is..
> ` sve_ext(vtmp1, vtmp2, vector_length_in_bytes / 2);`
This doesn't look right. `vtmp1` is both an src and dest register operand for `EXT` but it contains an undefined value at the first iteration of the loop.
I agree that the implementation of the loop is confusing. I'll rework and (hopefully) simplify it, thank you for pointing this out.
Hi @XiaohongGong , I hope I've been able to simplify the implementation. It doesn't utilize a lambda anymore, but I had to peel the first iteration out of the loop anyway.
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/23181#discussion_r2154422774
PR Review Comment: https://git.openjdk.org/jdk/pull/23181#discussion_r2154415974
PR Review Comment: https://git.openjdk.org/jdk/pull/23181#discussion_r2174988523
More information about the hotspot-compiler-dev
mailing list