RFR: 8282541: AArch64: Auto-vectorize Math.round API [v3]
Andrew Haley
aph at openjdk.java.net
Tue Apr 19 16:45:27 UTC 2022
On Fri, 15 Apr 2022 08:14:37 GMT, Andrew Haley <aph at openjdk.org> wrote:
>> OK, thanks! Looks reasonable to me. We are going to make all vecX/vecD regs to vReg, I think that should make SIMD code cleaner.
>>
>> Currently all rules for vReg are in aarch64_sve.ad. And since the codegen is actually for SVE target, though generates ASIMD insns, perhaps move these rules to aarch64_sve.ad would be better? Also I think the 2F/4F rules could be merged into one, like:
>>
>>
>> instruct vroundvRegF(vReg dst, vReg src, vReg tmp1, vReg tmp2, vReg tmp3)
>> %{
>> predicate(n->as_Vector()->length_in_bytes() <= 16);
>> match(Set dst (RoundVF src));
>> effect(TEMP_DEF dst, TEMP tmp1, TEMP tmp2, TEMP tmp3);
>> format %{ "vround $dst, $src\t# round vReg F to I vector" %}
>> ins_encode %{
>> uint size = Matcher::vector_length_in_bytes(this);
>> __ vector_round_neon(as_FloatRegister($dst$$reg), as_FloatRegister($src$$reg),
>> as_FloatRegister($tmp1$$reg), as_FloatRegister($tmp2$$reg),
>> as_FloatRegister($tmp3$$reg), (size == 16) ? __ T4S : __ T2S);
>> %}
>> ins_pipe(pipe_slow);
>> %}
>
> Seems reasonable. Maybe we could the logic down into MacroAssembler. That way there'd be one point at which the SVE/Neon devcision was made, in MacroAssembler. The disadvantage would be that Neon and SVE versions require different register clobbers, but that might not matter.
Hi, do you like it better now? Thanks.
-------------
PR: https://git.openjdk.java.net/jdk/pull/8204
More information about the hotspot-dev
mailing list