RFR: 8281375: Accelerate bitCount operation for AVX2 and AVX512 target. [v2]
Eric Liu
eliu at openjdk.java.net
Thu Feb 10 02:54:03 UTC 2022
On Wed, 9 Feb 2022 18:38:15 GMT, Jatin Bhateja <jbhateja at openjdk.org> wrote:
>> src/hotspot/cpu/x86/matcher_x86.hpp line 194:
>>
>>> 192: }
>>> 193: }
>>> 194:
>>
>> Could you explain more about the meaning of "cost" here? Should it be the actual count of instruction or latency?
>
> Currently its a rough approximation of generated instruction size over X86 since its mainly used to influence SLP unrolling factor.
Thanks! My concern is that the name "vector_op_cost", which looks like very general but it only cares about size overhead at this moment. In other place maybe we need to concern the time overhead. I prefer to give the function a more specific name.
>> src/hotspot/share/opto/loopTransform.cpp line 976:
>>
>>> 974: case Op_PopCountVL: {
>>> 975: const TypeVect * vt = n->bottom_type()->is_vect();
>>> 976: body_size += Matcher::vector_op_cost(n->Opcode(), vt->element_basic_type(), vt->length());
>>
>> For other platforms, I think `vector_op_cost` should be better to return 0 at this moment.
>
> If a target support PopCountVI/VL minimum cost should 1, Zero will mean instruction has no cost at all.
I thought the total cost has been counted by `uint body_size = _body.size()`. So that for other platforms, the total cost now is one more if counted PopCountVI/VL.
-------------
PR: https://git.openjdk.java.net/jdk/pull/7373
More information about the hotspot-compiler-dev
mailing list