[15] RFR (L) 8235825: C2: Merge AD instructions for Replicate nodes

Vladimir Ivanov vladimir.x.ivanov at oracle.com
Fri Dec 13 10:27:32 UTC 2019


Thanks for the feedback, Vladimir.

> replicateB
> 
> Can you fold it differently?
> 
> ReplB_reg_leg

Are you talking about ReplB_reg?

>    predicate(!VM_Version::supports_avx512vlbw());

3151 instruct ReplB_reg_leg(legVec dst, rRegI src) %{
3152   predicate(n->as_Vector()->length() == 64 && 
!VM_Version::supports_avx512bw());

For ReplB_reg_leg the predicate can't be simplified: it is applicable 
only to 512bit vector when AVX512BW is absent. Otherwise, legVec 
constraint will be unnecessarily applied to other configurations.


3119 instruct ReplB_reg(vec dst, rRegI src) %{
3120   predicate((n->as_Vector()->length() <= 32) ||
3121             (n->as_Vector()->length() == 64 && 
VM_Version::supports_avx512bw()));

For ReplB_reg there's a shorter version:

predicate(n->as_Vector()->length() <= 32 || 
VM_Version::supports_avx512bw());


But do you find it easier to read? For example, when you are checking 
that all configurations are covered:

predicate(n->as_Vector()->length() <= 32 ||
           VM_Version::supports_avx512bw());

instruct ReplB_reg_leg(legVec dst, rRegI src) %{
   predicate(n->as_Vector()->length() == 64 &&
            !VM_Version::supports_avx512bw());

vs

instruct ReplB_reg_leg(legVec dst, rRegI src) %{
   predicate(n->as_Vector()->length() == 64 && 
!VM_Version::supports_avx512bw());

instruct ReplB_reg(vec dst, rRegI src) %{
   predicate((n->as_Vector()->length() <= 32) ||
             (n->as_Vector()->length() == 64 && 
VM_Version::supports_avx512bw()));


>    ins_encode %{
>      uint vlen = vector_length(this);
>      __ movdl($dst$$XMMRegister, $src$$Register);
>      __ punpcklbw($dst$$XMMRegister, $dst$$XMMRegister);
>      __ pshuflw($dst$$XMMRegister, $dst$$XMMRegister, 0x00);
>      if (vlen > 8) {
>        __ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
>        if (vlen > 16) {
>          __ vinserti128_high($dst$$XMMRegister, $dst$$XMMRegister);
>          if (vlen > 32) {
>            assert(vlen == 64, "sanity");
>            __ vinserti64x4($dst$$XMMRegister, $dst$$XMMRegister, 
> $dst$$XMMRegister, 0x1);

Yes, it should work as well. Do you find it easier to read though?

> Similar ReplB_imm_leg for which I don't see new implementation.

Good catch. Added it back.

(FTR completeness for reg2reg variants (_reg*) is mandatory. But for 
_mem and _imm it is optional: if they some configuration isn't covered, 
_reg is used. But it was in original code, so I added it back.)

Updated version:
   http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00

Another thing I noticed is that for ReplI/.../ReplD cases avx512vl 
checks are not necessary:
 
http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.01/broadcast_vl/

The code assumes that evpbroadcastd/evpbroadcastq, 
vpbroadcastdvpbroadcastq, vpbroadcastss/vpbroadcastsd need AVX512VL for 
512bit case, but Intel manual says AVX512F is enough.

I plan to handle it as a separate change, but let me know if you want to 
incorporate it into 8235825.

> It should also simplify code for avx512 which one or 2 instructions.

Can you elaborate, please? Are you talking about the case when version 
for different vector sizes differ in 1-2 instructions (like ReplB_reg)?

> Other types changes can be done same way.

Best regards,
Vladimir Ivanov

> On 12/12/19 3:19 AM, Vladimir Ivanov wrote:
>> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/all/
>> https://bugs.openjdk.java.net/browse/JDK-8235825
>>
>> Merge AD instructions for the following vector nodes:
>>    - ReplicateB, ..., ReplicateD
>>
>> Individual patches:
>>
>> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/individual 
>>
>>
>> Testing: tier1-4, test run on different CPU flavors (KNL, CLX)
>>
>> Contributed-by: Jatin Bhateja <jatin.bhateja at intel.com>
>> Reviewed-by: vlivanov, sviswanathan, ?
>>
>> Best regards,
>> Vladimir Ivanov


More information about the hotspot-compiler-dev mailing list