[15] RFR (L) 8235825: C2: Merge AD instructions for Replicate nodes

Vladimir Kozlov vladimir.kozlov at oracle.com
Fri Dec 13 01:16:29 UTC 2019


Vladimir

replicateB

Can you fold it differently?

ReplB_reg_leg
   predicate(!VM_Version::supports_avx512vlbw());
   ins_encode %{
     uint vlen = vector_length(this);
     __ movdl($dst$$XMMRegister, $src$$Register);
     __ punpcklbw($dst$$XMMRegister, $dst$$XMMRegister);
     __ pshuflw($dst$$XMMRegister, $dst$$XMMRegister, 0x00);
     if (vlen > 8) {
       __ punpcklqdq($dst$$XMMRegister, $dst$$XMMRegister);
       if (vlen > 16) {
         __ vinserti128_high($dst$$XMMRegister, $dst$$XMMRegister);
         if (vlen > 32) {
           assert(vlen == 64, "sanity");
           __ vinserti64x4($dst$$XMMRegister, $dst$$XMMRegister, $dst$$XMMRegister, 0x1);

Similar ReplB_imm_leg for which I don't see new implementation.

It should also simplify code for avx512 which one or 2 instructions.

Other types changes can be done same way.

Thanks,
Vladimir

On 12/12/19 3:19 AM, Vladimir Ivanov wrote:
> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/all/
> https://bugs.openjdk.java.net/browse/JDK-8235825
> 
> Merge AD instructions for the following vector nodes:
>    - ReplicateB, ..., ReplicateD
> 
> Individual patches:
> 
> http://cr.openjdk.java.net/~vlivanov/jbhateja/8235825/webrev.00/individual
> 
> Testing: tier1-4, test run on different CPU flavors (KNL, CLX)
> 
> Contributed-by: Jatin Bhateja <jatin.bhateja at intel.com>
> Reviewed-by: vlivanov, sviswanathan, ?
> 
> Best regards,
> Vladimir Ivanov


More information about the hotspot-compiler-dev mailing list