Hi Gustavo,

Thank you very much for your helpful comments!

I updated webrev:
http://cr.openjdk.java.net/~mhorie/8208171/webrev.01/


Best regards,
--
Michihiro,
IBM Research - Tokyo

Inactive hide details for Gustavo Romero ---2018/07/25 23:05:32---Hi Michi, On 07/25/2018 02:43 AM, Michihiro Horie wrote:Gustavo Romero ---2018/07/25 23:05:32---Hi Michi, On 07/25/2018 02:43 AM, Michihiro Horie wrote:

From: Gustavo Romero <gromero@linux.vnet.ibm.com>
To: Michihiro Horie/Japan/IBM@IBMJP, ppc-aix-port-dev@openjdk.java.net, hotspot-dev@openjdk.java.net
Cc: goetz.lindenmaier@sap.com, volker.simonis@sap.com, "Doerr, Martin" <martin.doerr@sap.com>
Date: 2018/07/25 23:05
Subject: Re: RFR: 8208171: PPC64: Enrich SLP support





Hi Michi,

On 07/25/2018 02:43 AM, Michihiro Horie wrote:
> Dear all,
>
> Would you review the following change?
> Bug:
https://bugs.openjdk.java.net/browse/JDK-8208171
> Webrev:
http://cr.openjdk.java.net/~mhorie/8208171/webrev.00
>
> This change adds support for vectorized arithmetic calculation with SLP.
>
> The to_vr function is added to convert VSR to VR. Currently, vecX is associated with a VSR class vs_reg that only defines VSR32-51 in ppc.ad, which are exactly overlapped with VRs. Instruction APIs receiving VRs use the to_vr via vecX. Another thing is the change in sqrtF_reg to enable the matching with SqrtVF. I think the change in sqrtF_reg would be fine due to the ConvD2FNode::Value in convertnode.cpp.

Looks good. Just a few comments:

- In vmul4F_reg() would it be reasonable to use xvmulsp instead of vmaddfp in
  order to avoid the splat?

- Although all instructions added by your change where introduced in ISA 2.06,
  so POWER7 and above are OK, as I see probes for PowerArchictecturePPC64=6|5 in
  vm_version_ppc.cpp (line 64),  I'm wondering if there is any control point to
  guarantee that these instructions won't be emitted on a CPU that does not
  support them.

- I think that in general string in format %{} are in upper case. For instance,
  this the current output on optoassembly for vmul4F:

2941835 5b4     ADDI    R24, R24, #64
2941836 5b8     vmaddfp  VSR32,VSR32,VSR36      ! mul packed4F
2941837 5c0     STXVD2X     [R17], VSR32        // store 16-byte Vector

  I think it would be better to be in upper case instead. I also think that if
  the node match emits more than one instruction all instructions must be listed
  in format %{}, since it's meant for detailed debugging. Finally I think it
  would be better to replace \t! by \t// in that string (unless I'm missing any
  special meaning for that char). So for vmul4F it would be something like:

2941835 5b4     ADDI      R24, R24, #64
                VSPLTISW  VSR34, 0                 // Splat 0 imm in VSR34
2941836 5b8     VMADDFP   VSR32,VSR32,VSR36,VSR34  // Mul packed4F
2941837 5c0     STXVD2X   [R17], VSR32             // store 16-byte Vector


But feel free to change anything just after you get additional reviews :)


> I confirmed this change with JTREG. In addition, I used attached micro benchmarks.
> /(See attached file: slp_microbench.zip)/

Thanks for sharing it.
Btw, another option to host it would be in the CR
server, in
http://cr.openjdk.java.net/~mhorie/8208171


Best regards,
Gustavo
 
>
> Best regards,
> --
> Michihiro,
> IBM Research - Tokyo
>