8179527: Implement intrinsic code for reverseBytes with load/store

Michihiro Horie HORIE at jp.ibm.com
Fri Jun 2 09:28:01 UTC 2017


Volker,

Thank you very much for the 2nd review.
Attached includes micro benchmarks I used for the verification.
(See attached file: reverseBytesBench.tar.gz)

Best regards,
--
Michihiro,
IBM Research - Tokyo



From:	Volker Simonis <volker.simonis at gmail.com>
To:	"Doerr, Martin" <martin.doerr at sap.com>
Cc:	Michihiro Horie <HORIE at jp.ibm.com>, Gustavo Bueno Romero
            <gromero at br.ibm.com>, Hiroshi H Horii <HORII at jp.ibm.com>,
            "hotspot-dev at openjdk.java.net" <hotspot-dev at openjdk.java.net>
Date:	2017/06/02 17:04
Subject:	Re: 8179527: Implement intrinsic code for reverseBytes with
            load/store



Hi Michihiro, Martin,

thanks a lot for the change Michihiro and thanks for the thorough review
Martin :)

I'm just wondering if you had some test-cases to verify if the new nodes
really match?

Change looks good now.

Regards,
Volker


On Thu, Jun 1, 2017 at 6:00 PM, Doerr, Martin <martin.doerr at sap.com> wrote:
  Hi Michihiro,





  thank you very much for providing the new webrev (with support for older
  Power processors, which was missing in webrev.02):


  http://cr.openjdk.java.net/~horii/8179527/webrev.03/





  I have built and executed the VM on a Power6 machine. Thanks for
  supporting it.


  I would have spent only one flag for has_ldbrx/has_stdbrx, but I’m ok
  with it. The change looks good to me and I will sponsor it after 2nd
  review.





  Best regards,


  Martin








  From: Doerr, Martin
  Sent: Mittwoch, 31. Mai 2017 15:54
  To: 'Michihiro Horie' <HORIE at jp.ibm.com>; Volker Simonis (
  volker.simonis at gmail.com) <volker.simonis at gmail.com>
  Cc: Gustavo Bueno Romero <gromero at br.ibm.com>; Hiroshi H Horii <
  HORII at jp.ibm.com>; hotspot-dev at openjdk.java.net


  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store





  Hi Michihiro,





  thank you for the new webrev with all of my suggestions. Looks good to
  me.


  There’s only a trailing whitespace which I had to remove to satisfy “hg
  jcheck” (otherwise push attempt would get rejected).





  We’ll run tests. Maybe we can get a 2nd review in the meantime?





  Best regards,


  Martin








  From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
  Sent: Mittwoch, 31. Mai 2017 14:36
  To: Doerr, Martin <martin.doerr at sap.com>
  Cc: Gustavo Bueno Romero <gromero at br.ibm.com>; Hiroshi H Horii <
  HORII at jp.ibm.com>; hotspot-dev at openjdk.java.net; Simonis, Volker <
  volker.simonis at sap.com>; ppc-aix-port-dev at openjdk.java.net
  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store





  Martin,

  Thank you very much for your helpful comments and sponsoring this change.

  Would you review the latest change?
  http://cr.openjdk.java.net/~horii/8179527/webrev.02/

  Best regards,
  --
  Michihiro,
  IBM Research - Tokyo

  Inactive hide details for "Doerr, Martin" ---2017/05/30 01:26:01---Hi
  Michihiro, thanks for the improved webrev. This looks bet"Doerr, Martin"
  ---2017/05/30 01:26:01---Hi Michihiro, thanks for the improved webrev.
  This looks better, but I still have a couple of sugges

  From: "Doerr, Martin" <martin.doerr at sap.com>
  To: Michihiro Horie <HORIE at jp.ibm.com>
  Cc: "hotspot-dev at openjdk.java.net" <hotspot-dev at openjdk.java.net>,
  "Simonis, Volker" <volker.simonis at sap.com>, Hiroshi H Horii <
  HORII at jp.ibm.com>, Gustavo Bueno Romero <gromero at br.ibm.com>
  Date: 2017/05/30 01:26
  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store








  Hi Michihiro,

  thanks for the improved webrev. This looks better, but I still have a
  couple of suggestions.

  1.
  I still don’t like match rules which contain nodes which do something
  else (even though direct matching is prohibited by predicate).
  I think it would be better to remove “match(…)”, “predicate(false)” and
  “ins_const(…)” and just describe the “effect()”. At least, I’m not aware
  of why a match rule should be needed for rldicl and extsh.

  2.
  I’d appreciate if you could remove “predicate
  (UseCountLeadingZerosInstructionsPPC64)” from all byte_reverse_... rules.
  They don’t make any sense (not your fault).

  3.
  The costs seem not to be set appropriately in the byte_reverse_... rules.
  E.g. instruction count * DEFAULT_COST would be better.

  4.
  The load/store byte reversed instructions should use the 2 operand form
  (no explicit 0 for R0 to support assertions).

  Maybe we can find a 2nd reviewer if you provide a new webrev. I can
  sponsor the change.

  Thanks and best regards,
  Martin


  From: Michihiro Horie [mailto:HORIE at jp.ibm.com]
  Sent: Montag, 8. Mai 2017 06:58
  To: Doerr, Martin <martin.doerr at sap.com>; Lindenmaier, Goetz <
  goetz.lindenmaier at sap.com>
  Cc: Gustavo Serra Scalet <gustavo.scalet at eldorado.org.br>;
  hotspot-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net; Simonis,
  Volker <volker.simonis at sap.com>; Hiroshi H Horii <HORII at jp.ibm.com>;
  Gustavo Bueno Romero <gromero at br.ibm.com>
  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store


  Dear Martin, Gustavo,

  Thank you very much for your helpful comments.

  Fixed code is
  http://cr.openjdk.java.net/~horii/8179527/webrev.01/

  Dear Goetz,
  Would you kindly review and sponsor this change?
  I heard you are a C2 compiler expert and Martin is out for a while.


  Best regards,
  --
  Michihiro,
  IBM Research - Tokyo

  Inactive hide details for "Doerr, Martin" ---2017/05/03 02:24:18---Hi
  Michihiro and Gustavo, thank you very much for implementi"Doerr, Martin"
  ---2017/05/03 02:24:18---Hi Michihiro and Gustavo, thank you very much
  for implementing this change.

  From: "Doerr, Martin" <martin.doerr at sap.com>
  To: Gustavo Serra Scalet <gustavo.scalet at eldorado.org.br>, Michihiro
  Horie/Japan/IBM at IBMJP
  Cc: "ppc-aix-port-dev at openjdk.java.net" <
  ppc-aix-port-dev at openjdk.java.net>, "hotspot-dev at openjdk.java.net" <
  hotspot-dev at openjdk.java.net>, "Simonis, Volker" <volker.simonis at sap.com>
  Date: 2017/05/03 02:24
  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store









  Hi Michihiro and Gustavo,

  thank you very much for implementing this change.

  @Gustavo: Thanks for taking a look.
  I think that the direct match rules are just there to satisfy
  match_rule_supported. They don't need to be fast, they are just a fall
  back solution.
  The goal is to exploit the byte reverse load and store instructions which
  should match in more performance critical cases.

  Now my review:

  assembler_ppc.hpp:
  Looks good except a minor formatting request:
  LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532 << 1),
  should be
  LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532u << 1),
  to be consistent.
  The comments // X-FORM should be aligned with the other ones.

  assembler_ppc.inline.hpp:
  Good.

  ppc.ad:
  I'm concerned about the additional match rules which are only used for
  the expand step. They could match directly leading to incorrect code.
  What they match is not what they do.
  I suggest to implement the code directly in the ins_encode. This would
  make the new code significantly shorter and less error prone.
  I think we don't need to optimize for Power6 anymore and newer processors
  shouldn't really suffer under a little less optimized instruction
  scheduling. Would you agree?

  Displacements may be too large for "li" so I suggest to use the
  "indirect" memory operand and let the compiler handle it. I know that it
  may increase latency because the compiler will need to insert an addition
  which could better be matched into the memory operand of the load which
  is harder to implement (it is possible to match an addition in an
  operand).


  Best regards,
  Martin


  -----Original Message-----
  From: Gustavo Serra Scalet [mailto:gustavo.scalet at eldorado.org.br]
  Sent: Dienstag, 2. Mai 2017 17:05
  To: Michihiro Horie <HORIE at jp.ibm.com>
  Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net;
  Simonis, Volker <volker.simonis at sap.com>; Doerr, Martin <
  martin.doerr at sap.com>
  Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
  load/store

  Hi Michihiro,

  I wonder if there is no vectorized approach for implementing your
  "bytes_reverse_long_Ex" instruct on ppc.ad. Or did you avoid doing it so
  intentionally?

  > -----Original Message-----
  > From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-
  > bounces at openjdk.java.net] On Behalf Of Michihiro Horie
  > Sent: terça-feira, 2 de maio de 2017 11:47
  > To: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net;
  > volker.simonis at sap.com; martin.doerr at sap.com
  > Subject: 8179527: Implement intrinsic code for reverseBytes with
  > load/store
  >
  > Dear all,
  >
  > Would you please review following change?
  >
  > Bug: https://bugs.openjdk.java.net/browse/JDK-8179527
  > Webrev: http://cr.openjdk.java.net/~horii/8179527/webrev.00/
  >
  > I added new intrinsic code for reverseBytes() in ppc.ad with
  > * match(Set dst (ReverseBytesI/L/US/S (LoadI src)));
  > * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src)));
  >
  >
  > Best regards,
  > --
  > Michihiro,
  > IBM Research - Tokyo











-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20170602/266fcc73/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20170602/266fcc73/graycol-0001.gif>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: reverseBytesBench.tar.gz
Type: application/octet-stream
Size: 1714 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20170602/266fcc73/reverseBytesBench.tar-0001.gz>


More information about the ppc-aix-port-dev mailing list