8179527: Implement intrinsic code for reverseBytes with load/store

Volker Simonis volker.simonis at gmail.com
Fri Jun 2 12:11:17 UTC 2017


Hi Michihiro,

thanks a lot for providing your micro-benchmarks.
So, did you use PrintAssemby/PrintOtpoAssembly to verify that the new nodes
are matched?

Regards,
Volker


On Fri, Jun 2, 2017 at 11:28 AM, Michihiro Horie <HORIE at jp.ibm.com> wrote:

> Volker,
>
> Thank you very much for the 2nd review.
> Attached includes micro benchmarks I used for the verification.
> *(See attached file: reverseBytesBench.tar.gz)*
>
> Best regards,
> --
> Michihiro,
> IBM Research - Tokyo
>
> [image: Inactive hide details for Volker Simonis ---2017/06/02
> 17:04:16---Hi Michihiro, Martin, thanks a lot for the change Michihiro a]Volker
> Simonis ---2017/06/02 17:04:16---Hi Michihiro, Martin, thanks a lot for the
> change Michihiro and thanks for the thorough review
>
> From: Volker Simonis <volker.simonis at gmail.com>
> To: "Doerr, Martin" <martin.doerr at sap.com>
> Cc: Michihiro Horie <HORIE at jp.ibm.com>, Gustavo Bueno Romero <
> gromero at br.ibm.com>, Hiroshi H Horii <HORII at jp.ibm.com>, "
> hotspot-dev at openjdk.java.net" <hotspot-dev at openjdk.java.net>
> Date: 2017/06/02 17:04
> Subject: Re: 8179527: Implement intrinsic code for reverseBytes with
> load/store
>
> ------------------------------
>
>
>
> Hi Michihiro, Martin,
>
> thanks a lot for the change Michihiro and thanks for the thorough review
> Martin :)
>
> I'm just wondering if you had some test-cases to verify if the new nodes
> really match?
>
> Change looks good now.
>
> Regards,
> Volker
>
>
> On Thu, Jun 1, 2017 at 6:00 PM, Doerr, Martin <*martin.doerr at sap.com*
> <martin.doerr at sap.com>> wrote:
>
>    Hi Michihiro,
>
>
>
>    thank you very much for providing the new webrev (with support for
>    older Power processors, which was missing in webrev.02):
>
>    *http://cr.openjdk.java.net/~horii/8179527/webrev.03/*
>    <http://cr.openjdk.java.net/~horii/8179527/webrev.03/>
>
>
>
>    I have built and executed the VM on a Power6 machine. Thanks for
>    supporting it.
>
>    I would have spent only one flag for has_ldbrx/has_stdbrx, but I’m ok
>    with it. The change looks good to me and I will sponsor it after 2nd
>    review.
>
>
>
>    Best regards,
>
>    Martin
>
>
>
>
>
>    *From:* Doerr, Martin
> * Sent:* Mittwoch, 31. Mai 2017 15:54
> * To:* 'Michihiro Horie' <*HORIE at jp.ibm.com* <HORIE at jp.ibm.com>>; Volker
>    Simonis (*volker.simonis at gmail.com* <volker.simonis at gmail.com>) <
>    *volker.simonis at gmail.com* <volker.simonis at gmail.com>>
> * Cc:* Gustavo Bueno Romero <*gromero at br.ibm.com* <gromero at br.ibm.com>>;
>    Hiroshi H Horii <*HORII at jp.ibm.com* <HORII at jp.ibm.com>>;
>    *hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>
>
>    *Subject:* RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>
>
>    Hi Michihiro,
>
>
>
>    thank you for the new webrev with all of my suggestions. Looks good to
>    me.
>
>    There’s only a trailing whitespace which I had to remove to satisfy
>    “hg jcheck” (otherwise push attempt would get rejected).
>
>
>
>    We’ll run tests. Maybe we can get a 2nd review in the meantime?
>
>
>
>    Best regards,
>
>    Martin
>
>
>
>
>
>    *From:* Michihiro Horie [*mailto:HORIE at jp.ibm.com* <HORIE at jp.ibm.com>]
> * Sent:* Mittwoch, 31. Mai 2017 14:36
> * To:* Doerr, Martin <*martin.doerr at sap.com* <martin.doerr at sap.com>>
> * Cc:* Gustavo Bueno Romero <*gromero at br.ibm.com* <gromero at br.ibm.com>>;
>    Hiroshi H Horii <*HORII at jp.ibm.com* <HORII at jp.ibm.com>>;
>    *hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>;
>    Simonis, Volker <*volker.simonis at sap.com* <volker.simonis at sap.com>>;
>    *ppc-aix-port-dev at openjdk.java.net* <ppc-aix-port-dev at openjdk.java.net>
> * Subject:* RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>
>
>    Martin,
>
>    Thank you very much for your helpful comments and sponsoring this
>    change.
>
>    Would you review the latest change?
> *http://cr.openjdk.java.net/~horii/8179527/webrev.02/*
>    <http://cr.openjdk.java.net/~horii/8179527/webrev.02/>
>
>    Best regards,
>    --
>    Michihiro,
>    IBM Research - Tokyo
>
>    [image: Inactive hide details for "Doerr, Martin" ---2017/05/30
>    01:26:01---Hi Michihiro, thanks for the improved webrev. This looks bet]"Doerr,
>    Martin" ---2017/05/30 01:26:01---Hi Michihiro, thanks for the improved
>    webrev. This looks better, but I still have a couple of sugges
>
>    From: "Doerr, Martin" <*martin.doerr at sap.com* <martin.doerr at sap.com>>
>    To: Michihiro Horie <*HORIE at jp.ibm.com* <HORIE at jp.ibm.com>>
>    Cc: "*hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>" <
>    *hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>>,
>    "Simonis, Volker" <*volker.simonis at sap.com* <volker.simonis at sap.com>>,
>    Hiroshi H Horii <*HORII at jp.ibm.com* <HORII at jp.ibm.com>>, Gustavo Bueno
>    Romero <*gromero at br.ibm.com* <gromero at br.ibm.com>>
>    Date: 2017/05/30 01:26
>    Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>    ------------------------------
>
>
>
>
>    Hi Michihiro,
>
>    thanks for the improved webrev. This looks better, but I still have a
>    couple of suggestions.
>
>    1.
>    I still don’t like match rules which contain nodes which do something
>    else (even though direct matching is prohibited by predicate).
>    I think it would be better to remove “match(…)”, “predicate(false)”
>    and “ins_const(…)” and just describe the “effect()”. At least, I’m not
>    aware of why a match rule should be needed for rldicl and extsh.
>
>    2.
>    I’d appreciate if you could remove “predicate(
>    UseCountLeadingZerosInstructionsPPC64)” from all byte_reverse_...
>    rules. They don’t make any sense (not your fault).
>
>    3.
>    The costs seem not to be set appropriately in the byte_reverse_...
>    rules. E.g. instruction count * DEFAULT_COST would be better.
>
>    4.
>    The load/store byte reversed instructions should use the 2 operand
>    form (no explicit 0 for R0 to support assertions).
>
>    Maybe we can find a 2nd reviewer if you provide a new webrev. I can
>    sponsor the change.
>
>    Thanks and best regards,
>    Martin
>
>
> * From:* Michihiro Horie [*mailto:HORIE at jp.ibm.com* <HORIE at jp.ibm.com>]
> * Sent:* Montag, 8. Mai 2017 06:58
> * To:* Doerr, Martin <*martin.doerr at sap.com* <martin.doerr at sap.com>>;
>    Lindenmaier, Goetz <*goetz.lindenmaier at sap.com*
>    <goetz.lindenmaier at sap.com>>
> * Cc:* Gustavo Serra Scalet <*gustavo.scalet at eldorado.org.br*
>    <gustavo.scalet at eldorado.org.br>>; *hotspot-dev at openjdk.java.net*
>    <hotspot-dev at openjdk.java.net>; *ppc-aix-port-dev at openjdk.java.net*
>    <ppc-aix-port-dev at openjdk.java.net>; Simonis, Volker <
>    *volker.simonis at sap.com* <volker.simonis at sap.com>>; Hiroshi H Horii <
>    *HORII at jp.ibm.com* <HORII at jp.ibm.com>>; Gustavo Bueno Romero <
>    *gromero at br.ibm.com* <gromero at br.ibm.com>>
> * Subject:* RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>    Dear Martin, Gustavo,
>
>    Thank you very much for your helpful comments.
>
>    Fixed code is
> *http://cr.openjdk.java.net/~horii/8179527/webrev.01/*
>    <http://cr.openjdk.java.net/~horii/8179527/webrev.01/>
>
>    Dear Goetz,
>    Would you kindly review and sponsor this change?
>    I heard you are a C2 compiler expert and Martin is out for a while.
>
>
>    Best regards,
>    --
>    Michihiro,
>    IBM Research - Tokyo
>
>    [image: Inactive hide details for "Doerr, Martin" ---2017/05/03
>    02:24:18---Hi Michihiro and Gustavo, thank you very much for implementi]"Doerr,
>    Martin" ---2017/05/03 02:24:18---Hi Michihiro and Gustavo, thank you very
>    much for implementing this change.
>
>    From: "Doerr, Martin" <*martin.doerr at sap.com* <martin.doerr at sap.com>>
>    To: Gustavo Serra Scalet <*gustavo.scalet at eldorado.org.br*
>    <gustavo.scalet at eldorado.org.br>>, Michihiro Horie/Japan/IBM at IBMJP
>    Cc: "*ppc-aix-port-dev at openjdk.java.net*
>    <ppc-aix-port-dev at openjdk.java.net>" <
>    *ppc-aix-port-dev at openjdk.java.net* <ppc-aix-port-dev at openjdk.java.net>>,
>    "*hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>" <
>    *hotspot-dev at openjdk.java.net* <hotspot-dev at openjdk.java.net>>,
>    "Simonis, Volker" <*volker.simonis at sap.com* <volker.simonis at sap.com>>
>    Date: 2017/05/03 02:24
>    Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>    ------------------------------
>
>
>
>
>
>    Hi Michihiro and Gustavo,
>
>    thank you very much for implementing this change.
>
>    @Gustavo: Thanks for taking a look.
>    I think that the direct match rules are just there to satisfy
>    match_rule_supported. They don't need to be fast, they are just a fall back
>    solution.
>    The goal is to exploit the byte reverse load and store instructions
>    which should match in more performance critical cases.
>
>    Now my review:
>
>    assembler_ppc.hpp:
>    Looks good except a minor formatting request:
>    LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532 << 1),
>    should be
>    LDBRX_OPCODE = (31u << OPCODE_SHIFT | 532u << 1),
>    to be consistent.
>    The comments // X-FORM should be aligned with the other ones.
>
>    assembler_ppc.inline.hpp:
>    Good.
>
> *ppc.ad* <http://ppc.ad/>:
>    I'm concerned about the additional match rules which are only used for
>    the expand step. They could match directly leading to incorrect code. What
>    they match is not what they do.
>    I suggest to implement the code directly in the ins_encode. This would
>    make the new code significantly shorter and less error prone.
>    I think we don't need to optimize for Power6 anymore and newer
>    processors shouldn't really suffer under a little less optimized
>    instruction scheduling. Would you agree?
>
>    Displacements may be too large for "li" so I suggest to use the
>    "indirect" memory operand and let the compiler handle it. I know that it
>    may increase latency because the compiler will need to insert an addition
>    which could better be matched into the memory operand of the load which is
>    harder to implement (it is possible to match an addition in an operand).
>
>
>    Best regards,
>    Martin
>
>
>    -----Original Message-----
>    From: Gustavo Serra Scalet [*mailto:gustavo.scalet at eldorado.org.br*
>    <gustavo.scalet at eldorado.org.br>]
>    Sent: Dienstag, 2. Mai 2017 17:05
>    To: Michihiro Horie <*HORIE at jp.ibm.com* <HORIE at jp.ibm.com>>
>    Cc: *ppc-aix-port-dev at openjdk.java.net*
>    <ppc-aix-port-dev at openjdk.java.net>; *hotspot-dev at openjdk.java.net*
>    <hotspot-dev at openjdk.java.net>; Simonis, Volker <
>    *volker.simonis at sap.com* <volker.simonis at sap.com>>; Doerr, Martin <
>    *martin.doerr at sap.com* <martin.doerr at sap.com>>
>    Subject: RE: 8179527: Implement intrinsic code for reverseBytes with
>    load/store
>
>    Hi Michihiro,
>
>    I wonder if there is no vectorized approach for implementing your
>    "bytes_reverse_long_Ex" instruct on *ppc.ad* <http://ppc.ad/>. Or did
>    you avoid doing it so intentionally?
>
>    > -----Original Message-----
>    > From: ppc-aix-port-dev [*mailto:ppc-aix-port-dev-*
>    <ppc-aix-port-dev->
>    > *bounces at openjdk.java.net* <bounces at openjdk.java.net>] On Behalf Of
>    Michihiro Horie
>    > Sent: terça-feira, 2 de maio de 2017 11:47
>    > To: *ppc-aix-port-dev at openjdk.java.net*
>    <ppc-aix-port-dev at openjdk.java.net>; *hotspot-dev at openjdk.java.net*
>    <hotspot-dev at openjdk.java.net>;
>    > *volker.simonis at sap.com* <volker.simonis at sap.com>;
>    *martin.doerr at sap.com* <martin.doerr at sap.com>
>    > Subject: 8179527: Implement intrinsic code for reverseBytes with
>    > load/store
>    >
>    > Dear all,
>    >
>    > Would you please review following change?
>    >
>    > Bug: *https://bugs.openjdk.java.net/browse/JDK-8179527*
>    <https://bugs.openjdk.java.net/browse/JDK-8179527>
>    > Webrev: *http://cr.openjdk.java.net/~horii/8179527/webrev.00/*
>    <http://cr.openjdk.java.net/~horii/8179527/webrev.00/>
>    >
>    > I added new intrinsic code for reverseBytes() in *ppc.ad*
>    <http://ppc.ad/> with
>    > * match(Set dst (ReverseBytesI/L/US/S (LoadI src)));
>    > * match(Set dst (StoreI dst (ReverseBytesI/L/US/S src)));
>    >
>    >
>    > Best regards,
>    > --
>    > Michihiro,
>    > IBM Research - Tokyo
>
>
>
>
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20170602/cac3bf18/attachment-0001.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20170602/cac3bf18/graycol-0001.gif>


More information about the ppc-aix-port-dev mailing list