RFR(M): 8158232: PPC64: improve byte, int and long array copy stubs by using VSX instructions

Michihiro Horie HORIE at jp.ibm.com
Thu Jun 2 14:52:45 UTC 2016


Hi Martin,

Thank you for your profound comments, also uploading a new webrev.

>Please check for trailing whitespaces in the future. hg jcheck complains
about them.
I would check with "hg jcheck", thank you for the information.

Best regards,
--
Michi-hiro
IBM Research - Tokyo



From:	"Doerr, Martin" <martin.doerr at sap.com>
To:	Michihiro Horie/Japan/IBM at IBMJP, Gustavo Romero
            <gromero at linux.vnet.ibm.com>, "Lindenmaier, Goetz"
            <goetz.lindenmaier at sap.com>
Cc:	"ppc-aix-port-dev at openjdk.java.net"
            <ppc-aix-port-dev at openjdk.java.net>,
            "hotspot-dev at openjdk.java.net" <hotspot-dev at openjdk.java.net>
Date:	2016/06/02 18:35
Subject:	RE: RFR(M): 8158232: PPC64: improve byte,	int and long array
            copy stubs by using VSX instructions



Hi,

thanks for the update. New webrev is here:
http://cr.openjdk.java.net/~mdoerr/8158232_PPC_vsx_copy/webrev.01/

I will test and sponsor it if everybody is fine with it.

Please check for trailing whitespaces in the future. hg jcheck complains
about them.

I have an additional comment on the loop alignment:
32 byte alignment is also important for older processors (before Power 8).
In addition, 16 byte alignment may cause an instruction fetch penalty when
the 32 byte block crosses a cache line.
So I think just using 32 byte alignment is the right choice.

Best regards,
Martin


From: ppc-aix-port-dev [mailto:ppc-aix-port-dev-bounces at openjdk.java.net]
On Behalf Of Michihiro Horie
Sent: Donnerstag, 2. Juni 2016 04:30
To: Gustavo Romero <gromero at linux.vnet.ibm.com>
Cc: ppc-aix-port-dev at openjdk.java.net; hotspot-dev at openjdk.java.net
Subject: Re: RFR(M): 8158232: PPC64: improve byte, int and long array copy
stubs by using VSX instructions



Hi Gustavo,

Thanks a lot your your detailed comments and reference. I followed your
short arraycopy implementation, which was very helpful for my
understanding.

Patch for Java 9:
(See attached file: hotspot_jdk9_hscomp.diff)

Best regards,
--
Michihiro Horie,
IBM Research - Tokyo

Inactive hide details for Gustavo Romero ---2016/06/02 07:13:04---Hi
Michihiro A few things that come to my mind that could helGustavo Romero
---2016/06/02 07:13:04---Hi Michihiro A few things that come to my mind
that could help address the questions

From: Gustavo Romero <gromero at linux.vnet.ibm.com>
To: Michihiro Horie/Japan/IBM at IBMJP, hotspot-dev at openjdk.java.net,
ppc-aix-port-dev at openjdk.java.net
Date: 2016/06/02 07:13
Subject: Re: RFR(M): 8158232: PPC64: improve byte, int and long array copy
stubs by using VSX instructions




Hi Michihiro

A few things that come to my mind that could help address the questions
raised by Goetz:

* I could not see, when implementing the short case, any gain by
unrolling the tight loop;
* I could see that setting an aggressive prefetch did help a lot;
* I think that aligning the backbranch target at 16-byte at least is
the right thing to do, since according to [1]:

"Instructions read out of the I-cache are forwarded to the IBuffer as a
staging area for group formation. The IBuffer is arranged as a register
file where each row can hold up to four instructions (16-byte aligned
from the I-cache)"

And a nit: add space, add upper case 'C', fix typo in "byte", and add an
ending dot on:

//copy 16 elements (total 128 byte) a time

Regards,
Gustavo

[1] POWER8 Processor User’s Manual for the Single-Chip Module,
10 March 2015, Version 1.11, p. 207, section 10.1.6.


On 31-05-2016 12:36, Michihiro Horie wrote:
>
> Dear all,
>
> Could you please review the following webrev?
>
> http://cr.openjdk.java.net/~mdoerr/8158232_PPC_vsx_copy/webrev.00/
>
> This change improves performance of disjoint arraycopy of byte, int, and
> long by using VSX load/store instructions.
>
> Discussion started from:
>
http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/2016-May/002483.html

>
> Performance improvement with micro benchmarks is shown in:
>
http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/2016-May/002531.html

>
> Thank you very much,
>
> Best regards,
> --
> Michihiro Horie,
> IBM Research - Tokyo
>


-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20160602/b2a82a74/attachment.html>
-------------- next part --------------
A non-text attachment was scrubbed...
Name: graycol.gif
Type: image/gif
Size: 105 bytes
Desc: not available
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20160602/b2a82a74/graycol.gif>


More information about the ppc-aix-port-dev mailing list