From HORIE at jp.ibm.com Thu Apr 2 14:27:10 2020 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Thu, 2 Apr 2020 23:27:10 +0900 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> References: <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: Hi Corey, I?m not a reviewer, but I can run your benchmark in my local P9 node if you share it. Best regards, Michihiro ----- Original message ----- From: Corey Ashford Sent by: "hotspot-compiler-dev" To: hotspot-compiler-dev at openjdk.java.net Cc: Subject: [EXTERNAL] RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Date: Tue, Mar 31, 2020 7:52 AM Hello, This is my first OpenJDK patch for review. It increases the performance of byte reversal for Integer.reverseBytes() and Long.reverseBytes() on Power9 via its VSX xxbrw and xxbrd vector instructions. https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8241874&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=Q0ug0imG7nRw-N8m1U0RobPS3M9D2mmT8nY3GnID3io&s=TXqhnYzhTVyILKGJBOpWSmqe-iP6ixmCAqwxYT19K8E&e= https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Egromero_8241874_v1_&d=DwICaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=Q0ug0imG7nRw-N8m1U0RobPS3M9D2mmT8nY3GnID3io&s=1elFXKQoR_CB9mG6g4TM0z5-Da27XveB77RBXKwQi3I&e= I have tested on Power9 and see a 38%+ performance improvement on Long.reverseBytes() and 15%+ on Integer.reverseBytes(). (I add the + because the benchmark code has a fair amount of fixed overhead). Testing on Power8 reveals no regressions. I believe the patch itself is pretty self-explanatory. It adds definitions for four instructions that are needed to get the data in and out of the vector registers, and to perform the reversal operation, and it adds the instructs to use them. Also VM_Version::initialize() autodetects that the instructions are available, and warns for trying to set the UseVectorByteReverseInstructionsPPC64 flag on earlier Power processors that don't possess these PowerISA 3.0 instructions. Thanks to Michihiro Horie, Jose Ricardo Ziviani, and Gustav Romero for their help! Please review this patch. Thanks for your consideration, Corey Ashford -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjashfor at linux.ibm.com Thu Apr 2 23:07:31 2020 From: cjashfor at linux.ibm.com (Corey Ashford) Date: Thu, 2 Apr 2020 16:07:31 -0700 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: References: <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> On 4/2/20 7:27 AM, Michihiro Horie wrote: > Hi Corey, > > I?m not a reviewer, but I can run your benchmark in my local P9 node if > you share it. > > Best regards, > Michihiro The tests are somewhat hokey; I added the shifts to keep the compiler from hoisting the code that it could predetermine the result. Here's the one for Long.reverseBytes(): import java.lang.*; class ReverseLong { public static void main(String args[]) { long reversed, re_reversed; long accum = 0; long orig = 0x1122334455667788L; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Long.reverseBytes(orig); re_reversed = Long.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%16x", orig) + " Re-reversed: " + String.format("%16x", re_reversed)); } accum += orig; orig = Long.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Long.toString(accum)); } } And the one for Integer.reverseBytes(): import java.lang.*; class ReverseInt { public static void main(String args[]) { int reversed, re_reversed; int orig = 0x11223344; int accum = 0; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Integer.reverseBytes(orig); re_reversed = Integer.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%08x", orig) + " Re-reversed: " + String.format("%08x", re_reversed)); } accum += orig; orig = Integer.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Integer.toString(accum)); } } From HORIE at jp.ibm.com Fri Apr 10 08:47:42 2020 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Fri, 10 Apr 2020 17:47:42 +0900 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com>, <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: Hi Corey, Thank you for sharing your benchmarks. I confirmed your change reduced the elapsed time of the benchmarks by more than 30% on my P9 node. Also, I checked JTREG results, which look no problem. BTW, I cannot find further points of improvement in your change. Best regards, Michihiro ----- Original message ----- From: "Corey Ashford" To: Michihiro Horie/Japan/IBM at IBMJP Cc: hotspot-compiler-dev at openjdk.java.net, ppc-aix-port-dev at openjdk.java.net, "Gustavo Romero" Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Date: Fri, Apr 3, 2020 8:07 AM On 4/2/20 7:27 AM, Michihiro Horie wrote: > Hi Corey, > > I?m not a reviewer, but I can run your benchmark in my local P9 node if > you share it. > > Best regards, > Michihiro The tests are somewhat hokey; I added the shifts to keep the compiler from hoisting the code that it could predetermine the result. Here's the one for Long.reverseBytes(): import java.lang.*; class ReverseLong { public static void main(String args[]) { long reversed, re_reversed; long accum = 0; long orig = 0x1122334455667788L; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Long.reverseBytes(orig); re_reversed = Long.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%16x", orig) + " Re-reversed: " + String.format("%16x", re_reversed)); } accum += orig; orig = Long.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Long.toString(accum)); } } And the one for Integer.reverseBytes(): import java.lang.*; class ReverseInt { public static void main(String args[]) { int reversed, re_reversed; int orig = 0x11223344; int accum = 0; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Integer.reverseBytes(orig); re_reversed = Integer.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%08x", orig) + " Re-reversed: " + String.format("%08x", re_reversed)); } accum += orig; orig = Integer.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Integer.toString(accum)); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Tue Apr 14 13:26:08 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 14 Apr 2020 13:26:08 +0000 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com>, <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: Hi Corey, thanks for contributing it. Looks good to me. I?ll run it through our testing and let you know about the results. Best regards, Martin From: ppc-aix-port-dev On Behalf Of Michihiro Horie Sent: Freitag, 10. April 2020 10:48 To: cjashfor at linux.ibm.com Cc: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Hi Corey, Thank you for sharing your benchmarks. I confirmed your change reduced the elapsed time of the benchmarks by more than 30% on my P9 node. Also, I checked JTREG results, which look no problem. BTW, I cannot find further points of improvement in your change. Best regards, Michihiro ----- Original message ----- From: "Corey Ashford" > To: Michihiro Horie/Japan/IBM at IBMJP Cc: hotspot-compiler-dev at openjdk.java.net, ppc-aix-port-dev at openjdk.java.net, "Gustavo Romero" > Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Date: Fri, Apr 3, 2020 8:07 AM On 4/2/20 7:27 AM, Michihiro Horie wrote: > Hi Corey, > > I?m not a reviewer, but I can run your benchmark in my local P9 node if > you share it. > > Best regards, > Michihiro The tests are somewhat hokey; I added the shifts to keep the compiler from hoisting the code that it could predetermine the result. Here's the one for Long.reverseBytes(): import java.lang.*; class ReverseLong { public static void main(String args[]) { long reversed, re_reversed; long accum = 0; long orig = 0x1122334455667788L; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Long.reverseBytes(orig); re_reversed = Long.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%16x", orig) + " Re-reversed: " + String.format("%16x", re_reversed)); } accum += orig; orig = Long.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Long.toString(accum)); } } And the one for Integer.reverseBytes(): import java.lang.*; class ReverseInt { public static void main(String args[]) { int reversed, re_reversed; int orig = 0x11223344; int accum = 0; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Integer.reverseBytes(orig); re_reversed = Integer.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%08x", orig) + " Re-reversed: " + String.format("%08x", re_reversed)); } accum += orig; orig = Integer.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Integer.toString(accum)); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Tue Apr 14 14:07:06 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Tue, 14 Apr 2020 14:07:06 +0000 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Message-ID: Hi, I'd like to resolve a very old PPC64 issue: https://bugs.openjdk.java.net/browse/JDK-8151030 There's code for AllocatePrefetchStyle=4 which is not an accepted option. It was used for a special experimental prefetch mode using dcbz instructions to combine prefetching and zeroing in the TLABs. However, this code was never contributed and there are no plans to work on it. So I'd like to simply remove this small part of it. In addition to that, AllocatePrefetchLines is currently set to 3 by default which doesn't make sense to me. PPC64 has an automatic prefetch engine and executing several prefetch instructions for succeeding cache lines doesn't seem to be beneficial at all. So I'm setting it to 1 by default. I couldn't observe regressions on Power7, Power8 and Power9. Webrev: http://cr.openjdk.java.net/~mdoerr/8151030_ppc_prefetch/webrev.00/ Please review. If somebody from IBM would like to check performance impact of changing the AllocatePrefetchLines + Distance, I'll be glad to receive feedback. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From martin.doerr at sap.com Wed Apr 15 12:33:16 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Wed, 15 Apr 2020 12:33:16 +0000 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com>, <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: Hi again, testing didn?t show any new issues. Only the copyright years should get updated before pushing. Is there already a sponsor or do you want me to push it? Best regards, Martin From: Doerr, Martin Sent: Dienstag, 14. April 2020 15:26 To: Michihiro Horie ; cjashfor at linux.ibm.com Cc: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: RE: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Hi Corey, thanks for contributing it. Looks good to me. I?ll run it through our testing and let you know about the results. Best regards, Martin From: ppc-aix-port-dev > On Behalf Of Michihiro Horie Sent: Freitag, 10. April 2020 10:48 To: cjashfor at linux.ibm.com Cc: hotspot-compiler-dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Hi Corey, Thank you for sharing your benchmarks. I confirmed your change reduced the elapsed time of the benchmarks by more than 30% on my P9 node. Also, I checked JTREG results, which look no problem. BTW, I cannot find further points of improvement in your change. Best regards, Michihiro ----- Original message ----- From: "Corey Ashford" > To: Michihiro Horie/Japan/IBM at IBMJP Cc: hotspot-compiler-dev at openjdk.java.net, ppc-aix-port-dev at openjdk.java.net, "Gustavo Romero" > Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Date: Fri, Apr 3, 2020 8:07 AM On 4/2/20 7:27 AM, Michihiro Horie wrote: > Hi Corey, > > I?m not a reviewer, but I can run your benchmark in my local P9 node if > you share it. > > Best regards, > Michihiro The tests are somewhat hokey; I added the shifts to keep the compiler from hoisting the code that it could predetermine the result. Here's the one for Long.reverseBytes(): import java.lang.*; class ReverseLong { public static void main(String args[]) { long reversed, re_reversed; long accum = 0; long orig = 0x1122334455667788L; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Long.reverseBytes(orig); re_reversed = Long.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%16x", orig) + " Re-reversed: " + String.format("%16x", re_reversed)); } accum += orig; orig = Long.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Long.toString(accum)); } } And the one for Integer.reverseBytes(): import java.lang.*; class ReverseInt { public static void main(String args[]) { int reversed, re_reversed; int orig = 0x11223344; int accum = 0; long start = System.currentTimeMillis(); for (int i = 0; i < 1_000_000_000; i++) { // Try to keep java from figuring out stuff in advance reversed = Integer.reverseBytes(orig); re_reversed = Integer.reverseBytes(reversed); if (re_reversed != orig) { System.out.println("Orig: " + String.format("%08x", orig) + " Re-reversed: " + String.format("%08x", re_reversed)); } accum += orig; orig = Integer.rotateRight(orig, 3); } System.out.println("Elapsed time: " + Long.toString(System.currentTimeMillis() - start)); System.out.println("accum: " + Integer.toString(accum)); } } -------------- next part -------------- An HTML attachment was scrubbed... URL: From cjashfor at linux.ibm.com Thu Apr 16 01:34:46 2020 From: cjashfor at linux.ibm.com (Corey Ashford) Date: Wed, 15 Apr 2020 18:34:46 -0700 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> Message-ID: <1964be00-8926-7a70-d23a-2f7e85eb4ef3@linux.ibm.com> Hello Martin, I'm having some trouble with my email server, so I'm having to reply to your earlier post, but I saw your most recent post on the mailing list archive. Thanks for reviewing and testing this patch. I went to look at the copyright dates, and see two date ranges: one for Oracle and its affiliates, and another for SAP. In the files I looked at, the end date wasn't the same between the two. Which one (or both) should I modify? Thanks, - Corey On 4/14/20 6:26 AM, Doerr, Martin wrote: > Hi Corey, > > thanks for contributing it. Looks good to me. I?ll run it through our > testing and let you know about the results. > > Best regards, > > Martin > > *From:*ppc-aix-port-dev *On > Behalf Of *Michihiro Horie > *Sent:* Freitag, 10. April 2020 10:48 > *To:* cjashfor at linux.ibm.com > *Cc:* hotspot-compiler-dev at openjdk.java.net; > ppc-aix-port-dev at openjdk.java.net > *Subject:* Re: RFR[S]:8241874 [PPC64] Improve performance of > Long.reverseBytes() and Integer.reverseBytes() on Power9 > > Hi Corey, > > Thank you for sharing your benchmarks. I confirmed your change reduced > the elapsed time of the benchmarks by more than 30% on my P9 node. Also, > I checked JTREG results, which look no problem. > > BTW, I cannot find further points of improvement in your change. > > Best regards, > Michihiro > > > ----- Original message ----- > From: "Corey Ashford" > > To: Michihiro Horie/Japan/IBM at IBMJP > Cc: hotspot-compiler-dev at openjdk.java.net > , > ppc-aix-port-dev at openjdk.java.net > , "Gustavo Romero" > > > Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of > Long.reverseBytes() and Integer.reverseBytes() on Power9 > Date: Fri, Apr 3, 2020 8:07 AM > > On 4/2/20 7:27 AM, Michihiro Horie wrote: >> Hi Corey, >> >> I?m not a reviewer, but I can run your benchmark in my local P9 node if >> you share it. >> >> Best regards, >> Michihiro > > The tests are somewhat hokey; I added the shifts to keep the compiler > from hoisting the code that it could predetermine the result. > > Here's the one for Long.reverseBytes(): > > import java.lang.*; > > class ReverseLong > { > ? ? ?public static void main(String args[]) > ? ? ?{ > ? ? ? ? ?long reversed, re_reversed; > long accum = 0; > long orig = 0x1122334455667788L; > long start = System.currentTimeMillis(); > for (int i = 0; i < 1_000_000_000; i++) { > // Try to keep java from figuring out stuff in advance > reversed = Long.reverseBytes(orig); > re_reversed = Long.reverseBytes(reversed); > if (re_reversed != orig) { > ? ? ? ? ?System.out.println("Orig: " + String.format("%16x", orig) + > " ?Re-reversed: " + String.format("%16x", re_reversed)); > } > accum += orig; > orig = Long.rotateRight(orig, 3); > } > System.out.println("Elapsed time: " + > Long.toString(System.currentTimeMillis() - start)); > System.out.println("accum: " + Long.toString(accum)); > ? ? ?} > } > > > And the one for Integer.reverseBytes(): > > import java.lang.*; > > class ReverseInt > { > ? ? ?public static void main(String args[]) > ? ? ?{ > ? ? ? ? ?int reversed, re_reversed; > int orig = 0x11223344; > int accum = 0; > long start = System.currentTimeMillis(); > for (int i = 0; i < 1_000_000_000; i++) { > // Try to keep java from figuring out stuff in advance > reversed = Integer.reverseBytes(orig); > re_reversed = Integer.reverseBytes(reversed); > if (re_reversed != orig) { > ? ? ? ? ?System.out.println("Orig: " + String.format("%08x", orig) + > " ?Re-reversed: " + String.format("%08x", re_reversed)); > } > accum += orig; > orig = Integer.rotateRight(orig, 3); > } > System.out.println("Elapsed time: " + > Long.toString(System.currentTimeMillis() - start)); > System.out.println("accum: " + Integer.toString(accum)); > ? ? ?} > } > From martin.doerr at sap.com Thu Apr 16 08:08:24 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Thu, 16 Apr 2020 08:08:24 +0000 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: <1964be00-8926-7a70-d23a-2f7e85eb4ef3@linux.ibm.com> References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> <1964be00-8926-7a70-d23a-2f7e85eb4ef3@linux.ibm.com> Message-ID: Hi Corey, please use 2020 for both, the Oracle and the SAP copyright. Usually, both should be the same, but some people forget to update one of them. Best regards, Martin > -----Original Message----- > From: Corey Ashford > Sent: Donnerstag, 16. April 2020 03:35 > To: Doerr, Martin > Cc: Michihiro Horie ; hotspot-compiler- > dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of > Long.reverseBytes() and Integer.reverseBytes() on Power9 > > Hello Martin, > > I'm having some trouble with my email server, so I'm having to reply to > your earlier post, but I saw your most recent post on the mailing list > archive. > > Thanks for reviewing and testing this patch. I went to look at the > copyright dates, and see two date ranges: one for Oracle and its > affiliates, and another for SAP. In the files I looked at, the end date > wasn't the same between the two. Which one (or both) should I modify? > > Thanks, > > - Corey > > On 4/14/20 6:26 AM, Doerr, Martin wrote: > > Hi Corey, > > > > thanks for contributing it. Looks good to me. I?ll run it through our > > testing and let you know about the results. > > > > Best regards, > > > > Martin > > > > *From:*ppc-aix-port-dev > *On > > Behalf Of *Michihiro Horie > > *Sent:* Freitag, 10. April 2020 10:48 > > *To:* cjashfor at linux.ibm.com > > *Cc:* hotspot-compiler-dev at openjdk.java.net; > > ppc-aix-port-dev at openjdk.java.net > > *Subject:* Re: RFR[S]:8241874 [PPC64] Improve performance of > > Long.reverseBytes() and Integer.reverseBytes() on Power9 > > > > Hi Corey, > > > > Thank you for sharing your benchmarks. I confirmed your change reduced > > the elapsed time of the benchmarks by more than 30% on my P9 node. > Also, > > I checked JTREG results, which look no problem. > > > > BTW, I cannot find further points of improvement in your change. > > > > Best regards, > > Michihiro > > > > > > ----- Original message ----- > > From: "Corey Ashford" > > > > To: Michihiro Horie/Japan/IBM at IBMJP > > Cc: hotspot-compiler-dev at openjdk.java.net > > , > > ppc-aix-port-dev at openjdk.java.net > > , "Gustavo Romero" > > > > > Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of > > Long.reverseBytes() and Integer.reverseBytes() on Power9 > > Date: Fri, Apr 3, 2020 8:07 AM > > > > On 4/2/20 7:27 AM, Michihiro Horie wrote: > >> Hi Corey, > >> > >> I?m not a reviewer, but I can run your benchmark in my local P9 node if > >> you share it. > >> > >> Best regards, > >> Michihiro > > > > The tests are somewhat hokey; I added the shifts to keep the compiler > > from hoisting the code that it could predetermine the result. > > > > Here's the one for Long.reverseBytes(): > > > > import java.lang.*; > > > > class ReverseLong > > { > > ? ? ?public static void main(String args[]) > > ? ? ?{ > > ? ? ? ? ?long reversed, re_reversed; > > long accum = 0; > > long orig = 0x1122334455667788L; > > long start = System.currentTimeMillis(); > > for (int i = 0; i < 1_000_000_000; i++) { > > // Try to keep java from figuring out stuff in advance > > reversed = Long.reverseBytes(orig); > > re_reversed = Long.reverseBytes(reversed); > > if (re_reversed != orig) { > > ? ? ? ? ?System.out.println("Orig: " + String.format("%16x", orig) + > > " ?Re-reversed: " + String.format("%16x", re_reversed)); > > } > > accum += orig; > > orig = Long.rotateRight(orig, 3); > > } > > System.out.println("Elapsed time: " + > > Long.toString(System.currentTimeMillis() - start)); > > System.out.println("accum: " + Long.toString(accum)); > > ? ? ?} > > } > > > > > > And the one for Integer.reverseBytes(): > > > > import java.lang.*; > > > > class ReverseInt > > { > > ? ? ?public static void main(String args[]) > > ? ? ?{ > > ? ? ? ? ?int reversed, re_reversed; > > int orig = 0x11223344; > > int accum = 0; > > long start = System.currentTimeMillis(); > > for (int i = 0; i < 1_000_000_000; i++) { > > // Try to keep java from figuring out stuff in advance > > reversed = Integer.reverseBytes(orig); > > re_reversed = Integer.reverseBytes(reversed); > > if (re_reversed != orig) { > > ? ? ? ? ?System.out.println("Orig: " + String.format("%08x", orig) + > > " ?Re-reversed: " + String.format("%08x", re_reversed)); > > } > > accum += orig; > > orig = Integer.rotateRight(orig, 3); > > } > > System.out.println("Elapsed time: " + > > Long.toString(System.currentTimeMillis() - start)); > > System.out.println("accum: " + Integer.toString(accum)); > > ? ? ?} > > } > > From cjashfor at linux.ibm.com Mon Apr 20 20:39:33 2020 From: cjashfor at linux.ibm.com (Corey Ashford) Date: Mon, 20 Apr 2020 13:39:33 -0700 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: References: <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> <1964be00-8926-7a70-d23a-2f7e85eb4ef3@linux.ibm.com> Message-ID: <13786032-d4e9-9682-5cd7-698ceb4f8c00@linux.ibm.com> Hi Martin, Sorry for the delay on getting the copyright changes in (I work half time). Here's the revised patch, with all copyright dates set to 2020: https://bugs.openjdk.java.net/browse/JDK-8241874 http://cr.openjdk.java.net/~gromero/8241874/v2/ Thanks for your consideration, - Corey On 4/16/20 1:08 AM, Doerr, Martin wrote: > Hi Corey, > > please use 2020 for both, the Oracle and the SAP copyright. > Usually, both should be the same, but some people forget to update one of them. > > Best regards, > Martin > > >> -----Original Message----- >> From: Corey Ashford >> Sent: Donnerstag, 16. April 2020 03:35 >> To: Doerr, Martin >> Cc: Michihiro Horie ; hotspot-compiler- >> dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net >> Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of >> Long.reverseBytes() and Integer.reverseBytes() on Power9 >> >> Hello Martin, >> >> I'm having some trouble with my email server, so I'm having to reply to >> your earlier post, but I saw your most recent post on the mailing list >> archive. >> >> Thanks for reviewing and testing this patch. I went to look at the >> copyright dates, and see two date ranges: one for Oracle and its >> affiliates, and another for SAP. In the files I looked at, the end date >> wasn't the same between the two. Which one (or both) should I modify? >> >> Thanks, >> >> - Corey >> >> On 4/14/20 6:26 AM, Doerr, Martin wrote: >>> Hi Corey, >>> >>> thanks for contributing it. Looks good to me. I?ll run it through our >>> testing and let you know about the results. >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*ppc-aix-port-dev >> *On >>> Behalf Of *Michihiro Horie >>> *Sent:* Freitag, 10. April 2020 10:48 >>> *To:* cjashfor at linux.ibm.com >>> *Cc:* hotspot-compiler-dev at openjdk.java.net; >>> ppc-aix-port-dev at openjdk.java.net >>> *Subject:* Re: RFR[S]:8241874 [PPC64] Improve performance of >>> Long.reverseBytes() and Integer.reverseBytes() on Power9 >>> >>> Hi Corey, >>> >>> Thank you for sharing your benchmarks. I confirmed your change reduced >>> the elapsed time of the benchmarks by more than 30% on my P9 node. >> Also, >>> I checked JTREG results, which look no problem. >>> >>> BTW, I cannot find further points of improvement in your change. >>> >>> Best regards, >>> Michihiro >>> >>> >>> ----- Original message ----- >>> From: "Corey Ashford" >> > >>> To: Michihiro Horie/Japan/IBM at IBMJP >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> , >>> ppc-aix-port-dev at openjdk.java.net >>> , "Gustavo Romero" >>> > >>> Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of >>> Long.reverseBytes() and Integer.reverseBytes() on Power9 >>> Date: Fri, Apr 3, 2020 8:07 AM >>> >>> On 4/2/20 7:27 AM, Michihiro Horie wrote: >>>> Hi Corey, >>>> >>>> I?m not a reviewer, but I can run your benchmark in my local P9 node if >>>> you share it. >>>> >>>> Best regards, >>>> Michihiro >>> >>> The tests are somewhat hokey; I added the shifts to keep the compiler >>> from hoisting the code that it could predetermine the result. >>> >>> Here's the one for Long.reverseBytes(): >>> >>> import java.lang.*; >>> >>> class ReverseLong >>> { >>> ? ? ?public static void main(String args[]) >>> ? ? ?{ >>> ? ? ? ? ?long reversed, re_reversed; >>> long accum = 0; >>> long orig = 0x1122334455667788L; >>> long start = System.currentTimeMillis(); >>> for (int i = 0; i < 1_000_000_000; i++) { >>> // Try to keep java from figuring out stuff in advance >>> reversed = Long.reverseBytes(orig); >>> re_reversed = Long.reverseBytes(reversed); >>> if (re_reversed != orig) { >>> ? ? ? ? ?System.out.println("Orig: " + String.format("%16x", orig) + >>> " ?Re-reversed: " + String.format("%16x", re_reversed)); >>> } >>> accum += orig; >>> orig = Long.rotateRight(orig, 3); >>> } >>> System.out.println("Elapsed time: " + >>> Long.toString(System.currentTimeMillis() - start)); >>> System.out.println("accum: " + Long.toString(accum)); >>> ? ? ?} >>> } >>> >>> >>> And the one for Integer.reverseBytes(): >>> >>> import java.lang.*; >>> >>> class ReverseInt >>> { >>> ? ? ?public static void main(String args[]) >>> ? ? ?{ >>> ? ? ? ? ?int reversed, re_reversed; >>> int orig = 0x11223344; >>> int accum = 0; >>> long start = System.currentTimeMillis(); >>> for (int i = 0; i < 1_000_000_000; i++) { >>> // Try to keep java from figuring out stuff in advance >>> reversed = Integer.reverseBytes(orig); >>> re_reversed = Integer.reverseBytes(reversed); >>> if (re_reversed != orig) { >>> ? ? ? ? ?System.out.println("Orig: " + String.format("%08x", orig) + >>> " ?Re-reversed: " + String.format("%08x", re_reversed)); >>> } >>> accum += orig; >>> orig = Integer.rotateRight(orig, 3); >>> } >>> System.out.println("Elapsed time: " + >>> Long.toString(System.currentTimeMillis() - start)); >>> System.out.println("accum: " + Integer.toString(accum)); >>> ? ? ?} >>> } >>> > From HORIE at jp.ibm.com Tue Apr 21 13:21:32 2020 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 21 Apr 2020 22:21:32 +0900 Subject: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 In-Reply-To: <13786032-d4e9-9682-5cd7-698ceb4f8c00@linux.ibm.com> References: <13786032-d4e9-9682-5cd7-698ceb4f8c00@linux.ibm.com>, <67fa8056-a8ed-cdfc-1e5a-d36b49c4af18@linux.ibm.com> <0079874e-7bc2-5ff4-f004-337c718ec6df@linux.ibm.com> <1964be00-8926-7a70-d23a-2f7e85eb4ef3@linux.ibm.com> Message-ID: Hi Corey, Martin, I confirmed the latest webrev fixes copyright year properly, so the change looks ready to be pushed. I will push the change my tomorrow. Best regards, Michihiro ----- Original message ----- From: "Corey Ashford" To: "Doerr, Martin" Cc: Michihiro Horie/Japan/IBM at IBMJP, "hotspot-compiler-dev at openjdk.java.net" , "ppc-aix-port-dev at openjdk.java.net" Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of Long.reverseBytes() and Integer.reverseBytes() on Power9 Date: Tue, Apr 21, 2020 5:39 AM Hi Martin, Sorry for the delay on getting the copyright changes in (I work half time). Here's the revised patch, with all copyright dates set to 2020: https://bugs.openjdk.java.net/browse/JDK-8241874 http://cr.openjdk.java.net/~gromero/8241874/v2/ Thanks for your consideration, - Corey On 4/16/20 1:08 AM, Doerr, Martin wrote: > Hi Corey, > > please use 2020 for both, the Oracle and the SAP copyright. > Usually, both should be the same, but some people forget to update one of them. > > Best regards, > Martin > > >> -----Original Message----- >> From: Corey Ashford >> Sent: Donnerstag, 16. April 2020 03:35 >> To: Doerr, Martin >> Cc: Michihiro Horie ; hotspot-compiler- >> dev at openjdk.java.net; ppc-aix-port-dev at openjdk.java.net >> Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of >> Long.reverseBytes() and Integer.reverseBytes() on Power9 >> >> Hello Martin, >> >> I'm having some trouble with my email server, so I'm having to reply to >> your earlier post, but I saw your most recent post on the mailing list >> archive. >> >> Thanks for reviewing and testing this patch. I went to look at the >> copyright dates, and see two date ranges: one for Oracle and its >> affiliates, and another for SAP. In the files I looked at, the end date >> wasn't the same between the two. Which one (or both) should I modify? >> >> Thanks, >> >> - Corey >> >> On 4/14/20 6:26 AM, Doerr, Martin wrote: >>> Hi Corey, >>> >>> thanks for contributing it. Looks good to me. I?ll run it through our >>> testing and let you know about the results. >>> >>> Best regards, >>> >>> Martin >>> >>> *From:*ppc-aix-port-dev >> *On >>> Behalf Of *Michihiro Horie >>> *Sent:* Freitag, 10. April 2020 10:48 >>> *To:* cjashfor at linux.ibm.com >>> *Cc:* hotspot-compiler-dev at openjdk.java.net; >>> ppc-aix-port-dev at openjdk.java.net >>> *Subject:* Re: RFR[S]:8241874 [PPC64] Improve performance of >>> Long.reverseBytes() and Integer.reverseBytes() on Power9 >>> >>> Hi Corey, >>> >>> Thank you for sharing your benchmarks. I confirmed your change reduced >>> the elapsed time of the benchmarks by more than 30% on my P9 node. >> Also, >>> I checked JTREG results, which look no problem. >>> >>> BTW, I cannot find further points of improvement in your change. >>> >>> Best regards, >>> Michihiro >>> >>> >>> ----- Original message ----- >>> From: "Corey Ashford" >> > >>> To: Michihiro Horie/Japan/IBM at IBMJP >>> Cc: hotspot-compiler-dev at openjdk.java.net >>> , >>> ppc-aix-port-dev at openjdk.java.net >>> , "Gustavo Romero" >>> > >>> Subject: Re: RFR[S]:8241874 [PPC64] Improve performance of >>> Long.reverseBytes() and Integer.reverseBytes() on Power9 >>> Date: Fri, Apr 3, 2020 8:07 AM >>> >>> On 4/2/20 7:27 AM, Michihiro Horie wrote: >>>> Hi Corey, >>>> >>>> I?m not a reviewer, but I can run your benchmark in my local P9 node if >>>> you share it. >>>> >>>> Best regards, >>>> Michihiro >>> >>> The tests are somewhat hokey; I added the shifts to keep the compiler >>> from hoisting the code that it could predetermine the result. >>> >>> Here's the one for Long.reverseBytes(): >>> >>> import java.lang.*; >>> >>> class ReverseLong >>> { >>> public static void main(String args[]) >>> { >>> long reversed, re_reversed; >>> long accum = 0; >>> long orig = 0x1122334455667788L; >>> long start = System.currentTimeMillis(); >>> for (int i = 0; i < 1_000_000_000; i++) { >>> // Try to keep java from figuring out stuff in advance >>> reversed = Long.reverseBytes(orig); >>> re_reversed = Long.reverseBytes(reversed); >>> if (re_reversed != orig) { >>> System.out.println("Orig: " + String.format("%16x", orig) + >>> " Re-reversed: " + String.format("%16x", re_reversed)); >>> } >>> accum += orig; >>> orig = Long.rotateRight(orig, 3); >>> } >>> System.out.println("Elapsed time: " + >>> Long.toString(System.currentTimeMillis() - start)); >>> System.out.println("accum: " + Long.toString(accum)); >>> } >>> } >>> >>> >>> And the one for Integer.reverseBytes(): >>> >>> import java.lang.*; >>> >>> class ReverseInt >>> { >>> public static void main(String args[]) >>> { >>> int reversed, re_reversed; >>> int orig = 0x11223344; >>> int accum = 0; >>> long start = System.currentTimeMillis(); >>> for (int i = 0; i < 1_000_000_000; i++) { >>> // Try to keep java from figuring out stuff in advance >>> reversed = Integer.reverseBytes(orig); >>> re_reversed = Integer.reverseBytes(reversed); >>> if (re_reversed != orig) { >>> System.out.println("Orig: " + String.format("%08x", orig) + >>> " Re-reversed: " + String.format("%08x", re_reversed)); >>> } >>> accum += orig; >>> orig = Integer.rotateRight(orig, 3); >>> } >>> System.out.println("Elapsed time: " + >>> Long.toString(System.currentTimeMillis() - start)); >>> System.out.println("accum: " + Integer.toString(accum)); >>> } >>> } >>> > -------------- next part -------------- An HTML attachment was scrubbed... URL: From HORIE at jp.ibm.com Tue Apr 21 14:57:30 2020 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Tue, 21 Apr 2020 23:57:30 +0900 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range In-Reply-To: References: Message-ID: Hi Martin, I started measuring SPECjbb2015 to see the performance impact on P9. Also, I'm preparing same measurement on P8. Best regards, Michihiro ----- Original message ----- From: "Doerr, Martin" To: "'hotspot-compiler-dev at openjdk.java.net'" Cc: Michihiro Horie , "cjashfor at linux.ibm.com" , "ppc-aix-port-dev at openjdk.java.net" , Gustavo Romero , "joserz at linux.ibm.com" Subject: [EXTERNAL] RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Date: Tue, Apr 14, 2020 11:07 PM Hi, I?d like to resolve a very old PPC64 issue: https://bugs.openjdk.java.net/browse/JDK-8151030 There?s code for AllocatePrefetchStyle=4 which is not an accepted option. It was used for a special experimental prefetch mode using dcbz instructions to combine prefetching and zeroing in the TLABs. However, this code was never contributed and there are no plans to work on it. So I?d like to simply remove this small part of it. In addition to that, AllocatePrefetchLines is currently set to 3 by default which doesn?t make sense to me. PPC64 has an automatic prefetch engine and executing several prefetch instructions for succeeding cache lines doesn?t seem to be beneficial at all. So I?m setting it to 1 by default. I couldn?t observe regressions on Power7, Power8 and Power9. Webrev: http://cr.openjdk.java.net/~mdoerr/8151030_ppc_prefetch/webrev.00/ Please review. If somebody from IBM would like to check performance impact of changing the AllocatePrefetchLines + Distance, I?ll be glad to receive feedback. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.schmidt at sap.com Wed Apr 22 18:01:44 2020 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Wed, 22 Apr 2020 18:01:44 +0000 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range In-Reply-To: References: Message-ID: <0737AF50-4DED-4680-8629-47140DD2A7A6@sap.com> Hi Martin, your change looks good to me. I noticed you didn't find a chance to put it in the patch queue for our internal testing. I did that now, but it's too late for tonight. We'll have to wait until Friday morning (GMT+2) to really see what I expect: no issues. Thanks for cleaning up this old stuff. Regards, Lutz ?On 21.04.20, 16:57, "hotspot-compiler-dev on behalf of Michihiro Horie" wrote: Hi Martin, I started measuring SPECjbb2015 to see the performance impact on P9. Also, I'm preparing same measurement on P8. Best regards, Michihiro ----- Original message ----- From: "Doerr, Martin" To: "'hotspot-compiler-dev at openjdk.java.net'" Cc: Michihiro Horie , "cjashfor at linux.ibm.com" , "ppc-aix-port-dev at openjdk.java.net" , Gustavo Romero , "joserz at linux.ibm.com" Subject: [EXTERNAL] RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Date: Tue, Apr 14, 2020 11:07 PM Hi, I?d like to resolve a very old PPC64 issue: https://bugs.openjdk.java.net/browse/JDK-8151030 There?s code for AllocatePrefetchStyle=4 which is not an accepted option. It was used for a special experimental prefetch mode using dcbz instructions to combine prefetching and zeroing in the TLABs. However, this code was never contributed and there are no plans to work on it. So I?d like to simply remove this small part of it. In addition to that, AllocatePrefetchLines is currently set to 3 by default which doesn?t make sense to me. PPC64 has an automatic prefetch engine and executing several prefetch instructions for succeeding cache lines doesn?t seem to be beneficial at all. So I?m setting it to 1 by default. I couldn?t observe regressions on Power7, Power8 and Power9. Webrev: http://cr.openjdk.java.net/~mdoerr/8151030_ppc_prefetch/webrev.00/ Please review. If somebody from IBM would like to check performance impact of changing the AllocatePrefetchLines + Distance, I?ll be glad to receive feedback. Best regards, Martin From HORIE at jp.ibm.com Fri Apr 24 05:40:00 2020 From: HORIE at jp.ibm.com (Michihiro Horie) Date: Fri, 24 Apr 2020 14:40:00 +0900 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range In-Reply-To: <0737AF50-4DED-4680-8629-47140DD2A7A6@sap.com> References: <0737AF50-4DED-4680-8629-47140DD2A7A6@sap.com>, Message-ID: Hi Martin, Lutz, I have not seen big differences in SPECjbb2015 scores both on P8 and P9. Best regards, Michihiro ----- Original message ----- From: "Schmidt, Lutz" To: Michihiro Horie , "Doerr, Martin" Cc: "ppc-aix-port-dev at openjdk.java.net" , "hotspot-compiler-dev at openjdk.java.net" Subject: [EXTERNAL] Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Date: Thu, Apr 23, 2020 3:01 AM Hi Martin, your change looks good to me. I noticed you didn't find a chance to put it in the patch queue for our internal testing. I did that now, but it's too late for tonight. We'll have to wait until Friday morning (GMT+2) to really see what I expect: no issues. Thanks for cleaning up this old stuff. Regards, Lutz ?On 21.04.20, 16:57, "hotspot-compiler-dev on behalf of Michihiro Horie" wrote: Hi Martin, I started measuring SPECjbb2015 to see the performance impact on P9. Also, I'm preparing same measurement on P8. Best regards, Michihiro ----- Original message ----- From: "Doerr, Martin" To: "'hotspot-compiler-dev at openjdk.java.net'" Cc: Michihiro Horie , "cjashfor at linux.ibm.com" , "ppc-aix-port-dev at openjdk.java.net" , Gustavo Romero , "joserz at linux.ibm.com" Subject: [EXTERNAL] RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Date: Tue, Apr 14, 2020 11:07 PM Hi, I?d like to resolve a very old PPC64 issue: https://urldefense.proofpoint.com/v2/url?u=https-3A__bugs.openjdk.java.net_browse_JDK-2D8151030&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=Q3El2qgCsQyK-bunbC8-3yZzMvfZGBwC8q58omWEUCM&s=ohXZhHZXhsm01dbRh1iQHwrtNAH1QfUmokv2qs49cPY&e= There?s code for AllocatePrefetchStyle=4 which is not an accepted option. It was used for a special experimental prefetch mode using dcbz instructions to combine prefetching and zeroing in the TLABs. However, this code was never contributed and there are no plans to work on it. So I?d like to simply remove this small part of it. In addition to that, AllocatePrefetchLines is currently set to 3 by default which doesn?t make sense to me. PPC64 has an automatic prefetch engine and executing several prefetch instructions for succeeding cache lines doesn?t seem to be beneficial at all. So I?m setting it to 1 by default. I couldn?t observe regressions on Power7, Power8 and Power9. Webrev: https://urldefense.proofpoint.com/v2/url?u=http-3A__cr.openjdk.java.net_-7Emdoerr_8151030-5Fppc-5Fprefetch_webrev.00_&d=DwIGaQ&c=jf_iaSHvJObTbx-siA1ZOg&r=oecsIpYF-cifqq2i1JEH0Q&m=Q3El2qgCsQyK-bunbC8-3yZzMvfZGBwC8q58omWEUCM&s=paesC67BcmFOkkYjGySj1AUJJyOKHO25BwzZi0vHG8g&e= Please review. If somebody from IBM would like to check performance impact of changing the AllocatePrefetchLines + Distance, I?ll be glad to receive feedback. Best regards, Martin -------------- next part -------------- An HTML attachment was scrubbed... URL: From lutz.schmidt at sap.com Fri Apr 24 14:51:01 2020 From: lutz.schmidt at sap.com (Schmidt, Lutz) Date: Fri, 24 Apr 2020 14:51:01 +0000 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range In-Reply-To: References: <0737AF50-4DED-4680-8629-47140DD2A7A6@sap.com> Message-ID: Hi Martin, SAP-internal testing revealed no problems related to this patch. As Michihiro did not find performance issues, the patch is good to go from my perspective. Regards, Lutz From: Michihiro Horie on behalf of Michihiro Horie Date: Friday, 24. April 2020 at 07:40 To: Lutz Schmidt Cc: "hotspot-compiler-dev at openjdk.java.net" , "Doerr, Martin (martin.doerr at sap.com)" , "ppc-aix-port-dev at openjdk.java.net" Subject: Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Hi Martin, Lutz, I have not seen big differences in SPECjbb2015 scores both on P8 and P9. Best regards, Michihiro ----- Original message ----- From: "Schmidt, Lutz" To: Michihiro Horie , "Doerr, Martin" Cc: "ppc-aix-port-dev at openjdk.java.net" , "hotspot-compiler-dev at openjdk.java.net" Subject: [EXTERNAL] Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range Date: Thu, Apr 23, 2020 3:01 AM Hi Martin, your change looks good to me. I noticed you didn't find a chance to put it in the patch queue for our internal testing. I did that now, but it's too late for tonight. We'll have to wait until Friday morning (GMT+2) to really see what I expect: no issues. Thanks for cleaning up this old stuff. Regards, Lutz On 21.04.20, 16:57, "hotspot-compiler-dev on behalf of Michihiro Horie" wrote: ? ? Hi Martin, ? ? I started measuring SPECjbb2015 to see the performance impact on P9. Also, ? ? I'm preparing same measurement on P8. ? ? Best regards, ? ? Michihiro ? ? ?----- Original message ----- ? ? ?From: "Doerr, Martin" ? ? ?To: "'hotspot-compiler-dev at openjdk.java.net'" ? ? ? ? ? ?Cc: Michihiro Horie , "cjashfor at linux.ibm.com" ? ? ?, "ppc-aix-port-dev at openjdk.java.net" ? ? ?, Gustavo Romero ? ? ?, "joserz at linux.ibm.com" ? ? ? ? ? ?Subject: [EXTERNAL] RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is ? ? ?out of range ? ? ?Date: Tue, Apr 14, 2020 11:07 PM ? ? ?Hi, ? ? ?I?d like to resolve a very old PPC64 issue: ? ? ?https://bugs.openjdk.java.net/browse/JDK-8151030? ? ? ?There?s code for AllocatePrefetchStyle=4 which is not an accepted option. ? ? ?It was used for a special experimental prefetch mode using dcbz ? ? ?instructions to combine prefetching and zeroing in the TLABs. ? ? ?However, this code was never contributed and there are no plans to work on ? ? ?it. So I?d like to simply remove this small part of it. ? ? ?In addition to that, AllocatePrefetchLines is currently set to 3 by ? ? ?default which doesn?t make sense to me. PPC64 has an automatic prefetch ? ? ?engine and executing several prefetch instructions for succeeding cache ? ? ?lines doesn?t seem to be beneficial at all. ? ? ?So I?m setting it to 1 by default. I couldn?t observe regressions on ? ? ?Power7, Power8 and Power9. ? ? ?Webrev: ? ? ?http://cr.openjdk.java.net/~mdoerr/8151030_ppc_prefetch/webrev.00/? ? ? ?Please review. ? ? ?If somebody from IBM would like to check performance impact of changing ? ? ?the AllocatePrefetchLines + Distance, I?ll be glad to receive feedback. ? ? ?Best regards, ? ? ?Martin From martin.doerr at sap.com Mon Apr 27 08:06:44 2020 From: martin.doerr at sap.com (Doerr, Martin) Date: Mon, 27 Apr 2020 08:06:44 +0000 Subject: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range In-Reply-To: References: <0737AF50-4DED-4680-8629-47140DD2A7A6@sap.com> Message-ID: Hi G?tz, Michihiro and Lutz, thanks for the reviews. Pushed. Best regards, Martin > -----Original Message----- > From: Schmidt, Lutz > Sent: Freitag, 24. April 2020 16:51 > To: Michihiro Horie > Cc: hotspot-compiler-dev at openjdk.java.net; Doerr, Martin > ; ppc-aix-port-dev at openjdk.java.net > Subject: Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range > > Hi Martin, > > SAP-internal testing revealed no problems related to this patch. > > As Michihiro did not find performance issues, the patch is good to go from > my perspective. > > Regards, > Lutz > > From: Michihiro Horie on behalf of Michihiro Horie > > Date: Friday, 24. April 2020 at 07:40 > To: Lutz Schmidt > Cc: "hotspot-compiler-dev at openjdk.java.net" dev at openjdk.java.net>, "Doerr, Martin (martin.doerr at sap.com)" > , "ppc-aix-port-dev at openjdk.java.net" port-dev at openjdk.java.net> > Subject: Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is out of range > > Hi Martin, Lutz, > > I have not seen big differences in SPECjbb2015 scores both on P8 and P9. > > Best regards, > Michihiro > > > ----- Original message ----- > From: "Schmidt, Lutz" > To: Michihiro Horie , "Doerr, Martin" > > Cc: "ppc-aix-port-dev at openjdk.java.net" dev at openjdk.java.net>, "hotspot-compiler-dev at openjdk.java.net" > > Subject: [EXTERNAL] Re: RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is > out of range > Date: Thu, Apr 23, 2020 3:01 AM > > Hi Martin, > > your change looks good to me. > > I noticed you didn't find a chance to put it in the patch queue for our internal > testing. I did that now, but it's too late for tonight. We'll have to wait until > Friday morning (GMT+2) to really see what I expect: no issues. > > Thanks for cleaning up this old stuff. > > Regards, > Lutz > > > On 21.04.20, 16:57, "hotspot-compiler-dev on behalf of Michihiro Horie" > HORIE at jp.ibm.com> wrote: > > > ? ? Hi Martin, > > ? ? I started measuring SPECjbb2015 to see the performance impact on P9. > Also, > ? ? I'm preparing same measurement on P8. > > ? ? Best regards, > ? ? Michihiro > > > ? ? ?----- Original message ----- > ? ? ?From: "Doerr, Martin" > ? ? ?To: "'hotspot-compiler-dev at openjdk.java.net'" > ? ? ? > ? ? ?Cc: Michihiro Horie , "cjashfor at linux.ibm.com" > ? ? ?, "ppc-aix-port-dev at openjdk.java.net" > ? ? ?, Gustavo Romero > ? ? ?, "joserz at linux.ibm.com" > ? ? ? > ? ? ?Subject: [EXTERNAL] RFR(XS): 8151030: PPC64: AllocatePrefetchStyle=4 is > ? ? ?out of range > ? ? ?Date: Tue, Apr 14, 2020 11:07 PM > > ? ? ?Hi, > > ? ? ?I?d like to resolve a very old PPC64 issue: > ? ? ?https://bugs.openjdk.java.net/browse/JDK-8151030 > > ? ? ?There?s code for AllocatePrefetchStyle=4 which is not an accepted option. > ? ? ?It was used for a special experimental prefetch mode using dcbz > ? ? ?instructions to combine prefetching and zeroing in the TLABs. > ? ? ?However, this code was never contributed and there are no plans to work > on > ? ? ?it. So I?d like to simply remove this small part of it. > > ? ? ?In addition to that, AllocatePrefetchLines is currently set to 3 by > ? ? ?default which doesn?t make sense to me. PPC64 has an automatic prefetch > ? ? ?engine and executing several prefetch instructions for succeeding cache > ? ? ?lines doesn?t seem to be beneficial at all. > ? ? ?So I?m setting it to 1 by default. I couldn?t observe regressions on > ? ? ?Power7, Power8 and Power9. > > ? ? ?Webrev: > ? ? ?http://cr.openjdk.java.net/~mdoerr/8151030_ppc_prefetch/webrev.00/ > > ? ? ?Please review. > > ? ? ?If somebody from IBM would like to check performance impact of > changing > ? ? ?the AllocatePrefetchLines + Distance, I?ll be glad to receive feedback. > > ? ? ?Best regards, > ? ? ?Martin > > > > > From volker.simonis at gmail.com Thu Apr 30 14:45:03 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 30 Apr 2020 16:45:03 +0200 Subject: RFR(XS): 8230552: Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com> References: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com> Message-ID: Forwarding to ppc-aix and s390 port mailing lists with the kind request for testing this simple fix on the corresponding platforms. Thank you and best regards, Volker Liu, Xin schrieb am Do., 30. Apr. 2020, 08:39: > > > ?On 4/29/20, 11:06 PM, "Pengfei Li" wrote: > > > > Hi Xin, > > > I tested on aarch64. It generates the same crash report as x86_64 > when it > > does hit HaltNode. Halt reason is displayed. I paste report on the > JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except > for 3 > > relevant failures[1]. > > (NOT a reviewer) The original instruction used should be dcps1 instead > of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a > trivial fix to change dpcs1/2/3 to dcps1/2/3? > > Oh, I don't know that. I did search dpcs and found nothing. > I've filed a new issue about the typo thing: JDK-8244170. Let's resolve > it in separated issue. > > BTW, how did you test to hit the HaltNode? > -- > Thanks, > Pengfei > > I followed Christian and Volkers' recipe on JDK-8230552. Both of them can > generate HaltNode. > Volker's approach is very interesting. You have to give program a couple > of "-XX:SuppressErrorAt=" to increase tolerance. > > Thanks, > --lx > > > -------------- next part -------------- An HTML attachment was scrubbed... URL: