From gnu.andrew at redhat.com Wed Apr 1 01:22:17 2020 From: gnu.andrew at redhat.com (Andrew Hughes) Date: Wed, 1 Apr 2020 02:22:17 +0100 Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b08 Upstream Sync Message-ID: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com> Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/ Merge changesets: http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/corba/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxp/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxws/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jdk/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/hotspot/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/langtools/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/nashorn/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/root/merge.changeset Changes in aarch64-shenandoah-jdk8u252-b08: - S8241296: Segfault in JNIHandleBlock::oops_do() - S8241307: Marlin renderer should not be the default in 8u252 Main issues of note: One HotSpot change applied cleanly, no merge work. diffstat for root b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for corba b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for jaxp b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for jaxws b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for langtools b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for nashorn b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for jdk b/.hgtags | 1 b/src/share/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine | 7 + b/src/solaris/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine | 9 +- b/test/sun/java2d/marlin/DefaultRenderingEngine.java | 42 ++++++++++ 4 files changed, 54 insertions(+), 5 deletions(-) diffstat for hotspot b/.hgtags | 1 + b/src/share/vm/runtime/thread.cpp | 4 +++- 2 files changed, 4 insertions(+), 1 deletion(-) Successfully built on x86, x86_64, s390, s390x, ppc, ppc64, ppc64le & aarch64. Ok to push? Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 From Pengfei.Li at arm.com Wed Apr 1 02:05:04 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Wed, 1 Apr 2020 02:05:04 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com> References: <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com> Message-ID: Hi Andrew, Thanks for review. > INSN(absr, 0, 0b100000101110, 1); // accepted arrangements: T8B, T16B, > T4H, T8H, T4S > - INSN(negr, 1, 0b100000101110, 2); // accepted arrangements: T8B, T16B, > T4H, T8H, T2S, T4S, T2D > > is actually related to some other work you are doing? This change is related to - if (accepted < 2) guarantee(T != T2S && T != T2D, "incorrect arrangement"); \ - if (accepted == 0) guarantee(T == T8B || T == T16B, "incorrect arrangement"); \ + if (accepted < 3) guarantee(T != T2D, "incorrect arrangement"); \ + if (accepted < 2) guarantee(T != T2S, "incorrect arrangement"); \ + if (accepted < 1) guarantee(T == T8B || T == T16B, "incorrect arrangement"); \ Before my patch, the candidate values of "accepted" are 0, 1 and 2 meaning different accepted arrangements as below: 0 - Only T8B and T16B are accepted 1 - All arrangements but T2S and T2D are accepted 2 - All arrangements are accepted In my patch, the newly added instruction UADDLP supports T2S but doesn't support T2D. So I changed the value range to 0 - 3, where 3 means all arrangements are accepted now. That's why the value for parameter "accepted" of NEGR is promoted from 2 to 3 now. -- Thanks, Pengfei From aph at redhat.com Wed Apr 1 08:54:52 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 1 Apr 2020 09:54:52 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: References: <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com> Message-ID: On 4/1/20 3:05 AM, Pengfei Li wrote: > In my patch, the newly added instruction UADDLP supports T2S but doesn't support T2D. So I changed the value range to 0 - 3, where 3 means all arrangements are accepted now. That's why the value for parameter "accepted" of NEGR is promoted from 2 to 3 now. I see. OK, thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stumon01 at arm.com Wed Apr 1 09:29:02 2020 From: stumon01 at arm.com (Stuart Monteith) Date: Wed, 1 Apr 2020 10:29:02 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics from os_linux.cpp/hpp/inline.hpp Message-ID: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> Hello, This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1. Webrev: http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/ Bug: https://bugs.openjdk.java.net/browse/JDK-8241587 Thanks, Stuart IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From david.holmes at oracle.com Wed Apr 1 10:03:32 2020 From: david.holmes at oracle.com (David Holmes) Date: Wed, 1 Apr 2020 20:03:32 +1000 Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics from os_linux.cpp/hpp/inline.hpp In-Reply-To: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> Message-ID: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> Hi Stuart, On 1/04/2020 7:29 pm, Stuart Monteith wrote: > Hello, > This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1. > > Webrev: > http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/ > Bug: > https://bugs.openjdk.java.net/browse/JDK-8241587 That clean up seems good to me. > Thanks, > Stuart > > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. That footer seems inappropriate for OpenJDK emails. Cheers, David From stumon01 at arm.com Wed Apr 1 10:09:40 2020 From: stumon01 at arm.com (Stuart Monteith) Date: Wed, 1 Apr 2020 11:09:40 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics from os_linux.cpp/hpp/inline.hpp In-Reply-To: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> Message-ID: On 01/04/2020 11:03, David Holmes wrote: > Hi Stuart, > > On 1/04/2020 7:29 pm, Stuart Monteith wrote: >> Hello, >> This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1. >> >> Webrev: >> http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/ >> Bug: >> https://bugs.openjdk.java.net/browse/JDK-8241587 > > That clean up seems good to me. > Thanks. >> Thanks, >> Stuart >> >> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you >> are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other >> person, use it for any purpose, or store or copy the information in any medium. Thank you. > > That footer seems inappropriate for OpenJDK emails. > Apologies - dismiss that. I'd ordinarily send the email from my other machine. > Cheers, > David > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From shade at redhat.com Wed Apr 1 11:55:03 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Wed, 1 Apr 2020 13:55:03 +0200 Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b08 Upstream Sync In-Reply-To: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com> References: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com> Message-ID: <82a66f85-76d4-c5b1-a11f-136a0a949095@redhat.com> On 4/1/20 3:22 AM, Andrew Hughes wrote: > Merge changesets: > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/corba/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxp/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxws/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jdk/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/hotspot/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/langtools/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/nashorn/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/root/merge.changeset All look good. > Ok to push? Yes, please. -- Thanks, -Aleksey From gnu.andrew at redhat.com Wed Apr 1 16:50:53 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:50:53 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah: 3 new changesets Message-ID: <202004011650.031Gorev018534@aojmv0008.oracle.com> Changeset: e8b56e0eaa7b Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/e8b56e0eaa7b Added tag jdk8u252-b08 for changeset 72a6d93679e5 ! .hgtags Changeset: 259807b2eafc Author: andrew Date: 2020-03-27 06:04 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/259807b2eafc Merge jdk8u252-b08 ! .hgtags Changeset: 83b10c54af07 Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/83b10c54af07 Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 259807b2eafc ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:03 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:03 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/corba: 3 new changesets Message-ID: <202004011651.031Gp35P018686@aojmv0008.oracle.com> Changeset: 9340b3be1b47 Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/9340b3be1b47 Added tag jdk8u252-b08 for changeset 63738d15bb7f ! .hgtags Changeset: 81baca88f8b3 Author: andrew Date: 2020-03-27 06:04 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/81baca88f8b3 Merge jdk8u252-b08 ! .hgtags Changeset: 8fad3e09ebcf Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/8fad3e09ebcf Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 81baca88f8b3 ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:12 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:12 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxp: 3 new changesets Message-ID: <202004011651.031GpCbw018812@aojmv0008.oracle.com> Changeset: 8476d78dc695 Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/8476d78dc695 Added tag jdk8u252-b08 for changeset d1a8fb9aafdd ! .hgtags Changeset: 0e8735595b62 Author: andrew Date: 2020-03-27 06:04 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/0e8735595b62 Merge jdk8u252-b08 ! .hgtags Changeset: 878d3aa22258 Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/878d3aa22258 Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 0e8735595b62 ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:21 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:21 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxws: 3 new changesets Message-ID: <202004011651.031GpLhD019359@aojmv0008.oracle.com> Changeset: b012193ff452 Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/b012193ff452 Added tag jdk8u252-b08 for changeset 7e334946a044 ! .hgtags Changeset: e2725620cdbd Author: andrew Date: 2020-03-27 06:04 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/e2725620cdbd Merge jdk8u252-b08 ! .hgtags Changeset: 5f4c415b6acc Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/5f4c415b6acc Added tag aarch64-shenandoah-jdk8u252-b08 for changeset e2725620cdbd ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:30 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:30 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/langtools: 3 new changesets Message-ID: <202004011651.031GpUwR019430@aojmv0008.oracle.com> Changeset: 01036da3155c Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/01036da3155c Added tag jdk8u252-b08 for changeset c56eceecec71 ! .hgtags Changeset: 4cb8441f6bf5 Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/4cb8441f6bf5 Merge jdk8u252-b08 ! .hgtags Changeset: a6ed6d713d38 Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/a6ed6d713d38 Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 4cb8441f6bf5 ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:38 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:38 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/hotspot: 4 new changesets Message-ID: <202004011651.031GpcRe019568@aojmv0008.oracle.com> Changeset: 8f2780b3e4fa Author: aph Date: 2020-03-25 03:20 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/8f2780b3e4fa 8241296: Segfault in JNIHandleBlock::oops_do() Reviewed-by: andrew ! src/share/vm/runtime/thread.cpp Changeset: 095e60e7fc8c Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/095e60e7fc8c Added tag jdk8u252-b08 for changeset 8f2780b3e4fa ! .hgtags Changeset: 2668bab1293c Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/2668bab1293c Merge jdk8u252-b08 ! .hgtags ! src/share/vm/runtime/thread.cpp Changeset: ac1d2acb1e7d Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/ac1d2acb1e7d Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 2668bab1293c ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:47 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:47 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jdk: 4 new changesets Message-ID: <202004011651.031Gpm8v019764@aojmv0008.oracle.com> Changeset: e17fe591a374 Author: lbourges Date: 2020-03-25 03:53 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/e17fe591a374 8241307: Marlin renderer should not be the default in 8u252 Reviewed-by: phh, alexsch, andrew, sgehwolf ! src/share/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine ! src/solaris/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine + test/sun/java2d/marlin/DefaultRenderingEngine.java Changeset: da301ecaa81d Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/da301ecaa81d Added tag jdk8u252-b08 for changeset e17fe591a374 ! .hgtags Changeset: c38803f8a50b Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/c38803f8a50b Merge jdk8u252-b08 ! .hgtags Changeset: 2f1b1489f97f Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/2f1b1489f97f Added tag aarch64-shenandoah-jdk8u252-b08 for changeset c38803f8a50b ! .hgtags From gnu.andrew at redhat.com Wed Apr 1 16:51:56 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Wed, 01 Apr 2020 16:51:56 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/nashorn: 3 new changesets Message-ID: <202004011651.031Gpube020370@aojmv0008.oracle.com> Changeset: 5fc91c4182b0 Author: andrew Date: 2020-03-27 05:14 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/5fc91c4182b0 Added tag jdk8u252-b08 for changeset 95d61d0f326b ! .hgtags Changeset: d9fdfa71788f Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/d9fdfa71788f Merge jdk8u252-b08 ! .hgtags Changeset: ebb6de4f5fb3 Author: andrew Date: 2020-03-27 06:05 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/ebb6de4f5fb3 Added tag aarch64-shenandoah-jdk8u252-b08 for changeset d9fdfa71788f ! .hgtags From nick.gasson at arm.com Thu Apr 2 01:48:40 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 02 Apr 2020 09:48:40 +0800 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> On 03/30/20 18:43 pm, Andrew Haley wrote: > > I remember an early stepping where STLR/LDAR weren't sequentially > consistent, so it was necessary to generate explicit DMBs. I doubt > that parts with this bug ever reached the market. > > Having said that, maybe someone is still using one. It might be worth > correcting UseBarriersForVolatile and making the flag diagnostic only. > Having said that, the entire C library uses these instructions. > Opinions? I checked glibc and the Linux kernel and couldn't find any workaround like this. Presumably they'd both be affected. I suggest if Derek can confirm that bug never made it into a production part then we should completely remove UseBarriersForVolatile. Maybe with a warning at startup if we detect that CPU variant. It adds a lot of complexity to the volatile implementation for no clear benefit. Thanks, Nick From adinn at redhat.com Thu Apr 2 13:22:29 2020 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 2 Apr 2020 14:22:29 +0100 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> On 02/04/2020 02:48, Nick Gasson wrote: > I suggest if Derek can confirm that bug never made it into a production > part then we should completely remove UseBarriersForVolatile. Maybe with > a warning at startup if we detect that CPU variant. It adds a lot of > complexity to the volatile implementation for no clear benefit. One reason for having this switch was to provide a comparator for our scheme to implement the Java volatile accesses using ldar/stlr. That translation scheme avoids a dmb after the stlr allowing the value being written to be committed lazily while still providing the critical guarantee that prior writes are committed before it gets committed. The switch ensures we can fall back to a 'reference' implementation based on dmbs that, amongst other things, enforces immediate commit of the volatile write after commit of its predecessors. By removing support we lose the ability to test cases where synchronization errors occur with our scheme by switching to the 'standard' model. That may still be useful for finding bugs (current or newly injected) in our translation and, indeed, in new HW. Andrew Haley was not suggesting removing this option. He simply talked about making it a diagnostic option. I think that might be a wiser choice. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From adinn at redhat.com Thu Apr 2 13:43:20 2020 From: adinn at redhat.com (Andrew Dinn) Date: Thu, 2 Apr 2020 14:43:20 +0100 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> Message-ID: On 02/04/2020 14:22, Andrew Dinn wrote: > On 02/04/2020 02:48, Nick Gasson wrote: >> I suggest if Derek can confirm that bug never made it into a production >> part then we should completely remove UseBarriersForVolatile. Maybe with >> a warning at startup if we detect that CPU variant. It adds a lot of >> complexity to the volatile implementation for no clear benefit. > One reason for having this switch was to provide a comparator for our > scheme to implement the Java volatile accesses using ldar/stlr. That > translation scheme avoids a dmb after the stlr allowing the value being > written to be committed lazily while still providing the critical > guarantee that prior writes are committed before it gets committed. The > switch ensures we can fall back to a 'reference' implementation based on > dmbs that, amongst other things, enforces immediate commit of the > volatile write after commit of its predecessors. > > By removing support we lose the ability to test cases where > synchronization errors occur with our scheme by switching to the > 'standard' model. That may still be useful for finding bugs (current or > newly injected) in our translation and, indeed, in new HW. Andrew Haley > was not suggesting removing this option. He simply talked about making > it a diagnostic option. I think that might be a wiser choice. Correction he did actually raise the question as to whether to remove it after recommending making it diagnostic. My vote for the latter still stands. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From aph at redhat.com Thu Apr 2 13:46:23 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 2 Apr 2020 14:46:23 +0100 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> Message-ID: <3a12226e-1762-ea3c-ff99-2ebce2ebb69c@redhat.com> On 4/2/20 2:22 PM, Andrew Dinn wrote: > On 02/04/2020 02:48, Nick Gasson wrote: >> I suggest if Derek can confirm that bug never made it into a production >> part then we should completely remove UseBarriersForVolatile. Maybe with >> a warning at startup if we detect that CPU variant. It adds a lot of >> complexity to the volatile implementation for no clear benefit. > > One reason for having this switch was to provide a comparator for our > scheme to implement the Java volatile accesses using ldar/stlr. That > translation scheme avoids a dmb after the stlr allowing the value being > written to be committed lazily while still providing the critical > guarantee that prior writes are committed before it gets committed. The > switch ensures we can fall back to a 'reference' implementation based on > dmbs that, amongst other things, enforces immediate commit of the > volatile write after commit of its predecessors. > > By removing support we lose the ability to test cases where > synchronization errors occur with our scheme by switching to the > 'standard' model. Right, so I guess you're saying that we should keep it because there may be bugs in our ldar/stlr code. I can think of no other reason. > That may still be useful for finding bugs (current or newly > injected) in our translation and, indeed, in new HW. Andrew Haley > was not suggesting removing this option. He simply talked about > making it a diagnostic option. I think that might be a wiser choice. I'm fairly sure I said perhaps we could nuke it. I am strongly of the opinion that rarely-used code behind runtime switches tends to rot, and that any rotten part of the ship tends to spread. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From nick.gasson at arm.com Fri Apr 3 02:03:30 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Fri, 03 Apr 2020 10:03:30 +0800 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> Message-ID: <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com> On 04/02/20 21:22 pm, Andrew Dinn wrote: > One reason for having this switch was to provide a comparator for our > scheme to implement the Java volatile accesses using ldar/stlr. That > translation scheme avoids a dmb after the stlr allowing the value being > written to be committed lazily while still providing the critical > guarantee that prior writes are committed before it gets committed. The > switch ensures we can fall back to a 'reference' implementation based on > dmbs that, amongst other things, enforces immediate commit of the > volatile write after commit of its predecessors. > > By removing support we lose the ability to test cases where > synchronization errors occur with our scheme by switching to the > 'standard' model. That may still be useful for finding bugs (current or > newly injected) in our translation and, indeed, in new HW. OK, but keeping it is not without cost. If UseBarriersForVolatile is to have value as a reference implementation we need to expend effort to test it and fix any bugs that arise from changes to other parts of the code (see Xiaohong's original mail). Thanks, Nick From ningsheng.jian at arm.com Fri Apr 3 02:30:18 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Fri, 3 Apr 2020 10:30:18 +0800 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com> <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: On 4/3/20 10:03 AM, Nick Gasson wrote: > On 04/02/20 21:22 pm, Andrew Dinn wrote: >> One reason for having this switch was to provide a comparator for our >> scheme to implement the Java volatile accesses using ldar/stlr. That >> translation scheme avoids a dmb after the stlr allowing the value being >> written to be committed lazily while still providing the critical >> guarantee that prior writes are committed before it gets committed. The >> switch ensures we can fall back to a 'reference' implementation based on >> dmbs that, amongst other things, enforces immediate commit of the >> volatile write after commit of its predecessors. >> >> By removing support we lose the ability to test cases where >> synchronization errors occur with our scheme by switching to the >> 'standard' model. That may still be useful for finding bugs (current or >> newly injected) in our translation and, indeed, in new HW. > > OK, but keeping it is not without cost. If UseBarriersForVolatile is to > have value as a reference implementation we need to expend effort to > test it and fix any bugs that arise from changes to other parts of the > code (see Xiaohong's original mail). > > Yes, if the "reference" implementation is not widely used and tested, it might be buggy and misleading. I know that when Xiaohong was working on similar part on Graal [1], she spent a lot of time tracing the UseBarriersForVolatile issue in hotspot vm. So I agree with Nick and trend to not maintain this implementation. [1] https://github.com/oracle/graal/pull/2181 Thanks, Ningsheng From ningsheng.jian at arm.com Fri Apr 3 02:41:04 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Fri, 3 Apr 2020 10:41:04 +0800 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: References: Message-ID: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Hi Pengfei, On 3/31/20 5:32 PM, Pengfei Li wrote: > Hi, > > Please help review this another missing node support for AArch64. > > JBS: https://bugs.openjdk.java.net/browse/JDK-8241475 > Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/ > Just took a close look before pushing your code, and I think this line can be removed? + effect(TEMP_DEF dst); Thanks, Ningsheng From Pengfei.Li at arm.com Fri Apr 3 05:48:05 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Fri, 3 Apr 2020 05:48:05 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: <110347ce-0629-c5ff-d072-080094570f09@arm.com> References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Message-ID: Hi, > Just took a close look before pushing your code, and I think this line can be > removed? > > + effect(TEMP_DEF dst); Yes, thanks for pointing out. It is redundant since I don't use temps this time. I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/ -- Thanks, Pengfei From aph at redhat.com Fri Apr 3 08:56:34 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 3 Apr 2020 09:56:34 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Message-ID: On 4/3/20 6:48 AM, Pengfei Li wrote: > Yes, thanks for pointing out. It is redundant since I don't use temps this time. > > I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/ Please push. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ningsheng.jian at arm.com Fri Apr 3 09:11:15 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Fri, 3 Apr 2020 17:11:15 +0800 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Message-ID: <6c0bcfbd-118c-3fa7-96f7-7e832314a05c@arm.com> On 4/3/20 4:56 PM, Andrew Haley wrote: > On 4/3/20 6:48 AM, Pengfei Li wrote: >> Yes, thanks for pointing out. It is redundant since I don't use temps this time. >> >> I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/ > > Please push. > Pushed. Thanks, Ningsheng From adinn at redhat.com Fri Apr 3 09:13:40 2020 From: adinn at redhat.com (Andrew Dinn) Date: Fri, 3 Apr 2020 10:13:40 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: <110347ce-0629-c5ff-d072-080094570f09@arm.com> References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Message-ID: On 03/04/2020 03:41, Ningsheng Jian wrote: > Hi Pengfei, > > On 3/31/20 5:32 PM, Pengfei Li wrote: >> Hi, >> >> Please help review this another missing node support for AArch64. >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475 >> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/ >> > > Just took a close look before pushing your code, and I think this line > can be removed? > > +? effect(TEMP_DEF dst); Strictly, I think this is correct but I don't think it matters. I believe this usage is meant to identify a case where a generated multi-instruction sequence uses the output register (i.e. dst = target of Set) both as an output in the final instruction and as an intermediate scratch register in intervening instructions. That is the case for both these rules. The only way that might make a difference is if the back end were able to interleave instructions in other generated sequences with the instructions generated by this rule during instruction scheduling (or, say, via peephole rules). However, I don't believe that can happen given the current adlc code and AArch64 rules. n.b. there are several other exemples of TEMP_DEF use in aarch64.ad. I am not sure that they are the only ones where a dst register is used as both output and intermediary (we will only find out by carefully eyeballing every rule). regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From aph at redhat.com Fri Apr 3 09:22:30 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 3 Apr 2020 10:22:30 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> Message-ID: <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com> On 4/3/20 10:13 AM, Andrew Dinn wrote: > On 03/04/2020 03:41, Ningsheng Jian wrote: >> Hi Pengfei, >> >> On 3/31/20 5:32 PM, Pengfei Li wrote: >>> Hi, >>> >>> Please help review this another missing node support for AArch64. >>> >>> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475 >>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/ >>> >> >> Just took a close look before pushing your code, and I think this line >> can be removed? >> >> +? effect(TEMP_DEF dst); > Strictly, I think this is correct but I don't think it matters. > > I believe this usage is meant to identify a case where a generated > multi-instruction sequence uses the output register (i.e. dst = target > of Set) both as an output in the final instruction and as an > intermediate scratch register in intervening instructions. That is the > case for both these rules. More simply, it prevents the situation where the same register is used as both an output and an input. Withe these patterns that doesn't matter. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ningsheng.jian at arm.com Fri Apr 3 10:00:38 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Fri, 3 Apr 2020 18:00:38 +0800 Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing support for PopCountVI node In-Reply-To: <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com> References: <110347ce-0629-c5ff-d072-080094570f09@arm.com> <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com> Message-ID: <34dcff53-5afc-29c2-6086-e0d66882026c@arm.com> On 4/3/20 5:22 PM, Andrew Haley wrote: > On 4/3/20 10:13 AM, Andrew Dinn wrote: >> On 03/04/2020 03:41, Ningsheng Jian wrote: >>> Hi Pengfei, >>> >>> On 3/31/20 5:32 PM, Pengfei Li wrote: >>>> Hi, >>>> >>>> Please help review this another missing node support for AArch64. >>>> >>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475 >>>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/ >>>> >>> >>> Just took a close look before pushing your code, and I think this line >>> can be removed? >>> >>> +? effect(TEMP_DEF dst); >> Strictly, I think this is correct but I don't think it matters. >> >> I believe this usage is meant to identify a case where a generated >> multi-instruction sequence uses the output register (i.e. dst = target >> of Set) both as an output in the final instruction and as an >> intermediate scratch register in intervening instructions. That is the >> case for both these rules. > > More simply, it prevents the situation where the same register is used as both > an output and an input. Withe these patterns that doesn't matter. > Yeah, in this code block dst and src are not necessary to be different regs. Thanks, Ningsheng From Yang.Zhang at arm.com Fri Apr 3 10:49:06 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 3 Apr 2020 10:49:06 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I Message-ID: Hi, Could you please help to review this patch? In original reduce_add2I, dst may be the same as tmp2, which may get incorrect result. Some reduction operation instruct code formats are also cleaned up. JBS: https://bugs.openjdk.java.net/browse/JDK-8241911 Webrev: http://cr.openjdk.java.net/~yzhang/8241911/webrev.00/ Regards Yang From ci_notify at linaro.org Fri Apr 3 14:22:43 2020 From: ci_notify at linaro.org (ci_notify at linaro.org) Date: Fri, 3 Apr 2020 14:22:43 +0000 (UTC) Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3262 Failure Message-ID: <116463607.14192.1585923764564.JavaMail.javamailuser@localhost> OpenJDK AArch64 jdk/jdk build status is Failure Build details - https://ci.linaro.org/job/jdkX-ci-build/3262/ Changes - clanger: e940fc8b419408cb00fa5ceb6e598ea9bc40e233 - src/jdk.internal.le/share/classes/jdk/internal/org/jline/reader/ConfigurationPath.java - src/jdk.internal.le/share/classes/jdk/internal/org/jline/reader/ScriptEngine.java --"8242030: Wrong package declarations in jline classes after JDK-8241598 Reviewed-by: jlahoda " dfuchs: 70175514ffa1343ec75a9cb610e1adf4fd35adda - src/java.base/macosx/classes/java/net/DefaultInterface.java - test/jdk/java/net/MulticastSocket/SetLoopbackMode.java - test/jdk/java/net/MulticastSocket/SetLoopbackModeIPv4.java - test/jdk/java/net/MulticastSocket/SetOutgoingIf.java - test/jdk/java/nio/channels/DatagramChannel/AdaptorMulticasting.java - test/jdk/java/nio/channels/DatagramChannel/MulticastSendReceiveTests.java - test/jdk/java/nio/channels/DatagramChannel/Promiscuous.java - test/lib/jdk/test/lib/NetworkConfiguration.java --"8241786: Improve heuristic to determine default network interface on macOS Summary: DefaultInetrface.getDefault is updated to prefer interfaces that have non link-local addresses. NetworkConfiguration is updated to skip interface that have only link-local addresses, whether IPv4 or IPv6, for multicasting. Reviewed-by: chegar, alanb " rkennke: d8d2145c205ca1bda888ebb0834fc39693bca2b7 - src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp - src/hotspot/share/gc/shared/gcCause.cpp - src/hotspot/share/gc/shared/gcCause.hpp - src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.cpp - src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp - src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp - src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.cpp - src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahBarrierSetClone.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahClosures.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp - src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahConcurrentRoots.cpp - src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp - src/hotspot/share/gc/shenandoah/shenandoahControlThread.hpp - src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp - src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp - src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp - src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp - src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp - src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp - src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.hpp - src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp - src/hotspot/share/gc/shenandoah/shenandoahOopClosures.hpp - src/hotspot/share/gc/shenandoah/shenandoahOopClosures.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp - src/hotspot/share/gc/shenandoah/shenandoahPacer.hpp - src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.cpp - src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp - src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahUtils.cpp - src/hotspot/share/gc/shenandoah/shenandoahUtils.hpp - src/hotspot/share/gc/shenandoah/shenandoahVMOperations.cpp - src/hotspot/share/gc/shenandoah/shenandoahVMOperations.hpp - src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp - src/hotspot/share/gc/shenandoah/shenandoahVerifier.hpp - src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.cpp - src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.hpp - src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp - src/hotspot/share/runtime/vmOperations.hpp - test/hotspot/jtreg/compiler/c2/aarch64/TestVolatiles.java - test/hotspot/jtreg/gc/CriticalNativeArgs.java - test/hotspot/jtreg/gc/shenandoah/TestAllocHumongousFragment.java - test/hotspot/jtreg/gc/shenandoah/TestAllocIntArrays.java - test/hotspot/jtreg/gc/shenandoah/TestAllocObjectArrays.java - test/hotspot/jtreg/gc/shenandoah/TestAllocObjects.java - test/hotspot/jtreg/gc/shenandoah/TestGCThreadGroups.java - test/hotspot/jtreg/gc/shenandoah/TestHeapUncommit.java - test/hotspot/jtreg/gc/shenandoah/TestLotsOfCycles.java - test/hotspot/jtreg/gc/shenandoah/TestObjItrWithHeapDump.java - test/hotspot/jtreg/gc/shenandoah/TestPeriodicGC.java - test/hotspot/jtreg/gc/shenandoah/TestRefprocSanity.java - test/hotspot/jtreg/gc/shenandoah/TestRegionSampling.java - test/hotspot/jtreg/gc/shenandoah/TestRetainObjects.java - test/hotspot/jtreg/gc/shenandoah/TestSieveObjects.java - test/hotspot/jtreg/gc/shenandoah/TestStringDedup.java - test/hotspot/jtreg/gc/shenandoah/TestStringDedupStress.java - test/hotspot/jtreg/gc/shenandoah/TestStringInternCleanup.java - test/hotspot/jtreg/gc/shenandoah/TestVerifyJCStress.java - test/hotspot/jtreg/gc/shenandoah/TestWrongArrayMember.java - test/hotspot/jtreg/gc/shenandoah/mxbeans/TestChurnNotifications.java - test/hotspot/jtreg/gc/shenandoah/mxbeans/TestPauseNotifications.java - test/hotspot/jtreg/gc/shenandoah/oom/TestClassLoaderLeak.java - test/hotspot/jtreg/gc/shenandoah/options/TestExplicitGC.java - test/hotspot/jtreg/gc/shenandoah/options/TestHeuristicsUnlock.java - test/hotspot/jtreg/gc/shenandoah/options/TestSelectiveBarrierFlags.java - test/hotspot/jtreg/gc/shenandoah/options/TestWrongBarrierDisable.java - test/hotspot/jtreg/gc/stress/CriticalNativeStress.java - test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherWithShenandoah.java - test/hotspot/jtreg/gc/stress/gcold/TestGCOldWithShenandoah.java - test/hotspot/jtreg/gc/stress/systemgc/TestSystemGCWithShenandoah.java - src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalAggressiveHeuristics.cpp - src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalAggressiveHeuristics.hpp - src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalHeuristics.cpp - src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalHeuristics.hpp - src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.cpp - src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.hpp - src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.inline.hpp - src/hotspot/share/gc/shenandoah/shenandoahTraversalMode.cpp - src/hotspot/share/gc/shenandoah/shenandoahTraversalMode.hpp --"8242082: Shenandoah: Purge Traversal mode Reviewed-by: shade " Build output - Compiling 94 files for jdk.xml.dom Compiling 14 files for jdk.zipfs Compiling 15 files for java.prefs Compiling 30 files for java.security.sasl Compiling 131 files for java.rmi Compiling 77 files for java.sql Note: Some input files use or override a deprecated API that is marked for removal. Note: Recompile with -Xlint:removal for details. Compiling 138 files for BUILD_NASGEN Compiling 15 files for jdk.attach Compiling 74 files for jdk.crypto.cryptoki Running nasgen Compiling 134 files for jdk.jdeps Compiling 221 files for jdk.javadoc Compiling 40 files for jdk.jcmd Compiling 251 files for jdk.jdi Compiling 11 files for jdk.jstatd Compiling 14 files for jdk.management.jfr Compiling 188 files for jdk.rmic Compiling 11 files for jdk.scripting.nashorn.shell Note: Some input files use or override a deprecated API that is marked for removal. Note: Recompile with -Xlint:removal for details. Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Compiling 197 files for java.naming Compiling 83 files for jdk.jlink Compiling 2781 files for java.desktop Compiling 94 files for jdk.jshell Compiling 16 files for jdk.naming.dns Compiling 8 files for jdk.naming.rmi Compiling 16 files for java.management.rmi Compiling 219 files for java.security.jgss Compiling 56 files for java.sql.rowset Compiling 31 files for jdk.management.agent Compiling 30 files for jdk.security.auth Compiling 16 files for jdk.security.jgss Compiling 1686 files for jdk.internal.vm.compiler Compiling 108 files for jdk.aot Compiling 68 files for COMPILE_CREATE_SYMBOLS Creating ct.sym classes Compiling 3 files for jdk.internal.vm.compiler.management Compiling 64 files for jdk.jconsole Compiling 8 files for jdk.unsupported.desktop Updating support/src.zip Creating support/symbols/ct.sym Compiling 1 files for java.se Compiling 18 files for jdk.accessibility Compiling 3 files for jdk.editpad Compiling 1004 files for jdk.hotspot.agent Note: Some input files use or override a deprecated API. Note: Recompile with -Xlint:deprecation for details. Note: Some input files use unchecked or unsafe operations. Note: Recompile with -Xlint:unchecked for details. Compiling 47 files for jdk.incubator.jpackage /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp: In member function 'void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler*)': /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp:619:47: error: 'TRAVERSAL' is not a member of 'ShenandoahHeap' __ mov(rscratch2, ShenandoahHeap::MARKING | ShenandoahHeap::TRAVERSAL); ^ At global scope: cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror] cc1plus: error: unrecognized command line option '-Wno-misleading-indentation' [-Werror] cc1plus: error: unrecognized command line option '-Wno-implicit-fallthrough' [-Werror] cc1plus: error: unrecognized command line option '-Wno-int-in-bool-context' [-Werror] cc1plus: all warnings being treated as errors lib/CompileJvm.gmk:181: recipe for target '/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o' failed make[3]: *** [/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o] Error 1 make[3]: *** Waiting for unfinished jobs.... make/Main.gmk:252: recipe for target 'hotspot-server-libs' failed make[2]: *** [hotspot-server-libs] Error 1 ERROR: Build failed for target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build' (exit code 2) === Output from failing command(s) repeated here === * For target hotspot_variant-server_libjvm_objs_shenandoahBarrierSetAssembler_aarch64.o: /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp: In member function 'void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler*)': /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp:619:47: error: 'TRAVERSAL' is not a member of 'ShenandoahHeap' __ mov(rscratch2, ShenandoahHeap::MARKING | ShenandoahHeap::TRAVERSAL); ^ At global scope: cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror] cc1plus: error: unrecognized command line option '-Wno-misleading-indentation' [-Werror] cc1plus: error: unrecognized command line option '-Wno-implicit-fallthrough' [-Werror] cc1plus: error: unrecognized command line option '-Wno-int-in-bool-context' [-Werror] cc1plus: all warnings being treated as errors * All command lines available in /home/buildslave/workspace/jdkX-ci-build/build/make-support/failure-logs. === End of repeated output === === Make failed targets repeated here === lib/CompileJvm.gmk:181: recipe for target '/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o' failed make/Main.gmk:252: recipe for target 'hotspot-server-libs' failed === End of repeated output === Hint: Try searching the build log for the name of the first failed target. Hint: See doc/building.html#troubleshooting for assistance. /home/buildslave/workspace/jdkX-ci-build/jdkX/make/Init.gmk:307: recipe for target 'main' failed make[1]: *** [main] Error 1 /home/buildslave/workspace/jdkX-ci-build/jdkX/make/Init.gmk:186: recipe for target 'images' failed make: *** [images] Error 2 From ci_notify at linaro.org Fri Apr 3 18:31:38 2020 From: ci_notify at linaro.org (ci_notify at linaro.org) Date: Fri, 3 Apr 2020 18:31:38 +0000 (UTC) Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3266 Fixed Message-ID: <1893770877.14203.1585938700152.JavaMail.javamailuser@localhost> OpenJDK AArch64 jdk/jdk build status is Fixed Build details - https://ci.linaro.org/job/jdkX-ci-build/3266/ Changes - joehw: a2126bc7fab76661cc503743e9b11fd243765b2f - test/jaxp/javax/xml/jaxp/unittest/transform/ResultTest.java - src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXBaseWriter.java - src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXEventWriter.java - src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXStreamWriter.java --"8238183: SAX2StAXStreamWriter cannot deal with comments prior to the root element Reviewed-by: naoto, lancea " rkennke: 4c277b7a598a2051efa3332ee9406f61206f2381 - src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp --"8242107: Shenandoah: Fix aarch64 build after JDK-8242082 Reviewed-by: shade " Build output - Creating java.security.jgss.jmod Creating java.security.sasl.jmod Creating java.smartcardio.jmod Creating java.sql.jmod Creating java.sql.rowset.jmod Creating java.transaction.xa.jmod Creating java.xml.jmod Creating java.xml.crypto.jmod Creating jdk.accessibility.jmod Creating jdk.aot.jmod Creating jdk.attach.jmod Creating jdk.charsets.jmod Creating jdk.compiler.jmod Creating jdk.crypto.cryptoki.jmod Creating jdk.dynalink.jmod Creating jdk.editpad.jmod Creating jdk.crypto.ec.jmod Creating jdk.httpserver.jmod Creating jdk.hotspot.agent.jmod Creating jdk.incubator.foreign.jmod Creating jdk.incubator.jpackage.jmod Creating jdk.internal.ed.jmod Creating jdk.internal.jvmstat.jmod Creating jdk.internal.le.jmod Creating jdk.internal.opt.jmod Creating jdk.internal.vm.ci.jmod Creating jdk.internal.vm.compiler.jmod Creating jdk.internal.vm.compiler.management.jmod Creating jdk.jartool.jmod Creating jdk.javadoc.jmod Creating jdk.jcmd.jmod Creating jdk.jconsole.jmod Creating jdk.jdeps.jmod Creating jdk.jdi.jmod Creating jdk.jdwp.agent.jmod Creating jdk.jfr.jmod Creating jdk.jshell.jmod Creating jdk.jsobject.jmod Creating jdk.jstatd.jmod Creating jdk.localedata.jmod Creating jdk.management.jmod Creating jdk.management.agent.jmod Creating jdk.management.jfr.jmod Creating jdk.naming.dns.jmod Creating jdk.naming.rmi.jmod Creating jdk.nio.mapmode.jmod Creating jdk.net.jmod Creating jdk.rmic.jmod Creating jdk.scripting.nashorn.jmod Creating jdk.scripting.nashorn.shell.jmod Creating jdk.security.auth.jmod Creating jdk.sctp.jmod Creating jdk.security.jgss.jmod Creating jdk.unsupported.jmod Creating jdk.unsupported.desktop.jmod Creating jdk.xml.dom.jmod Creating jdk.zipfs.jmod Creating interim jimage Compiling 3 files for BUILD_DEMO_CodePointIM Updating support/demos/image/jfc/CodePointIM/src.zip Compiling 3 files for BUILD_DEMO_FileChooserDemo Updating support/demos/image/jfc/FileChooserDemo/src.zip Compiling 29 files for BUILD_DEMO_SwingSet2 Updating support/demos/image/jfc/SwingSet2/src.zip Compiling 3 files for BUILD_DEMO_Font2DTest Updating support/demos/image/jfc/Font2DTest/src.zip Compiling 64 files for BUILD_DEMO_J2Ddemo Updating support/demos/image/jfc/J2Ddemo/src.zip Compiling 15 files for BUILD_DEMO_Metalworks Updating support/demos/image/jfc/Metalworks/src.zip Compiling 2 files for BUILD_DEMO_Notepad Updating support/demos/image/jfc/Notepad/src.zip Compiling 5 files for BUILD_DEMO_Stylepad Updating support/demos/image/jfc/Stylepad/src.zip Compiling 5 files for BUILD_DEMO_SampleTree Updating support/demos/image/jfc/SampleTree/src.zip Compiling 8 files for BUILD_DEMO_TableExample Updating support/demos/image/jfc/TableExample/src.zip Compiling 1 files for BUILD_DEMO_TransparentRuler Updating support/demos/image/jfc/TransparentRuler/src.zip Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar Creating support/demos/image/jfc/Metalworks/Metalworks.jar Creating support/demos/image/jfc/Notepad/Notepad.jar Creating support/demos/image/jfc/Stylepad/Stylepad.jar Creating support/demos/image/jfc/SampleTree/SampleTree.jar Creating support/demos/image/jfc/TableExample/TableExample.jar Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar Compiling 1 files for CLASSLIST_JAR Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar Creating support/classlist.jar Creating jdk.jlink.jmod Creating java.base.jmod Creating jdk image WARNING: Using incubator modules: jdk.incubator.foreign, jdk.incubator.jpackage Creating CDS archive for jdk image Stopping sjavac server Finished building target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build' From nick.gasson at arm.com Tue Apr 7 07:19:10 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 07 Apr 2020 15:19:10 +0800 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active Message-ID: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Hi, Bug: https://bugs.openjdk.java.net/browse/JDK-8242029 Webrev: http://cr.openjdk.java.net/~ngasson/8242029/webrev.0/ Currently on AArch64 the G1GC array copy pre-barrier unconditionally performs a VM call into G1BarrierSetRuntime, but this is a no-op if marking is not in progress. This patch adds a check to skip the call unless marking is in progress. X86 already has this optimisation and I can't see a reason not to do it on AArch64 as well. Tested jtreg hotspot_all_no_apps, jdk_core. Results of ArrayCopy.arrayCopyObject* JMH benchmarks: Before: Benchmark Mode Cnt Score Error Units ArrayCopy.arrayCopyObject avgt 15 117.307 ? 11.607 ns/op ArrayCopy.arrayCopyObjectNonConst avgt 15 107.786 ? 2.692 ns/op ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 76.381 ? 0.761 ns/op ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 79.519 ? 3.433 ns/op After: Benchmark Mode Cnt Score Error Units ArrayCopy.arrayCopyObject avgt 15 86.161 ? 6.150 ns/op ArrayCopy.arrayCopyObjectNonConst avgt 15 83.539 ? 0.682 ns/op ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 52.388 ? 0.732 ns/op ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 54.619 ? 1.278 ns/op The VM call overhead can be quite high on AArch64 as we insert a serialising ISB instruction on every native->Java return. Thanks, Nick From erik.osterlund at oracle.com Tue Apr 7 07:33:23 2020 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 7 Apr 2020 09:33:23 +0200 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com> Hi Nick, Note that you only need ISB when returning from call_VM, not call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is redundant. Arraycopy uses leaf calls. So while this optimization is great for this case, maybe removing ISB fronleaf calls has a wider effect. It also appears to me that with Stuart?s new nmethod entry barriers enabled, ISB is never required on returns, as oop are no longer embedded in the instruction stream then (which is what the ISB protects against). Thanks, /Erik > On 7 Apr 2020, at 09:19, Nick Gasson wrote: > > ?Hi, > > Bug: https://bugs.openjdk.java.net/browse/JDK-8242029 > Webrev: http://cr.openjdk.java.net/~ngasson/8242029/webrev.0/ > > Currently on AArch64 the G1GC array copy pre-barrier unconditionally > performs a VM call into G1BarrierSetRuntime, but this is a no-op if > marking is not in progress. > > This patch adds a check to skip the call unless marking is in > progress. X86 already has this optimisation and I can't see a reason not > to do it on AArch64 as well. > > Tested jtreg hotspot_all_no_apps, jdk_core. > > Results of ArrayCopy.arrayCopyObject* JMH benchmarks: > > Before: > > Benchmark Mode Cnt Score Error Units > ArrayCopy.arrayCopyObject avgt 15 117.307 ? 11.607 ns/op > ArrayCopy.arrayCopyObjectNonConst avgt 15 107.786 ? 2.692 ns/op > ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 76.381 ? 0.761 ns/op > ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 79.519 ? 3.433 ns/op > > After: > > Benchmark Mode Cnt Score Error Units > ArrayCopy.arrayCopyObject avgt 15 86.161 ? 6.150 ns/op > ArrayCopy.arrayCopyObjectNonConst avgt 15 83.539 ? 0.682 ns/op > ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 52.388 ? 0.732 ns/op > ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 54.619 ? 1.278 ns/op > > The VM call overhead can be quite high on AArch64 as we insert a > serialising ISB instruction on every native->Java return. > > > Thanks, > Nick From aph at redhat.com Tue Apr 7 09:53:53 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 7 Apr 2020 10:53:53 +0100 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com> References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com> Message-ID: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com> On 4/7/20 8:33 AM, Erik ?sterlund wrote: > Note that you only need ISB when returning from call_VM, not > call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is > redundant. Arraycopy uses leaf calls. So while this optimization is > great for this case, maybe removing ISB fronleaf calls has a wider > effect. > > It also appears to me that with Stuart?s new nmethod entry barriers > enabled, ISB is never required on returns, as oop are no longer > embedded in the instruction stream then (which is what the ISB > protects against). That's probably true. In order to get it right for sure we'd need to insert a bunch of assertions in debug mode. I'll have a look. A (fairly) recent change to the ARM ARM (DDI 0487E, B2.3.5, Ordering of instruction fetches, first para.) means that we no longer have to be quite so paranoid about issuing ISBs. When call_VM was written the specification of what might happen was so loose that it was almost impossible to comply with. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Tue Apr 7 09:55:53 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 7 Apr 2020 10:55:53 +0100 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: On 4/7/20 8:19 AM, Nick Gasson wrote: The patch looks good, thanks. > This patch adds a check to skip the call unless marking is in > progress. X86 already has this optimisation and I can't see a reason not > to do it on AArch64 as well. Indeed not. I don't quite remember the history of this, but I guess that this optimization was added to x86 after we did AArch64. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Tue Apr 7 10:58:39 2020 From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=) Date: Tue, 7 Apr 2020 12:58:39 +0200 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com> References: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com> Message-ID: Hi, I think the best place to trigger the ISB is where you clear the last java frame. That happens on the exit path of all calls that may safepoint. The worry is code that explicitly saves and restores the last java frame, and uses leaf calls, instead of using call_VM. This solves that. Thanks, /Erik > On 7 Apr 2020, at 11:56, Andrew Haley wrote: > > ?On 4/7/20 8:33 AM, Erik ?sterlund wrote: > >> Note that you only need ISB when returning from call_VM, not >> call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is >> redundant. Arraycopy uses leaf calls. So while this optimization is >> great for this case, maybe removing ISB fronleaf calls has a wider >> effect. >> >> It also appears to me that with Stuart?s new nmethod entry barriers >> enabled, ISB is never required on returns, as oop are no longer >> embedded in the instruction stream then (which is what the ISB >> protects against). > > That's probably true. In order to get it right for sure we'd need to > insert a bunch of assertions in debug mode. I'll have a look. > > A (fairly) recent change to the ARM ARM (DDI 0487E, B2.3.5, Ordering > of instruction fetches, first para.) means that we no longer have to > be quite so paranoid about issuing ISBs. When call_VM was written the > specification of what might happen was so loose that it was almost > impossible to comply with. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > From aph at redhat.com Tue Apr 7 11:52:38 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 7 Apr 2020 12:52:38 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> Message-ID: <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> I notice that even after applying your patch we are still using embedded OOPs in two places. Here in aarch64.ad: if (rtype == relocInfo::oop_type) { __ movoop(dst_reg, (jobject)con, /*immediate*/true); } and here in sharedRuntime_aarch64.cpp: // load oop into a register __ movoop(c_rarg1, JNIHandles::make_local(method->method_holder()->java_mirror()), /*immediate*/true); Why is this? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Tue Apr 7 12:25:14 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 7 Apr 2020 13:25:14 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> Message-ID: On 4/7/20 12:52 PM, Andrew Haley wrote: > I notice that even after applying your patch we are still using embedded > OOPs in two places. > > Here in aarch64.ad: > > if (rtype == relocInfo::oop_type) { > __ movoop(dst_reg, (jobject)con, /*immediate*/true); > } > > and here in sharedRuntime_aarch64.cpp: > > // load oop into a register > __ movoop(c_rarg1, > JNIHandles::make_local(method->method_holder()->java_mirror()), > /*immediate*/true); > > Why is this? Ah, the second one is a handle, of course, and AFAIK handles don't move. Having said that, the use of movoop on something that is the address of an oop rather than an oop is odd,but it's done on other targets. The C2 one is still suspect. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From derekw at marvell.com Tue Apr 7 16:05:55 2020 From: derekw at marvell.com (Derek White) Date: Tue, 7 Apr 2020 16:05:55 +0000 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. Message-ID: [Sorry it took a long time to get some answers on this] I think we should no longer enable UseBarriersForVolatile for the prototype ThunderX processors (model A1, variant 0). We believe that these should all have been replaced or decommissioned. We can add an error at startup if we detect that CPU. Processor support is independent of whether UseBarriersForVolatile should be kept for debugging & development. On that issue, in addition to the support being broken and not regularly tested, I think that this adds a veneer of complexity to already subtle code. Especially since about half of the uses of UseBarriersForVolatile are of the form "if not using extra barriers, add a barrier" ??. I'd be fine with seeing it go. - Derek White, Marvell -----Original Message----- From: aarch64-port-dev On Behalf Of Ningsheng Jian Sent: Thursday, April 2, 2020 10:30 PM To: Nick Gasson ; Andrew Dinn Cc: aarch64-port-dev at openjdk.java.net Subject: [EXT] Re: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. External Email ---------------------------------------------------------------------- On 4/3/20 10:03 AM, Nick Gasson wrote: > On 04/02/20 21:22 pm, Andrew Dinn wrote: >> One reason for having this switch was to provide a comparator for our >> scheme to implement the Java volatile accesses using ldar/stlr. That >> translation scheme avoids a dmb after the stlr allowing the value >> being written to be committed lazily while still providing the >> critical guarantee that prior writes are committed before it gets >> committed. The switch ensures we can fall back to a 'reference' >> implementation based on dmbs that, amongst other things, enforces >> immediate commit of the volatile write after commit of its predecessors. >> >> By removing support we lose the ability to test cases where >> synchronization errors occur with our scheme by switching to the >> 'standard' model. That may still be useful for finding bugs (current >> or newly injected) in our translation and, indeed, in new HW. > > OK, but keeping it is not without cost. If UseBarriersForVolatile is > to have value as a reference implementation we need to expend effort > to test it and fix any bugs that arise from changes to other parts of > the code (see Xiaohong's original mail). > > Yes, if the "reference" implementation is not widely used and tested, it might be buggy and misleading. I know that when Xiaohong was working on similar part on Graal [1], she spent a lot of time tracing the UseBarriersForVolatile issue in hotspot vm. So I agree with Nick and trend to not maintain this implementation. [1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_oracle_graal_pull_2181&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=jMk_dCAeZJaapCXZaprROXyWC7AcLwFu-UIcW3SftNw&s=NXXKV871VjPzaiilChfrqlgm-viYGQBhujbES5aGNiw&e= Thanks, Ningsheng From nick.gasson at arm.com Wed Apr 8 06:22:49 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 08 Apr 2020 14:22:49 +0800 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com> On 04/07/20 17:55 pm, Andrew Haley wrote: > > The patch looks good, thanks. > Thanks, pushed it. >> This patch adds a check to skip the call unless marking is in >> progress. X86 already has this optimisation and I can't see a reason not >> to do it on AArch64 as well. > > Indeed not. I don't quite remember the history of this, but I guess > that this optimization was added to x86 after we did AArch64. I'm wondering if it's possible to optimise the array copy post-barrier in some cases as well. That VM call ends up in G1BarrierSet::invalidate which has a loop: // skip initial young cards for (; byte <= last_byte && *byte == G1CardTable::g1_young_card_val(); byte++); If the card table entries for the whole array are G1CardTable::g1_young_card_val() then it's a no-op. I tried replicating this loop in the barrier set assembly as a filter and only calling into the runtime when we know it has work to do. This seems to work, and gives a similar magnitude speed-up on the ArrayCopy microbenchmarks to the above patch. Removing the ISB on leaf calls helps too but skipping the call entirely is a bigger win. Do you think this is safe and worth doing? Thanks, Nick From aph at redhat.com Wed Apr 8 10:26:51 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 8 Apr 2020 11:26:51 +0100 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: References: Message-ID: On 4/7/20 5:05 PM, Derek White wrote: > I think we should no longer enable UseBarriersForVolatile for the prototype ThunderX processors (model A1, variant 0). We believe that these should all have been replaced or decommissioned. We can add an error at startup if we detect that CPU. > > Processor support is independent of whether UseBarriersForVolatile should be kept for debugging & development. > > On that issue, in addition to the support being broken and not regularly tested, I think that this adds a veneer of complexity to already subtle code. Especially since about half of the uses of UseBarriersForVolatile are of the form "if not using extra barriers, add a barrier" ??. I'd be fine with seeing it go. OK, thanks. One other use of UseBarriersForVolatile was as a fallback when HotSpot changes broke the use of ldar/stlr for volatile. Andrew Dinn, so you think we still need it as a fallback in case ldar/stlr handling breaks again? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed Apr 8 12:38:05 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 8 Apr 2020 13:38:05 +0100 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: On 4/8/20 7:22 AM, Nick Gasson wrote: > Do you think this is safe and worth doing? Please forgive me for turning this into a rather extreme thought experiment: if we hand-translate all GC runtime methods into all targets, we have an NxM problem, #collectors * #targets. So it's hard to justify without some heavy usage. And also, it means that if any of these runtime methods change, we'd risk falling behind on AArch64. Can you show us the assembly instructions that we'd save? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed Apr 8 13:08:56 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 8 Apr 2020 14:08:56 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> Message-ID: On 4/7/20 1:25 PM, Andrew Haley wrote: > On 4/7/20 12:52 PM, Andrew Haley wrote: >> I notice that even after applying your patch we are still using embedded >> OOPs in two places. >> >> Here in aarch64.ad: >> >> if (rtype == relocInfo::oop_type) { >> __ movoop(dst_reg, (jobject)con, /*immediate*/true); >> } >> >> and here in sharedRuntime_aarch64.cpp: >> >> // load oop into a register >> __ movoop(c_rarg1, >> JNIHandles::make_local(method->method_holder()->java_mirror()), >> /*immediate*/true); >> >> Why is this? > > Ah, the second one is a handle, of course, and AFAIK handles don't move. > Having said that, the use of movoop on something that is the address of an > oop rather than an oop is odd,but it's done on other targets. > > The C2 one is still suspect. I made the following changes, bootstrap still works: diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/aarch64.ad --- a/src/hotspot/cpu/aarch64/aarch64.ad Wed Apr 08 08:57:07 2020 -0400 +++ b/src/hotspot/cpu/aarch64/aarch64.ad Wed Apr 08 09:03:18 2020 -0400 @@ -3160,7 +3160,7 @@ } else { relocInfo::relocType rtype = $src->constant_reloc(); if (rtype == relocInfo::oop_type) { - __ movoop(dst_reg, (jobject)con, /*immediate*/true); + __ movoop(dst_reg, (jobject)con, /*immediate*/false); } else if (rtype == relocInfo::metadata_type) { __ mov_metadata(dst_reg, (Metadata*)con); } else { diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Wed Apr 08 08:57:07 2020 -0400 +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Wed Apr 08 09:03:18 2020 -0400 @@ -4145,7 +4145,7 @@ if (! immediate) { // nmethod barriers need to be ordered with respected to oop accesses, so // we can't use immediate literals as that would necessitate ISBs. - if (BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) { + if (0 && BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) { adr(dst, InternalAddress(address_constant((address)obj, rspec))); ldr(dst, Address(dst)); } else { diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Wed Apr 08 08:57:07 2020 -0400 +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Wed Apr 08 09:03:18 2020 -0400 @@ -1676,11 +1676,10 @@ // Pre-load a static method's oop into c_rarg1. if (method->is_static() && !is_critical_native) { - // load oop into a register __ movoop(c_rarg1, JNIHandles::make_local(method->method_holder()->java_mirror()), - /*immediate*/true); + /*immediate*/false); // Now handlize the static class mirror it's known not-null. __ str(c_rarg1, Address(sp, klass_offset)); diff -r cd06d732d5f0 src/hotspot/share/runtime/sharedRuntime.cpp --- a/src/hotspot/share/runtime/sharedRuntime.cpp Wed Apr 08 08:57:07 2020 -0400 +++ b/src/hotspot/share/runtime/sharedRuntime.cpp Wed Apr 08 09:03:18 2020 -0400 @@ -2873,6 +2873,7 @@ CodeBuffer buffer(buf); double locs_buf[20]; buffer.insts()->initialize_shared_locs((relocInfo*)locs_buf, sizeof(locs_buf) / sizeof(relocInfo)); + buffer.initialize_consts_size(8); MacroAssembler _masm(&buffer); // Fill in the signature array, for the calling-convention call. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From adinn at redhat.com Wed Apr 8 13:18:11 2020 From: adinn at redhat.com (Andrew Dinn) Date: Wed, 8 Apr 2020 14:18:11 +0100 Subject: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64. In-Reply-To: References: Message-ID: <0aed395c-b485-c846-bbb0-f4b64c32d087@redhat.com> On 08/04/2020 11:26, Andrew Haley wrote: > One other use of UseBarriersForVolatile was as a fallback when > HotSpot changes broke the use of ldar/stlr for volatile. Andrew Dinn, > so you think we still need it as a fallback in case ldar/stlr > handling breaks again? Well, I'm probably not the person to ask as my thought was that maintaining the two sets of paths that this flag implies was not really much of a burden. That's probably just me though (it /was/ mostly my code). A-and yet I can see that it's not just me who is not maintaining this code. So, if others find this complexity a burden then I'm happy for us to simplify things by removing the flag and the alternative paths. I think the code has baked fairly well so perhaps we don't need this fallback. regards, Andrew Dinn ----------- Senior Principal Software Engineer Red Hat UK Ltd Registered in England and Wales under Company Registration No. 03798903 Directors: Michael Cunningham, Michael ("Mike") O'Neill From ci_notify at linaro.org Wed Apr 8 15:28:39 2020 From: ci_notify at linaro.org (ci_notify at linaro.org) Date: Wed, 8 Apr 2020 15:28:39 +0000 (UTC) Subject: [aarch64-port-dev ] JTREG, JCStress, SPECjbb2015 and Hadoop/Terasort results for OpenJDK 14 on AArch64 Message-ID: <1887422348.15550.1586359720620.JavaMail.javamailuser@localhost> This is a summary of the JTREG test results =========================================== The build and test results are cycled every 15 days. For detailed information on the test output please refer to: http://openjdk.linaro.org/jdk14/openjdk-jtreg-nightly-tests/summary/2020/098/summary.html ------------------------------------------------------------------------------- release/hotspot ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/23 pass: 5,773; fail: 45 Build 1: aarch64/2020/jan/30 pass: 5,773; fail: 45 Build 2: aarch64/2020/feb/06 pass: 5,773; fail: 46 Build 3: aarch64/2020/feb/09 pass: 5,775; fail: 44 Build 4: aarch64/2020/apr/07 pass: 5,781; fail: 45 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- release/jdk ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/23 pass: 8,831; fail: 524; error: 18 Build 1: aarch64/2020/jan/30 pass: 8,839; fail: 518; error: 17 Build 2: aarch64/2020/feb/06 pass: 8,838; fail: 517; error: 18 Build 3: aarch64/2020/feb/09 pass: 8,832; fail: 523; error: 18 Build 4: aarch64/2020/apr/07 pass: 8,844; fail: 505; error: 20 1 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- release/langtools ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/23 pass: 4,031 Build 1: aarch64/2020/jan/30 pass: 4,031 Build 2: aarch64/2020/feb/03 pass: 4,031 Build 3: aarch64/2020/feb/06 pass: 4,031 Build 4: aarch64/2020/feb/09 pass: 4,031 Build 5: aarch64/2020/apr/07 pass: 4,031 Previous results can be found here: http://openjdk.linaro.org/jdk14/openjdk-jtreg-nightly-tests/index.html SPECjbb2015 composite regression test completed =============================================== This test measures the relative performance of the server compiler running the SPECjbb2015 composite tests and compares the performance against the baseline performance of the server compiler taken on 2016-11-21. In accordance with [1], the SPECjbb2015 tests are run on a system which is not production ready and does not meet all the requirements for publishing compliant results. The numbers below shall be treated as non-compliant (nc) and are for experimental purposes only. Relative performance: Server max-jOPS (nc): 8.24x Relative performance: Server critical-jOPS (nc): 9.80x Details of the test setup and historical results may be found here: http://openjdk.linaro.org/jdk14/SPECjbb2015-results/ [1] http://www.spec.org/fairuse.html#Academic Regression test Hadoop-Terasort completed ========================================= This test measures the performance of the server and client compilers running Hadoop sorting a 1GB file using Terasort and compares the performance against the baseline performance of the Zero interpreter and against the baseline performance of the server compiler on 2014-04-01. Relative performance: Zero: 1.0, Server: 210.67 Server 210.67 / Server 2014-04-01 (71.00): 2.97x Details of the test setup and historical results may be found here: http://openjdk.linaro.org/jdk14/hadoop-terasort-benchmark-results/ This is a summary of the jcstress test results ============================================== The build and test results are cycled every 15 days. 2020-01-24 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/023/results/ 2020-02-01 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/030/results/ 2020-02-08 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/037/results/ 2020-02-10 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/040/results/ 2020-04-08 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/098/results/ For detailed information on the test output please refer to: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/ From stumon01 at arm.com Wed Apr 8 15:33:20 2020 From: stumon01 at arm.com (Stuart Monteith) Date: Wed, 8 Apr 2020 16:33:20 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> Message-ID: I see what you did there. This comes back to our previous discussion about the value of having immediate oops at all. isn't that what you are effectively suggesting? That would simply the code somewhat. On 08/04/2020 14:08, Andrew Haley wrote: > On 4/7/20 1:25 PM, Andrew Haley wrote: >> On 4/7/20 12:52 PM, Andrew Haley wrote: >>> I notice that even after applying your patch we are still using embedded >>> OOPs in two places. >>> >>> Here in aarch64.ad: >>> >>> if (rtype == relocInfo::oop_type) { >>> __ movoop(dst_reg, (jobject)con, /*immediate*/true); >>> } >>> >>> and here in sharedRuntime_aarch64.cpp: >>> >>> // load oop into a register >>> __ movoop(c_rarg1, >>> JNIHandles::make_local(method->method_holder()->java_mirror()), >>> /*immediate*/true); >>> >>> Why is this? >> >> Ah, the second one is a handle, of course, and AFAIK handles don't move. >> Having said that, the use of movoop on something that is the address of an >> oop rather than an oop is odd,but it's done on other targets. >> >> The C2 one is still suspect. > > I made the following changes, bootstrap still works: > > diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/aarch64.ad > --- a/src/hotspot/cpu/aarch64/aarch64.ad Wed Apr 08 08:57:07 2020 -0400 > +++ b/src/hotspot/cpu/aarch64/aarch64.ad Wed Apr 08 09:03:18 2020 -0400 > @@ -3160,7 +3160,7 @@ > } else { > relocInfo::relocType rtype = $src->constant_reloc(); > if (rtype == relocInfo::oop_type) { > - __ movoop(dst_reg, (jobject)con, /*immediate*/true); > + __ movoop(dst_reg, (jobject)con, /*immediate*/false); > } else if (rtype == relocInfo::metadata_type) { > __ mov_metadata(dst_reg, (Metadata*)con); > } else { > diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp > --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Wed Apr 08 08:57:07 2020 -0400 > +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp Wed Apr 08 09:03:18 2020 -0400 > @@ -4145,7 +4145,7 @@ > if (! immediate) { > // nmethod barriers need to be ordered with respected to oop accesses, so > // we can't use immediate literals as that would necessitate ISBs. > - if (BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) { > + if (0 && BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) { > adr(dst, InternalAddress(address_constant((address)obj, rspec))); > ldr(dst, Address(dst)); > } else { > diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp > --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Wed Apr 08 08:57:07 2020 -0400 > +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp Wed Apr 08 09:03:18 2020 -0400 > @@ -1676,11 +1676,10 @@ > > // Pre-load a static method's oop into c_rarg1. > if (method->is_static() && !is_critical_native) { > - > // load oop into a register > __ movoop(c_rarg1, > JNIHandles::make_local(method->method_holder()->java_mirror()), > - /*immediate*/true); > + /*immediate*/false); > > // Now handlize the static class mirror it's known not-null. > __ str(c_rarg1, Address(sp, klass_offset)); > diff -r cd06d732d5f0 src/hotspot/share/runtime/sharedRuntime.cpp > --- a/src/hotspot/share/runtime/sharedRuntime.cpp Wed Apr 08 08:57:07 2020 -0400 > +++ b/src/hotspot/share/runtime/sharedRuntime.cpp Wed Apr 08 09:03:18 2020 -0400 > @@ -2873,6 +2873,7 @@ > CodeBuffer buffer(buf); > double locs_buf[20]; > buffer.insts()->initialize_shared_locs((relocInfo*)locs_buf, sizeof(locs_buf) / sizeof(relocInfo)); > + buffer.initialize_consts_size(8); > MacroAssembler _masm(&buffer); > > // Fill in the signature array, for the calling-convention call. > IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From aph at redhat.com Wed Apr 8 16:15:58 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 8 Apr 2020 17:15:58 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> Message-ID: <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com> On 4/8/20 4:33 PM, Stuart Monteith wrote: > I see what you did there. This comes back to our previous discussion > about the value of having immediate oops at all. isn't that what you are > effectively suggesting? That would simply the code somewhat. No entirely. Immediate oops are good for most GCs. But according to what Erik said, immediate oops are verboten when we're using ZGC with concurrent method unloading, and it seems to be very easy to do. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Thu Apr 9 06:43:12 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Thu, 9 Apr 2020 06:43:12 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I In-Reply-To: References: Message-ID: Hi Update the patch a little. Could you please help to review it? http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/ Test: tier1. -----Original Message----- From: aarch64-port-dev On Behalf Of Yang Zhang Sent: Friday, April 3, 2020 6:49 PM To: hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net Cc: nd Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I Hi, Could you please help to review this patch? In original reduce_add2I, dst may be the same as tmp2, which may get incorrect result. Some reduction operation instruct code formats are also cleaned up. JBS: https://bugs.openjdk.java.net/browse/JDK-8241911 Webrev: http://cr.openjdk.java.net/~yzhang/8241911/webrev.00/ Regards Yang From nick.gasson at arm.com Thu Apr 9 08:59:44 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 09 Apr 2020 16:59:44 +0800 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com> On 04/08/20 20:38 pm, Andrew Haley wrote: > On 4/8/20 7:22 AM, Nick Gasson wrote: >> Do you think this is safe and worth doing? > > Please forgive me for turning this into a rather extreme thought > experiment: if we hand-translate all GC runtime methods into all > targets, we have an NxM problem, #collectors * #targets. So it's hard > to justify without some heavy usage. And also, it means that if any of > these runtime methods change, we'd risk falling behind on AArch64. > > Can you show us the assembly instructions that we'd save? So I'm suggesting doing the following: --- a/src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp +++ b/src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp @@ -87,13 +87,43 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators, Register start, Register count, Register scratch, RegSet saved_regs) { - __ push(saved_regs, sp); - assert_different_registers(start, count, scratch); + + assert_different_registers(start, count, scratch, rscratch1, rscratch2); assert_different_registers(c_rarg0, count); + + const Register card_addr = scratch; + const Register end_card_addr = rscratch1; + + Label skip, slowpath, next; + + __ cbz(count, skip); + + __ lsr(card_addr, start, CardTable::card_shift); + + __ lea(end_card_addr, Address(start, count, Address::lsl(LogBytesPerHeapOop))); + __ lsr(end_card_addr, end_card_addr, CardTable::card_shift); + + __ load_byte_map_base(rscratch2); + __ add(card_addr, card_addr, rscratch2); + __ add(end_card_addr, end_card_addr, rscratch2); + + __ bind(next); + __ ldrb(rscratch2, Address(card_addr)); + __ cmpw(rscratch2, (int)G1CardTable::g1_young_card_val()); + __ br(Assembler::NE, slowpath); + __ cmp(card_addr, end_card_addr); + __ br(Assembler::EQ, skip); + __ add(card_addr, card_addr, 1); + __ b(next); + + __ bind(slowpath); + __ push(saved_regs, sp); __ mov(c_rarg0, start); __ mov(c_rarg1, count); __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_post_entry), 2); __ pop(saved_regs, sp); + + __ bind(skip); } (Add change the call sites to not pass rscratch1 as scratch.) It has a nice speedup on the ArrayCopy microbenchmarks, but I agree this sort of thing is a maintenance burden if it doesn't affect real workloads. With JDK-8242029: Benchmark Mode Cnt Score Error Units ArrayCopy.arrayCopyObject avgt 15 82.314 ? 0.641 ns/op ArrayCopy.arrayCopyObjectNonConst avgt 15 87.351 ? 6.820 ns/op ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 54.272 ? 1.445 ns/op ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 54.596 ? 1.329 ns/op With the above modification: Benchmark Mode Cnt Score Error Units ArrayCopy.arrayCopyObject avgt 15 58.913 ? 1.265 ns/op ArrayCopy.arrayCopyObjectNonConst avgt 15 64.682 ? 8.147 ns/op ArrayCopy.arrayCopyObjectSameArraysBackward avgt 15 36.866 ? 1.319 ns/op ArrayCopy.arrayCopyObjectSameArraysForward avgt 15 30.445 ? 3.719 ns/op Thanks, Nick From aph at redhat.com Thu Apr 9 09:41:59 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 9 Apr 2020 10:41:59 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I In-Reply-To: References: Message-ID: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com> On 4/9/20 7:43 AM, Yang Zhang wrote: > Hi > > Update the patch a little. Could you please help to review it? > http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/ I've been trying to figure out why this code is so difficult to understand. I think it's because names like tmp1 and src1 are used regardless of what kind of thing tmp1 is. I suggest something like instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{ match(Set dst (AddReductionVI i_src v_src)); ins_cost(INSN_COST); effect(TEMP v_tmp, TEMP i_tmp); format %{ "addv $v_tmp, T4S, $v_src\n\t" "umov $i_tmp, $v_tmp, S, 0\n\t" "addw $dst, $i_tmp, $i_src\t# add reduction4I" %} ins_encode %{ __ addv(as_FloatRegister($v_tmp$$reg), __ T4S, as_FloatRegister($v_src$$reg)); __ umov($i_tmp$$Register, as_FloatRegister($v_tmp$$reg), __ S, 0); __ addw($dst$$Register, $i_tmp$$Register, $i_src$$Register); %} ins_pipe(pipe_class_default); %} I think this makes the intent much clearer. Thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Thu Apr 9 11:21:42 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Thu, 9 Apr 2020 11:21:42 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I In-Reply-To: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com> References: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com> Message-ID: Hi Andrew >instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{ Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? -----Original Message----- From: Andrew Haley Sent: Thursday, April 9, 2020 5:42 PM To: Yang Zhang ; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I On 4/9/20 7:43 AM, Yang Zhang wrote: > Hi > > Update the patch a little. Could you please help to review it? > http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/ I've been trying to figure out why this code is so difficult to understand. I think it's because names like tmp1 and src1 are used regardless of what kind of thing tmp1 is. I suggest something like instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{ match(Set dst (AddReductionVI i_src v_src)); ins_cost(INSN_COST); effect(TEMP v_tmp, TEMP i_tmp); format %{ "addv $v_tmp, T4S, $v_src\n\t" "umov $i_tmp, $v_tmp, S, 0\n\t" "addw $dst, $i_tmp, $i_src\t# add reduction4I" %} ins_encode %{ __ addv(as_FloatRegister($v_tmp$$reg), __ T4S, as_FloatRegister($v_src$$reg)); __ umov($i_tmp$$Register, as_FloatRegister($v_tmp$$reg), __ S, 0); __ addw($dst$$Register, $i_tmp$$Register, $i_src$$Register); %} ins_pipe(pipe_class_default); %} I think this makes the intent much clearer. Thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Apr 9 12:21:22 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 9 Apr 2020 13:21:22 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I In-Reply-To: References: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com> Message-ID: On 4/9/20 12:21 PM, Yang Zhang wrote: > Hi Andrew > >> instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{ > > Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? That's a good point. I'll accept http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/ as it is, with a separate patch to clarify those reduction operations. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Apr 9 16:31:38 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 9 Apr 2020 17:31:38 +0100 Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy pre-barrier if marking not active In-Reply-To: <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com> References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com> <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com> Message-ID: On 4/9/20 9:59 AM, Nick Gasson wrote: > It has a nice speedup on the ArrayCopy microbenchmarks, but I agree this > sort of thing is a maintenance burden if it doesn't affect real > workloads. Now you've got me interested. :-) I'm looking at the code we we execute when we call the runtime. The call_VM_leaf() we generate is 0x0000ffffa913a6ec: mov x0, x1 0x0000ffffa913a6f0: mov x1, x2 0x0000ffffa913a6f4: stp x8, x12, [sp, #-16]! ;; 0xFFFFBCE50CD4 0x0000ffffa913a6f8: mov x8, #0xcd4 // #3284 0x0000ffffa913a6fc: movk x8, #0xbce5, lsl #16 0x0000ffffa913a700: movk x8, #0xffff, lsl #32 0x0000ffffa913a704: blr x8 0x0000ffffa913a708: ldp x8, x12, [sp], #16 0x0000ffffa913a70c: isb As discussed, we can lose the ISB here. If we're not called from the interpreter we can also lose the saving of r12 and rscratch1. This calls G1BarrierSetRuntime::write_ref_array_post_entry() => 0x0000ffffbd89d750 <+0>: adrp x2, 0xffffbe2ae000 0x0000ffffbd89d754 <+4>: adrp x4, 0xffffbe2aa000 0x0000ffffbd89d758 <+8>: and x3, x0, #0xfffffffffffffff8 0x0000ffffbd89d75c <+12>: ldr x2, [x2, #264] 0x0000ffffbd89d760 <+16>: ldr x4, [x4, #2024] 0x0000ffffbd89d764 <+20>: ldrsw x2, [x2] 0x0000ffffbd89d768 <+24>: madd x2, x2, x1, x0 0x0000ffffbd89d76c <+28>: ldr x0, [x4] 0x0000ffffbd89d770 <+32>: add x2, x2, #0x7 0x0000ffffbd89d774 <+36>: and x2, x2, #0xfffffffffffffff8 0x0000ffffbd89d778 <+40>: adrp x4, 0xffffbd895000 0x0000ffffbd89d77c <+44>: sub x2, x2, x3 0x0000ffffbd89d780 <+48>: add x4, x4, #0x640 0x0000ffffbd89d784 <+52>: ldr x5, [x0] 0x0000ffffbd89d788 <+56>: lsr x2, x2, #3 0x0000ffffbd89d78c <+60>: ldr x7, [x5, #88] 0x0000ffffbd89d790 <+64>: cmp x7, x4 0x0000ffffbd89d794 <+68>: b.ne 0xffffbd89d7a8 0x0000ffffbd89d798 <+72>: ldr x4, [x5, #56] 0x0000ffffbd89d79c <+76>: mov x1, x3 0x0000ffffbd89d7a0 <+80>: mov x16, x4 0x0000ffffbd89d7a4 <+84>: br x16 which seems to be a bunch of stuff to discover the adresses to scan, aligning them properly, followed by a virtual dispatch to G1BarrierSet::invalidate(), which contains the loop which scans the card table: 0x0000ffffbda250a0 <+0>: cbz x2, 0xffffbda25170 0x0000ffffbda250a4 <+4>: stp x29, x30, [sp, #-48]! 0x0000ffffbda250a8 <+8>: add x2, x1, x2, lsl #3 0x0000ffffbda250ac <+12>: mov x29, sp 0x0000ffffbda250b0 <+16>: str x21, [sp, #32] 0x0000ffffbda250b4 <+20>: sub x21, x2, #0x8 0x0000ffffbda250b8 <+24>: ldr x0, [x0, #64] 0x0000ffffbda250bc <+28>: ldr x0, [x0, #72] 0x0000ffffbda250c0 <+32>: add x1, x0, x1, lsr #9 0x0000ffffbda250c4 <+36>: add x21, x0, x21, lsr #9 0x0000ffffbda250c8 <+40>: cmp x21, x1 0x0000ffffbda250cc <+44>: b.cc 0xffffbda25164 // b.lo, b.ul, b.last 0x0000ffffbda250d0 <+48>: stp x19, x20, [sp, #16] 0x0000ffffbda250d4 <+52>: b 0xffffbda250e0 0x0000ffffbda250d8 <+56>: cmp x21, x1 0x0000ffffbda250dc <+60>: b.cc 0xffffbda25160 // b.lo, b.ul, b.last 0x0000ffffbda250e0 <+64>: ldrb w0, [x1] 0x0000ffffbda250e4 <+68>: mov x19, x1 0x0000ffffbda250e8 <+72>: add x1, x1, #0x1 0x0000ffffbda250ec <+76>: and w0, w0, #0xff 0x0000ffffbda250f0 <+80>: cmp w0, #0x8 0x0000ffffbda250f4 <+84>: b.eq 0xffffbda250d8 // b.none ... 0x0000ffffbda25160 <+192>: ldp x19, x20, [sp, #16] 0x0000ffffbda25164 <+196>: ldr x21, [sp, #32] 0x0000ffffbda25168 <+200>: ldp x29, x30, [sp], #48 0x0000ffffbda2516c <+204>: ret This clearly is a fair bit more than what we'd do by hand. The thing that baffles me, I guess, is why the runtime does all this extra stuff. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stuart.monteith at arm.com Thu Apr 9 19:18:01 2020 From: stuart.monteith at arm.com (Stuart Monteith) Date: Thu, 9 Apr 2020 20:18:01 +0100 Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics from os_linux.cpp/hpp/inline.hpp In-Reply-To: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> Message-ID: Thanks David. I'll need someone to push this for me. Ningsheng - would you be able to? Thanks, Stuart On 01/04/2020 11:03, David Holmes wrote: > Hi Stuart, > > On 1/04/2020 7:29 pm, Stuart Monteith wrote: >> Hello, >> ???????? This patch removes a couple of x86 specifics from aarch64 >> code. Tested with hotspot tier1. >> >> Webrev: >> ???????? http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/ >> Bug: >> ???????? https://bugs.openjdk.java.net/browse/JDK-8241587 > > That clean up seems good to me. > >> Thanks, >> ???????? Stuart >> >> IMPORTANT NOTICE: The contents of this email and any attachments are >> confidential and may also be privileged. If you are not the intended >> recipient, please notify the sender immediately and do not disclose >> the contents to any other person, use it for any purpose, or store or >> copy the information in any medium. Thank you. > > That footer seems inappropriate for OpenJDK emails. > > Cheers, > David > From ci_notify at linaro.org Thu Apr 9 23:41:53 2020 From: ci_notify at linaro.org (ci_notify at linaro.org) Date: Thu, 9 Apr 2020 23:41:53 +0000 (UTC) Subject: [aarch64-port-dev ] JTREG, JCStress, SPECjbb2015 and Hadoop/Terasort results for OpenJDK JDK on AArch64 Message-ID: <35869368.16925.1586475714461.JavaMail.javamailuser@localhost> This is a summary of the JTREG test results =========================================== The build and test results are cycled every 15 days. For detailed information on the test output please refer to: http://openjdk.linaro.org/jdkX/openjdk-jtreg-nightly-tests/summary/2020/099/summary.html ------------------------------------------------------------------------------- client-release/hotspot ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 5,780; fail: 19; not run: 90 ------------------------------------------------------------------------------- client-release/jdk ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 8,495; fail: 670; error: 23 ------------------------------------------------------------------------------- client-release/langtools ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 3,970; fail: 5 ------------------------------------------------------------------------------- release/hotspot ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/13 pass: 5,770; fail: 44 Build 1: aarch64/2020/jan/15 pass: 5,770; fail: 46 Build 2: aarch64/2020/jan/20 pass: 5,776; fail: 44 Build 3: aarch64/2020/jan/22 pass: 5,776; fail: 44 Build 4: aarch64/2020/jan/24 pass: 5,775; fail: 45 Build 5: aarch64/2020/jan/27 pass: 5,776; fail: 44 Build 6: aarch64/2020/jan/29 pass: 5,776; fail: 44 Build 7: aarch64/2020/feb/01 pass: 5,777; fail: 46 Build 8: aarch64/2020/feb/03 pass: 5,777; fail: 46 Build 9: aarch64/2020/feb/05 pass: 5,778; fail: 46 Build 10: aarch64/2020/feb/10 pass: 5,781; fail: 46 Build 11: aarch64/2020/feb/12 pass: 5,786; fail: 46 Build 12: aarch64/2020/mar/06 pass: 5,797; fail: 46 Build 13: aarch64/2020/mar/16 pass: 5,796; fail: 47 Build 14: aarch64/2020/apr/08 pass: 5,816; fail: 46; error: 2 ------------------------------------------------------------------------------- release/jdk ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/13 pass: 8,825; fail: 524; error: 20 Build 1: aarch64/2020/jan/15 pass: 8,827; fail: 524; error: 19 Build 2: aarch64/2020/jan/20 pass: 8,830; fail: 529; error: 16 Build 3: aarch64/2020/jan/22 pass: 8,829; fail: 528; error: 19 Build 4: aarch64/2020/jan/24 pass: 8,832; fail: 537; error: 16 Build 5: aarch64/2020/jan/27 pass: 8,846; fail: 523; error: 17 Build 6: aarch64/2020/jan/29 pass: 8,844; fail: 522; error: 19 Build 7: aarch64/2020/feb/01 pass: 8,848; fail: 523; error: 18 Build 8: aarch64/2020/feb/03 pass: 8,851; fail: 525; error: 15 Build 9: aarch64/2020/feb/05 pass: 8,851; fail: 526; error: 15 Build 10: aarch64/2020/feb/10 pass: 8,858; fail: 518; error: 20 Build 11: aarch64/2020/feb/12 pass: 8,849; fail: 525; error: 17 Build 12: aarch64/2020/mar/06 pass: 8,870; fail: 526; error: 17 Build 13: aarch64/2020/mar/16 pass: 8,872; fail: 525; error: 16 Build 14: aarch64/2020/apr/08 pass: 8,891; fail: 532; error: 13 4 fatal errors were detected; please follow the link above for more detail. ------------------------------------------------------------------------------- release/langtools ------------------------------------------------------------------------------- Build 0: aarch64/2020/jan/10 pass: 4,030 Build 1: aarch64/2020/jan/13 pass: 4,030 Build 2: aarch64/2020/jan/15 pass: 4,031 Build 3: aarch64/2020/jan/20 pass: 4,033 Build 4: aarch64/2020/jan/22 pass: 4,033 Build 5: aarch64/2020/jan/24 pass: 4,033 Build 6: aarch64/2020/jan/27 pass: 4,033 Build 7: aarch64/2020/feb/01 pass: 4,036 Build 8: aarch64/2020/feb/03 pass: 4,036 Build 9: aarch64/2020/feb/05 pass: 4,036 Build 10: aarch64/2020/feb/10 pass: 4,037 Build 11: aarch64/2020/feb/12 pass: 4,037 Build 12: aarch64/2020/mar/06 pass: 4,039 Build 13: aarch64/2020/mar/16 pass: 4,039 Build 14: aarch64/2020/apr/08 pass: 4,042 ------------------------------------------------------------------------------- server-release/hotspot ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 5,787; fail: 18; not run: 90 ------------------------------------------------------------------------------- server-release/jdk ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 8,476; fail: 686; error: 27 ------------------------------------------------------------------------------- server-release/langtools ------------------------------------------------------------------------------- Build 0: aarch64/2018/oct/15 pass: 3,970; fail: 5 Previous results can be found here: http://openjdk.linaro.org/jdkX/openjdk-jtreg-nightly-tests/index.html SPECjbb2015 composite regression test completed =============================================== This test measures the relative performance of the server compiler running the SPECjbb2015 composite tests and compares the performance against the baseline performance of the server compiler taken on 2016-11-21. In accordance with [1], the SPECjbb2015 tests are run on a system which is not production ready and does not meet all the requirements for publishing compliant results. The numbers below shall be treated as non-compliant (nc) and are for experimental purposes only. Relative performance: Server max-jOPS (nc): 8.14x Relative performance: Server critical-jOPS (nc): 12.38x Details of the test setup and historical results may be found here: http://openjdk.linaro.org/jdkX/SPECjbb2015-results/ [1] http://www.spec.org/fairuse.html#Academic Regression test Hadoop-Terasort completed ========================================= This test measures the performance of the server and client compilers running Hadoop sorting a 1GB file using Terasort and compares the performance against the baseline performance of the Zero interpreter and against the baseline performance of the server compiler on 2014-04-01. Relative performance: Zero: 1.0, Server: 210.67 Server 210.67 / Server 2014-04-01 (71.00): 2.97x Details of the test setup and historical results may be found here: http://openjdk.linaro.org/jdkX/hadoop-terasort-benchmark-results/ This is a summary of the jcstress test results ============================================== The build and test results are cycled every 15 days. 2020-01-11 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/010/results/ 2020-01-14 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/013/results/ 2020-01-16 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/015/results/ 2020-01-21 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/020/results/ 2020-01-23 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/022/results/ 2020-01-25 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/024/results/ 2020-01-28 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/027/results/ 2020-02-02 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/032/results/ 2020-02-04 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/034/results/ 2020-02-06 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/036/results/ 2020-02-11 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/041/results/ 2020-02-13 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/043/results/ 2020-03-17 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/066/results/ 2020-03-19 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/076/results/ 2020-04-09 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/099/results/ For detailed information on the test output please refer to: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/ From ningsheng.jian at arm.com Fri Apr 10 02:15:48 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Fri, 10 Apr 2020 10:15:48 +0800 Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics from os_linux.cpp/hpp/inline.hpp In-Reply-To: References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com> <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com> Message-ID: <7641c396-fde8-dc35-c355-18faba5c5a39@arm.com> On 4/10/20 3:18 AM, Stuart Monteith wrote: > Thanks David. I'll need someone to push this for me. Ningsheng - would > you be able to? > Pushed. Thanks, Ningsheng From Yang.Zhang at arm.com Fri Apr 10 02:45:45 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 10 Apr 2020 02:45:45 +0000 Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I In-Reply-To: References: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com> Message-ID: Okay. When the patch is ready, I will send it for review. Regards Yang -----Original Message----- From: Andrew Haley Sent: Thursday, April 9, 2020 8:21 PM To: Yang Zhang ; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I On 4/9/20 12:21 PM, Yang Zhang wrote: > Hi Andrew > >> instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, >> vecX v_tmp, iRegINoSp i_tmp) %{ > > Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? That's a good point. I'll accept http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/ as it is, with a separate patch to clarify those reduction operations. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Fri Apr 10 02:52:45 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 10 Apr 2020 02:52:45 +0000 Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8242070 Webrev: http://cr.openjdk.java.net/~yzhang/8242070/webrev.00/ In JDK-8238690, it unified IR shape for vector shifts by scalar and always used ShiftV src (ShiftCntV shift) When shift is scalar, the following IR nodes are generated. scalar_shift | src ShiftCntV | / | / ShiftV But when implementing this on AArch64, there is an issue in match rule of vector shift right with imm shift for short type. match(Set dst (RShiftVS src (LShiftCntV shift))); LShiftCntV should be RShiftCntV here. Test case: public static void shiftR(short[] a, short[] c) { for (int i = 0; i < a.length; i++) { c[i] = (short)(a[i] >> 2); } } IR nodes: imm:2 | LoadVector RShiftCntV | / | / RShiftVS C2 aassembly generated: Before: 0x0000ffffac563764: orr w11, wzr, #0x2 0x0000ffffac563768: dup v16.16b, w11 -------- vshiftcnt16B 0x0000ffffac5637a8: ldr q24, [x18, #16] 0x0000ffffac5637ac: neg v25.16b, v16.16b ------ 0x0000ffffac5637b0: sshl v24.8h, v24.8h, v25.8h ------vsra8S 0x0000ffffac5637b8: str q24, [x14, #16] "match(Set dst (RShiftVS src (LShiftCntV shift)));" matching fails. RShiftCntV and RShiftVS are matched separately by vshiftcnt16B and vsra8S. After: 0x0000ffffac563808: ldr q16, [x15, #16] 0x0000ffffac56380c: sshr v16.8h, v16.8h, #2 0x0000ffffac563814: str q16, [x14, #16] "match(Set dst (RShiftVS src (RShiftCntV shift)));" matching succeeds. Performance: JMH test case is attached in JBS. Before: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 10 66.964 ? 0.052 us/op After: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 10 56.156 ? 0.053 us/op Testing: tier1 Pass and no new failure. Regards Yang From gnu.andrew at redhat.com Tue Apr 14 20:25:35 2020 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Tue, 14 Apr 2020 21:25:35 +0100 Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync Message-ID: Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/ Merge changesets: http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset Changes in aarch64-shenandoah-jdk8u252-b09: - S8204152: SignedObject throws NullPointerException for null keys with an initialized Signature object - S8219597: (bf) Heap buffer state changes could provoke unexpected exceptions - S8223898: Forward references to Nashorn - S8223904: Improve Nashorn matching - S8224541: Better mapping of serial ENUMs - S8224549: Less Blocking Array Queues - S8225603: Enhancement for big integers - S8227542: Manifest improved jar headers - S8231415: Better signatures in XML - S8233250: Better X11 rendering - S8233410: Better Build Scripting - S8234027: Better JCEKS key support - S8234408: Improve TLS session handling - S8234825: Better Headings for HTTP Servers - S8234841: Enhance buffering of byte buffers - S8235274: Enhance typing of methods - S8236201: Better Scanner conversions - S8238960: linux-i586 builds are inconsistent as the newly build jdk is not able to reserve enough space for object heap Main issues of note: Simple merge, no HotSpot changes. diffstat for root b/.hgtags | 1 + b/common/autoconf/flags.m4 | 15 +++++++++++++-- b/common/autoconf/generated-configure.sh | 17 ++++++++++++++--- 3 files changed, 28 insertions(+), 5 deletions(-) diffstat for corba b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for jaxp b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for jaxws b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for langtools b/.hgtags | 1 + 1 file changed, 1 insertion(+) diffstat for nashorn b/.hgtags | 1 b/src/jdk/nashorn/internal/runtime/regexp/RegExpScanner.java | 6 +-- b/src/jdk/nashorn/internal/runtime/regexp/joni/Parser.java | 6 +-- b/src/jdk/nashorn/internal/runtime/regexp/joni/ast/StringNode.java | 16 ++++++++-- 4 files changed, 21 insertions(+), 8 deletions(-) diffstat for jdk b/.hgtags | 1 b/make/CompileLaunchers.gmk | 1 b/src/share/classes/com/sun/crypto/provider/JceKeyStore.java | 28 +++ b/src/share/classes/com/sun/crypto/provider/KeyProtector.java | 9 - b/src/share/classes/com/sun/crypto/provider/SealedObjectForKeyProtector.java | 26 ++- b/src/share/classes/com/sun/net/httpserver/Headers.java | 34 ++++ b/src/share/classes/java/io/ObjectInputStream.java | 4 b/src/share/classes/java/io/ObjectStreamClass.java | 16 +- b/src/share/classes/java/lang/instrument/package.html | 7 b/src/share/classes/java/lang/invoke/MethodType.java | 38 +--- b/src/share/classes/java/math/MutableBigInteger.java | 24 ++- b/src/share/classes/java/nio/ByteBufferAs-X-Buffer.java.template | 1 b/src/share/classes/java/nio/Direct-X-Buffer.java.template | 1 b/src/share/classes/java/nio/Heap-X-Buffer.java.template | 80 ++++++---- b/src/share/classes/java/nio/StringCharBuffer.java | 9 - b/src/share/classes/java/util/Scanner.java | 22 +- b/src/share/classes/org/jcp/xml/dsig/internal/dom/DOMKeyInfoFactory.java | 10 + b/src/share/classes/org/jcp/xml/dsig/internal/dom/DOMXMLSignatureFactory.java | 10 + b/src/share/classes/sun/security/rsa/RSAKeyFactory.java | 3 b/src/share/classes/sun/security/ssl/ClientHandshaker.java | 2 b/src/share/classes/sun/security/ssl/SSLEngineImpl.java | 2 b/src/share/classes/sun/security/ssl/SSLSessionImpl.java | 15 - b/src/share/classes/sun/security/ssl/SSLSocketImpl.java | 2 b/src/share/instrument/InvocationAdapter.c | 22 ++ b/src/share/native/sun/awt/splashscreen/splashscreen_gfx_impl.c | 2 b/src/share/native/sun/security/ec/impl/mpi.c | 9 - b/src/solaris/native/sun/awt/multiVis.c | 2 b/src/solaris/native/sun/java2d/x11/X11PMBlitLoops.c | 2 b/src/solaris/native/sun/java2d/x11/X11TextRenderer_md.c | 2 b/src/solaris/native/sun/java2d/x11/XRBackendNative.c | 6 b/test/java/math/BigInteger/ModInvTime.java | 57 +++++++ 31 files changed, 316 insertions(+), 131 deletions(-) diffstat for hotspot b/.hgtags | 1 + 1 file changed, 1 insertion(+) Successfully built on x86, x86_64, s390, s390x, ppc, ppc64, ppc64le & aarch64. Ok to push? Thanks, -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew From shade at redhat.com Tue Apr 14 20:28:11 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Tue, 14 Apr 2020 22:28:11 +0200 Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync In-Reply-To: References: Message-ID: On 4/14/20 10:25 PM, Andrew John Hughes wrote: > Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/ > > Merge changesets: > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset Look trivially good. > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset Looks good. > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset Look trivially good. > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset > http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset Look good. > Ok to push? Yes. -- Thanks, -Aleksey From gnu.andrew at redhat.com Tue Apr 14 20:32:13 2020 From: gnu.andrew at redhat.com (Andrew John Hughes) Date: Tue, 14 Apr 2020 21:32:13 +0100 Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync In-Reply-To: References: Message-ID: <724d1017-83cb-3cbe-30eb-b74da0710b12@redhat.com> On 14/04/2020 21:28, Aleksey Shipilev wrote: > On 4/14/20 10:25 PM, Andrew John Hughes wrote: >> Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/ >> >> Merge changesets: >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset > > Look trivially good. > >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset > > Looks good. > >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset > > Look trivially good. > >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset >> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset > > Look good. > >> Ok to push? > > Yes. > Thanks, pushed. -- Andrew :) Senior Free Java Software Engineer Red Hat, Inc. (http://www.redhat.com) PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net) Fingerprint = 5132 579D D154 0ED2 3E04 C5A0 CFDA 0F9B 3596 4222 https://keybase.io/gnu_andrew From gnu.andrew at redhat.com Tue Apr 14 20:30:56 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Tue, 14 Apr 2020 20:30:56 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxp: 3 new changesets Message-ID: <202004142030.03EKUubc019918@aojmv0008.oracle.com> Changeset: 70da96196e76 Author: andrew Date: 2020-04-06 04:05 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/70da96196e76 Added tag jdk8u252-b09 for changeset 8476d78dc695 ! .hgtags Changeset: d40b54be2536 Author: andrew Date: 2020-04-06 05:09 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/d40b54be2536 Merge jdk8u252-b09 ! .hgtags Changeset: 085e5483df61 Author: andrew Date: 2020-04-06 05:11 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/085e5483df61 Added tag aarch64-shenandoah-jdk8u252-b09 for changeset d40b54be2536 ! .hgtags From gnu.andrew at redhat.com Tue Apr 14 20:31:09 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Tue, 14 Apr 2020 20:31:09 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/langtools: 3 new changesets Message-ID: <202004142031.03EKV9KD020088@aojmv0008.oracle.com> Changeset: 5177983e7b78 Author: andrew Date: 2020-04-06 04:06 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/5177983e7b78 Added tag jdk8u252-b09 for changeset 01036da3155c ! .hgtags Changeset: 805c9d0d623f Author: andrew Date: 2020-04-06 05:09 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/805c9d0d623f Merge jdk8u252-b09 ! .hgtags Changeset: 22d2bfae6afe Author: andrew Date: 2020-04-06 05:11 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/22d2bfae6afe Added tag aarch64-shenandoah-jdk8u252-b09 for changeset 805c9d0d623f ! .hgtags From gnu.andrew at redhat.com Tue Apr 14 20:31:31 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Tue, 14 Apr 2020 20:31:31 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/hotspot: 3 new changesets Message-ID: <202004142031.03EKVVtW020314@aojmv0008.oracle.com> Changeset: 8915b1e17904 Author: andrew Date: 2020-04-06 04:06 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/8915b1e17904 Added tag jdk8u252-b09 for changeset 095e60e7fc8c ! .hgtags Changeset: e4e81ae21643 Author: andrew Date: 2020-04-06 05:09 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/e4e81ae21643 Merge jdk8u252-b09 ! .hgtags Changeset: 6d1cfa6cdbab Author: andrew Date: 2020-04-06 05:11 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/6d1cfa6cdbab Added tag aarch64-shenandoah-jdk8u252-b09 for changeset e4e81ae21643 ! .hgtags From gnu.andrew at redhat.com Tue Apr 14 20:31:24 2020 From: gnu.andrew at redhat.com (gnu.andrew at redhat.com) Date: Tue, 14 Apr 2020 20:31:24 +0000 Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jdk: 18 new changesets Message-ID: <202004142031.03EKVOic020237@aojmv0008.oracle.com> Changeset: db82be4e049c Author: bchristi Date: 2020-01-21 10:56 -0800 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/db82be4e049c 8224541: Better mapping of serial ENUMs Reviewed-by: mschoene, rhalade, robm, rriggs, smarks, andrew ! src/share/classes/java/io/ObjectInputStream.java ! src/share/classes/java/io/ObjectStreamClass.java Changeset: a75922cb4096 Author: andrew Date: 2020-04-05 19:18 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a75922cb4096 8224549: Less Blocking Array Queues Reviewed-by: mbalao ! src/share/classes/java/io/ObjectStreamClass.java Changeset: a5f5d7fd9be6 Author: bpb Date: 2019-10-29 14:07 -0700 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a5f5d7fd9be6 8225603: Enhancement for big integers Reviewed-by: darcy, ahgross, rhalade ! src/share/classes/java/math/MutableBigInteger.java ! src/share/native/sun/security/ec/impl/mpi.c + test/java/math/BigInteger/ModInvTime.java Changeset: ae9b738bfb93 Author: mbalao Date: 2019-11-14 15:06 -0800 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/ae9b738bfb93 8227542: Manifest improved jar headers Reviewed-by: andrew ! src/share/classes/java/lang/instrument/package.html ! src/share/instrument/InvocationAdapter.c Changeset: 36afd1d59467 Author: alvdavi Date: 2019-10-15 08:18 -0400 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/36afd1d59467 8231415: Better signatures in XML Reviewed-by: andrew ! src/share/classes/org/jcp/xml/dsig/internal/dom/DOMKeyInfoFactory.java ! src/share/classes/org/jcp/xml/dsig/internal/dom/DOMXMLSignatureFactory.java Changeset: 914f1b61fcff Author: bae Date: 2020-01-16 18:15 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/914f1b61fcff 8233250: Better X11 rendering Reviewed-by: andrew ! src/share/native/sun/awt/splashscreen/splashscreen_gfx_impl.c ! src/solaris/native/sun/awt/multiVis.c ! src/solaris/native/sun/java2d/x11/X11PMBlitLoops.c ! src/solaris/native/sun/java2d/x11/X11TextRenderer_md.c ! src/solaris/native/sun/java2d/x11/XRBackendNative.c Changeset: 7749c1865b40 Author: andrew Date: 2020-04-06 01:59 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/7749c1865b40 8233410: Better Build Scripting Reviewed-by: mbalao ! make/CompileLaunchers.gmk Changeset: 27498adf3cbb Author: yan Date: 2020-04-06 02:10 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/27498adf3cbb 8234027: Better JCEKS key support Reviewed-by: andrew ! src/share/classes/com/sun/crypto/provider/JceKeyStore.java ! src/share/classes/com/sun/crypto/provider/KeyProtector.java ! src/share/classes/com/sun/crypto/provider/SealedObjectForKeyProtector.java Changeset: b6be024c35ca Author: abakhtin Date: 2019-11-25 09:50 -0800 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/b6be024c35ca 8234408: Improve TLS session handling Reviewed-by: andrew ! src/share/classes/sun/security/ssl/ClientHandshaker.java ! src/share/classes/sun/security/ssl/SSLEngineImpl.java ! src/share/classes/sun/security/ssl/SSLSessionImpl.java ! src/share/classes/sun/security/ssl/SSLSocketImpl.java Changeset: 6592c0288089 Author: michaelm Date: 2020-01-29 21:46 +0300 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/6592c0288089 8234825: Better Headings for HTTP Servers Reviewed-by: chegar, dfuchs, igerasim ! src/share/classes/com/sun/net/httpserver/Headers.java Changeset: f5fa8182f5af Author: andrew Date: 2020-04-06 03:06 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/f5fa8182f5af 8219597: (bf) Heap buffer state changes could provoke unexpected exceptions Reviewed-by: mbalao ! src/share/classes/java/nio/Heap-X-Buffer.java.template Changeset: a6dcbf49526c Author: robm Date: 2020-03-30 05:13 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a6dcbf49526c 8234841: Enhance buffering of byte buffers Reviewed-by: alanb, ahgross, rhalade, psandoz ! src/share/classes/java/nio/ByteBufferAs-X-Buffer.java.template ! src/share/classes/java/nio/Direct-X-Buffer.java.template ! src/share/classes/java/nio/Heap-X-Buffer.java.template ! src/share/classes/java/nio/StringCharBuffer.java Changeset: 34bb0aa775b2 Author: avoitylov Date: 2020-02-20 19:35 +0300 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/34bb0aa775b2 8235274: Enhance typing of methods Reviewed-by: andrew ! src/share/classes/java/lang/invoke/MethodType.java Changeset: a8f0a9ef1797 Author: igerasim Date: 2020-01-30 01:15 -0800 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a8f0a9ef1797 8236201: Better Scanner conversions Reviewed-by: ahgross, rhalade, rriggs, skoivu, smarks, andrew ! src/share/classes/java/util/Scanner.java Changeset: 3ad9fa6a5a13 Author: valeriep Date: 2018-06-19 23:33 +0000 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/3ad9fa6a5a13 8204152: SignedObject throws NullPointerException for null keys with an initialized Signature object Summary: Check for null and throw InvalidKeyException to maintain same behavior Reviewed-by: xuelei ! src/share/classes/sun/security/rsa/RSAKeyFactory.java Changeset: b3db2cd0d9c4 Author: andrew Date: 2020-04-06 04:23 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/b3db2cd0d9c4 Added tag jdk8u252-b09 for changeset 3ad9fa6a5a13 ! .hgtags Changeset: 812f64a9a671 Author: andrew Date: 2020-04-06 05:09 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/812f64a9a671 Merge jdk8u252-b09 ! .hgtags ! make/CompileLaunchers.gmk ! src/share/classes/com/sun/net/httpserver/Headers.java ! src/share/classes/java/io/ObjectInputStream.java ! src/share/classes/java/io/ObjectStreamClass.java ! src/share/classes/java/lang/invoke/MethodType.java ! src/share/classes/java/math/MutableBigInteger.java ! src/share/classes/java/util/Scanner.java ! src/share/classes/sun/security/ssl/ClientHandshaker.java ! src/share/classes/sun/security/ssl/SSLEngineImpl.java ! src/share/classes/sun/security/ssl/SSLSocketImpl.java ! src/solaris/native/sun/awt/multiVis.c ! src/solaris/native/sun/java2d/x11/XRBackendNative.c Changeset: 0d7976fa1bc7 Author: andrew Date: 2020-04-06 05:11 +0100 URL: https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/0d7976fa1bc7 Added tag aarch64-shenandoah-jdk8u252-b09 for changeset 812f64a9a671 ! .hgtags From stumon01 at arm.com Thu Apr 16 11:29:19 2020 From: stumon01 at arm.com (Stuart Monteith) Date: Thu, 16 Apr 2020 12:29:19 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com> <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com> <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com> <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com> Message-ID: <85312260-c65f-cb86-5a44-ee77e8d04b4d@arm.com> On 08/04/2020 17:15, Andrew Haley wrote: > On 4/8/20 4:33 PM, Stuart Monteith wrote: >> I see what you did there. This comes back to our previous discussion >> about the value of having immediate oops at all. isn't that what you are >> effectively suggesting? That would simply the code somewhat. > > No entirely. Immediate oops are good for most GCs. But according to what > Erik said, immediate oops are verboten when we're using ZGC with concurrent > method unloading, and it seems to be very easy to do. > I've incorporated everyone's comments into the latest: http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ I've cleaned up the logic somewhat - instead movoop will only make an oop an immediate if there aren't nmethod entry guards - hopefully the comments make than clear. This tested OK with a full run of JTREG. I've made adding constants to wrapper only for AARCH64 with a comment explaining why. I presume we don't want architectures that don't need it to take the (albeit small) overhead. I've not made it conditional on aarch64 with ZGC or class unloading. BR, Stuart IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you. From thomas.stuefe at gmail.com Thu Apr 16 15:18:12 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 16 Apr 2020 17:18:12 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics Message-ID: Hi all, I am currently trying to wrap my head around the various ways the CompressedClassSpace is reserved. Coding has grown a bit in complexity with the advent of CDS/AppCDS and recently some aarch64 changes atop of that, changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a bit. Specifically, I am looking at Metaspace::reserve_space and its aarch64-specific outgrow Metaspace::reserve_preferred_space Despite its generic-sounding name, these functions can only be used to allocate ccs. They lack any interface description, so I parsed the code to understand their behavior. So I tried and here is how I think Metaspace::reserve_space works for the various combinations of input parameters: A) requested_addr == NULL && use_requested_addr == false: [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent allocation points. Failing that, return an unreserved space. [others]: Reserve a space anywhere. B) requested_addr == NULL && use_requested_addr == true: [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I assume this would be an invalid combination, but since it is not asserted I am not sure. [others]: Reserve a space anywhere (use_requested_addr is ignored). C) requested_addr != NULL && use_requested_addr == false: [aarch64, ppc64]: First attempt to reserve at the requested address, but only if that would cause the space to falls into the lower 4G. Failing that, allocate at one of the preferred OS dependent allocation points. Failing that, return an unreserved space. [others]: Attempt to reserve at requested_addr. . Failing that, return an unreserved space. D) requested_addr != NULL && use_requested_addr == true: [aarch64, ppc64]: First attempt to reserve at the requested address, but only if that would cause the space to falls into the lower 4G. Failing that, return an unreserved space. [others]: Attempt to reserve at requested_addr. . Failing that, return an unreserved space. (use_requested_addr is ignored). Note the many subtle platform differences. E.g. on aarch64 we honor the requested address only if ccs would fall below 4G, for all other platforms we always honor them. Or how for most platforms the parameter "use_requested_addr" is just ignored. Or that on aarch64 we never seem to "try anywhere", we just try a fixed set of attachment points and if these are all occupied we fail. Is this a bug or by design? Can we always rely at least one of the attachment points being unoccupied? Looking at the options for MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using x + base >> shift seems not to be wanted? There seems to be no fall back mode which would work with any value of base/shift? About reserve_preferred_space(), I was confused why a separate "use_requested_addr" was even needed - requested_addr!=NULL would be a perfectly valid way to communicate that the requested address should be used. I wish we could simplify the coding to just two cases: - hand down a requested address, which is to be taken-or-fail (somewhat like case D) - hand down NULL, which means "try whatever": which for most OSes would be really anywhere, for aarch64 could be the fixed set of attachment points. This would be case (A). About case (C): under which circumstances does it happen that caller code hands down a requested address below 4G which happens to be free? Does that make sense? In other words, if the whole point of Metaspace::reserve_preferred_space() is "OS knows better, let it try to find a good address", would it not make sense to just try a low address as part of the try-addresses-loop? Hope these questions make sense, and thanks a lot! ..Thomas From aph at redhat.com Thu Apr 16 16:24:06 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 16 Apr 2020 17:24:06 +0100 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: Message-ID: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Hi, On 4/16/20 4:18 PM, Thomas St?fe wrote: > > I am currently trying to wrap my head around the various ways the > CompressedClassSpace is reserved. Coding has grown a bit in complexity with > the advent of CDS/AppCDS and recently some aarch64 changes atop of that, > changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a > bit. > > Specifically, I am looking at > > Metaspace::reserve_space > > and its aarch64-specific outgrow > > Metaspace::reserve_preferred_space Yowza. This one is mine, I think. > Despite its generic-sounding name, these functions can only be used to > allocate ccs. They lack any interface description, so I parsed the code to > understand their behavior. > > So I tried and here is how I think Metaspace::reserve_space works for the > various combinations of input parameters: Bear in mind that this was changed recently. It was (even more) complicated before. Please read the discussion at https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html > A) requested_addr == NULL && use_requested_addr == false: > [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent > allocation points. Failing that, return an unreserved space. > [others]: Reserve a space anywhere. > > B) requested_addr == NULL && use_requested_addr == true: > [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I > assume this would be an invalid combination, but since it is not asserted I > am not sure. > [others]: Reserve a space anywhere (use_requested_addr is ignored). > > C) requested_addr != NULL && use_requested_addr == false: > [aarch64, ppc64]: First attempt to reserve at the requested address, but > only if that would cause the space to falls into the lower 4G. Failing > that, allocate at one of the preferred OS dependent allocation points. > Failing that, return an unreserved space. Yes. We have to do that, because we can't cope with the heap base being anything other than a multiple of 4*G. We've got rid of rheapbase, in other words, for all compiled code. > [others]: Attempt to reserve at requested_addr. . Failing that, return an > unreserved space. > > D) requested_addr != NULL && use_requested_addr == true: > [aarch64, ppc64]: First attempt to reserve at the requested address, but > only if that would cause the space to falls into the lower 4G. Failing > that, return an unreserved space. > [others]: Attempt to reserve at requested_addr. . Failing that, return an > unreserved space. (use_requested_addr is ignored). > > Note the many subtle platform differences. E.g. on aarch64 we honor the > requested address only if ccs would fall below 4G, for all other platforms > we always honor them. Or how for most platforms the parameter > "use_requested_addr" is just ignored. > > Or that on aarch64 we never seem to "try anywhere", we just try a > fixed set of attachment points and if these are all occupied we > fail. Is this a bug or by design? It's by design. We looked at it and decided that we would always be able to allocate one of our "nice" points: they are spaced 4G apart, and it's very unlikely that any Linux system (which is all we support) would fail to map any of the possibilities. > Can we always rely at least one of the attachment points being > unoccupied? Looking at the options for > MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using > x + base >> shift seems not to be wanted? There seems to be no fall > back mode which would work with any value of base/shift? That is correct. > About reserve_preferred_space(), I was confused why a separate > "use_requested_addr" was even needed - requested_addr!=NULL would be a > perfectly valid way to communicate that the requested address should be > used. I wish we could simplify the coding to just two cases: > - hand down a requested address, which is to be taken-or-fail (somewhat > like case D) > - hand down NULL, which means "try whatever": which for most OSes would be > really anywhere, for aarch64 could be the fixed set of attachment points. > This would be case (A). > > About case (C): under which circumstances does it happen that caller code > hands down a requested address below 4G which happens to be free? I don't know. > Does that make sense? In other words, if the whole point of > Metaspace::reserve_preferred_space() is "OS knows better, let it try > to find a good address", would it not make sense to just try a low > address as part of the try-addresses-loop? We certainly don't want to have to use a dedicated heapbase register or a shift. Just give us a multiple of 4*G and we're happy. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Thu Apr 16 17:51:04 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 16 Apr 2020 19:51:04 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: Hi Andrew, Thanks for the prompt answer. See my answers inline. On Thu, Apr 16, 2020 at 6:24 PM Andrew Haley wrote: > Hi, > > On 4/16/20 4:18 PM, Thomas St?fe wrote: > > > > I am currently trying to wrap my head around the various ways the > > CompressedClassSpace is reserved. Coding has grown a bit in complexity > with > > the advent of CDS/AppCDS and recently some aarch64 changes atop of that, > > changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a > > bit. > > > > Specifically, I am looking at > > > > Metaspace::reserve_space > > > > and its aarch64-specific outgrow > > > > Metaspace::reserve_preferred_space > > Yowza. This one is mine, I think. > > > Despite its generic-sounding name, these functions can only be used to > > allocate ccs. They lack any interface description, so I parsed the code > to > > understand their behavior. > > > > So I tried and here is how I think Metaspace::reserve_space works for the > > various combinations of input parameters: > > Bear in mind that this was changed recently. It was (even more) > complicated before. > > Yes, ccs reservation is complex, and the aarch64 parts are only a small part of it. I would love to simplify it a bit but its not that easy. > Please read the discussion at > > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html > > Thank you for pointing me to the discussion. We have missed that review. The reason I looked at this coding was due to the new Metaspace implementation. In the new allocator it is possible to allocate a Klass at ccs offset zero (currently this never happens out of accident). If CompressedKlassPointers::base() points to the start of ccs, the resulting narrow Klass pointer would be 0. But the VM cannot tell that apart from a real NULL reference. So far my cheap fix has been to move CompressedKlassPointers::base() a bit below the start of the ccs. But as I saw yesterday that breaks the 4G-alignment-assumption on aarch64. Nevermind, there are different ways to solve that, but I wondered why aarch64 could not handle this crooked base address. > > A) requested_addr == NULL && use_requested_addr == false: > > [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent > > allocation points. Failing that, return an unreserved space. > > [others]: Reserve a space anywhere. > > > > B) requested_addr == NULL && use_requested_addr == true: > > [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. > I > > assume this would be an invalid combination, but since it is not > asserted I > > am not sure. > > [others]: Reserve a space anywhere (use_requested_addr is ignored). > > > > C) requested_addr != NULL && use_requested_addr == false: > > [aarch64, ppc64]: First attempt to reserve at the requested address, but > > only if that would cause the space to falls into the lower 4G. Failing > > that, allocate at one of the preferred OS dependent allocation points. > > Failing that, return an unreserved space. > > Yes. We have to do that, because we can't cope with the heap base > being anything other than a multiple of 4*G. We've got rid of > rheapbase, in other words, for all compiled code. > > > [others]: Attempt to reserve at requested_addr. . Failing that, return an > > unreserved space. > > > > D) requested_addr != NULL && use_requested_addr == true: > > [aarch64, ppc64]: First attempt to reserve at the requested address, but > > only if that would cause the space to falls into the lower 4G. Failing > > that, return an unreserved space. > > [others]: Attempt to reserve at requested_addr. . Failing that, return an > > unreserved space. (use_requested_addr is ignored). > > > > Note the many subtle platform differences. E.g. on aarch64 we honor the > > requested address only if ccs would fall below 4G, for all other > platforms > > we always honor them. Or how for most platforms the parameter > > "use_requested_addr" is just ignored. > > > > Or that on aarch64 we never seem to "try anywhere", we just try a > > fixed set of attachment points and if these are all occupied we > > fail. Is this a bug or by design? > > It's by design. We looked at it and decided that we would always be > able to allocate one of our "nice" points: they are spaced 4G apart, > and it's very unlikely that any Linux system (which is all we support) > would fail to map any of the possibilities. > > Thank you for clarifying. This clearly distinguishes aarch64 from at least AIX and possibly Linux ppc, not sure - there we clearly want a fallback "try anywhere". We have to look at the code again. I believe either Goetz or me wrote the original AIX version but my memory is dim. I am at a loss why we restricted this to AIX only. I have to talk this over with Goetz. > > Can we always rely at least one of the attachment points being > > unoccupied? Looking at the options for > > MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using > > x + base >> shift seems not to be wanted? There seems to be no fall > > back mode which would work with any value of base/shift? > > That is correct. > > > About reserve_preferred_space(), I was confused why a separate > > "use_requested_addr" was even needed - requested_addr!=NULL would be a > > perfectly valid way to communicate that the requested address should be > > used. I wish we could simplify the coding to just two cases: > > - hand down a requested address, which is to be taken-or-fail (somewhat > > like case D) > > - hand down NULL, which means "try whatever": which for most OSes would > be > > really anywhere, for aarch64 could be the fixed set of attachment points. > > This would be case (A). > > > > About case (C): under which circumstances does it happen that caller code > > hands down a requested address below 4G which happens to be free? > > I don't know. > > > Does that make sense? In other words, if the whole point of > > Metaspace::reserve_preferred_space() is "OS knows better, let it try > > to find a good address", would it not make sense to just try a low > > address as part of the try-addresses-loop? > > We certainly don't want to have to use a dedicated heapbase register > or a shift. Just give us a multiple of 4*G and we're happy. > > Good to know. So, zero based encoding does not have any special place in your heart? 4G aligned base works just as well? Thanks, Thomas -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > From zgu at redhat.com Thu Apr 16 18:11:26 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Thu, 16 Apr 2020 14:11:26 -0400 Subject: [aarch64-port-dev ] [15] RFR(T) 8243008: Shenandoah: TestVolatilesShenandoah test failed on aarch64 Message-ID: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com> compiler/c2/aarch64/TestVolatilesShenandoah.java test failed on aarch64, because Shenandoah no long has traversal mode, but new incremental-update mode. Bug: https://bugs.openjdk.java.net/browse/JDK-8243008 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8243008/webrev.00/ Thanks, -Zhengyu From thomas.stuefe at gmail.com Thu Apr 16 18:14:08 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Thu, 16 Apr 2020 20:14:08 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: Hi Ioi, On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam wrote: > (I suppose you mean "compressed class space" by "ccs" :-) > > Yes, I think I stole this from Stefan Karlsson :) > > > I am not even sure if case (C) can happen at all. > > I admit that I've been guilty of making the interface even more complicated > with JDK-8231610 > (Relocate the CDS archive if it cannot be mapped to the > requested address). Looks now is a good time to clean up. > > The coding has been complicated to begin with, and then it usually only gets worse since no-one has time for a revamp :( A clean up would be very helpful. One reason I look at this coding now, beside the aarch64 problem, was that I try to disentangle CDS from Metaspace, especially the alignment policy. Remember, I tried to tackle this last summer? but it keeps biting me. For such a small problem this is weirdly complicated. > One thing that can be cleaned up is the call to > Metaspace::allocate_metaspace_compressed_klass_ptrs: > > (a) when CDS is enabled: > > Metaspace::global_initialize() > -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() > -> ... MetaspaceShared::map_archives() > -> ... reserve the space, eventually calling > Metaspace::reserve_space > -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() > > (b) when CDS is disabled > > Metaspace::global_initialize() > -> allocate_metaspace_compressed_klass_ptrs > -> (if cds is not enabled) Metaspace::reserve_space() > > > In case (b), we should first reserve the space, and then call into > allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments > of allocate_metaspace_compressed_klass_ptrs, and will also limit the > variations > of calls to Metaspace::reserve_space(). I think this will make it possible > to > drop the use_requested_addr argument and rely simply on (requested_addr != > NULL) > > So, in all cases we'd pre-reserve the ReservedSpace and hand it down to Metaspace::allocate_metaspace_compressed_klass_ptrs()? This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize compressed class space from a pre-arranged ReservedSpace, and set up base + shift". We could probably rename that thing to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); We even could move set_narrow_klass_base_and_shift() out of Metaspace::set_up_compressed_klass_space, then it becomes a series of three simple operations: 1) obtain a ReservedSpace however you see fit 2) register it with Metaspace as address space for ccs, 3) set_narrow_klass_base_and_shift. We would not have to hand down cds_base to Metaspace, only for it to be used as base address in set_narrow_klass_base_and_shift. One question which came to me today was: In AppCDS, DynamicArchiveBuilder::do_it() calls Metaspace::reserve_space(). Is that really needed, does a DumpRegion have anything to do with ccs? Don't they just need some space to dump into? Hope that question is not dumb. Thanks, Thomas > Thanks > - Ioi > > > Does that make sense? In other words, if the whole point of > Metaspace::reserve_preferred_space() is "OS knows better, let it try > to find a good address", would it not make sense to just try a low > address as part of the try-addresses-loop? > > We certainly don't want to have to use a dedicated heapbase register > or a shift. Just give us a multiple of 4*G and we're happy. > > > > From shade at redhat.com Thu Apr 16 18:16:11 2020 From: shade at redhat.com (Aleksey Shipilev) Date: Thu, 16 Apr 2020 20:16:11 +0200 Subject: [aarch64-port-dev ] [15] RFR(T) 8243008: Shenandoah: TestVolatilesShenandoah test failed on aarch64 In-Reply-To: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com> References: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com> Message-ID: <386be87e-d971-6bd4-aa36-5cef4aeb5814@redhat.com> On 4/16/20 8:11 PM, Zhengyu Gu wrote: > compiler/c2/aarch64/TestVolatilesShenandoah.java test failed on aarch64, > because Shenandoah no long has traversal mode, but new > incremental-update mode. > > Bug: https://bugs.openjdk.java.net/browse/JDK-8243008 > Webrev: http://cr.openjdk.java.net/~zgu/JDK-8243008/webrev.00/ Right. Looks good. -- Thanks, -Aleksey From Yang.Zhang at arm.com Fri Apr 17 06:34:20 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 17 Apr 2020 06:34:20 +0000 Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8242482 Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/ This patch is a followup patch of previous discussion. https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008740.html To make the intent clear, the scalar parameter name is changed to isrc, fsrc or dsrc based on its data type. The vector parameter name is changed to vsrc. And so does temp register. Testing: tier1 Regards Yang From thomas.stuefe at gmail.com Fri Apr 17 07:08:24 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Fri, 17 Apr 2020 09:08:24 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> Message-ID: On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam wrote: > > > On 4/16/20 11:14 AM, Thomas St?fe wrote: > > Hi Ioi, > > On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam wrote: > >> (I suppose you mean "compressed class space" by "ccs" :-) >> >> > Yes, I think I stole this from Stefan Karlsson :) > > >> >> > > >> I am not even sure if case (C) can happen at all. >> >> I admit that I've been guilty of making the interface even more >> complicated >> with JDK-8231610 >> (Relocate the CDS archive if it cannot be mapped to the >> requested address). Looks now is a good time to clean up. >> >> > The coding has been complicated to begin with, and then it usually only > gets worse since no-one has time for a revamp :( A clean up would be very > helpful. > > One reason I look at this coding now, beside the aarch64 problem, was that > I try to disentangle CDS from Metaspace, especially the alignment policy. > Remember, I tried to tackle this last summer? but it keeps biting me. For > such a small problem this is weirdly complicated. > > >> One thing that can be cleaned up is the call to >> Metaspace::allocate_metaspace_compressed_klass_ptrs: >> >> (a) when CDS is enabled: >> >> Metaspace::global_initialize() >> -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >> -> ... MetaspaceShared::map_archives() >> -> ... reserve the space, eventually calling >> Metaspace::reserve_space >> -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() >> >> (b) when CDS is disabled >> >> Metaspace::global_initialize() >> -> allocate_metaspace_compressed_klass_ptrs >> -> (if cds is not enabled) Metaspace::reserve_space() >> >> >> In case (b), we should first reserve the space, and then call into >> allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments >> of allocate_metaspace_compressed_klass_ptrs, and will also limit the >> variations >> of calls to Metaspace::reserve_space(). I think this will make it >> possible to >> drop the use_requested_addr argument and rely simply on (requested_addr >> != NULL) >> >> > So, in all cases we'd pre-reserve the ReservedSpace and hand it down to > Metaspace::allocate_metaspace_compressed_klass_ptrs()? > > This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs() > to just "initialize compressed class space from a pre-arranged > ReservedSpace, and set up base + shift". > > We could probably rename that thing > to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); > > We even could move set_narrow_klass_base_and_shift() out of > Metaspace::set_up_compressed_klass_space, then it becomes a series of three > simple operations: > 1) obtain a ReservedSpace however you see fit > 2) register it with Metaspace as address space for ccs, > 3) set_narrow_klass_base_and_shift. We would not have to hand down > cds_base to Metaspace, only for it to be used as base address > in set_narrow_klass_base_and_shift. > > > Yes, that seems the right thing to do. That will hopefully make the > aarch64 initialization code a little simpler as well. > > It would. One question which came to me today was: > > In AppCDS, DynamicArchiveBuilder::do_it() calls > Metaspace::reserve_space(). Is that really needed, does a DumpRegion have > anything to do with ccs? Don't they just need some space to dump into? Hope > that question is not dumb. > > Do you mean: > > DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() > -> MetaspaceShared::reserve_shared_space > -> Metaspace::reserve_space > > That's not necessary. When I wrote the code I thought > Metaspace::reserve_space was a general function for reserving spaces :-) > but as you said, this function is probably intended only for initializing > the CCS. > > Oh thank god :) That is good, this really tripped me off when reading the code. Thanks, Thomas > Thanks > - Ioi > > Thanks, Thomas > > >> Thanks >> - Ioi >> >> >> Does that make sense? In other words, if the whole point of >> Metaspace::reserve_preferred_space() is "OS knows better, let it try >> to find a good address", would it not make sense to just try a low >> address as part of the try-addresses-loop? >> >> We certainly don't want to have to use a dedicated heapbase register >> or a shift. Just give us a multiple of 4*G and we're happy. >> >> >> >> > From ioi.lam at oracle.com Thu Apr 16 17:46:37 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 16 Apr 2020 10:46:37 -0700 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: (I suppose you mean "compressed class space" by "ccs" :-) On 4/16/20 9:24 AM, Andrew Haley wrote: > Hi, > > On 4/16/20 4:18 PM, Thomas St?fe wrote: >> I am currently trying to wrap my head around the various ways the >> CompressedClassSpace is reserved. Coding has grown a bit in complexity with >> the advent of CDS/AppCDS and recently some aarch64 changes atop of that, >> changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a >> bit. >> >> Specifically, I am looking at >> >> Metaspace::reserve_space >> >> and its aarch64-specific outgrow >> >> Metaspace::reserve_preferred_space > Yowza. This one is mine, I think. > >> Despite its generic-sounding name, these functions can only be used to >> allocate ccs. They lack any interface description, so I parsed the code to >> understand their behavior. >> >> So I tried and here is how I think Metaspace::reserve_space works for the >> various combinations of input parameters: > Bear in mind that this was changed recently. It was (even more) > complicated before. > > Please read the discussion at > https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html > >> A) requested_addr == NULL && use_requested_addr == false: >> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent >> allocation points. Failing that, return an unreserved space. >> [others]: Reserve a space anywhere. >> >> B) requested_addr == NULL && use_requested_addr == true: >> [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I >> assume this would be an invalid combination, but since it is not asserted I >> am not sure. >> [others]: Reserve a space anywhere (use_requested_addr is ignored). >> >> C) requested_addr != NULL && use_requested_addr == false: >> [aarch64, ppc64]: First attempt to reserve at the requested address, but >> only if that would cause the space to falls into the lower 4G. Failing >> that, allocate at one of the preferred OS dependent allocation points. >> Failing that, return an unreserved space. > Yes. We have to do that, because we can't cope with the heap base > being anything other than a multiple of 4*G. We've got rid of > rheapbase, in other words, for all compiled code. > >> [others]: Attempt to reserve at requested_addr. . Failing that, return an >> unreserved space. >> >> D) requested_addr != NULL && use_requested_addr == true: >> [aarch64, ppc64]: First attempt to reserve at the requested address, but >> only if that would cause the space to falls into the lower 4G. Failing >> that, return an unreserved space. >> [others]: Attempt to reserve at requested_addr. . Failing that, return an >> unreserved space. (use_requested_addr is ignored). >> >> Note the many subtle platform differences. E.g. on aarch64 we honor the >> requested address only if ccs would fall below 4G, for all other platforms >> we always honor them. Or how for most platforms the parameter >> "use_requested_addr" is just ignored. >> >> Or that on aarch64 we never seem to "try anywhere", we just try a >> fixed set of attachment points and if these are all occupied we >> fail. Is this a bug or by design? > It's by design. We looked at it and decided that we would always be > able to allocate one of our "nice" points: they are spaced 4G apart, > and it's very unlikely that any Linux system (which is all we support) > would fail to map any of the possibilities. > >> Can we always rely at least one of the attachment points being >> unoccupied? Looking at the options for >> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using >> x + base >> shift seems not to be wanted? There seems to be no fall >> back mode which would work with any value of base/shift? > That is correct. > >> About reserve_preferred_space(), I was confused why a separate >> "use_requested_addr" was even needed - requested_addr!=NULL would be a >> perfectly valid way to communicate that the requested address should be >> used. I wish we could simplify the coding to just two cases: >> - hand down a requested address, which is to be taken-or-fail (somewhat >> like case D) >> - hand down NULL, which means "try whatever": which for most OSes would be >> really anywhere, for aarch64 could be the fixed set of attachment points. >> This would be case (A). >> >> About case (C): under which circumstances does it happen that caller code >> hands down a requested address below 4G which happens to be free? > I don't know. I am not even sure if case (C) can happen at all. I admit that I've been guilty of making the interface even more complicated with JDK-8231610 (Relocate the CDS archive if it cannot be mapped to the requested address). Looks now is a good time to clean up. One thing that can be cleaned up is the call to Metaspace::allocate_metaspace_compressed_klass_ptrs: (a) when CDS is enabled: ??? Metaspace::global_initialize() ??? -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() ?????? -> ... MetaspaceShared::map_archives() ???????? -> ... reserve the space, eventually calling Metaspace::reserve_space ???????? -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() (b) when CDS is disabled ??? Metaspace::global_initialize() -> allocate_metaspace_compressed_klass_ptrs ?????? -> (if cds is not enabled) Metaspace::reserve_space() In case (b), we should first reserve the space, and then call into allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments of allocate_metaspace_compressed_klass_ptrs, and will also limit the variations of calls to Metaspace::reserve_space(). I think this will make it possible to drop the use_requested_addr argument and rely simply on (requested_addr != NULL) Thanks - Ioi >> Does that make sense? In other words, if the whole point of >> Metaspace::reserve_preferred_space() is "OS knows better, let it try >> to find a good address", would it not make sense to just try a low >> address as part of the try-addresses-loop? > We certainly don't want to have to use a dedicated heapbase register > or a shift. Just give us a multiple of 4*G and we're happy. > From ioi.lam at oracle.com Thu Apr 16 18:28:50 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Thu, 16 Apr 2020 11:28:50 -0700 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> On 4/16/20 11:14 AM, Thomas St?fe wrote: > Hi Ioi, > > On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam > wrote: > > (I suppose you mean "compressed class space" by "ccs" :-) > > > Yes, I think I stole this from Stefan Karlsson :) > > > > I am not even sure if case (C) can happen at all. > > I admit that I've been guilty of making the interface even more > complicated > with JDK-8231610 > (Relocate the > CDS archive if it cannot be mapped to the > requested address). Looks now is a good time to clean up. > > > The coding has been complicated to begin with, and then it usually > only gets worse since no-one has time for a revamp :( A clean up would > be very helpful. > > One reason I look at this coding now, beside the aarch64 problem, was > that I try to disentangle?CDS from Metaspace, especially the alignment > policy. Remember, I tried to tackle this last summer? but it keeps > biting me. For such a small problem this is weirdly complicated. > > One thing that can be cleaned up is the call to > Metaspace::allocate_metaspace_compressed_klass_ptrs: > > (a) when CDS is enabled: > > ??? Metaspace::global_initialize() > ??? -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() > ?????? -> ... MetaspaceShared::map_archives() > ???????? -> ... reserve the space, eventually calling > Metaspace::reserve_space > ???????? -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() > > (b) when CDS is disabled > > ??? Metaspace::global_initialize() > -> allocate_metaspace_compressed_klass_ptrs > ?????? -> (if cds is not enabled) Metaspace::reserve_space() > > > In case (b), we should first reserve the space, and then call into > allocate_metaspace_compressed_klass_ptrs. This will simplify the > arguments > of allocate_metaspace_compressed_klass_ptrs, and will also limit > the variations > of calls to Metaspace::reserve_space(). I think this will make it > possible to > drop the use_requested_addr argument and rely simply on > (requested_addr != NULL) > > > So, in all cases we'd pre-reserve the ReservedSpace and hand it down > to Metaspace::allocate_metaspace_compressed_klass_ptrs()? > > This would melt down > Metaspace::allocate_metaspace_compressed_klass_ptrs() to just > "initialize compressed class space from a pre-arranged ReservedSpace, > and set up base?+ shift". > > We could probably rename that thing > to?Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); > > We even could move set_narrow_klass_base_and_shift() out of > Metaspace::set_up_compressed_klass_space, then it becomes a series of > three simple operations: > 1) obtain a ReservedSpace however you see fit > 2) register it with Metaspace as address space for ccs, > 3) set_narrow_klass_base_and_shift. We would not have to hand down > cds_base to Metaspace, only for it to be used as base address > in?set_narrow_klass_base_and_shift. > Yes, that seems the right thing to do. That will hopefully make the aarch64 initialization code a little simpler as well. > One question which came to me today was: > > In AppCDS, DynamicArchiveBuilder::do_it() calls > Metaspace::reserve_space(). Is that really needed,?does a DumpRegion > have anything to do with ccs? Don't they just need some space to dump > into? Hope that question is not dumb. > Do you mean: DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() -> MetaspaceShared::reserve_shared_space ??? -> Metaspace::reserve_space That's not necessary. When I wrote the code I thought Metaspace::reserve_space was a general function for reserving spaces :-) but as you said, this function is probably intended only for initializing the CCS. Thanks - Ioi > Thanks, Thomas > > Thanks > - Ioi > > >>> Does that make sense? In other words, if the whole point of >>> Metaspace::reserve_preferred_space() is "OS knows better, let it try >>> to find a good address", would it not make sense to just try a low >>> address as part of the try-addresses-loop? >> We certainly don't want to have to use a dedicated heapbase register >> or a shift. Just give us a multiple of 4*G and we're happy. >> > From aph at redhat.com Fri Apr 17 08:42:10 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 17 Apr 2020 09:42:10 +0100 Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear In-Reply-To: References: Message-ID: On 4/17/20 7:34 AM, Yang Zhang wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8242482 > Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/ > > This patch is a followup patch of previous discussion. > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008740.html > > To make the intent clear, the scalar parameter name is changed to isrc, fsrc or dsrc based on > its data type. The vector parameter name is changed to vsrc. And so does temp register. Thanks, that's much nicer. I haven't been able to check every substitution, though. I'm not quite sure about how to do that. Is all this stuff covered by our test cases? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Fri Apr 17 08:47:55 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 17 Apr 2020 09:47:55 +0100 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: On 4/16/20 6:51 PM, Thomas St?fe wrote: > Good to know. So, zero based encoding does not have any special place in > your heart? 4G aligned base works just as well? Absolutely, yes. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Fri Apr 17 09:13:11 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 17 Apr 2020 09:13:11 +0000 Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear In-Reply-To: References: Message-ID: Hi Andrew Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations. In this directory, there are also some test cases about reduction operations, which is added in [1]. https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtreg/compiler/loopopts/superword [1] https://bugs.openjdk.java.net/browse/JDK-8240248 Regards Yang -----Original Message----- From: Andrew Haley Sent: Friday, April 17, 2020 4:42 PM To: Yang Zhang ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear On 4/17/20 7:34 AM, Yang Zhang wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8242482 > Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/ > > This patch is a followup patch of previous discussion. > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/00 > 8740.html > > To make the intent clear, the scalar parameter name is changed to > isrc, fsrc or dsrc based on its data type. The vector parameter name is changed to vsrc. And so does temp register. Thanks, that's much nicer. I haven't been able to check every substitution, though. I'm not quite sure about how to do that. Is all this stuff covered by our test cases? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Fri Apr 17 09:14:24 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 17 Apr 2020 09:14:24 +0000 Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 In-Reply-To: References: Message-ID: Hi Andrew Ping it again. Could you please help to review this? Regards Yang -----Original Message----- From: aarch64-port-dev On Behalf Of Yang Zhang Sent: Friday, April 10, 2020 10:53 AM To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8242070 Webrev: http://cr.openjdk.java.net/~yzhang/8242070/webrev.00/ In JDK-8238690, it unified IR shape for vector shifts by scalar and always used ShiftV src (ShiftCntV shift) When shift is scalar, the following IR nodes are generated. scalar_shift | src ShiftCntV | / | / ShiftV But when implementing this on AArch64, there is an issue in match rule of vector shift right with imm shift for short type. match(Set dst (RShiftVS src (LShiftCntV shift))); LShiftCntV should be RShiftCntV here. Test case: public static void shiftR(short[] a, short[] c) { for (int i = 0; i < a.length; i++) { c[i] = (short)(a[i] >> 2); } } IR nodes: imm:2 | LoadVector RShiftCntV | / | / RShiftVS C2 aassembly generated: Before: 0x0000ffffac563764: orr w11, wzr, #0x2 0x0000ffffac563768: dup v16.16b, w11 -------- vshiftcnt16B 0x0000ffffac5637a8: ldr q24, [x18, #16] 0x0000ffffac5637ac: neg v25.16b, v16.16b ------ 0x0000ffffac5637b0: sshl v24.8h, v24.8h, v25.8h ------vsra8S 0x0000ffffac5637b8: str q24, [x14, #16] "match(Set dst (RShiftVS src (LShiftCntV shift)));" matching fails. RShiftCntV and RShiftVS are matched separately by vshiftcnt16B and vsra8S. After: 0x0000ffffac563808: ldr q16, [x15, #16] 0x0000ffffac56380c: sshr v16.8h, v16.8h, #2 0x0000ffffac563814: str q16, [x14, #16] "match(Set dst (RShiftVS src (RShiftCntV shift)));" matching succeeds. Performance: JMH test case is attached in JBS. Before: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 10 66.964 ? 0.052 us/op After: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 10 56.156 ? 0.053 us/op Testing: tier1 Pass and no new failure. Regards Yang From thomas.stuefe at gmail.com Sat Apr 18 06:26:24 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 18 Apr 2020 08:26:24 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: On Fri, Apr 17, 2020 at 10:48 AM Andrew Haley wrote: > On 4/16/20 6:51 PM, Thomas St?fe wrote: > > Good to know. So, zero based encoding does not have any special place in > > your heart? 4G aligned base works just as well? > > Absolutely, yes. > > Just occurred to me that aarch64 also relies on SharedBaseAddress being 4G aligned. The default is 32G so it works out. If you modify it with -XX:SharedBaseAddress, looks like the setting is ignored when the value is not usable. I guess that is all okay, maybe we just need to make it more explicit in coding. ..Thomas > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > From thomas.stuefe at gmail.com Sat Apr 18 07:15:21 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Sat, 18 Apr 2020 09:15:21 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> Message-ID: Hi Ioi, I am working on a small patch and have some more questions. - First, a simple one, in DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the space does not have anything to do with metaspace, as you wrote, so the alignment could be anything, right? - Out of curiousity, when you pack the different regions (DumpRegion::pack) you align the end to page size. Why? Why could the next region not simply follow immediately? I looked if any code needs a region to be page aligned, but may have missed it. - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() : I assume this code has to work for all three cases right 1) lp32. 2) lp64 with and without UseCompressedClassPointers? 3) lp64 without UseCompressedClassPointers? If yes, does the setting for UseCompressedClassPointers have to be the same at run time? In this layout: // On 64-bit VM, the heap and class space layout will be the same as if // you're running in -Xshare:on mode: // // +-- SharedBaseAddress (default = 0x800000000) // v // +-..---------+---------+ ... +----+----+----+--------------------+ // | Heap | Archive | | MC | RW | RO | class space | // +-..---------+---------+ ... +----+----+----+--------------------+ // |<-- MaxHeapSize -->| |<-- UnscaledClassSpaceMax = 4GB -->| // Why does the class space has to follow mc+rw+ro? Could it come before? Actually, does it have to be in the same space at all, or could it live somewhere completely different? Thanks! On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam wrote: > > > On 4/16/20 11:14 AM, Thomas St?fe wrote: > > Hi Ioi, > > On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam wrote: > >> (I suppose you mean "compressed class space" by "ccs" :-) >> >> > Yes, I think I stole this from Stefan Karlsson :) > > >> >> > > >> I am not even sure if case (C) can happen at all. >> >> I admit that I've been guilty of making the interface even more >> complicated >> with JDK-8231610 >> (Relocate the CDS archive if it cannot be mapped to the >> requested address). Looks now is a good time to clean up. >> >> > The coding has been complicated to begin with, and then it usually only > gets worse since no-one has time for a revamp :( A clean up would be very > helpful. > > One reason I look at this coding now, beside the aarch64 problem, was that > I try to disentangle CDS from Metaspace, especially the alignment policy. > Remember, I tried to tackle this last summer? but it keeps biting me. For > such a small problem this is weirdly complicated. > > >> One thing that can be cleaned up is the call to >> Metaspace::allocate_metaspace_compressed_klass_ptrs: >> >> (a) when CDS is enabled: >> >> Metaspace::global_initialize() >> -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >> -> ... MetaspaceShared::map_archives() >> -> ... reserve the space, eventually calling >> Metaspace::reserve_space >> -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() >> >> (b) when CDS is disabled >> >> Metaspace::global_initialize() >> -> allocate_metaspace_compressed_klass_ptrs >> -> (if cds is not enabled) Metaspace::reserve_space() >> >> >> In case (b), we should first reserve the space, and then call into >> allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments >> of allocate_metaspace_compressed_klass_ptrs, and will also limit the >> variations >> of calls to Metaspace::reserve_space(). I think this will make it >> possible to >> drop the use_requested_addr argument and rely simply on (requested_addr >> != NULL) >> >> > So, in all cases we'd pre-reserve the ReservedSpace and hand it down to > Metaspace::allocate_metaspace_compressed_klass_ptrs()? > > This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs() > to just "initialize compressed class space from a pre-arranged > ReservedSpace, and set up base + shift". > > We could probably rename that thing > to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); > > We even could move set_narrow_klass_base_and_shift() out of > Metaspace::set_up_compressed_klass_space, then it becomes a series of three > simple operations: > 1) obtain a ReservedSpace however you see fit > 2) register it with Metaspace as address space for ccs, > 3) set_narrow_klass_base_and_shift. We would not have to hand down > cds_base to Metaspace, only for it to be used as base address > in set_narrow_klass_base_and_shift. > > > Yes, that seems the right thing to do. That will hopefully make the > aarch64 initialization code a little simpler as well. > > One question which came to me today was: > > In AppCDS, DynamicArchiveBuilder::do_it() calls > Metaspace::reserve_space(). Is that really needed, does a DumpRegion have > anything to do with ccs? Don't they just need some space to dump into? Hope > that question is not dumb. > > Do you mean: > > DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() > -> MetaspaceShared::reserve_shared_space > -> Metaspace::reserve_space > > That's not necessary. When I wrote the code I thought > Metaspace::reserve_space was a general function for reserving spaces :-) > but as you said, this function is probably intended only for initializing > the CCS. > > Thanks > - Ioi > > Thanks, Thomas > > >> Thanks >> - Ioi >> >> >> Does that make sense? In other words, if the whole point of >> Metaspace::reserve_preferred_space() is "OS knows better, let it try >> to find a good address", would it not make sense to just try a low >> address as part of the try-addresses-loop? >> >> We certainly don't want to have to use a dedicated heapbase register >> or a shift. Just give us a multiple of 4*G and we're happy. >> >> >> >> > From nick.gasson at arm.com Mon Apr 20 02:35:48 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 20 Apr 2020 10:35:48 +0800 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: <851roiopdn.fsf@arm.com> On 04/17/20 16:47 pm, Andrew Haley wrote: > On 4/16/20 6:51 PM, Thomas St?fe wrote: >> Good to know. So, zero based encoding does not have any special place in >> your heart? 4G aligned base works just as well? > > Absolutely, yes. There's an extra constraint: above 32G the alignment needs to be (4 << LogKlassAlignmentInBytes)*G if the compressed klass shift is non-zero. There's some explanation here: https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html Nick From nick.gasson at arm.com Mon Apr 20 02:55:29 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 20 Apr 2020 10:55:29 +0800 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> Message-ID: <85zhb6n9we.fsf@arm.com> On 04/18/20 14:26 pm, Thomas St?fe wrote: > Just occurred to me that aarch64 also relies on SharedBaseAddress being 4G > aligned. The default is 32G so it works out. If you modify it with > -XX:SharedBaseAddress, looks like the setting is ignored when the value is > not usable. > Yes that's correct, it's treated as a hint on AArch64 since 8234794. Because MacroAssembler::{decode,encode}_klass cannot implement arbitrary base + (src << shift) without an additional temporary register that isn't always available when it's called. It seems better to constrain the possible base addresses than reserve a dedicated compressed class base register. All the different compressed class decoding modes should now be covered by the jtreg tests. So if you have access to an AArch64 machine, running these should be sufficient to prevent regressions. I'm also happy to help with testing. Thanks, Nick From Pengfei.Li at arm.com Mon Apr 20 04:32:00 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Mon, 20 Apr 2020 04:32:00 +0000 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>, <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> Message-ID: Hi Wei, > Thanks for all feedback. I think this patch has enough review and can be merged. > > Hi Pengfei, > > I need help to push it. Could you help to merge it? I'm not a reviewer, and not sure whether your updated webrev.01 [1] still requires an official reviewer to confirm. Maybe Andrew Haley or other AArch64 reviewers can help? [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/ -- Thanks, Pengfei From thomas.stuefe at gmail.com Mon Apr 20 06:35:04 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 20 Apr 2020 08:35:04 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <85zhb6n9we.fsf@arm.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <85zhb6n9we.fsf@arm.com> Message-ID: Hi Nick, thanks for your explanations! This may be a stupid question, but you rheapbase was used for both en/decoding compressed oops and compressed class pointers, right? Just was confused by the name. I would like to abstract the logic of allocating "good" memory for ccs away from the shared code into platform dependent files (e.g. metaspace_aarch64.cpp) so that each platform can cleanly implement whatever they feel is right. Ages ago we had a similar thing for allocating heap memory on AIX, that worked quite well. So, if I were to give you these prototypes: + // Given a size, reserve a space anywhere, suitable to be used as backing storage for ccs. The return + // address will be the base address for encoding/decoding compressed Klass pointers. + // Depending on the platform, this function may allocate anywhere or attempt platform specific optimized + // placement. + // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the returned space may be larger. + // On failure an unreserved space is returned. + static ReservedSpace reserve_compressed_class_space_anywhere(size_t size); + + // Given a size and an address p, reserve a space at address p, suitable to be used as backing storage for ccs. + // If p is not a suitable base address for encoding/decoding compressed Klass pointers, function will fail. + // Attach point has to be aligned to metaspace reserve alignment. + // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the returned space may be larger. + // On failure an unreserved space is returned. + static ReservedSpace reserve_compressed_class_space_at(address p, size_t size); this should be enough to implement platform dependent ccs allocation, right? Cheers, Thomas On Mon, Apr 20, 2020 at 4:56 AM Nick Gasson wrote: > On 04/18/20 14:26 pm, Thomas St?fe wrote: > > Just occurred to me that aarch64 also relies on SharedBaseAddress being > 4G > > aligned. The default is 32G so it works out. If you modify it with > > -XX:SharedBaseAddress, looks like the setting is ignored when the value > is > > not usable. > > > > Yes that's correct, it's treated as a hint on AArch64 since > 8234794. Because MacroAssembler::{decode,encode}_klass cannot implement > arbitrary base + (src << shift) without an additional temporary register > that isn't always available when it's called. It seems better to > constrain the possible base addresses than reserve a dedicated > compressed class base register. > > All the different compressed class decoding modes should now be covered > by the jtreg tests. So if you have access to an AArch64 machine, running > these should be sufficient to prevent regressions. I'm also happy to > help with testing. > > > Thanks, > Nick > From thomas.stuefe at gmail.com Mon Apr 20 06:37:48 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 20 Apr 2020 08:37:48 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <85zhb6n9we.fsf@arm.com> Message-ID: On Mon, Apr 20, 2020 at 8:35 AM Thomas St?fe wrote: > Hi Nick, > > thanks for your explanations! > > This may be a stupid question, but you rheapbase was used for both > en/decoding compressed oops and compressed class pointers, right? Just was > confused by the name. > > I would like to abstract the logic of allocating "good" memory for ccs > away from the shared code into platform dependent files (e.g. > metaspace_aarch64.cpp) so that each platform can cleanly implement whatever > they feel is right. Ages ago we had a similar thing for allocating heap > memory on AIX, that worked quite well. > > So, if I were to give you these prototypes: > > + // Given a size, reserve a space anywhere, suitable to be used as > backing storage for ccs. The return > + // address will be the base address for encoding/decoding compressed > Klass pointers. > + // Depending on the platform, this function may allocate anywhere or > attempt platform specific optimized > + // placement. > + // If size is not aligned to Metaspace::reserve_alignment it will be > corrected, so the returned space may be larger. > + // On failure an unreserved space is returned. > + static ReservedSpace reserve_compressed_class_space_anywhere(size_t > size); > + > + // Given a size and an address p, reserve a space at address p, > suitable to be used as backing storage for ccs. > + // If p is not a suitable base address for encoding/decoding compressed > Klass pointers, function will fail. > + // Attach point has to be aligned to metaspace reserve alignment. > + // If size is not aligned to Metaspace::reserve_alignment it will be > corrected, so the returned space may be larger. > + // On failure an unreserved space is returned. > + static ReservedSpace reserve_compressed_class_space_at(address p, > size_t size); > > this should be enough to implement platform dependent ccs allocation, > right? > > (Specifically, leaving the part of "checking request address if it fits zero based encoding" out, since the odds that the caller knows a low address which just happens to be free on my platforms are low; if zero based is important, platform probably knows better and can come up with a low address on its own.) > Cheers, Thomas > > On Mon, Apr 20, 2020 at 4:56 AM Nick Gasson wrote: > >> On 04/18/20 14:26 pm, Thomas St?fe wrote: >> > Just occurred to me that aarch64 also relies on SharedBaseAddress being >> 4G >> > aligned. The default is 32G so it works out. If you modify it with >> > -XX:SharedBaseAddress, looks like the setting is ignored when the value >> is >> > not usable. >> > >> >> Yes that's correct, it's treated as a hint on AArch64 since >> 8234794. Because MacroAssembler::{decode,encode}_klass cannot implement >> arbitrary base + (src << shift) without an additional temporary register >> that isn't always available when it's called. It seems better to >> constrain the possible base addresses than reserve a dedicated >> compressed class base register. >> >> All the different compressed class decoding modes should now be covered >> by the jtreg tests. So if you have access to an AArch64 machine, running >> these should be sufficient to prevent regressions. I'm also happy to >> help with testing. >> >> >> Thanks, >> Nick >> > From nick.gasson at arm.com Mon Apr 20 07:00:51 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Mon, 20 Apr 2020 15:00:51 +0800 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <85zhb6n9we.fsf@arm.com> Message-ID: <85y2qqmyjg.fsf@arm.com> Hi Thomas, > > This may be a stupid question, but you rheapbase was used for both en/decoding compressed > oops and compressed class pointers, right? Just was confused by the name. > Yes. Previously if the alignment of the compressed class base was such that it couldn't use EOR or MOVK to decode, it would fall back to a generic shift+add using the compressed oop base register, rheapbase, as temporary. This works because you can call reinit_heapbase to restore the original value of rheapbase. The problem with this is that it generates ~2x the instructions of the other decoding modes, which breaks some size assertions elsewhere in addition to just being inefficient. Additionally if compressed oops are not enabled we want to be able to reuse rheapbase as a general allocatable register, but in that case we can't use it as temporary for decoding class pointers. > I would like to abstract the logic of allocating "good" memory for ccs away from the shared > code into platform dependent files (e.g. metaspace_aarch64.cpp) so that each platform can > cleanly implement whatever they feel is right. Ages ago we had a similar thing for > allocating heap memory on AIX, that worked quite well. > > So, if I were to give you these prototypes: > > + // Given a size, reserve a space anywhere, suitable to be used as backing storage for > ccs. The return > + // address will be the base address for encoding/decoding compressed Klass pointers. > + // Depending on the platform, this function may allocate anywhere or attempt platform > specific optimized > + // placement. > + // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the > returned space may be larger. > + // On failure an unreserved space is returned. > + static ReservedSpace reserve_compressed_class_space_anywhere(size_t size); > + > + // Given a size and an address p, reserve a space at address p, suitable to be used as > backing storage for ccs. > + // If p is not a suitable base address for encoding/decoding compressed Klass pointers, > function will fail. > + // Attach point has to be aligned to metaspace reserve alignment. > + // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the > returned space may be larger. > + // On failure an unreserved space is returned. > + static ReservedSpace reserve_compressed_class_space_at(address p, size_t size); > > this should be enough to implement platform dependent ccs allocation, right? > Yes I think this works. Thanks, Nick From aph at redhat.com Mon Apr 20 08:48:50 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 09:48:50 +0100 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> Message-ID: On 4/20/20 5:32 AM, Pengfei Li wrote: > Maybe Andrew Haley or other AArch64 reviewers can help? > > [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/ It's fine. At some point in the future maybe we can get round to taking out all references to rheapbase, but it'll require careful thinking about JVMCI and Graal-precompiled code. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ioi.lam at oracle.com Mon Apr 20 08:47:00 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 20 Apr 2020 01:47:00 -0700 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> Message-ID: On 4/18/20 12:15 AM, Thomas St?fe wrote: > Hi Ioi, > > I am working on a small patch and have some more questions. > > - First, a simple one, in > DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), > the space does not have anything to do with metaspace, as you wrote, > so the alignment could be anything, right? > I think so. > - Out of curiousity, when you pack the different regions > (DumpRegion::pack) you align the end to page size. Why? Why could the > next region not simply follow immediately? I looked if any code needs > a region to be page aligned, but may have missed it. We map RO read-only and MC/RW in read-write. If the regions are not aligned, you will have a page that wants half to be read-only and half to be read-write. I guess we can adjust the mapping to be more lenient (if a page wants half read-write, we map it read-write), but that's no done today. > > - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() : > > I assume this code has to work for all three cases right > 1) lp32. > 2) lp64 with and without UseCompressedClassPointers? > 3) lp64 without UseCompressedClassPointers? > > If yes, does the setting for UseCompressedClassPointers have to be the > same at run time? Yes. The value of UseCompressedOops and UseCompressedClassPointers must be the same between dump time and run time. > > > In this layout: > ? // On 64-bit VM, the heap and class space layout will be the same as if > ? // you're running in -Xshare:on mode: > ? // > ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?+-- SharedBaseAddress (default = > 0x800000000) > ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?v > ? // +-..---------+---------+ ... +----+----+----+--------------------+ > ? // | ? ?Heap ? ?| Archive | ? ? | MC | RW | RO | ?class space ? ? | > ? // +-..---------+---------+ ... +----+----+----+--------------------+ > ? // |<-- ? MaxHeapSize ?-->| ? ? |<-- UnscaledClassSpaceMax = 4GB -->| > ? // > > Why does the class space has to follow mc+rw+ro? Could it come before? > > Compressed klass pointers are stored in archived objects. If the class space is now lower than SharedBaseAddress, you will need to rebase all of the compressed klass pointers. This is not efficient and will slow down start-up. > > Actually, does it have to be in the same space at all, or could it > live somewhere completely different? It can be higher. You just need to ensure that the distance between SharedBaseAddress to the end of the class space is within max compressed klass space size. But, I am wondering why you're asking this :-) > To ask in a more precise way: I understand that both the mc+rw+ro > archives and the ccs have to live in an area encompassed by the > compressed class pointers encoding scheme. I wonder whether there are > any restrictions beyond that. > > Could there be a gap between archives and ccs? Yes > Can the order be reversed? No. > Do the relative positions between archives and ccs have to be the same > between dump time and runtime? No. All the pointers stored inside CDS point to inside of the MC/MW/RO regions, so it doesn't retain any knowledge of where the CCS was at dump time. Thanks - Ioi > > Thanks! > > On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam > wrote: > > > > On 4/16/20 11:14 AM, Thomas St?fe wrote: >> Hi Ioi, >> >> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam > > wrote: >> >> (I suppose you mean "compressed class space" by "ccs" :-) >> >> >> Yes, I think I stole this from Stefan Karlsson :) >> >> >> >> I am not even sure if case (C) can happen at all. >> >> I admit that I've been guilty of making the interface even >> more complicated >> with JDK-8231610 >> (Relocate >> the CDS archive if it cannot be mapped to the >> requested address). Looks now is a good time to clean up. >> >> >> The coding has been complicated to begin with, and then it >> usually only gets worse since no-one has time for a revamp :( A >> clean up would be very helpful. >> >> One reason I look at this coding now, beside the aarch64 problem, >> was that I try to disentangle?CDS from Metaspace, especially the >> alignment policy. Remember, I tried to tackle this last summer? >> but it keeps biting me. For such a small problem this is weirdly >> complicated. >> >> One thing that can be cleaned up is the call to >> Metaspace::allocate_metaspace_compressed_klass_ptrs: >> >> (a) when CDS is enabled: >> >> ??? Metaspace::global_initialize() >> ??? -> >> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >> ?????? -> ... MetaspaceShared::map_archives() >> ???????? -> ... reserve the space, eventually calling >> Metaspace::reserve_space >> ???????? -> call >> Metaspace::allocate_metaspace_compressed_klass_ptrs() >> >> (b) when CDS is disabled >> >> Metaspace::global_initialize() >> -> allocate_metaspace_compressed_klass_ptrs >> ?????? -> (if cds is not enabled) Metaspace::reserve_space() >> >> >> In case (b), we should first reserve the space, and then call >> into >> allocate_metaspace_compressed_klass_ptrs. This will simplify >> the arguments >> of allocate_metaspace_compressed_klass_ptrs, and will also >> limit the variations >> of calls to Metaspace::reserve_space(). I think this will >> make it possible to >> drop the use_requested_addr argument and rely simply on >> (requested_addr != NULL) >> >> >> So, in all cases we'd pre-reserve the ReservedSpace and hand it >> down to Metaspace::allocate_metaspace_compressed_klass_ptrs()? >> >> This would melt down >> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just >> "initialize compressed class space from a pre-arranged >> ReservedSpace, and set up base?+ shift". >> >> We could probably rename that thing >> to?Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, >> cds_base); >> >> We even could move set_narrow_klass_base_and_shift() out of >> Metaspace::set_up_compressed_klass_space, then it becomes a >> series of three simple operations: >> 1) obtain a ReservedSpace however you see fit >> 2) register it with Metaspace as address space for ccs, >> 3) set_narrow_klass_base_and_shift. We would not have to hand >> down cds_base to Metaspace, only for it to be used as base >> address in?set_narrow_klass_base_and_shift. >> > > Yes, that seems the right thing to do. That will hopefully make > the aarch64 initialization code a little simpler as well. > >> One question which came to me today was: >> >> In AppCDS, DynamicArchiveBuilder::do_it() calls >> Metaspace::reserve_space(). Is that really needed,?does a >> DumpRegion have anything to do with ccs? Don't they just need >> some space to dump into? Hope that question is not dumb. >> > Do you mean: > > DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() > > -> MetaspaceShared::reserve_shared_space > ??? -> Metaspace::reserve_space > > That's not necessary. When I wrote the code I thought > Metaspace::reserve_space was a general function for reserving > spaces :-) but as you said, this function is probably intended > only for initializing the CCS. > > Thanks > - Ioi > >> Thanks, Thomas >> >> Thanks >> - Ioi >> >> >>>> Does that make sense? In other words, if the whole point of >>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try >>>> to find a good address", would it not make sense to just try a low >>>> address as part of the try-addresses-loop? >>> We certainly don't want to have to use a dedicated heapbase register >>> or a shift. Just give us a multiple of 4*G and we're happy. >>> >> > From Pengfei.Li at arm.com Mon Apr 20 09:54:40 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Mon, 20 Apr 2020 09:54:40 +0000 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> Message-ID: > It's fine. At some point in the future maybe we can get round to taking out all > references to rheapbase, but it'll require careful thinking about JVMCI and > Graal-precompiled code. Thanks Andrew. Pushed here http://hg.openjdk.java.net/jdk/jdk/rev/aedc9bf21743 -- Thanks, Pengfei From aph at redhat.com Mon Apr 20 10:01:10 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 11:01:10 +0100 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> Message-ID: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com> On 4/20/20 9:48 AM, Andrew Haley wrote: > On 4/20/20 5:32 AM, Pengfei Li wrote: >> Maybe Andrew Haley or other AArch64 reviewers can help? >> >> [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/ > It's fine. Sorry, no it isn't fine. Please get rid of this hunk: --- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:52.009758661 +0800 +++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:51.785764043 +0800 @@ -2185,6 +2185,10 @@ #if 0 assert (UseCompressedOops || UseCompressedClassPointers, "should be compressed"); assert (Universe::heap() != NULL, "java heap should be initialized"); + if (!UseCompressedOops || Universe::ptr_base() == NULL) { + // rheapbase is allocated as general register + return; + } if (CheckCompressedOops) { Label ok; push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1 -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Pengfei.Li at arm.com Mon Apr 20 10:10:05 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Mon, 20 Apr 2020 10:10:05 +0000 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com> References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com> Message-ID: Hi Andrew, > Sorry, no it isn't fine. Please get rid of this hunk: > > --- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020- > 04-14 21:18:52.009758661 +0800 > +++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020- > 04-14 21:18:51.785764043 +0800 > @@ -2185,6 +2185,10 @@ > #if 0 > assert (UseCompressedOops || UseCompressedClassPointers, "should be > compressed"); > assert (Universe::heap() != NULL, "java heap should be initialized"); > + if (!UseCompressedOops || Universe::ptr_base() == NULL) { > + // rheapbase is allocated as general register > + return; > + } > if (CheckCompressedOops) { > Label ok; > push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1 Oh. It's already pushed just now. According to the process, we may need Wei to create another JBS to backout that part? -- Thanks, Pengfei From aph at redhat.com Mon Apr 20 10:23:41 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 11:23:41 +0100 Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 In-Reply-To: References: Message-ID: <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com> On 4/17/20 10:14 AM, Yang Zhang wrote: > Ping it again. Could you please help to review this? I'm running it, and I get no vector code generated. How did you test it? -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Mon Apr 20 10:36:19 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 11:36:19 +0100 Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 In-Reply-To: <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com> References: <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com> Message-ID: On 4/20/20 11:23 AM, Andrew Haley wrote: > On 4/17/20 10:14 AM, Yang Zhang wrote: >> Ping it again. Could you please help to review this? > > I'm running it, and I get no vector code generated. How did you test it? Sorry, my mistake. I'm testing it now. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Mon Apr 20 11:10:42 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Mon, 20 Apr 2020 13:10:42 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> Message-ID: On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam wrote: > > > On 4/18/20 12:15 AM, Thomas St?fe wrote: > > Hi Ioi, > > I am working on a small patch and have some more questions. > > - First, a simple one, in > DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the > space does not have anything to do with metaspace, as you wrote, so the > alignment could be anything, right? > > I think so. > > - Out of curiousity, when you pack the different regions > (DumpRegion::pack) you align the end to page size. Why? Why could the next > region not simply follow immediately? I looked if any code needs a region > to be page aligned, but may have missed it. > > > We map RO read-only and MC/RW in read-write. If the regions are not > aligned, you will have a page that wants half to be read-only and half to > be read-write. > > Okay. I wondered why page align here and not allocation granularity. Now I understand. I guess this is also the reason why we could not use large pages for the archive? I think this is fine, I did not want to change it. On some platforms we have 64K (non-large) pages, but even there I think the waste would be acceptable. > I guess we can adjust the mapping to be more lenient (if a page wants half > read-write, we map it read-write), but that's no done today. > > > - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() : > > I assume this code has to work for all three cases right > 1) lp32. > 2) lp64 with and without UseCompressedClassPointers? > 3) lp64 without UseCompressedClassPointers? > > If yes, does the setting for UseCompressedClassPointers have to be the > same at run time? > > > Yes. The value of UseCompressedOops and UseCompressedClassPointers must be > the same between dump time and run time. > > > > In this layout: > // On 64-bit VM, the heap and class space layout will be the same as if > // you're running in -Xshare:on mode: > // > // +-- SharedBaseAddress (default = > 0x800000000) > // v > // +-..---------+---------+ ... +----+----+----+--------------------+ > // | Heap | Archive | | MC | RW | RO | class space | > // +-..---------+---------+ ... +----+----+----+--------------------+ > // |<-- MaxHeapSize -->| |<-- UnscaledClassSpaceMax = 4GB -->| > // > > Why does the class space has to follow mc+rw+ro? Could it come before? > > > Compressed klass pointers are stored in archived objects. If the class > space is now lower than SharedBaseAddress, you will need to rebase all of > the compressed klass pointers. This is not efficient and will slow down > start-up. > > Well, could SharedBaseAddress not point to start of the ccs: // +-- SharedBaseAddress (default = 0x800000000) // v // +----+----+----+-----------------------------------+ // | class space | ..gap maybe.. | MC | RW | RO // +----+----+----+-----------------------------------+ you'd then need to make sure that the relative offset of MC to SharedBaseAddress is the same at dump time and at runtime. Is my understanding correct? I am not saying I want to do this, I just try to understand the way ccs archive allocation works. > > > Actually, does it have to be in the same space at all, or could it live > somewhere completely different? > > > It can be higher. You just need to ensure that the distance between > SharedBaseAddress to the end of the class space is within max compressed > klass space size. > > But, I am wondering why you're asking this :-) > > I try to understand the allocation and where apply what restrictions. We have at least three parties, cds, metaspace and the underlying platform, all with their own subtleties of how the memory should be allocated: - metaspace will in the near future want a larger alignment than what cds uses for reservation. - platforms like aarch64 and maybe ppc want the compressed class base to look in a certain way Part of my confusion was that I always thought of CompressClassPointers::base() to be basically the same as the start of the ccs (maybe modulo being zero on zero-based mode) but that is obviously not true since CDS exists. So what I wrote first: "Metaspace::reserve_preferred_space.. Despite its generic-sounding name, these functions can only be used to allocate ccs." is actually not fully correct. In reality this space is to be used to allocate memory to house Klass structures so that their pointers are compressable, so the reserved start address has to be compatible with that. But, e.g., that start address does not have to be aligned to Metaspace::reserve_alignment(). In both cds dump and runtime case, the ccs is carved from the end part of the reserved space. Only that split point, and the size of that second part, have to be aligned to Metaspace::reserve_alignment(). Were we to allocate ccs first and put the archives behind it this would simplify some matters, but only minor points. I think the way it works now is okay. I will try to disentangle it a bit in a way you proposed. > To ask in a more precise way: I understand that both the mc+rw+ro archives > and the ccs have to live in an area encompassed by the compressed class > pointers encoding scheme. I wonder whether there are any restrictions > beyond that. > > Could there be a gap between archives and ccs? > > Yes > > Can the order be reversed? > > No. > > Do the relative positions between archives and ccs have to be the same > between dump time and runtime? > > No. All the pointers stored inside CDS point to inside of the MC/MW/RO > regions, so it doesn't retain any knowledge of where the CCS was at dump > time. > > Clear answers, thank you! ..Thomas > > Thanks > - Ioi > > > > Thanks! > > On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam wrote: > >> >> >> On 4/16/20 11:14 AM, Thomas St?fe wrote: >> >> Hi Ioi, >> >> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam wrote: >> >>> (I suppose you mean "compressed class space" by "ccs" :-) >>> >>> >> Yes, I think I stole this from Stefan Karlsson :) >> >> >>> >>> >> >> >>> I am not even sure if case (C) can happen at all. >>> >>> I admit that I've been guilty of making the interface even more >>> complicated >>> with JDK-8231610 >>> (Relocate the CDS archive if it cannot be mapped to the >>> requested address). Looks now is a good time to clean up. >>> >>> >> The coding has been complicated to begin with, and then it usually only >> gets worse since no-one has time for a revamp :( A clean up would be very >> helpful. >> >> One reason I look at this coding now, beside the aarch64 problem, was >> that I try to disentangle CDS from Metaspace, especially the alignment >> policy. Remember, I tried to tackle this last summer? but it keeps biting >> me. For such a small problem this is weirdly complicated. >> >> >>> One thing that can be cleaned up is the call to >>> Metaspace::allocate_metaspace_compressed_klass_ptrs: >>> >>> (a) when CDS is enabled: >>> >>> Metaspace::global_initialize() >>> -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >>> -> ... MetaspaceShared::map_archives() >>> -> ... reserve the space, eventually calling >>> Metaspace::reserve_space >>> -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() >>> >>> (b) when CDS is disabled >>> >>> Metaspace::global_initialize() >>> -> allocate_metaspace_compressed_klass_ptrs >>> -> (if cds is not enabled) Metaspace::reserve_space() >>> >>> >>> In case (b), we should first reserve the space, and then call into >>> allocate_metaspace_compressed_klass_ptrs. This will simplify the >>> arguments >>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the >>> variations >>> of calls to Metaspace::reserve_space(). I think this will make it >>> possible to >>> drop the use_requested_addr argument and rely simply on (requested_addr >>> != NULL) >>> >>> >> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to >> Metaspace::allocate_metaspace_compressed_klass_ptrs()? >> >> This would melt down >> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize >> compressed class space from a pre-arranged ReservedSpace, and set up base + >> shift". >> >> We could probably rename that thing >> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); >> >> We even could move set_narrow_klass_base_and_shift() out of >> Metaspace::set_up_compressed_klass_space, then it becomes a series of three >> simple operations: >> 1) obtain a ReservedSpace however you see fit >> 2) register it with Metaspace as address space for ccs, >> 3) set_narrow_klass_base_and_shift. We would not have to hand down >> cds_base to Metaspace, only for it to be used as base address >> in set_narrow_klass_base_and_shift. >> >> >> Yes, that seems the right thing to do. That will hopefully make the >> aarch64 initialization code a little simpler as well. >> >> One question which came to me today was: >> >> In AppCDS, DynamicArchiveBuilder::do_it() calls >> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have >> anything to do with ccs? Don't they just need some space to dump into? Hope >> that question is not dumb. >> >> Do you mean: >> >> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() >> -> MetaspaceShared::reserve_shared_space >> -> Metaspace::reserve_space >> >> That's not necessary. When I wrote the code I thought >> Metaspace::reserve_space was a general function for reserving spaces :-) >> but as you said, this function is probably intended only for initializing >> the CCS. >> >> Thanks >> - Ioi >> >> Thanks, Thomas >> >> >>> Thanks >>> - Ioi >>> >>> >>> Does that make sense? In other words, if the whole point of >>> Metaspace::reserve_preferred_space() is "OS knows better, let it try >>> to find a good address", would it not make sense to just try a low >>> address as part of the try-addresses-loop? >>> >>> We certainly don't want to have to use a dedicated heapbase register >>> or a shift. Just give us a multiple of 4*G and we're happy. >>> >>> >>> >>> >> > From aph at redhat.com Mon Apr 20 11:50:33 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 12:50:33 +0100 Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690 In-Reply-To: References: Message-ID: On 4/17/20 10:14 AM, Yang Zhang wrote: > > Ping it again. Could you please help to review this? Before: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 5 141.027 ? 0.117 us/op 0.41% 0x0000ffffa8c5fc40: sbfiz x15, x11, #1, #32 0x0000ffffa8c5fc44: add x16, x18, x15 ;*saload {reexecute=0 rethrow=0 return_oop=0} ; - org.sample.TestVect::testVectShift at 16 (line 31) 0x0000ffffa8c5fc48: ldr q16, [x16, #16] 0.51% 0x0000ffffa8c5fc4c: neg v17.16b, v18.16b 0x0000ffffa8c5fc50: sshl v16.8h, v16.8h, v17.8h 0x0000ffffa8c5fc54: add x15, x17, x15 After: Benchmark Mode Cnt Score Error Units TestVect.testVectShift avgt 5 143.021 ? 0.506 us/op 0.46% 0x0000ffff78c61f00: sbfiz x13, x15, #1, #32 0x0000ffff78c61f04: add x14, x17, x13 ;*saload {reexecute=0 rethrow=0 return_oop=0} ; - org.sample.TestVect::testVectShift at 16 (line 31) 0x0000ffff78c61f08: ldr q16, [x14, #16] 0.36% 0x0000ffff78c61f0c: sshr v16.8h, v16.8h, #2 0x0000ffff78c61f10: add x13, x16, x13 So, at least on this thing it makes no difference. I'll grant you it's less code, so OK. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Mon Apr 20 12:14:29 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 13:14:29 +0100 Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in compressed mode In-Reply-To: <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com> References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.co m> <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com> Message-ID: On 4/20/20 12:12 PM, Kuai Wei wrote: > Could you tell more detail about it? I can start a new patch for it > if it break anything. Well, it's ifdef'd out at the moment, so by definition it can't break anything. But there may be issues with Graal whereby we really do need to check rheapbase, but it's OK for now. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stuart.monteith at arm.com Mon Apr 20 14:19:28 2020 From: stuart.monteith at arm.com (Stuart Monteith) Date: Mon, 20 Apr 2020 15:19:28 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> Message-ID: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Hi, If anyone has bandwidth, would the be able to review this patch? It addresses Andrew, Per and Erik's comments: http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ Thanks, Stuart On 27/03/2020 09:47, Erik ?sterlund wrote: > Hi Stuart, > > Thanks for sorting this out on AArch64. It is nice to see thatyou can > implement these > barriers on platforms that do not have instruction cache coherency. > > One small change request: > It looks like in C1 you inject the entry barrier right after build_frame > is done: > > ?629?????? build_frame(); > ?630?????? { > ?631???????? // Insert nmethod entry barrier into frame. > ?632???????? BarrierSetAssembler* bs = > BarrierSet::barrier_set()->barrier_set_assembler(); > ?633???????? bs->nmethod_entry_barrier(_masm); > ?634?????? } > > Unfortunately, this is in the platform independent part of the LIR > assembler. In the x86 version > we inject it at the very end of build_frame() instead, which is a > platform-specific function. > The platform-specific function is in the C1 macro assembler file for > that platform. > > We intentionally put it in the platform-specific path as it is a > platform-specific feature. > Now on x86, the barrier code will be emitted once in build_frame() and > once after returning > from build_frame, resulting in two nmethod entry barriers, and only the > first one will get > patched, causing the second one to mostly take slow paths, which isn't > necessarily wrong, > but will cause regressions. > > I would propose you just move those lines into the very end of the > AArch64-specific part of > build_frame(). > > I don't need to see another webrev for that trivial code motion. This > looks good to me. > Agan, thanks a lot for fixing this! It will allow me to go forward with > concurrent stack > scanning on AArch64 as well. > > Thanks, > /Erik > > > On 2020-03-26 23:42, Stuart Monteith wrote: >> Hello, >> ???????? Please review this change to implement nmethod entry barriers on >> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will >> need to be separately tested and enabled - there are problems with this >> on Shenandoah. >> >> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as >> well as Netbeans. >> >> In terms of interesting features: >> ????????? With nmethod entry barriers,? immediate oops are removed by: >> ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop >> ???????? This is to ensure consistency with the entry barrier, as >> otherwise with >> an immediate we'd otherwise need an ISB. >> >> ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this >> functionality >> useful in testing as deoptimisation is very infrequent. I've written it >> as an atomic to avoid it happening too frequently. As it is a new >> option, I'm not sure whether any more is needed than this review. A new >> test has been added >> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to >> test GC with that option enabled. >> >> ???????? BarrierSetAssembler::nmethod_entry_barrier >> ???????? This method emits the barrier code. In internal review it was >> suggested >> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not >> done this as the BarrierSetNMethod code checks the exact instruction >> sequence, and I prefer to be explicit. >> >> ???????? Benchmarking method entry shows an increase of around 6ns >> with the >> nmethod entry barrier. >> >> >> The deoptimisation code was contributed by Andrew Haley. >> >> The bug: >> ???????? https://bugs.openjdk.java.net/browse/JDK-8216557 >> >> The webrev: >> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/ >> >> >> BR, >> ???????? Stuart From aph at redhat.com Mon Apr 20 16:35:35 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 17:35:35 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: <7cd269af-c621-8d33-c7d6-1baa6729fc31@redhat.com> On 4/20/20 3:19 PM, Stuart Monteith wrote: > If anyone has bandwidth, would the be able to review this patch? It > addresses Andrew, Per and Erik's comments: > http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ Yes, yes. I'm pedalling as quickly as I can. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Mon Apr 20 17:22:12 2020 From: aph at redhat.com (Andrew Haley) Date: Mon, 20 Apr 2020 18:22:12 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: On 4/20/20 3:19 PM, Stuart Monteith wrote: > If anyone has bandwidth, would the be able to review this patch? It > addresses Andrew, Per and Erik's comments: > http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ It looks right. For clarity I wonder if perhaps we should have a method bool BarrierSet::use_nmethod_barriers() or somesuch. It would be much easier to read. In future we should perhaps not inline the guard value, and move the infrequently-executed code out of line. Then this: 0x0000ffffa97fff14: ldr w8, 0x0000ffffa97fff3c 0x0000ffffa97fff18: dmb ishld 0x0000ffffa97fff1c: ldr w9, [x28, #36] 0x0000ffffa97fff20: cmp w8, w9 0x0000ffffa97fff24: b.eq 0x0000ffffa97fff40 // b.none ;; 0xFFFFA9118B00 0x0000ffffa97fff28: mov x8, #0x8b00 // #35584 0x0000ffffa97fff2c: movk x8, #0xa911, lsl #16 0x0000ffffa97fff30: movk x8, #0xffff, lsl #32 0x0000ffffa97fff34: blr x8 0x0000ffffa97fff38: b 0x0000ffffa97fff40 0x0000ffffa97fff3c would turn into this: : ldr w8, 0x0000ffffa97fff3c : dmb ishld : ldr w9, [x28, #36] : cmp w8, w9 : b.ne 0x0000ffffa97fff40 // b.none but we don't have to do that right now. OK. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ci_notify at linaro.org Mon Apr 20 18:06:39 2020 From: ci_notify at linaro.org (ci_notify at linaro.org) Date: Mon, 20 Apr 2020 18:06:39 +0000 (UTC) Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3385 Failure Message-ID: <265510322.18108.1587406000291.JavaMail.javamailuser@localhost> OpenJDK AArch64 jdk/jdk build status is Failure Build details - https://ci.linaro.org/job/jdkX-ci-build/3385/ Changes - erikj: f499459eda7ae5bc843e072e0cfdd6201636456f - make/autoconf/boot-jdk.m4 - make/autoconf/version-numbers - make/conf/jib-profiles.js --"8242863: Bump minimum boot jdk to JDK 14 Reviewed-by: ihse, jlahoda, dholmes " Build output - Creating java.rmi.jmod Creating java.scripting.jmod Creating java.se.jmod Creating java.security.jgss.jmod Creating java.security.sasl.jmod Creating java.smartcardio.jmod Creating java.sql.jmod Creating java.sql.rowset.jmod Creating java.transaction.xa.jmod Creating java.xml.jmod Creating jdk.accessibility.jmod Creating java.xml.crypto.jmod Creating jdk.aot.jmod Creating jdk.charsets.jmod Creating jdk.attach.jmod Creating jdk.compiler.jmod Creating jdk.crypto.cryptoki.jmod Creating jdk.crypto.ec.jmod Creating jdk.dynalink.jmod Creating jdk.editpad.jmod Creating jdk.hotspot.agent.jmod Creating jdk.httpserver.jmod Creating jdk.incubator.foreign.jmod Creating jdk.incubator.jpackage.jmod Creating jdk.internal.ed.jmod Creating jdk.internal.jvmstat.jmod Creating jdk.internal.le.jmod Creating jdk.internal.opt.jmod Creating jdk.internal.vm.ci.jmod Creating jdk.internal.vm.compiler.jmod Creating jdk.internal.vm.compiler.management.jmod Creating jdk.jartool.jmod Creating jdk.javadoc.jmod Creating jdk.jcmd.jmod Creating jdk.jconsole.jmod Creating jdk.jdeps.jmod Creating jdk.jdwp.agent.jmod Creating jdk.jdi.jmod Creating jdk.jfr.jmod Creating jdk.jshell.jmod Creating jdk.jsobject.jmod Creating jdk.jstatd.jmod Creating jdk.localedata.jmod Creating jdk.management.jmod Creating jdk.management.agent.jmod Creating jdk.management.jfr.jmod Creating jdk.naming.dns.jmod Creating jdk.naming.rmi.jmod Creating jdk.net.jmod Creating jdk.nio.mapmode.jmod Creating jdk.sctp.jmod Creating jdk.security.auth.jmod Creating jdk.security.jgss.jmod Creating jdk.unsupported.jmod Creating jdk.unsupported.desktop.jmod Creating jdk.xml.dom.jmod Creating jdk.zipfs.jmod Creating interim jimage Compiling 3 files for BUILD_DEMO_CodePointIM Updating support/demos/image/jfc/CodePointIM/src.zip Compiling 3 files for BUILD_DEMO_FileChooserDemo Updating support/demos/image/jfc/FileChooserDemo/src.zip Compiling 29 files for BUILD_DEMO_SwingSet2 Updating support/demos/image/jfc/SwingSet2/src.zip Compiling 3 files for BUILD_DEMO_Font2DTest Updating support/demos/image/jfc/Font2DTest/src.zip Compiling 64 files for BUILD_DEMO_J2Ddemo Updating support/demos/image/jfc/J2Ddemo/src.zip Compiling 15 files for BUILD_DEMO_Metalworks Updating support/demos/image/jfc/Metalworks/src.zip Compiling 2 files for BUILD_DEMO_Notepad Updating support/demos/image/jfc/Notepad/src.zip Compiling 5 files for BUILD_DEMO_Stylepad Updating support/demos/image/jfc/Stylepad/src.zip Compiling 5 files for BUILD_DEMO_SampleTree Updating support/demos/image/jfc/SampleTree/src.zip Compiling 8 files for BUILD_DEMO_TableExample Updating support/demos/image/jfc/TableExample/src.zip Compiling 1 files for BUILD_DEMO_TransparentRuler Updating support/demos/image/jfc/TransparentRuler/src.zip Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar Creating support/demos/image/jfc/Metalworks/Metalworks.jar Creating support/demos/image/jfc/Notepad/Notepad.jar Creating support/demos/image/jfc/Stylepad/Stylepad.jar Creating support/demos/image/jfc/SampleTree/SampleTree.jar Creating support/demos/image/jfc/TableExample/TableExample.jar Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar Compiling 1 files for CLASSLIST_JAR Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar Creating support/classlist.jar Creating jdk.jlink.jmod Creating java.base.jmod Creating jdk image WARNING: Using incubator modules: jdk.incubator.jpackage, jdk.incubator.foreign Creating CDS archive for jdk image Stopping sjavac server Finished building target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build' From stuart.monteith at arm.com Mon Apr 20 19:35:19 2020 From: stuart.monteith at arm.com (Stuart Monteith) Date: Mon, 20 Apr 2020 20:35:19 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: On 20/04/2020 18:22, Andrew Haley wrote: > On 4/20/20 3:19 PM, Stuart Monteith wrote: >> If anyone has bandwidth, would the be able to review this patch? It >> addresses Andrew, Per and Erik's comments: >> http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ > > It looks right. For clarity I wonder if perhaps we should have a method > bool BarrierSet::use_nmethod_barriers() or somesuch. It would be much > easier to read. > That would be good. How about I apply that as a separate patch, as it would necessarily be cross-platform. > In future we should perhaps not inline the guard value, and move the > infrequently-executed code out of line. > > Then this: > > 0x0000ffffa97fff14: ldr w8, 0x0000ffffa97fff3c > 0x0000ffffa97fff18: dmb ishld > 0x0000ffffa97fff1c: ldr w9, [x28, #36] > 0x0000ffffa97fff20: cmp w8, w9 > 0x0000ffffa97fff24: b.eq 0x0000ffffa97fff40 // b.none > ;; 0xFFFFA9118B00 > 0x0000ffffa97fff28: mov x8, #0x8b00 // #35584 > 0x0000ffffa97fff2c: movk x8, #0xa911, lsl #16 > 0x0000ffffa97fff30: movk x8, #0xffff, lsl #32 > 0x0000ffffa97fff34: blr x8 > 0x0000ffffa97fff38: b 0x0000ffffa97fff40 > 0x0000ffffa97fff3c > > would turn into this: > > : ldr w8, 0x0000ffffa97fff3c > : dmb ishld > : ldr w9, [x28, #36] > : cmp w8, w9 > : b.ne 0x0000ffffa97fff40 // b.none > > but we don't have to do that right now. OK. > That would be ideal - a CodeStub to handle the slow path would then take the responsibility for recording the relative location of the guard value - currently that happens to be fixed. I think I'd prefer to do that as a separate patch while I work out the details. Thanks for the review, Stuart From david.holmes at oracle.com Tue Apr 21 04:42:29 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 21 Apr 2020 14:42:29 +1000 Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for mainline changes Message-ID: Hi everyone, aarch64-port-dev at openjdk.java.net is the mailing list for the aarch64-port project: http://openjdk.java.net/projects/aarch64-port/ which was setup to get the Aarch64 port into OpenJDK 8. While the mailing list still serves as a good place for people to specifically discuss issues around the Aarch64 port, all code reviews for changes to be pushed to mainline openjdk (not the aarch64-port project repo) should be occurring on the appropriate mainline hotspot*-dev mailing list (which correspond to the 'appropriate development list' as per http://openjdk.java.net/contribute/). Thanks, David From thomas.stuefe at gmail.com Tue Apr 21 05:18:30 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 21 Apr 2020 07:18:30 +0200 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com> References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com> Message-ID: On Tue, Apr 21, 2020 at 6:43 AM Ioi Lam wrote: > > > On 4/20/20 4:10 AM, Thomas St?fe wrote: > > On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam wrote: > >> >> >> On 4/18/20 12:15 AM, Thomas St?fe wrote: >> >> Hi Ioi, >> >> I am working on a small patch and have some more questions. >> >> - First, a simple one, in >> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the >> space does not have anything to do with metaspace, as you wrote, so the >> alignment could be anything, right? >> >> I think so. >> >> - Out of curiousity, when you pack the different regions >> (DumpRegion::pack) you align the end to page size. Why? Why could the next >> region not simply follow immediately? I looked if any code needs a region >> to be page aligned, but may have missed it. >> >> >> We map RO read-only and MC/RW in read-write. If the regions are not >> aligned, you will have a page that wants half to be read-only and half to >> be read-write. >> >> > Okay. I wondered why page align here and not allocation granularity. Now I > understand. I guess this is also the reason why we could not use large > pages for the archive? > > I think this is fine, I did not want to change it. On some platforms we > have 64K (non-large) pages, but even there I think the waste would be > acceptable. > > >> I guess we can adjust the mapping to be more lenient (if a page wants >> half read-write, we map it read-write), but that's no done today. >> >> >> - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() : >> >> I assume this code has to work for all three cases right >> 1) lp32. >> 2) lp64 with and without UseCompressedClassPointers? >> 3) lp64 without UseCompressedClassPointers? >> >> If yes, does the setting for UseCompressedClassPointers have to be the >> same at run time? >> >> >> Yes. The value of UseCompressedOops and UseCompressedClassPointers must >> be the same between dump time and run time. >> >> >> >> In this layout: >> // On 64-bit VM, the heap and class space layout will be the same as if >> // you're running in -Xshare:on mode: >> // >> // +-- SharedBaseAddress (default = >> 0x800000000) >> // v >> // +-..---------+---------+ ... +----+----+----+--------------------+ >> // | Heap | Archive | | MC | RW | RO | class space | >> // +-..---------+---------+ ... +----+----+----+--------------------+ >> // |<-- MaxHeapSize -->| |<-- UnscaledClassSpaceMax = 4GB -->| >> // >> >> Why does the class space has to follow mc+rw+ro? Could it come before? >> >> >> Compressed klass pointers are stored in archived objects. If the class >> space is now lower than SharedBaseAddress, you will need to rebase all of >> the compressed klass pointers. This is not efficient and will slow down >> start-up. >> >> > Well, could SharedBaseAddress not point to start of the ccs: > > // +-- SharedBaseAddress (default = 0x800000000) > // v > // +----+----+----+-----------------------------------+ > // | class space | ..gap maybe.. | MC | RW | RO > // +----+----+----+-----------------------------------+ > > you'd then need to make sure that the relative offset of MC to > SharedBaseAddress is the same at dump time and at runtime. Is my > understanding correct? I am not saying I want to do this, I just try to > understand the way ccs archive allocation works. > > > That should work. But you are still using a fixed offset from the bottom > of MC to SharedBaseAddress (instead of a fixed address of 0). I am not sure > if that will buy you any flexibility. > > > >> >> >> Actually, does it have to be in the same space at all, or could it live >> somewhere completely different? >> >> >> It can be higher. You just need to ensure that the distance between >> SharedBaseAddress to the end of the class space is within max compressed >> klass space size. >> >> But, I am wondering why you're asking this :-) >> >> > I try to understand the allocation and where apply what restrictions. We > have at least three parties, cds, metaspace and the underlying platform, > all with their own subtleties of how the memory should be allocated: > - metaspace will in the near future want a larger alignment than what cds > uses for reservation. > - platforms like aarch64 and maybe ppc want the compressed class base to > look in a certain way > > Part of my confusion was that I always thought of > CompressClassPointers::base() to be basically the same as the start of the > ccs (maybe modulo being zero on zero-based mode) but that is obviously not > true since CDS exists. So what I wrote first: > > "Metaspace::reserve_preferred_space.. Despite its generic-sounding name, > these functions can only be used to allocate ccs." > > is actually not fully correct. In reality this space is to be used to > allocate memory to house Klass structures so that their pointers are > compressable, so the reserved start address has to be compatible with that. > But, e.g., that start address does not have to be aligned to > Metaspace::reserve_alignment(). > > That's very true. I have some logic in CDS to pick the greater of > Metaspace::reserve_alignment() and os alignment. I think this is probably > unnecessary. > > Maybe we should simply untangle CDS/CCS from Metaspace altogether? We > really just need a 1GB reservation to be anchored at 32GB (or some address > that aarch64 likes). That way, you can do whatever you want with Metaspace > and not worry about CDS/CCS. > > Yes, this is what I am trying to do. CDS and Metaspace do not have much in common beyond the ccs reservation. I still like your first proposal of moving the creation of ReservedSpace for ccs out of Metaspace altogether. I also plan to change the code so that Metaspace::reserve_alignment is not used in cds anymore, only at that one or two places where it is really needed. ..Thomas > Thanks > - Ioi > > In both cds dump and runtime case, the ccs is carved from the end part of > the reserved space. Only that split point, and the size of that second > part, have to be aligned to Metaspace::reserve_alignment(). > > Were we to allocate ccs first and put the archives behind it this would > simplify some matters, but only minor points. I think the way it works now > is okay. I will try to disentangle it a bit in a way you proposed. > > >> To ask in a more precise way: I understand that both the mc+rw+ro >> archives and the ccs have to live in an area encompassed by the compressed >> class pointers encoding scheme. I wonder whether there are any restrictions >> beyond that. >> >> Could there be a gap between archives and ccs? >> >> Yes >> >> Can the order be reversed? >> >> No. >> >> Do the relative positions between archives and ccs have to be the same >> between dump time and runtime? >> >> No. All the pointers stored inside CDS point to inside of the MC/MW/RO >> regions, so it doesn't retain any knowledge of where the CCS was at dump >> time. >> >> > Clear answers, thank you! > > ..Thomas > > >> >> Thanks >> - Ioi >> >> >> >> Thanks! >> >> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam wrote: >> >>> >>> >>> On 4/16/20 11:14 AM, Thomas St?fe wrote: >>> >>> Hi Ioi, >>> >>> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam wrote: >>> >>>> (I suppose you mean "compressed class space" by "ccs" :-) >>>> >>>> >>> Yes, I think I stole this from Stefan Karlsson :) >>> >>> >>>> >>>> >>> >>> >>>> I am not even sure if case (C) can happen at all. >>>> >>>> I admit that I've been guilty of making the interface even more >>>> complicated >>>> with JDK-8231610 >>>> (Relocate the CDS archive if it cannot be mapped to the >>>> requested address). Looks now is a good time to clean up. >>>> >>>> >>> The coding has been complicated to begin with, and then it usually only >>> gets worse since no-one has time for a revamp :( A clean up would be very >>> helpful. >>> >>> One reason I look at this coding now, beside the aarch64 problem, was >>> that I try to disentangle CDS from Metaspace, especially the alignment >>> policy. Remember, I tried to tackle this last summer? but it keeps biting >>> me. For such a small problem this is weirdly complicated. >>> >>> >>>> One thing that can be cleaned up is the call to >>>> Metaspace::allocate_metaspace_compressed_klass_ptrs: >>>> >>>> (a) when CDS is enabled: >>>> >>>> Metaspace::global_initialize() >>>> -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >>>> -> ... MetaspaceShared::map_archives() >>>> -> ... reserve the space, eventually calling >>>> Metaspace::reserve_space >>>> -> call Metaspace::allocate_metaspace_compressed_klass_ptrs() >>>> >>>> (b) when CDS is disabled >>>> >>>> Metaspace::global_initialize() >>>> -> allocate_metaspace_compressed_klass_ptrs >>>> -> (if cds is not enabled) Metaspace::reserve_space() >>>> >>>> >>>> In case (b), we should first reserve the space, and then call into >>>> allocate_metaspace_compressed_klass_ptrs. This will simplify the >>>> arguments >>>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the >>>> variations >>>> of calls to Metaspace::reserve_space(). I think this will make it >>>> possible to >>>> drop the use_requested_addr argument and rely simply on (requested_addr >>>> != NULL) >>>> >>>> >>> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to >>> Metaspace::allocate_metaspace_compressed_klass_ptrs()? >>> >>> This would melt down >>> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize >>> compressed class space from a pre-arranged ReservedSpace, and set up base + >>> shift". >>> >>> We could probably rename that thing >>> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base); >>> >>> We even could move set_narrow_klass_base_and_shift() out of >>> Metaspace::set_up_compressed_klass_space, then it becomes a series of three >>> simple operations: >>> 1) obtain a ReservedSpace however you see fit >>> 2) register it with Metaspace as address space for ccs, >>> 3) set_narrow_klass_base_and_shift. We would not have to hand down >>> cds_base to Metaspace, only for it to be used as base address >>> in set_narrow_klass_base_and_shift. >>> >>> >>> Yes, that seems the right thing to do. That will hopefully make the >>> aarch64 initialization code a little simpler as well. >>> >>> One question which came to me today was: >>> >>> In AppCDS, DynamicArchiveBuilder::do_it() calls >>> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have >>> anything to do with ccs? Don't they just need some space to dump into? Hope >>> that question is not dumb. >>> >>> Do you mean: >>> >>> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() >>> -> MetaspaceShared::reserve_shared_space >>> -> Metaspace::reserve_space >>> >>> That's not necessary. When I wrote the code I thought >>> Metaspace::reserve_space was a general function for reserving spaces :-) >>> but as you said, this function is probably intended only for initializing >>> the CCS. >>> >>> Thanks >>> - Ioi >>> >>> Thanks, Thomas >>> >>> >>>> Thanks >>>> - Ioi >>>> >>>> >>>> Does that make sense? In other words, if the whole point of >>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try >>>> to find a good address", would it not make sense to just try a low >>>> address as part of the try-addresses-loop? >>>> >>>> We certainly don't want to have to use a dedicated heapbase register >>>> or a shift. Just give us a multiple of 4*G and we're happy. >>>> >>>> >>>> >>>> >>> >> > From nick.gasson at arm.com Tue Apr 21 06:03:05 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 21 Apr 2020 14:03:05 +0800 Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for mainline changes In-Reply-To: References: Message-ID: <85wo69ml46.fsf@arm.com> Hi David, > > aarch64-port-dev at openjdk.java.net is the mailing list for the > aarch64-port project: > > http://openjdk.java.net/projects/aarch64-port/ > > which was setup to get the Aarch64 port into OpenJDK 8. While the > mailing list still serves as a good place for people to specifically > discuss issues around the Aarch64 port, all code reviews for changes to > be pushed to mainline openjdk (not the aarch64-port project repo) should > be occurring on the appropriate mainline hotspot*-dev mailing list > (which correspond to the 'appropriate development list' as per > http://openjdk.java.net/contribute/). > Is there a specific change you're referring to? I checked the RFRs on this list in the last few months and I can't find any that aren't also To/CC the appropriate hotspot-* mailing list. For patches from Arm the policy we've been following is to send to the relevant hotspot-* list and also copy aarch64-port-dev if it touches AArch64-specific code. I think the way Mailman is configured, if a message is sent to multiple lists you are subscribed to then you'll only receive it from one of those. Thanks, Nick From david.holmes at oracle.com Tue Apr 21 06:25:01 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 21 Apr 2020 16:25:01 +1000 Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for mainline changes In-Reply-To: <85wo69ml46.fsf@arm.com> References: <85wo69ml46.fsf@arm.com> Message-ID: <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com> Hi Nick, On 21/04/2020 4:03 pm, Nick Gasson wrote: > Hi David, > >> >> aarch64-port-dev at openjdk.java.net is the mailing list for the >> aarch64-port project: >> >> http://openjdk.java.net/projects/aarch64-port/ >> >> which was setup to get the Aarch64 port into OpenJDK 8. While the >> mailing list still serves as a good place for people to specifically >> discuss issues around the Aarch64 port, all code reviews for changes to >> be pushed to mainline openjdk (not the aarch64-port project repo) should >> be occurring on the appropriate mainline hotspot*-dev mailing list >> (which correspond to the 'appropriate development list' as per >> http://openjdk.java.net/contribute/). >> > > Is there a specific change you're referring to? I checked the RFRs on > this list in the last few months and I can't find any that aren't also > To/CC the appropriate hotspot-* mailing list. For patches from Arm the > policy we've been following is to send to the relevant hotspot-* list > and also copy aarch64-port-dev if it touches AArch64-specific code. I > think the way Mailman is configured, if a message is sent to multiple > lists you are subscribed to then you'll only receive it from one of > those. My apologies to all. I saw a JBS update with a link to a RFR thread on aarch64-dev and then failed to find the corresponding mail on hotspot-compiler-dev. Sorry for the noise. David ----- > > Thanks, > Nick > From Pengfei.Li at arm.com Tue Apr 21 06:35:32 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Tue, 21 Apr 2020 06:35:32 +0000 Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for mainline changes In-Reply-To: <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com> References: <85wo69ml46.fsf@arm.com> <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com> Message-ID: Hi David, > My apologies to all. I saw a JBS update with a link to a RFR thread on > aarch64-dev and then failed to find the corresponding mail on hotspot- > compiler-dev. > > Sorry for the noise. Perhaps you saw this one https://bugs.openjdk.java.net/browse/JDK-8242070 I have updated the JBS comment and will use the hotspot-*-dev URLs as official review thread links in the future. Anyway, thanks for reminding. -- Thanks, Pengfei From david.holmes at oracle.com Tue Apr 21 06:47:07 2020 From: david.holmes at oracle.com (David Holmes) Date: Tue, 21 Apr 2020 16:47:07 +1000 Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for mainline changes In-Reply-To: References: <85wo69ml46.fsf@arm.com> <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com> Message-ID: <6cd09601-5904-2c07-1e9b-4f8975c5ccf4@oracle.com> Hi Pengfei, On 21/04/2020 4:35 pm, Pengfei Li wrote: > Hi David, > >> My apologies to all. I saw a JBS update with a link to a RFR thread on >> aarch64-dev and then failed to find the corresponding mail on hotspot- >> compiler-dev. >> >> Sorry for the noise. > > Perhaps you saw this one https://bugs.openjdk.java.net/browse/JDK-8242070 Yes that was the one. > I have updated the JBS comment and will use the hotspot-*-dev URLs as official review thread links in the future. Anyway, thanks for reminding. Thank you for doing that, but it was my mistake. Cheers, David > -- > Thanks, > Pengfei > From per.liden at oracle.com Tue Apr 21 08:20:02 2020 From: per.liden at oracle.com (Per Liden) Date: Tue, 21 Apr 2020 10:20:02 +0200 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: <90c015a8-3db9-b7bd-f8d3-f05f5e6458d3@oracle.com> Looks good to me. One minor thing, you no longer need -XX:+UnlockExperimentalVMOptions in test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherWithZ.java. I don't need to see another webrev for that. cheers, Per On 4/20/20 4:19 PM, Stuart Monteith wrote: > Hi, > If anyone has bandwidth, would the be able to review this patch? It > addresses Andrew, Per and Erik's comments: > http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ > > Thanks, > Stuart > > > On 27/03/2020 09:47, Erik ?sterlund wrote: >> Hi Stuart, >> >> Thanks for sorting this out on AArch64. It is nice to see thatyou can >> implement these >> barriers on platforms that do not have instruction cache coherency. >> >> One small change request: >> It looks like in C1 you inject the entry barrier right after build_frame >> is done: >> >> ?629?????? build_frame(); >> ?630?????? { >> ?631???????? // Insert nmethod entry barrier into frame. >> ?632???????? BarrierSetAssembler* bs = >> BarrierSet::barrier_set()->barrier_set_assembler(); >> ?633???????? bs->nmethod_entry_barrier(_masm); >> ?634?????? } >> >> Unfortunately, this is in the platform independent part of the LIR >> assembler. In the x86 version >> we inject it at the very end of build_frame() instead, which is a >> platform-specific function. >> The platform-specific function is in the C1 macro assembler file for >> that platform. >> >> We intentionally put it in the platform-specific path as it is a >> platform-specific feature. >> Now on x86, the barrier code will be emitted once in build_frame() and >> once after returning >> from build_frame, resulting in two nmethod entry barriers, and only the >> first one will get >> patched, causing the second one to mostly take slow paths, which isn't >> necessarily wrong, >> but will cause regressions. >> >> I would propose you just move those lines into the very end of the >> AArch64-specific part of >> build_frame(). >> >> I don't need to see another webrev for that trivial code motion. This >> looks good to me. >> Agan, thanks a lot for fixing this! It will allow me to go forward with >> concurrent stack >> scanning on AArch64 as well. >> >> Thanks, >> /Erik >> >> >> On 2020-03-26 23:42, Stuart Monteith wrote: >>> Hello, >>> ???????? Please review this change to implement nmethod entry barriers on >>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will >>> need to be separately tested and enabled - there are problems with this >>> on Shenandoah. >>> >>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as >>> well as Netbeans. >>> >>> In terms of interesting features: >>> ????????? With nmethod entry barriers,? immediate oops are removed by: >>> ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop >>> ???????? This is to ensure consistency with the entry barrier, as >>> otherwise with >>> an immediate we'd otherwise need an ISB. >>> >>> ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this >>> functionality >>> useful in testing as deoptimisation is very infrequent. I've written it >>> as an atomic to avoid it happening too frequently. As it is a new >>> option, I'm not sure whether any more is needed than this review. A new >>> test has been added >>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to >>> test GC with that option enabled. >>> >>> ???????? BarrierSetAssembler::nmethod_entry_barrier >>> ???????? This method emits the barrier code. In internal review it was >>> suggested >>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not >>> done this as the BarrierSetNMethod code checks the exact instruction >>> sequence, and I prefer to be explicit. >>> >>> ???????? Benchmarking method entry shows an increase of around 6ns >>> with the >>> nmethod entry barrier. >>> >>> >>> The deoptimisation code was contributed by Andrew Haley. >>> >>> The bug: >>> ???????? https://bugs.openjdk.java.net/browse/JDK-8216557 >>> >>> The webrev: >>> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/ >>> >>> >>> BR, >>> ???????? Stuart > From kuaiwei.kw at alibaba-inc.com Mon Apr 20 11:12:55 2020 From: kuaiwei.kw at alibaba-inc.com (Kuai Wei) Date: Mon, 20 Apr 2020 19:12:55 +0800 Subject: [aarch64-port-dev ] =?utf-8?q?RFR=3A_heapbase_register_can_be_all?= =?utf-8?q?ocated_in_compressed_mode?= In-Reply-To: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com> References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com> <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com> <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com> <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com> <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com> <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com> <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com> <781CB090-0386-4D32-8465-8238E516789B@amazon.com> <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com> , <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com> Message-ID: <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com> Hi Andrew, Could you tell more detail about it? I can start a new patch for it if it break anything. Kuai Wei ------------------------------------------------------------------ From:Andrew Haley Send Time:2020?4?20?(???) 18:01 To:Pengfei Li ; ??(??) ; "Liu, Xin" ; hotspot compiler Cc:nd ; aarch64-port-dev at openjdk.java.net Subject:Re: RFR: heapbase register can be allocated in compressed mode On 4/20/20 9:48 AM, Andrew Haley wrote: > On 4/20/20 5:32 AM, Pengfei Li wrote: >> Maybe Andrew Haley or other AArch64 reviewers can help? >> >> [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/ > It's fine. Sorry, no it isn't fine. Please get rid of this hunk: --- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:52.009758661 +0800 +++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:51.785764043 +0800 @@ -2185,6 +2185,10 @@ #if 0 assert (UseCompressedOops || UseCompressedClassPointers, "should be compressed"); assert (Universe::heap() != NULL, "java heap should be initialized"); + if (!UseCompressedOops || Universe::ptr_base() == NULL) { + // rheapbase is allocated as general register + return; + } if (CheckCompressedOops) { Label ok; push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1 -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From ioi.lam at oracle.com Tue Apr 21 04:42:56 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Mon, 20 Apr 2020 21:42:56 -0700 Subject: [aarch64-port-dev ] Question about ccs reservation, CDS and aarch64 specifics In-Reply-To: References: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com> <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com> Message-ID: <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com> On 4/20/20 4:10 AM, Thomas St?fe wrote: > On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam > wrote: > > > > On 4/18/20 12:15 AM, Thomas St?fe wrote: >> Hi Ioi, >> >> I am working on a small patch and have some more questions. >> >> - First, a simple one, in >> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), >> the space does not have anything to do with metaspace, as you >> wrote, so the alignment could be anything, right? >> > I think so. > >> - Out of curiousity, when you pack the different regions >> (DumpRegion::pack) you align the end to page size. Why? Why could >> the next region not simply follow immediately? I looked if any >> code needs a region to be page aligned, but may have missed it. > > We map RO read-only and MC/RW in read-write. If the regions are > not aligned, you will have a page that wants half to be read-only > and half to be read-write. > > > Okay. I wondered why page align here and not allocation granularity. > Now I understand. I guess this is also the reason why we could not use > large pages for the archive? > > I think this is fine,?I did not want to change it. On some platforms > we have 64K (non-large) pages, but even there I think the waste would > be acceptable. > > I guess we can adjust the mapping to be more lenient (if a page > wants half read-write, we map it read-write), but that's no done > today. > >> >> - void >> MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() : >> >> I assume this code has to work for all three cases right >> 1) lp32. >> 2) lp64 with and without UseCompressedClassPointers? >> 3) lp64 without UseCompressedClassPointers? >> >> If yes, does the setting for UseCompressedClassPointers have to >> be the same at run time? > > Yes. The value of UseCompressedOops and UseCompressedClassPointers > must be the same between dump time and run time. > >> >> >> In this layout: >> ? // On 64-bit VM, the heap and class space layout will be the >> same as if >> ? // you're running in -Xshare:on mode: >> ? // >> ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?+-- SharedBaseAddress (default >> = 0x800000000) >> ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?v >> ? // +-..---------+---------+ ... >> +----+----+----+--------------------+ >> ? // | ? ?Heap ? ?| Archive | ? ? | MC | RW | RO | ? ?class space >> ? ? | >> ? // +-..---------+---------+ ... >> +----+----+----+--------------------+ >> ? // |<-- ? MaxHeapSize ?-->| |<-- UnscaledClassSpaceMax = 4GB -->| >> ? // >> >> Why does the class space has to follow mc+rw+ro? Could it come >> before? >> >> > Compressed klass pointers are stored in archived objects. If the > class space is now lower than SharedBaseAddress, you will need to > rebase all of the compressed klass pointers. This is not efficient > and will slow down start-up. > > > Well, could SharedBaseAddress not point to start of the ccs: > > ? // +-- SharedBaseAddress (default = 0x800000000) > ? // v > ? // +----+----+----+-----------------------------------+ > ? // | ? ?class space ? ? | ..gap maybe.. | MC | RW | RO > ? // +----+----+----+-----------------------------------+ > > you'd then need to make sure that the relative offset of MC to > SharedBaseAddress is the same at dump time and at runtime. Is my > understanding correct? I am not saying I want to do this, I just try > to understand the way ccs archive allocation works. That should work. But you are still using a fixed offset from the bottom of MC to SharedBaseAddress (instead of a fixed address of 0). I am not sure if that will buy you any flexibility. > >> >> Actually, does it have to be in the same space at all, or could >> it live somewhere completely different? > > It can be higher. You just need to ensure that the distance > between SharedBaseAddress to the end of the class space is within > max compressed klass space size. > > But, I am wondering why you're asking this :-) > > > I try to understand the allocation and where apply what restrictions. > We have at least three parties, cds, metaspace and the underlying > platform, all with their own subtleties of how the memory should be > allocated: > - metaspace will in the near future want a larger alignment than what > cds uses for reservation. > - platforms like aarch64 and maybe ppc want the compressed class base > to look in a certain way > > Part of my confusion was that I always thought of > CompressClassPointers::base() to be basically the same as the start of > the ccs (maybe modulo being zero on zero-based mode) but that is > obviously not true since CDS exists. So what I wrote first: > > "Metaspace::reserve_preferred_space.. Despite its generic-sounding > name, these functions can only be used to allocate ccs." > > is actually not fully correct. In reality this space is?to be used to > allocate memory to house Klass structures so that their pointers are > compressable, so the reserved start address has to be compatible with > that. But, e.g., that start address does not have to be aligned to > Metaspace::reserve_alignment(). > That's very true. I have some logic in CDS to pick the greater of Metaspace::reserve_alignment() and os alignment. I think this is probably unnecessary. Maybe we should simply untangle CDS/CCS from Metaspace altogether? We really just need a 1GB reservation to be anchored at 32GB (or some address that aarch64 likes). That way, you can do whatever you want with Metaspace and not worry about CDS/CCS. Thanks - Ioi > In both cds dump and runtime case, the ccs is carved from the end part > of the reserved space. Only that split point, and the size of that > second part, have to be aligned to Metaspace::reserve_alignment(). > > Were we to allocate ccs first and put the archives behind it this > would simplify some matters, but only minor points. I think the way it > works now is okay. I will try to disentangle it a bit in a way you > proposed. > >> To ask in a more precise way: I understand that both the mc+rw+ro >> archives and the ccs have to live in an area encompassed by the >> compressed class pointers encoding scheme. I wonder whether there >> are any restrictions beyond that. >> >> Could there be a gap between archives and ccs? > Yes > >> Can the order be reversed? > No. > >> Do the relative positions between archives and ccs have to be the >> same between dump time and runtime? > No. All the pointers stored inside CDS point to inside of the > MC/MW/RO regions, so it doesn't retain any knowledge of where the > CCS was at dump time. > > > Clear answers, thank you! > > ..Thomas > > > Thanks > - Ioi > > >> >> Thanks! >> >> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam > > wrote: >> >> >> >> On 4/16/20 11:14 AM, Thomas St?fe wrote: >>> Hi Ioi, >>> >>> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam >> > wrote: >>> >>> (I suppose you mean "compressed class space" by "ccs" :-) >>> >>> >>> Yes, I think I stole this from Stefan Karlsson :) >>> >>> >>> >>> I am not even sure if case (C) can happen at all. >>> >>> I admit that I've been guilty of making the interface >>> even more complicated >>> with JDK-8231610 >>> (Relocate >>> the CDS archive if it cannot be mapped to the >>> requested address). Looks now is a good time to clean up. >>> >>> >>> The coding has been complicated to begin with, and then it >>> usually only gets worse since no-one has time for a revamp >>> :( A clean up would be very helpful. >>> >>> One reason I look at this coding now, beside the aarch64 >>> problem, was that I try to disentangle?CDS from Metaspace, >>> especially the alignment policy. Remember, I tried to tackle >>> this last summer? but it keeps biting me. For such a small >>> problem this is weirdly complicated. >>> >>> One thing that can be cleaned up is the call to >>> Metaspace::allocate_metaspace_compressed_klass_ptrs: >>> >>> (a) when CDS is enabled: >>> >>> Metaspace::global_initialize() >>> ??? -> >>> MetaspaceShared::initialize_runtime_shared_and_meta_spaces() >>> ?????? -> ... MetaspaceShared::map_archives() >>> ???????? -> ... reserve the space, eventually calling >>> Metaspace::reserve_space >>> ???????? -> call >>> Metaspace::allocate_metaspace_compressed_klass_ptrs() >>> >>> (b) when CDS is disabled >>> >>> Metaspace::global_initialize() >>> -> allocate_metaspace_compressed_klass_ptrs >>> ?????? -> (if cds is not enabled) Metaspace::reserve_space() >>> >>> >>> In case (b), we should first reserve the space, and then >>> call into >>> allocate_metaspace_compressed_klass_ptrs. This will >>> simplify the arguments >>> of allocate_metaspace_compressed_klass_ptrs, and will >>> also limit the variations >>> of calls to Metaspace::reserve_space(). I think this >>> will make it possible to >>> drop the use_requested_addr argument and rely simply on >>> (requested_addr != NULL) >>> >>> >>> So, in all cases we'd pre-reserve the ReservedSpace and hand >>> it down to >>> Metaspace::allocate_metaspace_compressed_klass_ptrs()? >>> >>> This would melt down >>> Metaspace::allocate_metaspace_compressed_klass_ptrs() to >>> just "initialize compressed class space from a pre-arranged >>> ReservedSpace, and set up base?+ shift". >>> >>> We could probably rename that thing >>> to?Metaspace::set_up_compressed_klass_space(ReservedSpace* >>> rs, cds_base); >>> >>> We even could move set_narrow_klass_base_and_shift() out of >>> Metaspace::set_up_compressed_klass_space, then it becomes a >>> series of three simple operations: >>> 1) obtain a ReservedSpace however you see fit >>> 2) register it with Metaspace as address space for ccs, >>> 3) set_narrow_klass_base_and_shift. We would not have to >>> hand down cds_base to Metaspace, only for it to be used as >>> base address in?set_narrow_klass_base_and_shift. >>> >> >> Yes, that seems the right thing to do. That will hopefully >> make the aarch64 initialization code a little simpler as well. >> >>> One question which came to me today was: >>> >>> In AppCDS, DynamicArchiveBuilder::do_it() calls >>> Metaspace::reserve_space(). Is that really needed,?does a >>> DumpRegion have anything to do with ccs? Don't they just >>> need some space to dump into? Hope that question is not dumb. >>> >> Do you mean: >> >> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta() >> >> -> MetaspaceShared::reserve_shared_space >> ??? -> Metaspace::reserve_space >> >> That's not necessary. When I wrote the code I thought >> Metaspace::reserve_space was a general function for reserving >> spaces :-) but as you said, this function is probably >> intended only for initializing the CCS. >> >> Thanks >> - Ioi >> >>> Thanks, Thomas >>> >>> Thanks >>> - Ioi >>> >>> >>>>> Does that make sense? In other words, if the whole point of >>>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try >>>>> to find a good address", would it not make sense to just try a low >>>>> address as part of the try-addresses-loop? >>>> We certainly don't want to have to use a dedicated heapbase register >>>> or a shift. Just give us a multiple of 4*G and we're happy. >>>> >>> >> > From aph at redhat.com Tue Apr 21 09:23:33 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 21 Apr 2020 10:23:33 +0100 Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear In-Reply-To: References: Message-ID: On 4/17/20 10:13 AM, Yang Zhang wrote: > Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations. > > In this directory, there are also some test cases about reduction operations, which is added in [1]. > https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtreg/compiler/loopopts/superword > > [1] https://bugs.openjdk.java.net/browse/JDK-8240248 Sounds good. Thanks! -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Tue Apr 21 14:31:08 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 21 Apr 2020 16:31:08 +0200 Subject: [aarch64-port-dev ] Question about CompressedKlassPointers::range Message-ID: Hi, this is a followup question, mainly for aarch64, to https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html . CompressedKlassPointers has a range field, only used by aarch64 afaics, introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV". I read its bug description and the patch. If I understand the problem, before CDS the assumption was that CompressedClassSpaceSize is synonymous with the range of values narrow Klass pointers could have; which seems logical, but that assumption was broken since CDS and now the encoding range must span both the ccs and the cds archives. The range is used inside MacroAssembler::klass_decode_mode() to decide whether to use the OR mode. I see this being set in three places: 1) at cds dumptime, to 4G 2) at cds runtime, to CompressedClassSpaceSize, and 3) if cds is disabled it keeps its default value of 4G. I may miss something here. Would (2) not be too small? Should that size not include the size of the archives? We map first the archives, lets say they are 300MB, after that ccs, lets say 1G default, would that not mean any Klass residing toward the end of the ccs - if it were to fill up, which it almost never does - would have an offset larger than the initially assumed range and hence not correctly OR-able with the base anymore? And I'm not sure (3) is correct either since the range we could encode in theory is 32G with shift=3. In practice this is today no problem. Today CompressedClassSpaceSize is artificially capped at 3G. If that were ever to change, and someone would set it to >4G, this should cause problems too, no? If my assumption about (2) is correct, it could be the error is just well hidden either because MacroAssembler::_klass_decode_mode is already initialized, using the default value (3). Or because it is difficult to allocate so many classes to trigger this error. -- As a more general question: CompressedKlassPointers::range(), as in "the expected range of narrow Klass pointer values", I guess it makes sense to keep it as small as possible, right? Instead of, say hard-coding it to 32G? Since the smaller the expected range of narrow pointers is, the more probable we could choose the OR mode? Oh, and on aarch64, how "good" is that OR mode compared with the "movk" mode on aarch64? Since it seems to be preferred? Thanks a lot, again, Thomas From aph at redhat.com Tue Apr 21 15:59:22 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 21 Apr 2020 16:59:22 +0100 Subject: [aarch64-port-dev ] Question about CompressedKlassPointers::range In-Reply-To: References: Message-ID: <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com> Hi, On 4/21/20 3:31 PM, Thomas St?fe wrote: > this is a followup question, mainly for aarch64, to > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html > . > > CompressedKlassPointers has a range field, only used by aarch64 afaics, > introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV". > > I read its bug description and the patch. If I understand the problem, > before CDS the assumption was that CompressedClassSpaceSize is synonymous > with the range of values narrow Klass pointers could have; which seems > logical, but that assumption was broken since CDS and now the encoding > range must span both the ccs and the cds archives. > > The range is used inside MacroAssembler::klass_decode_mode() to decide > whether to use the OR mode. > > I see this being set in three places: > 1) at cds dumptime, to 4G > 2) at cds runtime, to CompressedClassSpaceSize, and > 3) if cds is disabled it keeps its default value of 4G. > > I may miss something here. Would (2) not be too small? Should that size not > include the size of the archives? I believe so. > We map first the archives, lets say they > are 300MB, after that ccs, lets say 1G default, would that not mean any > Klass residing toward the end of the ccs - if it were to fill up, which it > almost never does - would have an offset larger than the initially assumed > range and hence not correctly OR-able with the base anymore? How would that happen? If someone maps CDS space miles from CCS, you mean? OK, but that'd be a pointless thing to do. > And I'm not sure (3) is correct either since the range we could encode in > theory is 32G with shift=3. In practice this is today no problem. Today > CompressedClassSpaceSize is artificially capped at 3G. If that were ever to > change, and someone would set it to >4G, this should cause problems too, no? Yes, it would. It'd be a fool thing to do, but that doesn't mean it won't happen. We really don't need more than 3G, after all. > If my assumption about (2) is correct, it could be the error is just well > hidden either because MacroAssembler::_klass_decode_mode is already > initialized, using the default value (3). Or because it is difficult to > allocate so many classes to trigger this error. (2) looks wrong. > As a more general question: CompressedKlassPointers::range(), as in > "the expected range of narrow Klass pointer values", I guess it > makes sense to keep it as small as possible, right? Instead of, say > hard-coding it to 32G? Yes, it does. > Since the smaller the expected range of narrow pointers is, the more > probable we could choose the OR mode? At the moment the probability of being able to do that is so high that if it fails I'd expect it'd be a bug. > Oh, and on aarch64, how "good" is that OR mode compared with the "movk" > mode on aarch64? Since it seems to be preferred? A shift is sometimes slower than a simple XOR, so a shift is never preferred. Beyond that it's impossible to say for sure because there are many independent implementations, some of which I have never seen, but I doubt that there's a huge difference. Any 4G range is probably OK. Bear in mind, though, that people designing AArch64 hardware today are benchmarking OpenJDK and making decisions based on what HotSpot does. For that reason, changing what we do without a really good reason isn't the best idea. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Tue Apr 21 17:24:33 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 21 Apr 2020 19:24:33 +0200 Subject: [aarch64-port-dev ] Question about CompressedKlassPointers::range In-Reply-To: <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com> References: <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com> Message-ID: On Tue, Apr 21, 2020 at 5:59 PM Andrew Haley wrote: > Hi, > > On 4/21/20 3:31 PM, Thomas St?fe wrote: > > this is a followup question, mainly for aarch64, to > > > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html > > . > > > > CompressedKlassPointers has a range field, only used by aarch64 afaics, > > introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV". > > > > I read its bug description and the patch. If I understand the problem, > > before CDS the assumption was that CompressedClassSpaceSize is synonymous > > with the range of values narrow Klass pointers could have; which seems > > logical, but that assumption was broken since CDS and now the encoding > > range must span both the ccs and the cds archives. > > > > The range is used inside MacroAssembler::klass_decode_mode() to decide > > whether to use the OR mode. > > > > I see this being set in three places: > > 1) at cds dumptime, to 4G > > 2) at cds runtime, to CompressedClassSpaceSize, and > > 3) if cds is disabled it keeps its default value of 4G. > > > > I may miss something here. Would (2) not be too small? Should that size > not > > include the size of the archives? > > I believe so. > > > We map first the archives, lets say they > > are 300MB, after that ccs, lets say 1G default, would that not mean any > > Klass residing toward the end of the ccs - if it were to fill up, which > it > > almost never does - would have an offset larger than the initially > assumed > > range and hence not correctly OR-able with the base anymore? > > How would that happen? If someone maps CDS space miles from CCS, > you mean? OK, but that'd be a pointless thing to do. > I thought this could happen by filling up ccs. At CDS runtime (-Xshare=on) we map the cds archive, followed by the ccs: Encoding base | v +------+----------------------------+ | cds | ccs | +------+----------------------------+ +----------------------------+ A The size of the ccs is CompressedClassSpaceSize. Address A is Encoding base + CompressedClassSpaceSize, as in case (2), without archive size taken into account. ccs fills up at runtime, starting at the bottom, if more non-shared classes are loaded. E.g. lots of lambdas or reflection glue classes, or just application classes. When ccs fills up beyond point A, the assumption that no Klass ever has an offset larger than CompressedClassPointers::range is broken and the OR mode may not work anymore. However I see now that we would only have a problem if the encoding base had a non-zero bit set right above the end of the offset mask. But if the encoding base on aarch64 is always 4G aligned, and a narrow Klass pointer cannot be larger than 4G, the OR would still work. So, this is only a theoretical problem. > > And I'm not sure (3) is correct either since the range we could encode in > > theory is 32G with shift=3. In practice this is today no problem. Today > > CompressedClassSpaceSize is artificially capped at 3G. If that were ever > to > > change, and someone would set it to >4G, this should cause problems too, > no? > > Yes, it would. It'd be a fool thing to do, but that doesn't mean it > won't happen. We really don't need more than 3G, after all. > > > If my assumption about (2) is correct, it could be the error is just well > > hidden either because MacroAssembler::_klass_decode_mode is already > > initialized, using the default value (3). Or because it is difficult to > > allocate so many classes to trigger this error. > > (2) looks wrong. > > > As a more general question: CompressedKlassPointers::range(), as in > > "the expected range of narrow Klass pointer values", I guess it > > makes sense to keep it as small as possible, right? Instead of, say > > hard-coding it to 32G? > > Yes, it does. > > > Since the smaller the expected range of narrow pointers is, the more > > probable we could choose the OR mode? > > At the moment the probability of being able to do that is so high that > if it fails I'd expect it'd be a bug. > > > Oh, and on aarch64, how "good" is that OR mode compared with the "movk" > > mode on aarch64? Since it seems to be preferred? > > A shift is sometimes slower than a simple XOR, so a shift is never > preferred. Beyond that it's impossible to say for sure because there > are many independent implementations, some of which I have never seen, > but I doubt that there's a huge difference. Any 4G range is probably > OK. > > Bear in mind, though, that people designing AArch64 hardware today are > benchmarking OpenJDK and making decisions based on what HotSpot > does. For that reason, changing what we do without a really good > reason isn't the best idea. > I had no idea. Thank you. I will be very careful. ..Thomas > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > From Yang.Zhang at arm.com Wed Apr 22 04:23:51 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Wed, 22 Apr 2020 04:23:51 +0000 Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear In-Reply-To: References: Message-ID: Hi Andrew Thanks for your review. I will ask Pengfei to help push it. Regards Yang -----Original Message----- From: Andrew Haley Sent: Tuesday, April 21, 2020 5:24 PM To: Yang Zhang ; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net Cc: nd Subject: Re: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear On 4/17/20 10:13 AM, Yang Zhang wrote: > Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations. > > In this directory, there are also some test cases about reduction operations, which is added in [1]. > https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtr > eg/compiler/loopopts/superword > > [1] https://bugs.openjdk.java.net/browse/JDK-8240248 Sounds good. Thanks! -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From Yang.Zhang at arm.com Thu Apr 23 02:39:26 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Thu, 23 Apr 2020 02:39:26 +0000 Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build failed Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8242905 Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/ This issue is introduced by [1]. In this commit, pop_CPU_state(restore _vectors) and leave() are included under COMPILER2_OR_JVMCI check in AArc64 restore_live_registers[2]. But restore_live_registers is used in generate_resolve_blob[3] which might be called from c1. In x86 restore_live_registers, pop_CPU_state() and pop(rbp) are always done [4]. To fix this issue, pop_CPU_state(restore_vectors) and leave() are also moved outside of COMPILER2_OR_JVMCI check in AArch64 restore_live_registers. Testing on AArch64 platform: tier1 test with server build server build with configuring --with-jvm-features=-compiler2 client build and ran HelloWorld [1] https://bugs.openjdk.java.net/browse/JDK-8241665 [2] https://hg.openjdk.java.net/jdk/jdk/rev/53568400fec3#l1.23 [3] http://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp#l2850 [4] http://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#l378 From aph at redhat.com Thu Apr 23 08:44:15 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 23 Apr 2020 09:44:15 +0100 Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build failed In-Reply-To: References: Message-ID: On 4/23/20 3:39 AM, Yang Zhang wrote: > Could you please help to review this patch? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8242905 > Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/ Ok, thanks. Does anyone in the real world use AArch64 client builds? I'm wondering if we'd be better off without that option. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From erik.osterlund at oracle.com Thu Apr 23 10:51:46 2020 From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=) Date: Thu, 23 Apr 2020 12:51:46 +0200 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: Hi Stuart, This looks good to me. Thanks, /Erik On 2020-04-20 16:19, Stuart Monteith wrote: > Hi, > If anyone has bandwidth, would the be able to review this patch? It > addresses Andrew, Per and Erik's comments: > http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ > > Thanks, > Stuart > > > On 27/03/2020 09:47, Erik ?sterlund wrote: >> Hi Stuart, >> >> Thanks for sorting this out on AArch64. It is nice to see thatyou can >> implement these >> barriers on platforms that do not have instruction cache coherency. >> >> One small change request: >> It looks like in C1 you inject the entry barrier right after build_frame >> is done: >> >> ?629?????? build_frame(); >> ?630?????? { >> ?631???????? // Insert nmethod entry barrier into frame. >> ?632???????? BarrierSetAssembler* bs = >> BarrierSet::barrier_set()->barrier_set_assembler(); >> ?633???????? bs->nmethod_entry_barrier(_masm); >> ?634?????? } >> >> Unfortunately, this is in the platform independent part of the LIR >> assembler. In the x86 version >> we inject it at the very end of build_frame() instead, which is a >> platform-specific function. >> The platform-specific function is in the C1 macro assembler file for >> that platform. >> >> We intentionally put it in the platform-specific path as it is a >> platform-specific feature. >> Now on x86, the barrier code will be emitted once in build_frame() and >> once after returning >> from build_frame, resulting in two nmethod entry barriers, and only the >> first one will get >> patched, causing the second one to mostly take slow paths, which isn't >> necessarily wrong, >> but will cause regressions. >> >> I would propose you just move those lines into the very end of the >> AArch64-specific part of >> build_frame(). >> >> I don't need to see another webrev for that trivial code motion. This >> looks good to me. >> Agan, thanks a lot for fixing this! It will allow me to go forward with >> concurrent stack >> scanning on AArch64 as well. >> >> Thanks, >> /Erik >> >> >> On 2020-03-26 23:42, Stuart Monteith wrote: >>> Hello, >>> ???????? Please review this change to implement nmethod entry barriers on >>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will >>> need to be separately tested and enabled - there are problems with this >>> on Shenandoah. >>> >>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as >>> well as Netbeans. >>> >>> In terms of interesting features: >>> ????????? With nmethod entry barriers,? immediate oops are removed by: >>> ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop >>> ???????? This is to ensure consistency with the entry barrier, as >>> otherwise with >>> an immediate we'd otherwise need an ISB. >>> >>> ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this >>> functionality >>> useful in testing as deoptimisation is very infrequent. I've written it >>> as an atomic to avoid it happening too frequently. As it is a new >>> option, I'm not sure whether any more is needed than this review. A new >>> test has been added >>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to >>> test GC with that option enabled. >>> >>> ???????? BarrierSetAssembler::nmethod_entry_barrier >>> ???????? This method emits the barrier code. In internal review it was >>> suggested >>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not >>> done this as the BarrierSetNMethod code checks the exact instruction >>> sequence, and I prefer to be explicit. >>> >>> ???????? Benchmarking method entry shows an increase of around 6ns >>> with the >>> nmethod entry barrier. >>> >>> >>> The deoptimisation code was contributed by Andrew Haley. >>> >>> The bug: >>> ???????? https://bugs.openjdk.java.net/browse/JDK-8216557 >>> >>> The webrev: >>> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/ >>> >>> >>> BR, >>> ???????? Stuart From aleksei.voitylov at bell-sw.com Thu Apr 23 13:12:16 2020 From: aleksei.voitylov at bell-sw.com (Aleksei Voitylov) Date: Thu, 23 Apr 2020 16:12:16 +0300 Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build failed In-Reply-To: References: Message-ID: <7b98219a-e45b-f0e8-9008-0c7a712c06f4@bell-sw.com> Yes, in the embedded space. On 23/04/2020 11:44, Andrew Haley wrote: > On 4/23/20 3:39 AM, Yang Zhang wrote: >> Could you please help to review this patch? >> >> JBS: https://bugs.openjdk.java.net/browse/JDK-8242905 >> Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/ > Ok, thanks. > > Does anyone in the real world use AArch64 client builds? I'm wondering if > we'd be better off without that option. > From Yang.Zhang at arm.com Fri Apr 24 06:01:28 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Fri, 24 Apr 2020 06:01:28 +0000 Subject: [aarch64-port-dev ] RFR(S): 8243240: AArch64: Add support for MulVB Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8243240 Webrev: http://cr.openjdk.java.net/~yzhang/8243240/webrev.00/ In this patch, the missing MulVB support for AArch64 is added. Testing: tier1 Test case: public static void mulvb(byte[] a, byte[] b, byte[] c) { for (int i = 0; i < a.length; i++) { c[i] = (byte)(a[i] * b[i]); } } Assembly generated by C2: 0x0000ffffacafdbac: ldr q17, [x15, #16] 0x0000ffffacafdbb0: ldr q16, [x14, #16] 0x0000ffffacafdbb4: mul v16.16b, v16.16b, v17.16b 0x0000ffffacafdbbc: str q16, [x11, #16] Performance: JMH test case is attached in JBS. Before: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectMulVB 1024 avgt 5 0.952 0.005 us/op After: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectMulVB 1024 avgt 5 0.110 0.001 us/op Regards Yang From aph at redhat.com Fri Apr 24 09:31:59 2020 From: aph at redhat.com (Andrew Haley) Date: Fri, 24 Apr 2020 10:31:59 +0100 Subject: [aarch64-port-dev ] RFR(S): 8243240: AArch64: Add support for MulVB In-Reply-To: References: Message-ID: <893f6983-7e3c-adc0-ecf4-48e57312c456@redhat.com> On 4/24/20 7:01 AM, Yang Zhang wrote: > JBS: https://bugs.openjdk.java.net/browse/JDK-8243240 > Webrev: http://cr.openjdk.java.net/~yzhang/8243240/webrev.00/ OK, thanks. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stuart.monteith at arm.com Mon Apr 27 16:34:49 2020 From: stuart.monteith at arm.com (Stuart Monteith) Date: Mon, 27 Apr 2020 17:34:49 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> Message-ID: <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com> Thanks Erik, Per, Andrew, I've fixed up the testcase and retested. Uploaded here: http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/ Would someone be able to submit this for me? Thanks, Stuart On 23/04/2020 11:51, Erik ?sterlund wrote: > Hi Stuart, > > This looks good to me. > > Thanks, > /Erik > > On 2020-04-20 16:19, Stuart Monteith wrote: >> Hi, >> ????If anyone has bandwidth, would the be able to review this patch? It >> addresses Andrew, Per and Erik's comments: >> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/ >> >> Thanks, >> ????Stuart >> >> >> On 27/03/2020 09:47, Erik ?sterlund wrote: >>> Hi Stuart, >>> >>> Thanks for sorting this out on AArch64. It is nice to see thatyou can >>> implement these >>> barriers on platforms that do not have instruction cache coherency. >>> >>> One small change request: >>> It looks like in C1 you inject the entry barrier right after build_frame >>> is done: >>> >>> ??629?????? build_frame(); >>> ??630?????? { >>> ??631???????? // Insert nmethod entry barrier into frame. >>> ??632???????? BarrierSetAssembler* bs = >>> BarrierSet::barrier_set()->barrier_set_assembler(); >>> ??633???????? bs->nmethod_entry_barrier(_masm); >>> ??634?????? } >>> >>> Unfortunately, this is in the platform independent part of the LIR >>> assembler. In the x86 version >>> we inject it at the very end of build_frame() instead, which is a >>> platform-specific function. >>> The platform-specific function is in the C1 macro assembler file for >>> that platform. >>> >>> We intentionally put it in the platform-specific path as it is a >>> platform-specific feature. >>> Now on x86, the barrier code will be emitted once in build_frame() and >>> once after returning >>> from build_frame, resulting in two nmethod entry barriers, and only the >>> first one will get >>> patched, causing the second one to mostly take slow paths, which isn't >>> necessarily wrong, >>> but will cause regressions. >>> >>> I would propose you just move those lines into the very end of the >>> AArch64-specific part of >>> build_frame(). >>> >>> I don't need to see another webrev for that trivial code motion. This >>> looks good to me. >>> Agan, thanks a lot for fixing this! It will allow me to go forward with >>> concurrent stack >>> scanning on AArch64 as well. >>> >>> Thanks, >>> /Erik >>> >>> >>> On 2020-03-26 23:42, Stuart Monteith wrote: >>>> Hello, >>>> ????????? Please review this change to implement nmethod entry barriers on >>>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will >>>> need to be separately tested and enabled - there are problems with this >>>> on Shenandoah. >>>> >>>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as >>>> well as Netbeans. >>>> >>>> In terms of interesting features: >>>> ?????????? With nmethod entry barriers,? immediate oops are removed by: >>>> ????????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop >>>> ????????? This is to ensure consistency with the entry barrier, as >>>> otherwise with >>>> an immediate we'd otherwise need an ISB. >>>> >>>> ????????? I've added "-XX:DeoptNMethodBarrierALot". I found this >>>> functionality >>>> useful in testing as deoptimisation is very infrequent. I've written it >>>> as an atomic to avoid it happening too frequently. As it is a new >>>> option, I'm not sure whether any more is needed than this review. A new >>>> test has been added >>>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to >>>> test GC with that option enabled. >>>> >>>> ????????? BarrierSetAssembler::nmethod_entry_barrier >>>> ????????? This method emits the barrier code. In internal review it was >>>> suggested >>>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not >>>> done this as the BarrierSetNMethod code checks the exact instruction >>>> sequence, and I prefer to be explicit. >>>> >>>> ????????? Benchmarking method entry shows an increase of around 6ns >>>> with the >>>> nmethod entry barrier. >>>> >>>> >>>> The deoptimisation code was contributed by Andrew Haley. >>>> >>>> The bug: >>>> ????????? https://bugs.openjdk.java.net/browse/JDK-8216557 >>>> >>>> The webrev: >>>> ????????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/ >>>> >>>> >>>> BR, >>>> ????????? Stuart > From ningsheng.jian at arm.com Tue Apr 28 05:26:43 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Tue, 28 Apr 2020 13:26:43 +0800 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com> Message-ID: <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com> Hi Stuart, On 4/28/20 12:34 AM, Stuart Monteith wrote: > Thanks Erik, Per, Andrew, > I've fixed up the testcase and retested. > > Uploaded here: > > http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/ > > Would someone be able to submit this for me? > I submitted a build job before pushing your code, but it failed to build with minimal variant configure. Here's error message: ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member function 'static AdapterHandlerEntry* SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const BasicType*, const VMRegPair*, AdapterFingerPrint*)': ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: invalid use of incomplete type 'class BarrierSetAssembler' bs->c2i_entry_barrier(masm); I think you need to include barrierSetAssembler.hpp in sharedRuntime_aarch64.cpp? Thanks, Ningsheng From Yang.Zhang at arm.com Tue Apr 28 06:57:15 2020 From: Yang.Zhang at arm.com (Yang Zhang) Date: Tue, 28 Apr 2020 06:57:15 +0000 Subject: [aarch64-port-dev ] RFR(S): 8243155: AArch64: Add support for SqrtVF Message-ID: Hi, Could you please help to review this patch? JBS: https://bugs.openjdk.java.net/browse/JDK-8243155 Webrev: http://cr.openjdk.java.net/~yzhang/8243155/webrev.00/ In Java, Math.sqrt() supports double data only. To support Math.sqrt() for float, the following conversion must be done. float a, b; a = (float)Math.sqrt((double)b) Both AArch64 and x86 support such single-precision sqrt by hardware instructions. AArch64 FSQRT instruction matches Java (float)Math. sqrt((double)b) exactly. And X86 has supported vectorization of Math.sqrt() on floats in [1]. In this patch, vectorized sqrt for float (SqrtVF) is supported in AArch64 backend. Jtreg test cases for SqrtVF and SqrtVD are also added. Special cases such as min/max, +/-Inf, +0.0/-0.0 and NaN are covered. Testing: Full jtreg Newly added sqrt jtreg tests Panama/Vector API tests which cover vector sqrt Test case for sqrtvf: public static void sqrtvf(float[] a, float[] b, float[] c) { float tmp; for (int i = 0; i < a.length; i++) { tmp = (float)(a[i] + b[i]); c[i] = (float)Math.sqrt((double)tmp); } } With this patch, the following code snippet is generated. 0x0000ffffacaf872c: ldr q17, [x18, #16] 0x0000ffffacaf8730: ldr q16, [x16, #16] 0x0000ffffacaf8734: fadd v16.4s, v16.4s, v17.4s 0x0000ffffacaf8738: fsqrt v16.4s, v16.4s 0x0000ffffacaf8740: str q16, [x14, #16] Performance: JMH test is attached in JBS. Before: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectSqrtVF 1024 avgt 5 4.372 ? 0.016 us/op After: Benchmark (size) Mode Cnt Score Error Units TestVect.testVectSqrtVF 1024 avgt 5 1.115 ? 0.013 us/op [1] https://bugs.openjdk.java.net/browse/JDK-8190800 Regards Yang From aph at redhat.com Tue Apr 28 09:46:43 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 28 Apr 2020 10:46:43 +0100 Subject: [aarch64-port-dev ] RFR(S): 8243155: AArch64: Add support for SqrtVF In-Reply-To: References: Message-ID: <33e0d71a-0b82-9112-fe81-a8e9a34d6d57@redhat.com> On 4/28/20 7:57 AM, Yang Zhang wrote: > Could you please help to review this patch? > > JBS: https://bugs.openjdk.java.net/browse/JDK-8243155 > Webrev: http://cr.openjdk.java.net/~yzhang/8243155/webrev.00/ This was a bit of a head scratcher. To begin with I thought that this must be wrong, because Math.sqrt() is supposed to be correctly rounded, and (float)Math.sqrt(float) is double rounded, leading to an inaccurate result. Looking round the web, Figueroa [1] proved double rounding to be innocuous for the square root if it is performed with a precision larger than twice the original precision, plus two. [2] But it's not hard to write a program to do an exhaustive search from x = FLT_MIN; x <= FLT_MAX, like so: float roundedSqrt(float x) { return (float)ieee754_sqrt((double)x); } int main() { for (float x = FLT_MIN; x <= FLT_MAX; x = nextFloat(x)) { if (ieee754_sqrtf(x) != roundedSqrt(x)) { fprintf(stdout, "%12.6f\n", x); } } } ... and it returns no differences. The patch is OK, thanks. [1] Samuel A. Figueroa. When is Double Rounding Innocuous? SIGNUM Newsl., 30(3):21?26, July 1995. [2] Pierre Roux. Innocuous Double Rounding of Basic Arithmetic Operations. Journal of Formalized Reasoning, ASDD-AlmaDL, 2014, 7 (1), pp.131-142. 10.6092/issn.1972-5787/4359. hal-01091186 -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From xxinliu at amazon.com Tue Apr 28 09:57:48 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Tue, 28 Apr 2020 09:57:48 +0000 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: References: Message-ID: Hello, I recently received some build failures from submit repo. May I know more details ? eg. linux-x64-linux-x64-build-5, linux-x64-debug-linux-x64-build-6 I have successfully built it on the latest linux on ubuntu 18.04 with gcc 7.5. My nightly buildbot started failing recently on aarch64. One issue is the error message of [cds] prevents configure from bootjdk determining. https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 ./bin/java --version [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 openjdk 14.0.1 2020-04-14 OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host. Maybe it?s aarch64-only. CC aarch-port-dev. Thanks, --lx From: on behalf of "do-not-reply at oracle.com" Reply-To: "mach5_admin_ww_grp at oracle.com" Date: Monday, April 27, 2020 at 3:46 PM To: "Hohensee, Paul" Subject: [EXTERNAL] [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED Job: mach5-one-phh-JDK-8151779-20200427-2151-10554367 BuildId: 2020-04-27-2150109.hohensee.source No failed tests Tasks Summary * NOTHING_TO_RUN: 0 * UNABLE_TO_RUN: 24 * KILLED: 0 * NA: 0 * HARNESS_ERROR: 0 * FAILED: 0 * EXECUTED_WITH_FAILURE: 9 * PASSED: 51 Build 2 Unable to run * linux-aarch64-install-linux-aarch64-build-signing-20 Dependency task failed: mach5...-10554367-linux-aarch64-linux-aarch64-build-1 * linux-x64-install-linux-x64-build-signing-21 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5 9 Executed with failure * linux-aarch64-linux-aarch64-build-1 error while building, return value: 2 * linux-aarch64-debug-linux-aarch64-build-2 error while building, return value: 2 * linux-aarch64-open-linux-aarch64-build-3 error while building, return value: 2 * linux-aarch64-open-debug-linux-aarch64-build-4 error while building, return value: 2 * linux-x64-linux-x64-build-5 error while building, return value: 2 * linux-x64-debug-linux-x64-build-6 error while building, return value: 2 * linux-x64-debug-nopch-linux-x64-build-9 error while building, return value: 2 * linux-x64-open-linux-x64-build-7 error while building, return value: 2 * linux-x64-open-debug-linux-x64-build-8 error while building, return value: 2 Test 22 Unable to run * tier1-product-open_test_hotspot_jtreg_tier1_common-linux-x64-24 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5 * tier1-debug-open_test_hotspot_jtreg_tier1_common-linux-x64-debug-30 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-linux-x64-debug-33 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-linux-x64-debug-36 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-linux-x64-debug-39 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_compiler_graal-linux-x64-debug-45 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-linux-x64-debug-42 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-linux-x64-debug-48 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-linux-x64-debug-51 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6 * tier1-product-open_test_hotspot_jtreg_tier1_gc_gcbasher-linux-x64-27 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5 * See all 22... From nick.gasson at arm.com Tue Apr 28 10:11:04 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Tue, 28 Apr 2020 18:11:04 +0800 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: References: Message-ID: <858sifgbt3.fsf@arm.com> > > My nightly buildbot started failing recently on aarch64. > One issue is the error message of [cds] prevents configure from bootjdk determining. > https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 > > ./bin/java --version > [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 > openjdk 14.0.1 2020-04-14 > OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) > > Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host. > Maybe it?s aarch64-only. CC aarch-port-dev. > Was your boot JDK built on a machine configured with a different page size to your current machine? Looks like the CDS archive was dumped on a machine with 64k pages but you're running with 4k pages. There's a JBS issue for this: https://bugs.openjdk.java.net/browse/JDK-8236847 Thanks, Nick From aph at redhat.com Tue Apr 28 10:47:02 2020 From: aph at redhat.com (Andrew Haley) Date: Tue, 28 Apr 2020 11:47:02 +0100 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: <858sifgbt3.fsf@arm.com> References: <858sifgbt3.fsf@arm.com> Message-ID: On 4/28/20 11:11 AM, Nick Gasson wrote: >> >> My nightly buildbot started failing recently on aarch64. >> One issue is the error message of [cds] prevents configure from bootjdk determining. >> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 >> >> ./bin/java --version >> [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 >> openjdk 14.0.1 2020-04-14 >> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) >> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) >> >> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host. >> Maybe it?s aarch64-only. CC aarch-port-dev. > > Was your boot JDK built on a machine configured with a different page > size to your current machine? Looks like the CDS archive was dumped on a > machine with 64k pages but you're running with 4k pages. There's a JBS > issue for this: > > https://bugs.openjdk.java.net/browse/JDK-8236847 The thread seems to have died here: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html We really need to get this fixed. If anyone reading this has machines with both 4k and 64k pages, please do the experiment and we'll make a suitable patch. Everything here has 64k pages. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From stuart.monteith at arm.com Tue Apr 28 11:28:52 2020 From: stuart.monteith at arm.com (Stuart Monteith) Date: Tue, 28 Apr 2020 12:28:52 +0100 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com> References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com> <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com> Message-ID: On 28/04/2020 06:26, Ningsheng Jian wrote: > Hi Stuart, > > On 4/28/20 12:34 AM, Stuart Monteith wrote: >> Thanks Erik, Per, Andrew, >> ????I've fixed up the testcase and retested. >> >> Uploaded here: >> >> ????http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/ >> >> Would someone be able to submit this for me? >> > > I submitted a build job before pushing your code, but it failed to build with minimal variant configure. Here's error > message: > > ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member function 'static AdapterHandlerEntry* > SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const BasicType*, const VMRegPair*, > AdapterFingerPrint*)': > > ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: invalid use of incomplete type 'class > BarrierSetAssembler' > > ?? bs->c2i_entry_barrier(masm); > > I think you need to include barrierSetAssembler.hpp in sharedRuntime_aarch64.cpp? > > Thanks, > Ningsheng Thanks for that Ningsheng - I've made some changes, and built with minimal. The revised patch: http://cr.openjdk.java.net/~smonteith/8216557/webrev.3/ There were contributions from aph at redhat.com Thanks, Stuart From thomas.stuefe at gmail.com Tue Apr 28 14:54:57 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 28 Apr 2020 16:54:57 +0200 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation Message-ID: Hi all, Could I have reviews for the following proposal of reworking cds/class space reservation? Bug: https://bugs.openjdk.java.net/browse/JDK-8243392 Webrev: http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/ (Many thanks to Ioi Lam for so patiently explaining CDS internals to me, and to Andrew Haley and Nick Gasson for help with aarch64!) Reservation of the compressed class space is needlessly complicated and has some minor issues. It can be simplified and made clearer. The complexity stems from the fact that this area lives at the intersection of two to three sub systems, depending on how one counts. Metaspace, CDS, and the platform which may or may not its own view of how to reserve class space. And this code has been growing organically over time. One small example: ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t alignment, bool large_pages, char *requested_addr, bool use_requested_addr) which I spent hours decoding, resulting in a very confused mail to hs-runtime and aarch64-port-dev [2]. This patch attempts to simplify cds and metaspace setup a bit; to comment implicit knowledge which is not immediately clear; to cleanly abstract platform concerns like optimized class space placement; and to disentangle cds from metaspace to solve issues which may bite us later with Elastic Metaspace [4]. --- The main change is the reworked reservation mechanism. This is based on Ioi's proposal [5]. When reserving class space, three things must happen: 1) reservation of the space obviously. If cds is active that space must be in the vicinity of cds archives to be covered by compressed class pointer encoding. 2) setting up the internal Metaspace structures atop of that space 3) setting up compressed class pointer encoding. In its current form, Metaspace may or may not do some or all of that in one function (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace metaspace_rs, char* requested_addr, address cds_base);) - if cds is active, it will reserve the space for Metaspace and hand it in, otherwise it will create it itself. When discussing this in [2], Ioi proposed to move the reservation of the class space completely out of Metaspace and make it a responsibility of the caller always. This would reduce some complexity, and this patch follows the proposal. I removed Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace metaspace_rs, char* requested_addr, address cds_base); and all its sub functions. (1) now has to always be done outside - a ReservedSpace for class space has to be provided by the caller. However, Metaspace now offers a utility function for reserving space at a "nice" location, and explicitly doing nothing else: ReservedSpace Metaspace::reserve_address_space_for_compressed_classes(size_t size); this function can be redefined on a platform level for platform optimized reservation, see below for details. (2) is taken care of by a new function, Metaspace::initialize_class_space(ReservedSpace rs) (3) is taken care of a new function CompressedKlassPointers::initialize(), see below for details. So, class space now is set up three explicit steps: - First, reserve a suitable space by however means you want. For convenience you may use Metaspace::reserve_address_space_for_compressed_classes(), or you may roll your own reservation. - Next, tell Metaspace to use that range as backing storage for class space: Metaspace::initialize_class_space(ReservedSpace rs) - Finally, set up encoding. Encoding is independent from the concept of a ReservedSpace, it just gets an address range, see below for details. Separating these steps and moving them out of the responsibility of Metaspace makes this whole thing more flexible; it also removes unnecessary knowledge (e.g. Metaspace does not need to know anything about either ccp encoding or cds). --- How it comes together: If CDS is off, we just reserve a space using Metaspace::reserve_address_space_for_compressed_classes(), initialize it with Metaspace::initialize_class_space(ReservedSpace rs), then set up compressed class pointer encoding covering the range of this class space. If CDS is on (dump time), we reserve large 4G space, either at SharedBaseAddress or using Metaspace::reserve_address_space_for_compressed_classes(); we then split that into 3G archive space and 1G class space; we set up that space with Metaspace as class space; then we set up compressed class pointer encoding covering both archive space and cds. If CDS is on (run time), we reserve a large space, split it into archive space (large enough to hold both archives) and class space, then basically proceed as above. Note that this is almost exactly how things worked before (modulo some minor fixes, e.g. alignment issues), only the code is reformed and made more explicit. --- I moved compressed class pointer setup over to CompressedKlassPointers and changed the interface: -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace metaspace_rs, address cds_base) +void CompressedKlassPointers::initialize(address addr, size_t len); Instead of feeding it a single ReservedSpace, which is supposed to represent class space, and an optional alternate base if cds is on, now we give it just an numeric address range. That range marks the limits to where Klass structures are to be expected, and is the implicit promise that outside that range no Klass structures will exist, so encoding has to cover only this range. This range may contain just the class space; or class space+cds; or whatever allocation scheme we come up with in the future. Encoding does not really care how the memory is organized as long as the input range covers all possible Klass locations. That way we remove knowledge about class space/cds from compressed class pointer encoding. Moving it away from metaspace.cpp into the CompressedKlassPointers class also mirrors CompressedOops::initialize(). --- I renamed _narrow_klass_range to just _range, because strictly speaking this is the range un-narrow Klass pointers can have. As for the implementation of CompressedKlassPointers::initialize(address addr, size_t len), I mimicked very closely what happened before, so there should be almost no differences. Since "almost no differences" sounds scary :) here are the differences: - When CDS is active (dump or run time) we now always, unconditionally, set the encoding range to 4G. This fixes a theoretical bug discussed on aarch64-port-dev [1]. - When CDS is not active, we set the encoding range to the minimum required length. Before, it was left at its default value of 4G. Both differences only affect aarch64, since they are currently the only one using the range field in CompressedKlassPointers. I wanted to add an assert somewhere to test encoding of the very last address of the CompressedKlassPointers range, again to prevent errors like [3]. But I did not come up with a good place for this assert which would cover also the encoding done by C1/C2. For the same reason I thought about introducing a mode where Klass structures would be allocated in reverse order, starting at the end of the ccs, but again left it out as too big a change. --- OS abstraction: platforms may have restrictions of what constitutes a valid compressed class pointer encoding base. Or if not, they may have at least preferences. There was logic like this in metaspace.cpp, which I removed and cleanly factored out into platform dependent files, giving each platform the option to add special logic. These are two new methods: - bool CompressedKlassPointers::is_valid_base(address p) to let the platform tell you whether it considers p to be a valid encoding base. The only platform having these restrictions currently is aarch64. - ReservedSpace Metaspace::reserve_address_space_for_compressed_classes(size_t size); this hands over the process of allocating a range suitable for compressed class pointer encoding to the platform. Most platforms will allocate just anywhere, but some platforms may have a better strategy (e.g. trying low memory first, trying only correctly aligned addresses and so on). Beforehand, this coding existed in a similar form in metaspace.cpp for aarch64 and AIX. For now, I left the AIX part out - it seems only half done, and I want to check further if we even need it, if yes why not on Linux ppc, and C1 does not seem to support anything other than base+offset with shift either, but I may be mistaken. These two methods should give the platform enough control to implement their own scheme for optimized class space placement without bothering any shared code about it. Note about the form, I introduced two new platform dependent files, "metaspace_.cpp" and "compressedOops_.cpp". I am not happy about this but this seems to be what we generally do in hotspot, right? --- Metaspace reserve alignment vs cds alignment CDS was using Metaspace reserve alignment for CDS internal purposes. I guess this was just a copy paste issue. It never caused problems since Metaspace reserve alignment == page size, but that is not true anymore in the upcoming Elastic Metaspace where reserve alignment will be larger. This causes a number of issues. I separated those two cleanly. CDS now uses os::vm_allocation_granularity. Metaspace::reserve_alignment is only used in those two places where it is needed, when CDS creates the address space for class space on behalf of the Metaspace. --- Windows special handling in CDS To simplify coding I removed the windows specific handling which left out reservation of the archive. This was needed because windows cannot mmap files into reserved regions. But fallback code exists in filemap.cpp for this case which just reads in the region instead of mapping it. Should that turn out to be a performance problem, I will reinstate the feature. But a simpler way would be reserve the archive and later just before mmapping the archive file to release the archive space. That would not only be simpler but give us the best guarantee that that address space is actually available. But I'd be happy to leave that part out completely if we do not see any performance problems on windows x64. --- NMT cannot deal with spaces which are split. This problem manifests in that bookkeeping for class space is done under "Shared Classes", not "Classes" as it should. This problem exists today too at dump time and randomly at run time. But since I simplified the reservation, this problem now shows up always, whether or not we map at the SharedBaseAddress. While I could work around this problem, I'd prefer this problem to be solved at the core, and NMT to have an option to recognize reservation splits. So I'd rather not put a workaround for this into the patch but leave it for fixing as a separate issue. I opened this issue to track it [6]. --- Jtreg tests: I expanded the CompressedOops/CompressedClassPointers.java. I also extended them to Windows. The tests now optionally omit strict class space placement tests, since these tests heavily depend on ASLR and were the reason they were excluded on Windows. However I think even without checking for class space placement they make sense, just to see that the VM comes up and lives with the many different settings we can run in. --- Tests: - I ran the patch through Oracles submit repo - I ran tests manually for aarch64, zero, linux 32bit and windows x64 - The whole battery of nightly tests at SAP, including ppc, ppcle and aarch64, unfortunately excluding windows because of unrelated errors. Windows x64 tests will be redone tonight. Thank you, Thomas [1] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html [2] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html [3] https://bugs.openjdk.java.net/browse/JDK-8193266 [4] https://bugs.openjdk.java.net/browse/JDK-8221173 [5] https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html [6] https://bugs.openjdk.java.net/browse/JDK-8243535 From thomas.stuefe at gmail.com Tue Apr 28 16:29:39 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Tue, 28 Apr 2020 18:29:39 +0200 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: References: <858sifgbt3.fsf@arm.com> Message-ID: On a related note, I wonder whether it would be possible to fake a larger page size than the system uses. Well, I don't wonder since we do this on AIX, for complicated reasons. It requires a bit of work. So, not only spoofing os::vm_page_size(), but also fixing the places where the native page size shines thru, e.g. making sure os::reserve_memory() returns always os::vm_page_size() aligned memory. I wonder whether it would be worth the work, that way one could simulate a larger page size with a switch and test for errors like this. ..Thomas On Tue, Apr 28, 2020 at 12:50 PM Andrew Haley wrote: > On 4/28/20 11:11 AM, Nick Gasson wrote: > >> > >> My nightly buildbot started failing recently on aarch64. > >> One issue is the error message of [cds] prevents configure from bootjdk > determining. > >> > https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 > >> > >> ./bin/java --version > >> [0.006s][error][cds] Unable to map CDS archive -- > os::vm_allocation_granularity() expected: 65536 actual: 4096 > >> openjdk 14.0.1 2020-04-14 > >> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) > >> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) > >> > >> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 > host. > >> Maybe it?s aarch64-only. CC aarch-port-dev. > > > > Was your boot JDK built on a machine configured with a different page > > size to your current machine? Looks like the CDS archive was dumped on a > > machine with 64k pages but you're running with 4k pages. There's a JBS > > issue for this: > > > > https://bugs.openjdk.java.net/browse/JDK-8236847 > > The thread seems to have died here: > > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html > > We really need to get this fixed. If anyone reading this has machines with > both 4k and 64k pages, please do the experiment and we'll make a suitable > patch. Everything here has 64k pages. > > -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > From dms at samersoff.net Tue Apr 28 16:31:03 2020 From: dms at samersoff.net (Dmitry Samersoff) Date: Tue, 28 Apr 2020 19:31:03 +0300 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: References: <858sifgbt3.fsf@arm.com> Message-ID: <2fc1d25f-e854-31d9-6d14-fab24b5d24c5@samersoff.net> Hello Andrew, I'm working on it, based on the fix proposed by Ioi. -Dmitry On 28.04.2020 13:47, Andrew Haley wrote: > On 4/28/20 11:11 AM, Nick Gasson wrote: >>> >>> My nightly buildbot started failing recently on aarch64. >>> One issue is the error message of [cds] prevents configure from bootjdk determining. >>> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 >>> >>> ./bin/java --version >>> [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 >>> openjdk 14.0.1 2020-04-14 >>> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) >>> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) >>> >>> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host. >>> Maybe it?s aarch64-only. CC aarch-port-dev. >> >> Was your boot JDK built on a machine configured with a different page >> size to your current machine? Looks like the CDS archive was dumped on a >> machine with 64k pages but you're running with 4k pages. There's a JBS >> issue for this: >> >> https://bugs.openjdk.java.net/browse/JDK-8236847 > > The thread seems to have died here: > > http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html > > We really need to get this fixed. If anyone reading this has machines with > both 4k and 64k pages, please do the experiment and we'll make a suitable > patch. Everything here has 64k pages. > From thomas.stuefe at gmail.com Wed Apr 29 06:18:58 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 29 Apr 2020 08:18:58 +0200 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com> References: <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com> Message-ID: Hi Ioi, thanks for looking at this. Of course I'm happy if you run it through your CI too. The patch is based on changeset: 59009:2b3b41fff837 tag: qparent user: egahlin date: Mon Apr 27 15:01:22 2020 +0200 summary: 8242034: Remove JRE_HOME references On Wed, Apr 29, 2020 at 7:14 AM Ioi Lam wrote: > Hi Thomas, > > There are a lot of changes so it will take me a while to go through > everything. Just some initial comments: > > // User may have specified an invalid base address. Should we ignore > it or assert? > guarantee(CompressedKlassPointers::is_valid_base((address)shared_base), > "SharedBaseAddress: " PTR_FORMAT " is not a valid base.", > p2i(shared_base)); > > This will cause the VM to crash. I think it's better (1) exit the VM > properly with an error code, or (2) override the user's input. > > ====== > > Since this is a potentially disruptive change, I want to run it in our > CI as well. Could you tell me the tip of your repo? > > ======== > > For testing the CDS relocation code, I would suggest running: > > cd test/hotspot/jtreg > jtreg -javaoption:-XX:+UnlockDiagnosticVMOptions \ > -javaoption:-XX:ArchiveRelocationMode=1 \ > -javaoption:-XX:NativeMemoryTracking=detail > :hotspot_cds_relocation > > This will place the CCS at random locations picked by the OS. > > ======== > > metaspace.cpp: > > If your intention is to "shake things up a little", it's not a good idea > to include it in a complex change set. If things indeed go wrong, we > don't know who caused it (your CCS changes, or old bugs triggered by > this debug code), and we will end up backing out the entire changeset. > > I would suggest putting this in a different RFE, and even push it now. > > // The upcoming Elastic Metaspace will have stricter alignment > requirements. > // For debug builds, increase reserve alignment to shake loose errors > resulting > // from misusing this alignment. > // Note: do not increase too much (e.g. not on platforms with 64K > pages), we do not > // want to disturb tests requiring precise numbers for metaspace size > or ccs size. > #ifdef ASSERT > if (_reserve_alignment == 4 * K) { > _reserve_alignment *= 4; > } > #endif > > > More to come .... > > Thanks > - Ioi > > > All good points, I'll wait for your final review. Cheers, Thomas > On 4/28/20 7:54 AM, Thomas St?fe wrote: > > Hi all, > > > > Could I have reviews for the following proposal of reworking cds/class > > space reservation? > > > > Bug: https://bugs.openjdk.java.net/browse/JDK-8243392 > > > > Webrev: > > > http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/ > > > > (Many thanks to Ioi Lam for so patiently explaining CDS internals to > > me, and to Andrew Haley and Nick Gasson for help with aarch64!) > > > > Reservation of the compressed class space is needlessly complicated > > and has some minor issues. It can be simplified and made clearer. > > > > The complexity stems from the fact that this area lives at the > > intersection of two to three sub systems, depending on how one counts. > > Metaspace, CDS, and the platform which may or may not its own view of > > how to reserve class space. And this code has been growing organically > > over time. > > > > One small example: > > > > ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t > > alignment, > > bool large_pages, > > char *requested_addr, > > bool use_requested_addr) > > > > which I spent hours decoding, resulting in a very confused mail to > > hs-runtime and aarch64-port-dev [2]. > > > > This patch attempts to simplify cds and metaspace setup a bit; to > > comment implicit knowledge which is not immediately clear; to cleanly > > abstract platform concerns like optimized class space placement; and > > to disentangle cds from metaspace to solve issues which may bite us > > later with Elastic Metaspace [4]. > > > > --- > > > > The main change is the reworked reservation mechanism. This is based > > on Ioi's proposal [5]. > > > > When reserving class space, three things must happen: > > > > 1) reservation of the space obviously. If cds is active that space > > must be in the vicinity of cds archives to be covered by compressed > > class pointer encoding. > > 2) setting up the internal Metaspace structures atop of that space > > 3) setting up compressed class pointer encoding. > > > > In its current form, Metaspace may or may not do some or all of that > > in one function > > (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace > > metaspace_rs, char* requested_addr, address cds_base);) - if cds is > > active, it will reserve the space for Metaspace and hand it in, > > otherwise it will create it itself. > > > > When discussing this in [2], Ioi proposed to move the reservation of > > the class space completely out of Metaspace and make it a > > responsibility of the caller always. This would reduce some > > complexity, and this patch follows the proposal. > > > > I removed > > Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace > > metaspace_rs, char* requested_addr, address cds_base); and all its sub > > functions. > > > > (1) now has to always be done outside - a ReservedSpace for class > > space has to be provided by the caller. However, Metaspace now offers > > a utility function for reserving space at a "nice" location, and > > explicitly doing nothing else: > > > > ReservedSpace > > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > > > this function can be redefined on a platform level for platform > > optimized reservation, see below for details. > > > > (2) is taken care of by a new function, > > Metaspace::initialize_class_space(ReservedSpace rs) > > > > (3) is taken care of a new function > > CompressedKlassPointers::initialize(), see below for details. > > > > > > So, class space now is set up three explicit steps: > > > > - First, reserve a suitable space by however means you want. For > > convenience you may use > > Metaspace::reserve_address_space_for_compressed_classes(), or you may > > roll your own reservation. > > - Next, tell Metaspace to use that range as backing storage for class > > space: Metaspace::initialize_class_space(ReservedSpace rs) > > - Finally, set up encoding. Encoding is independent from the concept > > of a ReservedSpace, it just gets an address range, see below for details. > > > > Separating these steps and moving them out of the responsibility of > > Metaspace makes this whole thing more flexible; it also removes > > unnecessary knowledge (e.g. Metaspace does not need to know anything > > about either ccp encoding or cds). > > > > --- > > > > How it comes together: > > > > If CDS is off, we just reserve a space using > > Metaspace::reserve_address_space_for_compressed_classes(), initialize > > it with Metaspace::initialize_class_space(ReservedSpace rs), then set > > up compressed class pointer encoding covering the range of this class > > space. > > > > If CDS is on (dump time), we reserve large 4G space, either at > > SharedBaseAddress or using > > Metaspace::reserve_address_space_for_compressed_classes(); we then > > split that into 3G archive space and 1G class space; we set up that > > space with Metaspace as class space; then we set up compressed class > > pointer encoding covering both archive space and cds. > > > > If CDS is on (run time), we reserve a large space, split it into > > archive space (large enough to hold both archives) and class space, > > then basically proceed as above. > > > > Note that this is almost exactly how things worked before (modulo some > > minor fixes, e.g. alignment issues), only the code is reformed and > > made more explicit. > > > > --- > > > > I moved compressed class pointer setup over to CompressedKlassPointers > > and changed the interface: > > > > -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace > > metaspace_rs, address cds_base) > > +void CompressedKlassPointers::initialize(address addr, size_t len); > > > > Instead of feeding it a single ReservedSpace, which is supposed to > > represent class space, and an optional alternate base if cds is on, > > now we give it just an numeric address range. That range marks the > > limits to where Klass structures are to be expected, and is the > > implicit promise that outside that range no Klass structures will > > exist, so encoding has to cover only this range. > > > > This range may contain just the class space; or class space+cds; or > > whatever allocation scheme we come up with in the future. Encoding > > does not really care how the memory is organized as long as the input > > range covers all possible Klass locations. That way we remove > > knowledge about class space/cds from compressed class pointer encoding. > > > > Moving it away from metaspace.cpp into the CompressedKlassPointers > > class also mirrors CompressedOops::initialize(). > > > > --- > > > > I renamed _narrow_klass_range to just _range, because strictly > > speaking this is the range un-narrow Klass pointers can have. > > > > As for the implementation of > > CompressedKlassPointers::initialize(address addr, size_t len), I > > mimicked very closely what happened before, so there should be almost > > no differences. Since "almost no differences" sounds scary :) here are > > the differences: > > > > - When CDS is active (dump or run time) we now always, > > unconditionally, set the encoding range to 4G. This fixes a > > theoretical bug discussed on aarch64-port-dev [1]. > > > > - When CDS is not active, we set the encoding range to the minimum > > required length. Before, it was left at its default value of 4G. > > > > Both differences only affect aarch64, since they are currently the > > only one using the range field in CompressedKlassPointers. > > > > I wanted to add an assert somewhere to test encoding of the very last > > address of the CompressedKlassPointers range, again to prevent errors > > like [3]. But I did not come up with a good place for this assert > > which would cover also the encoding done by C1/C2. > > > > For the same reason I thought about introducing a mode where Klass > > structures would be allocated in reverse order, starting at the end of > > the ccs, but again left it out as too big a change. > > > > --- > > > > OS abstraction: platforms may have restrictions of what constitutes a > > valid compressed class pointer encoding base. Or if not, they may have > > at least preferences. There was logic like this in metaspace.cpp, > > which I removed and cleanly factored out into platform dependent > > files, giving each platform the option to add special logic. > > > > These are two new methods: > > > > - bool CompressedKlassPointers::is_valid_base(address p) > > > > to let the platform tell you whether it considers p to be a valid > > encoding base. The only platform having these restrictions currently > > is aarch64. > > > > - ReservedSpace > > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > > > this hands over the process of allocating a range suitable for > > compressed class pointer encoding to the platform. Most platforms will > > allocate just anywhere, but some platforms may have a better strategy > > (e.g. trying low memory first, trying only correctly aligned addresses > > and so on). > > > > Beforehand, this coding existed in a similar form in metaspace.cpp for > > aarch64 and AIX. For now, I left the AIX part out - it seems only half > > done, and I want to check further if we even need it, if yes why not > > on Linux ppc, and C1 does not seem to support anything other than > > base+offset with shift either, but I may be mistaken. > > > > These two methods should give the platform enough control to implement > > their own scheme for optimized class space placement without bothering > > any shared code about it. > > > > Note about the form, I introduced two new platform dependent files, > > "metaspace_.cpp" and "compressedOops_.cpp". I am not happy > > about this but this seems to be what we generally do in hotspot, right? > > > > --- > > > > Metaspace reserve alignment vs cds alignment > > > > CDS was using Metaspace reserve alignment for CDS internal purposes. I > > guess this was just a copy paste issue. It never caused problems since > > Metaspace reserve alignment == page size, but that is not true anymore > > in the upcoming Elastic Metaspace where reserve alignment will be > > larger. This causes a number of issues. > > > > I separated those two cleanly. CDS now uses > > os::vm_allocation_granularity. Metaspace::reserve_alignment is only > > used in those two places where it is needed, when CDS creates the > > address space for class space on behalf of the Metaspace. > > > > --- > > > > Windows special handling in CDS > > > > To simplify coding I removed the windows specific handling which left > > out reservation of the archive. This was needed because windows cannot > > mmap files into reserved regions. But fallback code exists in > > filemap.cpp for this case which just reads in the region instead of > > mapping it. > > > > Should that turn out to be a performance problem, I will reinstate the > > feature. But a simpler way would be reserve the archive and later just > > before mmapping the archive file to release the archive space. That > > would not only be simpler but give us the best guarantee that that > > address space is actually available. But I'd be happy to leave that > > part out completely if we do not see any performance problems on > > windows x64. > > > > --- > > > > NMT cannot deal with spaces which are split. This problem manifests in > > that bookkeeping for class space is done under "Shared Classes", not > > "Classes" as it should. This problem exists today too at dump time and > > randomly at run time. But since I simplified the reservation, this > > problem now shows up always, whether or not we map at the > > SharedBaseAddress. > > While I could work around this problem, I'd prefer this problem to be > > solved at the core, and NMT to have an option to recognize reservation > > splits. So I'd rather not put a workaround for this into the patch but > > leave it for fixing as a separate issue. I opened this issue to track > > it [6]. > > > > --- > > > > Jtreg tests: > > > > I expanded the CompressedOops/CompressedClassPointers.java. I also > > extended them to Windows. The tests now optionally omit strict class > > space placement tests, since these tests heavily depend on ASLR and > > were the reason they were excluded on Windows. However I think even > > without checking for class space placement they make sense, just to > > see that the VM comes up and lives with the many different settings we > > can run in. > > > > --- > > > > Tests: > > > > - I ran the patch through Oracles submit repo > > - I ran tests manually for aarch64, zero, linux 32bit and windows x64 > > - The whole battery of nightly tests at SAP, including ppc, ppcle and > > aarch64, unfortunately excluding windows because of unrelated errors. > > Windows x64 tests will be redone tonight. > > > > > > Thank you, > > > > Thomas > > > > [1] > > > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html > > [2] > > > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html > > [3] https://bugs.openjdk.java.net/browse/JDK-8193266 > > [4] https://bugs.openjdk.java.net/browse/JDK-8221173 > > [5] > > > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html > > [6] https://bugs.openjdk.java.net/browse/JDK-8243535 > > > > From ningsheng.jian at arm.com Wed Apr 29 06:59:18 2020 From: ningsheng.jian at arm.com (Ningsheng Jian) Date: Wed, 29 Apr 2020 14:59:18 +0800 Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for Concurrent Class Unloading In-Reply-To: References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com> <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com> <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com> <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com> <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com> Message-ID: <0f602574-a4ea-da3e-46de-d35862e276d6@arm.com> On 4/28/20 7:28 PM, Stuart Monteith wrote: > On 28/04/2020 06:26, Ningsheng Jian wrote: >> Hi Stuart, >> >> On 4/28/20 12:34 AM, Stuart Monteith wrote: >>> Thanks Erik, Per, Andrew, >>> ????I've fixed up the testcase and retested. >>> >>> Uploaded here: >>> >>> ????http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/ >>> >>> Would someone be able to submit this for me? >>> >> >> I submitted a build job before pushing your code, but it failed to build with minimal variant configure. Here's error >> message: >> >> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member function 'static AdapterHandlerEntry* >> SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const BasicType*, const VMRegPair*, >> AdapterFingerPrint*)': >> >> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: invalid use of incomplete type 'class >> BarrierSetAssembler' >> >> ?? bs->c2i_entry_barrier(masm); >> >> I think you need to include barrierSetAssembler.hpp in sharedRuntime_aarch64.cpp? >> >> Thanks, >> Ningsheng > > Thanks for that Ningsheng - I've made some changes, and built with minimal. > > The revised patch: > > http://cr.openjdk.java.net/~smonteith/8216557/webrev.3/ > Looks good and pushed. Thanks, Ningsheng From ioi.lam at oracle.com Wed Apr 29 05:14:31 2020 From: ioi.lam at oracle.com (Ioi Lam) Date: Tue, 28 Apr 2020 22:14:31 -0700 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: References: Message-ID: <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com> Hi Thomas, There are a lot of changes so it will take me a while to go through everything. Just some initial comments: ? // User may have specified an invalid base address. Should we ignore it or assert? guarantee(CompressedKlassPointers::is_valid_base((address)shared_base), ??????????? "SharedBaseAddress: " PTR_FORMAT " is not a valid base.", p2i(shared_base)); This will cause the VM to crash. I think it's better (1) exit the VM properly with an error code, or (2) override the user's input. ====== Since this is a potentially disruptive change, I want to run it in our CI as well. Could you tell me the tip of your repo? ======== For testing the CDS relocation code, I would suggest running: cd test/hotspot/jtreg jtreg -javaoption:-XX:+UnlockDiagnosticVMOptions \ ????? -javaoption:-XX:ArchiveRelocationMode=1 \ ????? -javaoption:-XX:NativeMemoryTracking=detail ????? :hotspot_cds_relocation This will place the CCS at random locations picked by the OS. ======== metaspace.cpp: If your intention is to "shake things up a little", it's not a good idea to include it in a complex change set. If things indeed go wrong, we don't know who caused it (your CCS changes, or old bugs triggered by this debug code), and we will end up backing out the entire changeset. I would suggest putting this in a different RFE, and even push it now. ? // The upcoming Elastic Metaspace will have stricter alignment requirements. ? // For debug builds, increase reserve alignment to shake loose errors resulting ? // from misusing this alignment. ? // Note: do not increase too much (e.g. not on platforms with 64K pages), we do not ? // want to disturb tests requiring precise numbers for metaspace size or ccs size. #ifdef ASSERT ? if (_reserve_alignment == 4 * K) { ??? _reserve_alignment *= 4; ? } #endif More to come .... Thanks - Ioi On 4/28/20 7:54 AM, Thomas St?fe wrote: > Hi all, > > Could I have reviews for the following proposal of reworking cds/class > space reservation? > > Bug: https://bugs.openjdk.java.net/browse/JDK-8243392 > > Webrev: > http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/ > > (Many thanks to Ioi Lam for so patiently explaining CDS internals to > me, and to Andrew Haley and Nick Gasson for help with aarch64!) > > Reservation of the compressed class space is needlessly complicated > and has some minor issues. It can be simplified and made clearer. > > The complexity stems from the fact that this area lives at the > intersection of two to three sub systems, depending on how one counts. > Metaspace, CDS, and the platform which may or may not its own view of > how to reserve class space. And this code has been growing organically > over time. > > One small example: > > ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t > alignment, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bool large_pages, > char *requested_addr, > ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bool use_requested_addr) > > which I spent hours decoding, resulting in a very confused mail to > hs-runtime and aarch64-port-dev [2]. > > This patch attempts to simplify cds and metaspace setup a bit; to > comment implicit knowledge which is not immediately clear; to cleanly > abstract platform concerns like optimized class space placement; and > to disentangle cds from metaspace to solve issues which may bite us > later with Elastic Metaspace [4]. > > --- > > The main change is the reworked reservation mechanism. This is based > on Ioi's proposal [5]. > > When reserving class space, three things must happen: > > 1) reservation of the space obviously. If cds is active that space > must be in the vicinity of cds archives to be covered by compressed > class pointer encoding. > 2) setting up the internal Metaspace structures atop of that space > 3) setting up compressed class pointer encoding. > > In its current form, Metaspace may or may not do some or all of that > in one function > (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace > metaspace_rs, char* requested_addr, address cds_base);) - if cds is > active, it will reserve the space for Metaspace and hand it in, > otherwise it will create it itself. > > When discussing this in [2], Ioi proposed to move the reservation of > the class space completely out of Metaspace and make it a > responsibility of the caller always. This would reduce some > complexity, and this patch follows the proposal. > > I removed > Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace > metaspace_rs, char* requested_addr, address cds_base); and all its sub > functions. > > (1) now has to always be done outside - a ReservedSpace for class > space has to be provided by the caller. However, Metaspace now offers > a utility function for reserving space at a "nice" location, and > explicitly doing nothing else: > > ReservedSpace > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > this function can be redefined on a platform level for platform > optimized reservation, see below for details. > > (2) is taken care of by a new function, > Metaspace::initialize_class_space(ReservedSpace rs) > > (3) is taken care of a new function > CompressedKlassPointers::initialize(), see below for details. > > > So, class space now is set up three explicit steps: > > - First, reserve a suitable space by however means you want. For > convenience you may use > Metaspace::reserve_address_space_for_compressed_classes(), or you may > roll your own reservation. > - Next, tell Metaspace to use that range as backing storage for class > space: Metaspace::initialize_class_space(ReservedSpace rs) > - Finally, set up encoding. Encoding is independent from the concept > of a ReservedSpace, it just gets an address range, see below for details. > > Separating these steps and moving them out of the responsibility of > Metaspace makes this whole thing more flexible; it also removes > unnecessary knowledge (e.g. Metaspace does not need to know anything > about either ccp encoding or cds). > > --- > > How it comes together: > > If CDS is off, we just reserve a space using > Metaspace::reserve_address_space_for_compressed_classes(), initialize > it with Metaspace::initialize_class_space(ReservedSpace rs), then set > up compressed class pointer encoding covering the range of this class > space. > > If CDS is on (dump time), we reserve large 4G space, either at > SharedBaseAddress or using > Metaspace::reserve_address_space_for_compressed_classes(); we then > split that into 3G archive space and 1G class space; we set up that > space with Metaspace as class space; then we set up?compressed class > pointer encoding covering both archive space and cds. > > If CDS is on (run time), we reserve a large space, split it into > archive space (large enough to hold both archives) and class space, > then basically proceed as above. > > Note that this is almost exactly how things worked before (modulo some > minor fixes, e.g. alignment issues), only the code is reformed and > made more explicit. > > --- > > I moved compressed class pointer setup over to CompressedKlassPointers > and changed the interface: > > -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace > metaspace_rs, address cds_base) > +void CompressedKlassPointers::initialize(address addr, size_t len); > > Instead of feeding it a single ReservedSpace, which is supposed to > represent class space, and an optional alternate base if cds is on, > now we give it just an numeric address range. That range marks the > limits to where Klass structures are to be expected, and is the > implicit promise that outside that range no Klass structures will > exist, so encoding has to cover only this range. > > This range may contain just the class space; or class space+cds; or > whatever allocation scheme we come up with in the future. Encoding > does not really care how the memory is organized as long as the input > range covers all possible Klass locations. That way we remove > knowledge about class space/cds from compressed class pointer encoding. > > Moving it away from metaspace.cpp into the CompressedKlassPointers > class also mirrors CompressedOops::initialize(). > > --- > > I renamed _narrow_klass_range to just _range, because strictly > speaking this is the range un-narrow Klass pointers can have. > > As for the implementation of > CompressedKlassPointers::initialize(address addr, size_t len), I > mimicked very closely what happened before, so there should be almost > no differences. Since "almost no differences" sounds scary :) here are > the differences: > > - When CDS is active (dump or run time) we now always, > unconditionally, set the encoding range to 4G. This fixes a > theoretical bug discussed on aarch64-port-dev [1]. > > - When CDS is not active, we set the encoding range to the minimum > required length. Before, it was left at its default value of 4G. > > Both differences only affect aarch64, since they are currently the > only one using the range field in CompressedKlassPointers. > > I wanted to add an assert somewhere to test encoding of the very last > address of the CompressedKlassPointers range, again to prevent errors > like [3]. But I did not come up with a good place for this assert > which would cover also the encoding done by C1/C2. > > For the same reason I thought about introducing a mode where Klass > structures would be allocated in reverse order, starting at the end of > the ccs, but again left it out as too big a change. > > --- > > OS abstraction: platforms may have restrictions of what constitutes a > valid compressed class pointer encoding base. Or if not, they may have > at least preferences. There was logic like this in metaspace.cpp, > which I removed and cleanly factored out into platform dependent > files, giving each platform the option to add special logic. > > These are two new methods: > > - bool CompressedKlassPointers::is_valid_base(address p) > > to let the platform tell you whether it considers p to be a valid > encoding base. The only platform having these restrictions currently > is aarch64. > > - ReservedSpace > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > this hands over the process of allocating a range suitable for > compressed class pointer encoding to the platform. Most platforms will > allocate just anywhere, but some platforms may have a better strategy > (e.g. trying low memory first, trying only correctly aligned addresses > and so on). > > Beforehand, this coding existed in a similar form in metaspace.cpp for > aarch64 and AIX. For now, I left the AIX part out - it seems only half > done, and I want to check further if we even need it, if yes why not > on Linux ppc, and C1 does not seem to support anything other than > base+offset with shift either, but I may be mistaken. > > These two methods should give the platform enough control to implement > their own scheme for optimized class space placement without bothering > any shared code about it. > > Note about the form, I introduced two new platform dependent files, > "metaspace_.cpp" and "compressedOops_.cpp". I am not happy > about this but this seems to be what we generally do in hotspot, right? > > --- > > Metaspace reserve alignment vs cds alignment > > CDS was using Metaspace reserve alignment for CDS internal purposes. I > guess this was just a copy paste issue. It never caused problems since > Metaspace reserve alignment == page size, but that is not true anymore > in the upcoming Elastic Metaspace where reserve alignment will be > larger. This causes a number of issues. > > I separated those two cleanly. CDS now uses > os::vm_allocation_granularity. Metaspace::reserve_alignment is only > used in those two places where it is needed, when CDS creates the > address space for class space on behalf of the Metaspace. > > --- > > Windows special handling in CDS > > To simplify coding I removed the windows specific handling which left > out reservation of the archive. This was needed because windows cannot > mmap files into reserved regions. But fallback code exists in > filemap.cpp for this case which just reads in the region instead of > mapping?it. > > Should that turn out to be a performance problem, I will reinstate the > feature. But a simpler way would be reserve the archive and later just > before mmapping?the archive file to release the archive space. That > would not only be simpler but give us the best guarantee that that > address space is actually available. But I'd be happy to leave that > part out completely if we do not see any performance problems on > windows x64. > > --- > > NMT cannot deal with spaces which are split. This problem manifests in > that bookkeeping for class space is done under "Shared Classes", not > "Classes" as it should. This problem exists today too at dump?time and > randomly at run time. But since I simplified the reservation, this > problem now shows up always, whether or not we map at the > SharedBaseAddress. > While I could work around this problem, I'd prefer this problem to be > solved at the core, and NMT to have an option to recognize reservation > splits. So I'd rather not put a workaround for this into the patch but > leave it for fixing as a separate issue. I opened this issue to track > it [6]. > > --- > > Jtreg tests: > > I expanded the CompressedOops/CompressedClassPointers.java. I also > extended them to Windows. The tests now optionally omit strict class > space placement tests, since these tests heavily depend on ASLR and > were the reason they were excluded on Windows. However I think even > without checking for class space placement they make sense, just to > see that the VM comes up and lives with the many different settings we > can run in. > > --- > > Tests: > > - I ran the patch through Oracles submit repo > - I ran tests manually for aarch64, zero, linux 32bit and windows x64 > - The whole battery of nightly tests at SAP, including ppc, ppcle and > aarch64, unfortunately excluding windows because of unrelated errors. > Windows x64 tests will be redone tonight. > > > Thank you, > > Thomas > > [1] > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html > [2] > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html > [3] https://bugs.openjdk.java.net/browse/JDK-8193266 > [4] https://bugs.openjdk.java.net/browse/JDK-8221173 > [5] > https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html > [6] https://bugs.openjdk.java.net/browse/JDK-8243535 > From nick.gasson at arm.com Wed Apr 29 07:47:02 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 29 Apr 2020 15:47:02 +0800 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: References: Message-ID: <857dxyg2dl.fsf@arm.com> Hi Thomas, On 04/28/20 22:54 pm, Thomas St?fe wrote: > > These are two new methods: > > - bool CompressedKlassPointers::is_valid_base(address p) > > to let the platform tell you whether it considers p to be a valid encoding > base. The only platform having these restrictions currently is aarch64. > > - ReservedSpace > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > this hands over the process of allocating a range suitable for compressed > class pointer encoding to the platform. Most platforms will allocate just > anywhere, but some platforms may have a better strategy (e.g. trying low > memory first, trying only correctly aligned addresses and so on). > > Beforehand, this coding existed in a similar form in metaspace.cpp for > aarch64 and AIX. For now, I left the AIX part out - it seems only half > done, and I want to check further if we even need it, if yes why not on > Linux ppc, and C1 does not seem to support anything other than base+offset > with shift either, but I may be mistaken. Just a small comment: 33 bool CompressedKlassPointers::is_valid_base(address p) { 34 35 // Below 32G, base must be aligned to 4G. 36 // Above that point, base must be aligned to 32G 37 38 if (p < (address)(32 * G)) { 39 return is_aligned(p, 4 * G); 40 } 41 42 return is_aligned(p, 32 * G); 43 44 } On line 42 I'd prefer to use (4 << LogKlassAlignmentInBytes)*G as it currently is in metaspace.cpp instead of the literal 32. This makes the relationship with the compressed class decode logic a bit clearer as the restriction comes from the MOV and MOVK instructions we use to decompress the pointer: we have to ensure the bits of the base and bits of the offset after shifting do not overlap. Similarly for the `increment` field in metaspace_aarch64.cpp line 51. Thanks, Nick From aph at redhat.com Wed Apr 29 08:36:03 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 29 Apr 2020 09:36:03 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <857dxyg2dl.fsf@arm.com> References: <857dxyg2dl.fsf@arm.com> Message-ID: <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> On 4/29/20 8:47 AM, Nick Gasson wrote: > This makes the > relationship with the compressed class decode logic a bit clearer as the > restriction comes from the MOV and MOVK instructions we use to > decompress the pointer: we have to ensure the bits of the base and bits > of the offset after shifting do not overlap. This seems a bit crazy. Whyever would anyone want shifted CompressedKlassPointers with an offset? I guess I'm going to have to look very closely at this patch. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From nick.gasson at arm.com Wed Apr 29 08:51:23 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Wed, 29 Apr 2020 16:51:23 +0800 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> References: <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> Message-ID: <854kt2fzec.fsf@arm.com> On 04/29/20 16:36 pm, Andrew Haley wrote: > On 4/29/20 8:47 AM, Nick Gasson wrote: >> This makes the >> relationship with the compressed class decode logic a bit clearer as the >> restriction comes from the MOV and MOVK instructions we use to >> decompress the pointer: we have to ensure the bits of the base and bits >> of the offset after shifting do not overlap. > > This seems a bit crazy. Whyever would anyone want shifted > CompressedKlassPointers with an offset? I guess I'm going to have to > look very closely at this patch. The compressed class shift is always set to LogKlassAlignmentInBytes when CDS is enabled. It's for compatibility with AOT. See this comment in Metaspace::set_narrow_klass_base_and_shift(): // CDS uses LogKlassAlignmentInBytes for narrow_klass_shift. See // MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() for // how dump time narrow_klass_shift is set. Although, CDS can work // with zero-shift mode also, to be consistent with AOT it uses // LogKlassAlignmentInBytes for klass shift so archived java heap objects // can be used at same time as AOT code. For AOT this is set up in AOTGraalHotSpotVMConfig.java: // AOT captures VM settings during compilation. For compressed oops this // presents a problem for the case when the VM selects a zero-shift mode // (i.e., when the heap is less than 4G). Compiling an AOT binary with // zero-shift limits its usability. As such we force the shift to be // always equal to alignment to avoid emitting zero-shift AOT code. CompressEncoding vmOopEncoding = super.getOopEncoding(); aotOopEncoding = new CompressEncoding(vmOopEncoding.getBase(), logMinObjAlignment()); CompressEncoding vmKlassEncoding = super.getKlassEncoding(); aotKlassEncoding = new CompressEncoding(vmKlassEncoding.getBase(), logKlassAlignment); For compressed OOPs it makes sense because it allows a larger heap without changing the encoding mode. But for compressed class pointers we never need to address more than 4G so maybe it's better to use 0 shift instead of logKlassAlignment above? With CDS the default shared base address is 0x80000000 which doesn't allow a zero base anyway. Thanks, Nick From aph at redhat.com Wed Apr 29 12:56:10 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 29 Apr 2020 13:56:10 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <854kt2fzec.fsf@arm.com> References: <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> <854kt2fzec.fsf@arm.com> Message-ID: <25829763-7485-5208-58e8-1d51f7068816@redhat.com> On 4/29/20 9:51 AM, Nick Gasson wrote: > For compressed OOPs it makes sense because it allows a larger heap > without changing the encoding mode. That's right: I'm looking at AOT-compiled code (after applying your patch) and by default it uses a shift of 3, no offset. If I then run the AOT-compiled code with -Xmx31G I get: # # A fatal error has been detected by the Java Runtime Environment: # # SIGSEGV (0xb) at pc=0x0000ffffa142bd3c, pid=9965, tid=10174 # # JRE version: (15.0) (slowdebug build ) # Java VM: OpenJDK 64-Bit Server VM (slowdebug 15-internal+0-adhoc.aph.jdk-tmp, mixed mode, aot, tiered, compressed oops, g1 gc, linux-aarch64) # Problematic frame: # A 388 java.lang.Thread.setPriority(I)V java.base (56 bytes) @ 0x0000ffffa142bd3c [0x0000ffffa142bac0+0x000000000000027c] 0x0000ffffa142bd30 <+624>: ldr w1, [x4, #56] 0x0000ffffa142bd34 <+628>: cbz w1, 0xffffa142bd84 0x0000ffffa142bd38 <+632>: lsl x1, x1, #3 0x0000ffffa142bd3c <+636>: ldr w0, [x1, #12] ... so the AOT-compiled code is still trying to use the shift of 3, but it is not adding in the base, which is 0x1000000000. I guess this is pilot error, but I'm trying to understand what gets checked and when. > But for compressed class pointers we never need to address more than > 4G so maybe it's better to use 0 shift instead of logKlassAlignment > above? With CDS the default shared base address is 0x80000000 which > doesn't allow a zero base anyway. Maybe. What actually happens when we decode compressed class pointers in AOT-compiled code is: Load the klass pointer from an Object: 532440: b940082a ldr w10, [x1,#8] Load the compressed class base: 532444: 90055e68 adrp x8, b0fe000 532448: 9104c108 add x8, x8, #0x130 53244c: f9400108 ldr x8, [x8] Shift and add: 532450: 8b2a6d0a add x10, x8, x10, uxtx #3 ... none of which is very nice, but the expensive part is loading the compressed classw base and doing the add, so I guess we don't care that there is a shift as well. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed Apr 29 13:32:56 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 29 Apr 2020 14:32:56 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <25829763-7485-5208-58e8-1d51f7068816@redhat.com> References: <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com> Message-ID: <040930ce-2008-455f-4427-82c2795492c7@redhat.com> On 4/29/20 1:56 PM, Andrew Haley wrote: > Maybe. What actually happens when we decode compressed class pointers > in AOT-compiled code is: > > Load the klass pointer from an Object: > > 532440: b940082a ldr w10, [x1,#8] > > Load the compressed class base: > > 532444: 90055e68 adrp x8, b0fe000 > 532448: 9104c108 add x8, x8, #0x130 > 53244c: f9400108 ldr x8, [x8] > > Shift and add: > > 532450: 8b2a6d0a add x10, x8, x10, uxtx #3 > > ... none of which is very nice, but the expensive part is loading the > compressed class base and doing the add, so I guess we don't care > that there is a shift as well. Argh. s/compressed class/compressed oop/g -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Wed Apr 29 14:21:44 2020 From: aph at redhat.com (Andrew Haley) Date: Wed, 29 Apr 2020 15:21:44 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: References: Message-ID: On 4/28/20 3:54 PM, Thomas St?fe wrote: > These two methods should give the platform enough control to implement > their own scheme for optimized class space placement without bothering any > shared code about it. There's still something I don't like. If we have a compressed class space in the lower 32G but above 4G, we do this: // Otherwise we attempt to use a zero base if the range fits in lower 32G. if (end <= (address)ClassEncodingMetaspaceMax) { base = 0; } else { base = addr; } // Highest offset a Class* can ever have in relation to base. range = end - base; // We may not even need a shift if the range fits into 32bit: const uint64_t UnscaledClassSpaceMax = (uint64_t(max_juint) + 1); if (range < UnscaledClassSpaceMax) { shift = 0; } else { shift = LogClassAlignmentInBytes; } ... which means that we end up with zero base, shifted compressed class pointers, *despite the fact* that we carefully chose a nicely-aligned compressed class base we could encode efficiently. I guess this code above is really optimized for x86; it certainly seems to prefer shifts to offsets, which makes sense on that part. It doesn't make much difference (if any) for AArch64, I admit, but it is odd. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From thomas.stuefe at gmail.com Wed Apr 29 16:14:30 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 29 Apr 2020 18:14:30 +0200 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: References: Message-ID: Hi Andrew, On Wed, Apr 29, 2020 at 4:21 PM Andrew Haley wrote: > On 4/28/20 3:54 PM, Thomas St?fe wrote: > > These two methods should give the platform enough control to implement > > their own scheme for optimized class space placement without bothering > any > > shared code about it. > > There's still something I don't like. If we have a compressed class space > in the lower 32G but above 4G, we do this: > > // Otherwise we attempt to use a zero base if the range fits in lower > 32G. > if (end <= (address)ClassEncodingMetaspaceMax) { > base = 0; > } else { > base = addr; > } > > // Highest offset a Class* can ever have in relation to base. > range = end - base; > > // We may not even need a shift if the range fits into 32bit: > const uint64_t UnscaledClassSpaceMax = (uint64_t(max_juint) + 1); > if (range < UnscaledClassSpaceMax) { > shift = 0; > } else { > shift = LogClassAlignmentInBytes; > } > > ... which means that we end up with zero base, shifted compressed > class pointers, *despite the fact* that we carefully chose a > nicely-aligned compressed class base we could encode efficiently. > > I guess this code above is really optimized for x86; it certainly > seems to prefer shifts to offsets, which makes sense on that part. It > doesn't make much difference (if any) for AArch64, I admit, but it is > odd. > > I understand. First off, this patch is supposed to be an almost clean code reshuffle. It was not the intent of this patch to change functionality, at least not by much, since the patch is complicated enough as it is. It just wants to improve maintainability, so that we can improve the code in the future easier. So, compressed class pointer encoding should work as it did before, modulo those little details about CompressedKlassPointers::range. That said, I agree with you, this is not optimal. There are other possibilities to improve matters, e.g. we miss opportunities to go zero based if the heap is very large since the class space gets always allocated behind the heap. Since the intent of this code is to give platforms greater leeway to do their thing without disturbing shared code, maybe we should make CompressedKlassPointers::initialize() platform dependent too? Or, add a hook at the end of it allowing platforms to overwrite the default behavior. If aarch64 prefers shift=0 and base= it could do so. ..Thomas -- > Andrew Haley (he/him) > Java Platform Lead Engineer > Red Hat UK Ltd. > https://keybase.io/andrewhaley > EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 > > From thomas.stuefe at gmail.com Wed Apr 29 16:16:24 2020 From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=) Date: Wed, 29 Apr 2020 18:16:24 +0200 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <857dxyg2dl.fsf@arm.com> References: <857dxyg2dl.fsf@arm.com> Message-ID: Thank you Nick. You are of course right, I should use speaking constants instead. I will change that. ..Thomas On Wed, Apr 29, 2020 at 10:14 AM Nick Gasson wrote: > Hi Thomas, > > On 04/28/20 22:54 pm, Thomas St?fe wrote: > > > > These are two new methods: > > > > - bool CompressedKlassPointers::is_valid_base(address p) > > > > to let the platform tell you whether it considers p to be a valid > encoding > > base. The only platform having these restrictions currently is aarch64. > > > > - ReservedSpace > > Metaspace::reserve_address_space_for_compressed_classes(size_t size); > > > > this hands over the process of allocating a range suitable for compressed > > class pointer encoding to the platform. Most platforms will allocate just > > anywhere, but some platforms may have a better strategy (e.g. trying low > > memory first, trying only correctly aligned addresses and so on). > > > > Beforehand, this coding existed in a similar form in metaspace.cpp for > > aarch64 and AIX. For now, I left the AIX part out - it seems only half > > done, and I want to check further if we even need it, if yes why not on > > Linux ppc, and C1 does not seem to support anything other than > base+offset > > with shift either, but I may be mistaken. > > Just a small comment: > > 33 bool CompressedKlassPointers::is_valid_base(address p) { > 34 > 35 // Below 32G, base must be aligned to 4G. > 36 // Above that point, base must be aligned to 32G > 37 > 38 if (p < (address)(32 * G)) { > 39 return is_aligned(p, 4 * G); > 40 } > 41 > 42 return is_aligned(p, 32 * G); > 43 > 44 } > > On line 42 I'd prefer to use (4 << LogKlassAlignmentInBytes)*G as it > currently is in metaspace.cpp instead of the literal 32. This makes the > relationship with the compressed class decode logic a bit clearer as the > restriction comes from the MOV and MOVK instructions we use to > decompress the pointer: we have to ensure the bits of the base and bits > of the offset after shifting do not overlap. Similarly for the > `increment` field in metaspace_aarch64.cpp line 51. > > > Thanks, > Nick > From zgu at redhat.com Wed Apr 29 19:09:11 2020 From: zgu at redhat.com (Zhengyu Gu) Date: Wed, 29 Apr 2020 15:09:11 -0400 Subject: [aarch64-port-dev ] [15] RFR 8241793: Shenandoah: Enable concurrent class unloading for aarch64 Message-ID: <4f453020-1f29-30a4-9e45-33854451bd3d@redhat.com> Concurrent class unloading support for aarch64 [1] has been pushed, let's enable it for Shenandoah GC. Bug: https://bugs.openjdk.java.net/browse/JDK-8241793 Webrev: http://cr.openjdk.java.net/~zgu/JDK-8241793/webrev.00/ Test: hotspot_gc_shenandoah tier1 with Shenandoah GC Thanks, -Zhengyu [1] https://bugs.openjdk.java.net/browse/JDK-8216557 From xxinliu at amazon.com Wed Apr 29 23:43:13 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Wed, 29 Apr 2020 23:43:13 +0000 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: <858sifgbt3.fsf@arm.com> References: <858sifgbt3.fsf@arm.com> Message-ID: <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com> Hi, Nick, Thanks for taking a look at this issue. I figure out why. There?re two separated issues. 1. My toolchain was too old. That?s why I can?t reproduce the building failures feedbacked from submit repo. OpenJDK wiki says jdk13 Linux x86_64 supports gcc 8.2, so I believe it means the building hosts must use gcc 8.2+. My WIP patch does have a C++ issue captured by g++-8+. 2. I can?t use adoptOpenJDK?s aarch64 jdk14 as boot-jdk. Thanks for helping me to understand the problem. The pagesize of my aarch64 host is 4k. I don't have access of adoptOpenJDK's build hosts. I have filed an issue to adoptJDK https://github.com/AdoptOpenJDK/openjdk14-binaries/issues/1 One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. Thanks, --lx ?On 4/28/20, 3:14 AM, "hotspot-dev on behalf of Nick Gasson" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. > > My nightly buildbot started failing recently on aarch64. > One issue is the error message of [cds] prevents configure from bootjdk determining. > https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77 > > ./bin/java --version > [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 > openjdk 14.0.1 2020-04-14 > OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) > > Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host. > Maybe it?s aarch64-only. CC aarch-port-dev. > Was your boot JDK built on a machine configured with a different page size to your current machine? Looks like the CDS archive was dumped on a machine with 64k pages but you're running with 4k pages. There's a JBS issue for this: https://bugs.openjdk.java.net/browse/JDK-8236847 Thanks, Nick From nick.gasson at arm.com Thu Apr 30 05:55:16 2020 From: nick.gasson at arm.com (Nick Gasson) Date: Thu, 30 Apr 2020 13:55:16 +0800 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <25829763-7485-5208-58e8-1d51f7068816@redhat.com> References: <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com> Message-ID: <85368lfrgb.fsf@arm.com> On 04/29/20 20:56 pm, Andrew Haley wrote: > > That's right: I'm looking at AOT-compiled code (after applying your > patch) and by default it uses a shift of 3, no offset. If I then run > the AOT-compiled code with -Xmx31G I get: > > # > # A fatal error has been detected by the Java Runtime Environment: > # > # SIGSEGV (0xb) at pc=0x0000ffffa142bd3c, pid=9965, tid=10174 > # > # JRE version: (15.0) (slowdebug build ) > # Java VM: OpenJDK 64-Bit Server VM (slowdebug 15-internal+0-adhoc.aph.jdk-tmp, mixed mode, aot, tiered, compressed oops, g1 gc, linux-aarch64) > # Problematic frame: > # A 388 java.lang.Thread.setPriority(I)V java.base (56 bytes) @ 0x0000ffffa142bd3c [0x0000ffffa142bac0+0x000000000000027c] > > 0x0000ffffa142bd30 <+624>: ldr w1, [x4, #56] > 0x0000ffffa142bd34 <+628>: cbz w1, 0xffffa142bd84 > 0x0000ffffa142bd38 <+632>: lsl x1, x1, #3 > 0x0000ffffa142bd3c <+636>: ldr w0, [x1, #12] > > ... so the AOT-compiled code is still trying to use the shift of 3, > but it is not adding in the base, which is 0x1000000000. I guess this > is pilot error, but I'm trying to understand what gets checked and > when. No this looks like a real bug: jaotc is using the value of the heap base in the VM where jaotc is run to decide whether to emit the add or not. If you run `jaotc -J-Xmx31g` it works. I'm not very familiar with Graal but I believe this fixes it: --- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java +++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java @@ -139,8 +139,9 @@ public class AArch64HotSpotMove { Register resultRegister = asRegister(result); Register ptr = asRegister(input); Register base = (isRegister(baseRegister) ? asRegister(baseRegister) : zr); + boolean pic = GeneratePIC.getValue(crb.getOptions()); // result = (ptr - base) >> shift - if (!encoding.hasBase()) { + if (!pic && !encoding.hasBase()) { if (encoding.hasShift()) { masm.lshr(64, resultRegister, ptr, encoding.getShift()); } else { @@ -189,7 +190,8 @@ public class AArch64HotSpotMove { public void emitCode(CompilationResultBuilder crb, AArch64MacroAssembler masm) { Register inputRegister = asRegister(input); Register resultRegister = asRegister(result); - Register base = encoding.hasBase() ? asRegister(baseRegister) : null; + boolean pic = GeneratePIC.getValue(crb.getOptions()); + Register base = pic || encoding.hasBase() ? asRegister(baseRegister) : null; emitUncompressCode(masm, inputRegister, resultRegister, base, encoding.getShift(), nonNull); } I've made a JBS issue to track this: https://bugs.openjdk.java.net/browse/JDK-8244164 > >> But for compressed class pointers we never need to address more than >> 4G so maybe it's better to use 0 shift instead of logKlassAlignment >> above? With CDS the default shared base address is 0x80000000 which >> doesn't allow a zero base anyway. > > Maybe. What actually happens when we decode compressed class pointers > in AOT-compiled code is: > > Load the klass pointer from an Object: > > 532440: b940082a ldr w10, [x1,#8] > > Load the compressed class base: > > 532444: 90055e68 adrp x8, b0fe000 > 532448: 9104c108 add x8, x8, #0x130 > 53244c: f9400108 ldr x8, [x8] > > Shift and add: > > 532450: 8b2a6d0a add x10, x8, x10, uxtx #3 > > ... none of which is very nice, but the expensive part is loading the > compressed classw base and doing the add, so I guess we don't care > that there is a shift as well. Yes but if we can avoid the shift here then CDS can also use zero shift by default. Which avoids the problem of having compressed class pointers with both shift and base non-zero in the Hotspot-generated code. Thanks, Nick From Pengfei.Li at arm.com Thu Apr 30 06:05:30 2020 From: Pengfei.Li at arm.com (Pengfei Li) Date: Thu, 30 Apr 2020 06:05:30 +0000 Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com> Message-ID: Hi Xin, > I tested on aarch64. It generates the same crash report as x86_64 when it > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > relevant failures[1]. (NOT a reviewer) The original instruction used should be dcps1 instead of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a trivial fix to change dpcs1/2/3 to dcps1/2/3? BTW, how did you test to hit the HaltNode? -- Thanks, Pengfei From xxinliu at amazon.com Thu Apr 30 06:35:54 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 30 Apr 2020 06:35:54 +0000 Subject: [aarch64-port-dev ] RFR(XS): 8230552: Provide information when hitting a HaltNode for architectures other than x86 Message-ID: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com> ?On 4/29/20, 11:06 PM, "Pengfei Li" wrote: Hi Xin, > I tested on aarch64. It generates the same crash report as x86_64 when it > does hit HaltNode. Halt reason is displayed. I paste report on the JBS. > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except for 3 > relevant failures[1]. (NOT a reviewer) The original instruction used should be dcps1 instead of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a trivial fix to change dpcs1/2/3 to dcps1/2/3? Oh, I don't know that. I did search dpcs and found nothing. I've filed a new issue about the typo thing: JDK-8244170. Let's resolve it in separated issue. BTW, how did you test to hit the HaltNode? -- Thanks, Pengfei I followed Christian and Volkers' recipe on JDK-8230552. Both of them can generate HaltNode. Volker's approach is very interesting. You have to give program a couple of "-XX:SuppressErrorAt=" to increase tolerance. Thanks, --lx From aph at redhat.com Thu Apr 30 07:57:31 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 30 Apr 2020 08:57:31 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: References: Message-ID: <617e5a37-3282-fb9e-6904-a03bd1f71411@redhat.com> On 4/29/20 5:14 PM, Thomas St?fe wrote: > Since the intent of this code is to give platforms greater leeway to do > their thing without disturbing shared code, maybe we should make > CompressedKlassPointers::initialize() platform dependent too? That would be very nice. If we can fix things so that we never shift, then a lot of things become easier. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Apr 30 07:57:43 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 30 Apr 2020 08:57:43 +0100 Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage reservation In-Reply-To: <85368lfrgb.fsf@arm.com> References: <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com> <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com> <85368lfrgb.fsf@arm.com> Message-ID: <45cf5ef0-3939-38df-7959-0e19d79d9e39@redhat.com> On 4/30/20 6:55 AM, Nick Gasson wrote: > Yes but if we can avoid the shift here then CDS can also use zero shift > by default. Which avoids the problem of having compressed class pointers > with both shift and base non-zero in the Hotspot-generated code. That sounds good. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From aph at redhat.com Thu Apr 30 08:18:51 2020 From: aph at redhat.com (Andrew Haley) Date: Thu, 30 Apr 2020 09:18:51 +0100 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com> References: <858sifgbt3.fsf@arm.com> <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com> Message-ID: <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com> On 4/30/20 12:43 AM, Liu, Xin wrote: > One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. The fix should be submitted to build-dev. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From volker.simonis at gmail.com Thu Apr 30 14:45:03 2020 From: volker.simonis at gmail.com (Volker Simonis) Date: Thu, 30 Apr 2020 16:45:03 +0200 Subject: [aarch64-port-dev ] RFR(XS): 8230552: Provide information when hitting a HaltNode for architectures other than x86 In-Reply-To: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com> References: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com> Message-ID: Forwarding to ppc-aix and s390 port mailing lists with the kind request for testing this simple fix on the corresponding platforms. Thank you and best regards, Volker Liu, Xin schrieb am Do., 30. Apr. 2020, 08:39: > > > ?On 4/29/20, 11:06 PM, "Pengfei Li" wrote: > > > > Hi Xin, > > > I tested on aarch64. It generates the same crash report as x86_64 > when it > > does hit HaltNode. Halt reason is displayed. I paste report on the > JBS. > > I ran hotspot:tier1 on aarch64 fastdebug build. It passed except > for 3 > > relevant failures[1]. > > (NOT a reviewer) The original instruction used should be dcps1 instead > of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a > trivial fix to change dpcs1/2/3 to dcps1/2/3? > > Oh, I don't know that. I did search dpcs and found nothing. > I've filed a new issue about the typo thing: JDK-8244170. Let's resolve > it in separated issue. > > BTW, how did you test to hit the HaltNode? > -- > Thanks, > Pengfei > > I followed Christian and Volkers' recipe on JDK-8230552. Both of them can > generate HaltNode. > Volker's approach is very interesting. You have to give program a couple > of "-XX:SuppressErrorAt=" to increase tolerance. > > Thanks, > --lx > > > From xxinliu at amazon.com Thu Apr 30 21:48:07 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 30 Apr 2020 21:48:07 +0000 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com> References: <858sifgbt3.fsf@arm.com> <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com> <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com> Message-ID: <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com> Hi, Andrew, That's a hack. A general way should use grep or sed to capture the needed line instead of hardcoding first or second line. Okay, Let me try to do that. Thanks, --lx ?On 4/30/20, 1:19 AM, "aph at redhat.com" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 4/30/20 12:43 AM, Liu, Xin wrote: > One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. The fix should be submitted to build-dev. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671 From xxinliu at amazon.com Thu Apr 30 23:48:05 2020 From: xxinliu at amazon.com (Liu, Xin) Date: Thu, 30 Apr 2020 23:48:05 +0000 Subject: [aarch64-port-dev ] FW: [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED In-Reply-To: <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com> References: <858sifgbt3.fsf@arm.com> <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com> <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com> <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com> Message-ID: <9725176C-5E5A-499B-8093-A5865C4AC443@amazon.com> Hi, Andrew, How about this? I can use awk to capture java -version. There're 2 cases. I) openjdk openjdk version "14.0.1" 2020-04-14 2) oraclejdk java 14.0.1 2020-04-14 if somehow java displays some error/warning messages, awk can filter them out and capture the version line. Eg. $ ~/builds/jdk-14.0.1+7/bin/java -version [0.009s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096 openjdk version "14.0.1" 2020-04-14 OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7) OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode) $ ~/builds/jdk-14.0.1+7/bin/java -version 2>&1 | awk '/^(openjdk version|java)/ {print $0}' openjdk version "14.0.1" 2020-04-14 I think this awk stmt is portable, but it's always good to ask experts to review it, so I cc build-dev. Hers is the change. diff --git a/make/autoconf/boot-jdk.m4 b/make/autoconf/boot-jdk.m4 --- a/make/autoconf/boot-jdk.m4 +++ b/make/autoconf/boot-jdk.m4 @@ -74,7 +74,7 @@ BOOT_JDK_FOUND=no else # Oh, this is looking good! We probably have found a proper JDK. Is it the correct version? - BOOT_JDK_VERSION=`"$BOOT_JDK/bin/java$EXE_SUFFIX" $USER_BOOT_JDK_OPTIONS -version 2>&1 | $HEAD -n 1` + BOOT_JDK_VERSION=`"$BOOT_JDK/bin/java$EXE_SUFFIX" $USER_BOOT_JDK_OPTIONS -version 2>&1 | $AWK '/^(openjdk version|java)/ {print [$]0}'` if [ [[ "$BOOT_JDK_VERSION" =~ "Picked up" ]] ]; then AC_MSG_NOTICE([You have _JAVA_OPTIONS or JAVA_TOOL_OPTIONS set. This can mess up the build. Please use --with-boot-jdk-jvmargs instead.]) AC_MSG_NOTICE([Java reports: "$BOOT_JDK_VERSION".]) @@ -529,7 +529,7 @@ BUILD_JDK_FOUND=no else # Oh, this is looking good! We probably have found a proper JDK. Is it the correct version? - BUILD_JDK_VERSION=`"$BUILD_JDK/bin/java" -version 2>&1 | $HEAD -n 1` + BUILD_JDK_VERSION=`"$BUILD_JDK/bin/java" -version 2>&1 | $AWK '/^(openjdk version|java)/ {print [$]0}'` # Extra M4 quote needed to protect [] in grep expression. [FOUND_CORRECT_VERSION=`echo $BUILD_JDK_VERSION | $EGREP "\"$VERSION_FEATURE([\.+-].*)?\""`] ?On 4/30/20, 2:52 PM, "aarch64-port-dev on behalf of Liu, Xin" wrote: Hi, Andrew, That's a hack. A general way should use grep or sed to capture the needed line instead of hardcoding first or second line. Okay, Let me try to do that. Thanks, --lx On 4/30/20, 1:19 AM, "aph at redhat.com" wrote: CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe. On 4/30/20 12:43 AM, Liu, Xin wrote: > One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. The fix should be submitted to build-dev. -- Andrew Haley (he/him) Java Platform Lead Engineer Red Hat UK Ltd. https://keybase.io/andrewhaley EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671