From gnu.andrew at redhat.com  Wed Apr  1 01:22:17 2020
From: gnu.andrew at redhat.com (Andrew Hughes)
Date: Wed, 1 Apr 2020 02:22:17 +0100
Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b08 Upstream Sync
Message-ID: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com>

Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/

Merge changesets:
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/corba/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxp/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxws/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jdk/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/hotspot/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/langtools/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/nashorn/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/root/merge.changeset

Changes in aarch64-shenandoah-jdk8u252-b08:
  - S8241296: Segfault in JNIHandleBlock::oops_do()
  - S8241307: Marlin renderer should not be the default in 8u252

Main issues of note:
One HotSpot change applied cleanly, no merge work.

diffstat for root
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for corba
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for jaxp
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for jaxws
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for langtools
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for nashorn
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for jdk
 b/.hgtags
                   |    1
 b/src/share/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine   |    7 +
 b/src/solaris/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine |    9 +-
 b/test/sun/java2d/marlin/DefaultRenderingEngine.java
                   |   42 ++++++++++
 4 files changed, 54 insertions(+), 5 deletions(-)

diffstat for hotspot
 b/.hgtags                         |    1 +
 b/src/share/vm/runtime/thread.cpp |    4 +++-
 2 files changed, 4 insertions(+), 1 deletion(-)

Successfully built on x86, x86_64, s390, s390x, ppc, ppc64,
ppc64le & aarch64.

Ok to push?

Thanks,
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222


From Pengfei.Li at arm.com  Wed Apr  1 02:05:04 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Wed, 1 Apr 2020 02:05:04 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com>
Message-ID: <DB8PR08MB496930232C57100B12D55E9896C90@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi Andrew,

Thanks for review.

>    INSN(absr,   0, 0b100000101110, 1); // accepted arrangements: T8B, T16B,
> T4H, T8H,      T4S
> -  INSN(negr,   1, 0b100000101110, 2); // accepted arrangements: T8B, T16B,
> T4H, T8H, T2S, T4S, T2D
> 
> is actually related to some other work you are doing?

This change is related to
-    if (accepted < 2) guarantee(T != T2S && T != T2D, "incorrect arrangement");         \
-    if (accepted == 0) guarantee(T == T8B || T == T16B, "incorrect arrangement");       \
+    if (accepted < 3) guarantee(T != T2D, "incorrect arrangement");                     \
+    if (accepted < 2) guarantee(T != T2S, "incorrect arrangement");                     \
+    if (accepted < 1) guarantee(T == T8B || T == T16B, "incorrect arrangement");        \

Before my patch, the candidate values of "accepted" are 0, 1 and 2 meaning different accepted arrangements as below:
0 - Only T8B and T16B are accepted
1 - All arrangements but T2S and T2D are accepted
2 - All arrangements are accepted

In my patch, the newly added instruction UADDLP supports T2S but doesn't support T2D. So I changed the value range to 0 - 3, where 3 means all arrangements are accepted now. That's why the value for parameter "accepted" of NEGR is promoted from 2 to 3 now.

--
Thanks,
Pengfei


From aph at redhat.com  Wed Apr  1 08:54:52 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 1 Apr 2020 09:54:52 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <DB8PR08MB496930232C57100B12D55E9896C90@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <2ce24736-9b5c-5c23-bfde-14067d6d6b0d@redhat.com>
 <DB8PR08MB496930232C57100B12D55E9896C90@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <b0272c8c-2f0a-9a7c-bca8-1c33a4aa691d@redhat.com>

On 4/1/20 3:05 AM, Pengfei Li wrote:
> In my patch, the newly added instruction UADDLP supports T2S but doesn't support T2D. So I changed the value range to 0 - 3, where 3 means all arrangements are accepted now. That's why the value for parameter "accepted" of NEGR is promoted from 2 to 3 now.

I see. OK, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From stumon01 at arm.com  Wed Apr  1 09:29:02 2020
From: stumon01 at arm.com (Stuart Monteith)
Date: Wed, 1 Apr 2020 10:29:02 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86 specifics
 from os_linux.cpp/hpp/inline.hpp
Message-ID: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>

Hello,
        This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1.

Webrev:
        http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/
Bug:
        https://bugs.openjdk.java.net/browse/JDK-8241587

Thanks,
        Stuart

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

From david.holmes at oracle.com  Wed Apr  1 10:03:32 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 1 Apr 2020 20:03:32 +1000
Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86
 specifics from os_linux.cpp/hpp/inline.hpp
In-Reply-To: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>
References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>
Message-ID: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>

Hi Stuart,

On 1/04/2020 7:29 pm, Stuart Monteith wrote:
> Hello,
>          This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1.
> 
> Webrev:
>          http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/
> Bug:
>          https://bugs.openjdk.java.net/browse/JDK-8241587

That clean up seems good to me.

> Thanks,
>          Stuart
> 
> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

That footer seems inappropriate for OpenJDK emails.

Cheers,
David


From stumon01 at arm.com  Wed Apr  1 10:09:40 2020
From: stumon01 at arm.com (Stuart Monteith)
Date: Wed, 1 Apr 2020 11:09:40 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86
 specifics from os_linux.cpp/hpp/inline.hpp
In-Reply-To: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>
References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>
 <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>
Message-ID: <a2b0e31b-ad4b-3bf4-91fa-4b9d285d32f9@arm.com>

On 01/04/2020 11:03, David Holmes wrote:
> Hi Stuart,
>
> On 1/04/2020 7:29 pm, Stuart Monteith wrote:
>> Hello,
>>          This patch removes a couple of x86 specifics from aarch64 code. Tested with hotspot tier1.
>>
>> Webrev:
>>          http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/
>> Bug:
>>          https://bugs.openjdk.java.net/browse/JDK-8241587
>
> That clean up seems good to me.
>

Thanks.

>> Thanks,
>>          Stuart
>>
>> IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you
>> are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other
>> person, use it for any purpose, or store or copy the information in any medium. Thank you.
>
> That footer seems inappropriate for OpenJDK emails.
>

Apologies - dismiss that. I'd ordinarily send the email from my other machine.

> Cheers,
> David
>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

From shade at redhat.com  Wed Apr  1 11:55:03 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 1 Apr 2020 13:55:03 +0200
Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b08 Upstream Sync
In-Reply-To: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com>
References: <68d1f2ac-c6e6-0a4a-3fd1-620a84e9f7aa@redhat.com>
Message-ID: <82a66f85-76d4-c5b1-a11f-136a0a949095@redhat.com>

On 4/1/20 3:22 AM, Andrew Hughes wrote:
> Merge changesets:
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/corba/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxp/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jaxws/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/jdk/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/hotspot/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/langtools/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/nashorn/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b08/root/merge.changeset

All look good.

> Ok to push?

Yes, please.

-- 
Thanks,
-Aleksey


From gnu.andrew at redhat.com  Wed Apr  1 16:50:53 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:50:53 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah: 3 new
	changesets
Message-ID: <202004011650.031Gorev018534@aojmv0008.oracle.com>

Changeset: e8b56e0eaa7b
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/e8b56e0eaa7b

Added tag jdk8u252-b08 for changeset 72a6d93679e5

! .hgtags

Changeset: 259807b2eafc
Author:    andrew
Date:      2020-03-27 06:04 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/259807b2eafc

Merge jdk8u252-b08

! .hgtags

Changeset: 83b10c54af07
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/rev/83b10c54af07

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 259807b2eafc

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:03 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:03 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/corba: 3 new
	changesets
Message-ID: <202004011651.031Gp35P018686@aojmv0008.oracle.com>

Changeset: 9340b3be1b47
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/9340b3be1b47

Added tag jdk8u252-b08 for changeset 63738d15bb7f

! .hgtags

Changeset: 81baca88f8b3
Author:    andrew
Date:      2020-03-27 06:04 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/81baca88f8b3

Merge jdk8u252-b08

! .hgtags

Changeset: 8fad3e09ebcf
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/corba/rev/8fad3e09ebcf

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 81baca88f8b3

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:12 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:12 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxp: 3 new
	changesets
Message-ID: <202004011651.031GpCbw018812@aojmv0008.oracle.com>

Changeset: 8476d78dc695
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/8476d78dc695

Added tag jdk8u252-b08 for changeset d1a8fb9aafdd

! .hgtags

Changeset: 0e8735595b62
Author:    andrew
Date:      2020-03-27 06:04 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/0e8735595b62

Merge jdk8u252-b08

! .hgtags

Changeset: 878d3aa22258
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/878d3aa22258

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 0e8735595b62

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:21 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:21 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxws: 3 new
	changesets
Message-ID: <202004011651.031GpLhD019359@aojmv0008.oracle.com>

Changeset: b012193ff452
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/b012193ff452

Added tag jdk8u252-b08 for changeset 7e334946a044

! .hgtags

Changeset: e2725620cdbd
Author:    andrew
Date:      2020-03-27 06:04 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/e2725620cdbd

Merge jdk8u252-b08

! .hgtags

Changeset: 5f4c415b6acc
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxws/rev/5f4c415b6acc

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset e2725620cdbd

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:30 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:30 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/langtools: 3
	new changesets
Message-ID: <202004011651.031GpUwR019430@aojmv0008.oracle.com>

Changeset: 01036da3155c
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/01036da3155c

Added tag jdk8u252-b08 for changeset c56eceecec71

! .hgtags

Changeset: 4cb8441f6bf5
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/4cb8441f6bf5

Merge jdk8u252-b08

! .hgtags

Changeset: a6ed6d713d38
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/a6ed6d713d38

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 4cb8441f6bf5

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:38 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:38 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/hotspot: 4
	new changesets
Message-ID: <202004011651.031GpcRe019568@aojmv0008.oracle.com>

Changeset: 8f2780b3e4fa
Author:    aph
Date:      2020-03-25 03:20 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/8f2780b3e4fa

8241296: Segfault in JNIHandleBlock::oops_do()
Reviewed-by: andrew

! src/share/vm/runtime/thread.cpp

Changeset: 095e60e7fc8c
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/095e60e7fc8c

Added tag jdk8u252-b08 for changeset 8f2780b3e4fa

! .hgtags

Changeset: 2668bab1293c
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/2668bab1293c

Merge jdk8u252-b08

! .hgtags
! src/share/vm/runtime/thread.cpp

Changeset: ac1d2acb1e7d
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/ac1d2acb1e7d

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset 2668bab1293c

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:47 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:47 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jdk: 4 new
	changesets
Message-ID: <202004011651.031Gpm8v019764@aojmv0008.oracle.com>

Changeset: e17fe591a374
Author:    lbourges
Date:      2020-03-25 03:53 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/e17fe591a374

8241307: Marlin renderer should not be the default in 8u252
Reviewed-by: phh, alexsch, andrew, sgehwolf

! src/share/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine
! src/solaris/classes/sun/java2d/pisces/META-INF/services/sun.java2d.pipe.RenderingEngine
+ test/sun/java2d/marlin/DefaultRenderingEngine.java

Changeset: da301ecaa81d
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/da301ecaa81d

Added tag jdk8u252-b08 for changeset e17fe591a374

! .hgtags

Changeset: c38803f8a50b
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/c38803f8a50b

Merge jdk8u252-b08

! .hgtags

Changeset: 2f1b1489f97f
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/2f1b1489f97f

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset c38803f8a50b

! .hgtags


From gnu.andrew at redhat.com  Wed Apr  1 16:51:56 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Wed, 01 Apr 2020 16:51:56 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/nashorn: 3
	new changesets
Message-ID: <202004011651.031Gpube020370@aojmv0008.oracle.com>

Changeset: 5fc91c4182b0
Author:    andrew
Date:      2020-03-27 05:14 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/5fc91c4182b0

Added tag jdk8u252-b08 for changeset 95d61d0f326b

! .hgtags

Changeset: d9fdfa71788f
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/d9fdfa71788f

Merge jdk8u252-b08

! .hgtags

Changeset: ebb6de4f5fb3
Author:    andrew
Date:      2020-03-27 06:05 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/nashorn/rev/ebb6de4f5fb3

Added tag aarch64-shenandoah-jdk8u252-b08 for changeset d9fdfa71788f

! .hgtags


From nick.gasson at arm.com  Thu Apr  2 01:48:40 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Thu, 02 Apr 2020 09:48:40 +0800
Subject: [aarch64-port-dev ] Question about JVM option
	"-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
Message-ID: <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>

On 03/30/20 18:43 pm, Andrew Haley wrote:
>
> I remember an early stepping where STLR/LDAR weren't sequentially
> consistent, so it was necessary to generate explicit DMBs. I doubt
> that parts with this bug ever reached the market.
>
> Having said that, maybe someone is still using one. It might be worth
> correcting UseBarriersForVolatile and making the flag diagnostic only.
> Having said that, the entire C library uses these instructions.
> Opinions?

I checked glibc and the Linux kernel and couldn't find any workaround
like this. Presumably they'd both be affected.

I suggest if Derek can confirm that bug never made it into a production
part then we should completely remove UseBarriersForVolatile. Maybe with
a warning at startup if we detect that CPU variant. It adds a lot of
complexity to the volatile implementation for no clear benefit.


Thanks,
Nick

From adinn at redhat.com  Thu Apr  2 13:22:29 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 2 Apr 2020 14:22:29 +0100
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
 <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>

On 02/04/2020 02:48, Nick Gasson wrote:
> I suggest if Derek can confirm that bug never made it into a production
> part then we should completely remove UseBarriersForVolatile. Maybe with
> a warning at startup if we detect that CPU variant. It adds a lot of
> complexity to the volatile implementation for no clear benefit.
One reason for having this switch was to provide a comparator for our
scheme to implement the Java volatile accesses using ldar/stlr. That
translation scheme avoids a dmb after the stlr allowing the value being
written to be committed lazily while still providing the critical
guarantee that prior writes are committed before it gets committed. The
switch ensures we can fall back to a 'reference' implementation based on
dmbs that, amongst other things, enforces immediate commit of the
volatile write after commit of its predecessors.

By removing support we lose the ability to test cases where
synchronization errors occur with our scheme by switching to the
'standard' model. That may still be useful for finding bugs (current or
newly injected) in our translation and, indeed, in new HW. Andrew Haley
was not suggesting removing this option. He simply talked about making
it a diagnostic option. I think that might be a wiser choice.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From adinn at redhat.com  Thu Apr  2 13:43:20 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Thu, 2 Apr 2020 14:43:20 +0100
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
 <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
Message-ID: <aa953f1a-4915-0a24-ee86-43250d31b435@redhat.com>

On 02/04/2020 14:22, Andrew Dinn wrote:
> On 02/04/2020 02:48, Nick Gasson wrote:
>> I suggest if Derek can confirm that bug never made it into a production
>> part then we should completely remove UseBarriersForVolatile. Maybe with
>> a warning at startup if we detect that CPU variant. It adds a lot of
>> complexity to the volatile implementation for no clear benefit.
> One reason for having this switch was to provide a comparator for our
> scheme to implement the Java volatile accesses using ldar/stlr. That
> translation scheme avoids a dmb after the stlr allowing the value being
> written to be committed lazily while still providing the critical
> guarantee that prior writes are committed before it gets committed. The
> switch ensures we can fall back to a 'reference' implementation based on
> dmbs that, amongst other things, enforces immediate commit of the
> volatile write after commit of its predecessors.
> 
> By removing support we lose the ability to test cases where
> synchronization errors occur with our scheme by switching to the
> 'standard' model. That may still be useful for finding bugs (current or
> newly injected) in our translation and, indeed, in new HW. Andrew Haley
> was not suggesting removing this option. He simply talked about making
> it a diagnostic option. I think that might be a wiser choice.
Correction he did actually raise the question as to whether to remove it
after recommending making it diagnostic. My vote for the latter still
stands.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From aph at redhat.com  Thu Apr  2 13:46:23 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 2 Apr 2020 14:46:23 +0100
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
 <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
Message-ID: <3a12226e-1762-ea3c-ff99-2ebce2ebb69c@redhat.com>

On 4/2/20 2:22 PM, Andrew Dinn wrote:
> On 02/04/2020 02:48, Nick Gasson wrote:
>> I suggest if Derek can confirm that bug never made it into a production
>> part then we should completely remove UseBarriersForVolatile. Maybe with
>> a warning at startup if we detect that CPU variant. It adds a lot of
>> complexity to the volatile implementation for no clear benefit.
>
> One reason for having this switch was to provide a comparator for our
> scheme to implement the Java volatile accesses using ldar/stlr. That
> translation scheme avoids a dmb after the stlr allowing the value being
> written to be committed lazily while still providing the critical
> guarantee that prior writes are committed before it gets committed. The
> switch ensures we can fall back to a 'reference' implementation based on
> dmbs that, amongst other things, enforces immediate commit of the
> volatile write after commit of its predecessors.
>
> By removing support we lose the ability to test cases where
> synchronization errors occur with our scheme by switching to the
> 'standard' model.

Right, so I guess you're saying that we should keep it because there
may be bugs in our ldar/stlr code. I can think of no other reason.

> That may still be useful for finding bugs (current or newly
> injected) in our translation and, indeed, in new HW. Andrew Haley
> was not suggesting removing this option. He simply talked about
> making it a diagnostic option. I think that might be a wiser choice.

I'm fairly sure I said perhaps we could nuke it. I am strongly of the
opinion that rarely-used code behind runtime switches tends to rot, and
that any rotten part of the ship tends to spread.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From nick.gasson at arm.com  Fri Apr  3 02:03:30 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Fri, 03 Apr 2020 10:03:30 +0800
Subject: [aarch64-port-dev ] Question about JVM option
	"-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
 <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
Message-ID: <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com>

On 04/02/20 21:22 pm, Andrew Dinn wrote:
> One reason for having this switch was to provide a comparator for our
> scheme to implement the Java volatile accesses using ldar/stlr. That
> translation scheme avoids a dmb after the stlr allowing the value being
> written to be committed lazily while still providing the critical
> guarantee that prior writes are committed before it gets committed. The
> switch ensures we can fall back to a 'reference' implementation based on
> dmbs that, amongst other things, enforces immediate commit of the
> volatile write after commit of its predecessors.
>
> By removing support we lose the ability to test cases where
> synchronization errors occur with our scheme by switching to the
> 'standard' model. That may still be useful for finding bugs (current or
> newly injected) in our translation and, indeed, in new HW.

OK, but keeping it is not without cost. If UseBarriersForVolatile is to
have value as a reference implementation we need to expend effort to
test it and fix any bugs that arise from changes to other parts of the
code (see Xiaohong's original mail).


Thanks,
Nick

From ningsheng.jian at arm.com  Fri Apr  3 02:30:18 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 3 Apr 2020 10:30:18 +0800
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <VI1PR08MB53284E28F472E868A7DFF61BF5CB0@VI1PR08MB5328.eurprd08.prod.outlook.com>
 <b01765c8-4f77-a380-4f80-56f83fabccb1@redhat.com>
 <85r1xap2u8.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <f8eab48b-2228-ee35-8fba-1160a6e8044e@redhat.com>
 <85k12y1wuf.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <6bb21c20-aea8-4ba5-d0f1-e71439b16592@redhat.com>
 <85imih1g25.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <af9b18da-3357-346d-84c1-8e94a497afa9@arm.com>

On 4/3/20 10:03 AM, Nick Gasson wrote:
> On 04/02/20 21:22 pm, Andrew Dinn wrote:
>> One reason for having this switch was to provide a comparator for our
>> scheme to implement the Java volatile accesses using ldar/stlr. That
>> translation scheme avoids a dmb after the stlr allowing the value being
>> written to be committed lazily while still providing the critical
>> guarantee that prior writes are committed before it gets committed. The
>> switch ensures we can fall back to a 'reference' implementation based on
>> dmbs that, amongst other things, enforces immediate commit of the
>> volatile write after commit of its predecessors.
>>
>> By removing support we lose the ability to test cases where
>> synchronization errors occur with our scheme by switching to the
>> 'standard' model. That may still be useful for finding bugs (current or
>> newly injected) in our translation and, indeed, in new HW.
> 
> OK, but keeping it is not without cost. If UseBarriersForVolatile is to
> have value as a reference implementation we need to expend effort to
> test it and fix any bugs that arise from changes to other parts of the
> code (see Xiaohong's original mail).
> 
> 

Yes, if the "reference" implementation is not widely used and tested, it 
might be buggy and misleading. I know that when Xiaohong was working on 
similar part on Graal [1], she spent a lot of time tracing the 
UseBarriersForVolatile issue in hotspot vm. So I agree with Nick and 
trend to not maintain this implementation.


[1] https://github.com/oracle/graal/pull/2181

Thanks,
Ningsheng

From ningsheng.jian at arm.com  Fri Apr  3 02:41:04 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 3 Apr 2020 10:41:04 +0800
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <110347ce-0629-c5ff-d072-080094570f09@arm.com>

Hi Pengfei,

On 3/31/20 5:32 PM, Pengfei Li wrote:
> Hi,
> 
> Please help review this another missing node support for AArch64.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475
> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/
> 

Just took a close look before pushing your code, and I think this line 
can be removed?

+  effect(TEMP_DEF dst);

Thanks,
Ningsheng

From Pengfei.Li at arm.com  Fri Apr  3 05:48:05 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Fri, 3 Apr 2020 05:48:05 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <110347ce-0629-c5ff-d072-080094570f09@arm.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
Message-ID: <DB8PR08MB4969A351FA2AE7ACD9DD698E96C70@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi,

> Just took a close look before pushing your code, and I think this line can be
> removed?
> 
> +  effect(TEMP_DEF dst);

Yes, thanks for pointing out. It is redundant since I don't use temps this time.

I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/

--
Thanks,
Pengfei


From aph at redhat.com  Fri Apr  3 08:56:34 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 3 Apr 2020 09:56:34 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <DB8PR08MB4969A351FA2AE7ACD9DD698E96C70@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
 <DB8PR08MB4969A351FA2AE7ACD9DD698E96C70@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <d4fca5fe-b5c1-c448-5e4d-40541f5d5f46@redhat.com>

On 4/3/20 6:48 AM, Pengfei Li wrote:
> Yes, thanks for pointing out. It is redundant since I don't use temps this time.
> 
> I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/

Please push.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From ningsheng.jian at arm.com  Fri Apr  3 09:11:15 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 3 Apr 2020 17:11:15 +0800
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <d4fca5fe-b5c1-c448-5e4d-40541f5d5f46@redhat.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
 <DB8PR08MB4969A351FA2AE7ACD9DD698E96C70@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d4fca5fe-b5c1-c448-5e4d-40541f5d5f46@redhat.com>
Message-ID: <6c0bcfbd-118c-3fa7-96f7-7e832314a05c@arm.com>

On 4/3/20 4:56 PM, Andrew Haley wrote:
> On 4/3/20 6:48 AM, Pengfei Li wrote:
>> Yes, thanks for pointing out. It is redundant since I don't use temps this time.
>>
>> I've updated and rebased the patch. See http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.02/
> 
> Please push.
> 

Pushed.

Thanks,
Ningsheng

From adinn at redhat.com  Fri Apr  3 09:13:40 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Fri, 3 Apr 2020 10:13:40 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <110347ce-0629-c5ff-d072-080094570f09@arm.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
Message-ID: <b2e80cd9-49ea-3f9e-f2ca-9b36cdc5166b@redhat.com>

On 03/04/2020 03:41, Ningsheng Jian wrote:
> Hi Pengfei,
> 
> On 3/31/20 5:32 PM, Pengfei Li wrote:
>> Hi,
>>
>> Please help review this another missing node support for AArch64.
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475
>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/
>>
> 
> Just took a close look before pushing your code, and I think this line
> can be removed?
> 
> +? effect(TEMP_DEF dst);
Strictly, I think this is correct but I don't think it matters.

I believe this usage is meant to identify a case where a generated
multi-instruction sequence uses the output register (i.e. dst = target
of Set) both as an output in the final instruction and as an
intermediate scratch register in intervening instructions. That is the
case for both these rules.

The only way that might make a difference is if the back end were able
to interleave instructions in other generated sequences with the
instructions generated by this rule during instruction scheduling (or,
say, via peephole rules). However, I don't believe that can happen given
the current adlc code and AArch64 rules.

n.b. there are several other exemples of TEMP_DEF use in aarch64.ad. I
am not sure that they are the only ones where a dst register is used as
both output and intermediary (we will only find out by carefully
eyeballing every rule).

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From aph at redhat.com  Fri Apr  3 09:22:30 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 3 Apr 2020 10:22:30 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <b2e80cd9-49ea-3f9e-f2ca-9b36cdc5166b@redhat.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
 <b2e80cd9-49ea-3f9e-f2ca-9b36cdc5166b@redhat.com>
Message-ID: <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com>

On 4/3/20 10:13 AM, Andrew Dinn wrote:
> On 03/04/2020 03:41, Ningsheng Jian wrote:
>> Hi Pengfei,
>>
>> On 3/31/20 5:32 PM, Pengfei Li wrote:
>>> Hi,
>>>
>>> Please help review this another missing node support for AArch64.
>>>
>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475
>>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/
>>>
>>
>> Just took a close look before pushing your code, and I think this line
>> can be removed?
>>
>> +? effect(TEMP_DEF dst);
> Strictly, I think this is correct but I don't think it matters.
> 
> I believe this usage is meant to identify a case where a generated
> multi-instruction sequence uses the output register (i.e. dst = target
> of Set) both as an output in the final instruction and as an
> intermediate scratch register in intervening instructions. That is the
> case for both these rules.

More simply, it prevents the situation where the same register is used as both
an output and an input. Withe these patterns that doesn't matter.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From ningsheng.jian at arm.com  Fri Apr  3 10:00:38 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 3 Apr 2020 18:00:38 +0800
Subject: [aarch64-port-dev ] RFR(S): 8241475: AArch64: Add missing
 support for PopCountVI node
In-Reply-To: <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com>
References: <DB8PR08MB49699764FA887647D8495C5396C80@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <110347ce-0629-c5ff-d072-080094570f09@arm.com>
 <b2e80cd9-49ea-3f9e-f2ca-9b36cdc5166b@redhat.com>
 <9b007363-0380-3d6a-8df6-f0afca4c50d5@redhat.com>
Message-ID: <34dcff53-5afc-29c2-6086-e0d66882026c@arm.com>

On 4/3/20 5:22 PM, Andrew Haley wrote:
> On 4/3/20 10:13 AM, Andrew Dinn wrote:
>> On 03/04/2020 03:41, Ningsheng Jian wrote:
>>> Hi Pengfei,
>>>
>>> On 3/31/20 5:32 PM, Pengfei Li wrote:
>>>> Hi,
>>>>
>>>> Please help review this another missing node support for AArch64.
>>>>
>>>> JBS: https://bugs.openjdk.java.net/browse/JDK-8241475
>>>> Webrev: http://cr.openjdk.java.net/~pli/rfr/8241475/webrev.01/
>>>>
>>>
>>> Just took a close look before pushing your code, and I think this line
>>> can be removed?
>>>
>>> +? effect(TEMP_DEF dst);
>> Strictly, I think this is correct but I don't think it matters.
>>
>> I believe this usage is meant to identify a case where a generated
>> multi-instruction sequence uses the output register (i.e. dst = target
>> of Set) both as an output in the final instruction and as an
>> intermediate scratch register in intervening instructions. That is the
>> case for both these rules.
> 
> More simply, it prevents the situation where the same register is used as both
> an output and an input. Withe these patterns that doesn't matter.
> 

Yeah, in this code block dst and src are not necessary to be different regs.

Thanks,
Ningsheng

From Yang.Zhang at arm.com  Fri Apr  3 10:49:06 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 3 Apr 2020 10:49:06 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
Message-ID: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

In original reduce_add2I, dst may be the same as tmp2, which may get incorrect result.
Some reduction operation instruct code formats are also cleaned up.

JBS: https://bugs.openjdk.java.net/browse/JDK-8241911
Webrev: http://cr.openjdk.java.net/~yzhang/8241911/webrev.00/


Regards
Yang


From ci_notify at linaro.org  Fri Apr  3 14:22:43 2020
From: ci_notify at linaro.org (ci_notify at linaro.org)
Date: Fri, 3 Apr 2020 14:22:43 +0000 (UTC)
Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3262
	Failure
Message-ID: <116463607.14192.1585923764564.JavaMail.javamailuser@localhost>

OpenJDK AArch64 jdk/jdk build status is Failure
Build details -  https://ci.linaro.org/job/jdkX-ci-build/3262/

Changes -
  clanger: e940fc8b419408cb00fa5ceb6e598ea9bc40e233 
	- src/jdk.internal.le/share/classes/jdk/internal/org/jline/reader/ConfigurationPath.java
	- src/jdk.internal.le/share/classes/jdk/internal/org/jline/reader/ScriptEngine.java 
--"8242030: Wrong package declarations in jline classes after JDK-8241598
Reviewed-by: jlahoda
"  dfuchs: 70175514ffa1343ec75a9cb610e1adf4fd35adda 
	- src/java.base/macosx/classes/java/net/DefaultInterface.java
	- test/jdk/java/net/MulticastSocket/SetLoopbackMode.java
	- test/jdk/java/net/MulticastSocket/SetLoopbackModeIPv4.java
	- test/jdk/java/net/MulticastSocket/SetOutgoingIf.java
	- test/jdk/java/nio/channels/DatagramChannel/AdaptorMulticasting.java
	- test/jdk/java/nio/channels/DatagramChannel/MulticastSendReceiveTests.java
	- test/jdk/java/nio/channels/DatagramChannel/Promiscuous.java
	- test/lib/jdk/test/lib/NetworkConfiguration.java 
--"8241786: Improve heuristic to determine default network interface on macOS
Summary: DefaultInetrface.getDefault is updated to prefer interfaces that have non link-local addresses. NetworkConfiguration is updated to skip interface that have only link-local addresses, whether IPv4 or IPv6, for multicasting.
Reviewed-by: chegar, alanb
"  rkennke: d8d2145c205ca1bda888ebb0834fc39693bca2b7 
	- src/hotspot/cpu/x86/gc/shenandoah/shenandoahBarrierSetAssembler_x86.cpp
	- src/hotspot/share/gc/shared/gcCause.cpp
	- src/hotspot/share/gc/shared/gcCause.hpp
	- src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.cpp
	- src/hotspot/share/gc/shenandoah/c2/shenandoahBarrierSetC2.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahBarrierSet.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahBarrierSetClone.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahClosures.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahConcurrentMark.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahConcurrentRoots.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahControlThread.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahFreeSet.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeap.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeap.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeap.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeapRegion.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahHeapRegionCounters.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahMarkCompact.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahOopClosures.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahOopClosures.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahPacer.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahPacer.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahPhaseTimings.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahUtils.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahUtils.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahVMOperations.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahVMOperations.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahVerifier.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahVerifier.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahWorkerPolicy.hpp
	- src/hotspot/share/gc/shenandoah/shenandoah_globals.hpp
	- src/hotspot/share/runtime/vmOperations.hpp
	- test/hotspot/jtreg/compiler/c2/aarch64/TestVolatiles.java
	- test/hotspot/jtreg/gc/CriticalNativeArgs.java
	- test/hotspot/jtreg/gc/shenandoah/TestAllocHumongousFragment.java
	- test/hotspot/jtreg/gc/shenandoah/TestAllocIntArrays.java
	- test/hotspot/jtreg/gc/shenandoah/TestAllocObjectArrays.java
	- test/hotspot/jtreg/gc/shenandoah/TestAllocObjects.java
	- test/hotspot/jtreg/gc/shenandoah/TestGCThreadGroups.java
	- test/hotspot/jtreg/gc/shenandoah/TestHeapUncommit.java
	- test/hotspot/jtreg/gc/shenandoah/TestLotsOfCycles.java
	- test/hotspot/jtreg/gc/shenandoah/TestObjItrWithHeapDump.java
	- test/hotspot/jtreg/gc/shenandoah/TestPeriodicGC.java
	- test/hotspot/jtreg/gc/shenandoah/TestRefprocSanity.java
	- test/hotspot/jtreg/gc/shenandoah/TestRegionSampling.java
	- test/hotspot/jtreg/gc/shenandoah/TestRetainObjects.java
	- test/hotspot/jtreg/gc/shenandoah/TestSieveObjects.java
	- test/hotspot/jtreg/gc/shenandoah/TestStringDedup.java
	- test/hotspot/jtreg/gc/shenandoah/TestStringDedupStress.java
	- test/hotspot/jtreg/gc/shenandoah/TestStringInternCleanup.java
	- test/hotspot/jtreg/gc/shenandoah/TestVerifyJCStress.java
	- test/hotspot/jtreg/gc/shenandoah/TestWrongArrayMember.java
	- test/hotspot/jtreg/gc/shenandoah/mxbeans/TestChurnNotifications.java
	- test/hotspot/jtreg/gc/shenandoah/mxbeans/TestPauseNotifications.java
	- test/hotspot/jtreg/gc/shenandoah/oom/TestClassLoaderLeak.java
	- test/hotspot/jtreg/gc/shenandoah/options/TestExplicitGC.java
	- test/hotspot/jtreg/gc/shenandoah/options/TestHeuristicsUnlock.java
	- test/hotspot/jtreg/gc/shenandoah/options/TestSelectiveBarrierFlags.java
	- test/hotspot/jtreg/gc/shenandoah/options/TestWrongBarrierDisable.java
	- test/hotspot/jtreg/gc/stress/CriticalNativeStress.java
	- test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherWithShenandoah.java
	- test/hotspot/jtreg/gc/stress/gcold/TestGCOldWithShenandoah.java
	- test/hotspot/jtreg/gc/stress/systemgc/TestSystemGCWithShenandoah.java
	- src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalAggressiveHeuristics.cpp
	- src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalAggressiveHeuristics.hpp
	- src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalHeuristics.cpp
	- src/hotspot/share/gc/shenandoah/heuristics/shenandoahTraversalHeuristics.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahTraversalGC.inline.hpp
	- src/hotspot/share/gc/shenandoah/shenandoahTraversalMode.cpp
	- src/hotspot/share/gc/shenandoah/shenandoahTraversalMode.hpp 
--"8242082: Shenandoah: Purge Traversal mode
Reviewed-by: shade
"

Build output -
   Compiling 94 files for jdk.xml.dom
   Compiling 14 files for jdk.zipfs
   Compiling 15 files for java.prefs
   Compiling 30 files for java.security.sasl
   Compiling 131 files for java.rmi
   Compiling 77 files for java.sql
   Note: Some input files use or override a deprecated API that is marked for removal.
   Note: Recompile with -Xlint:removal for details.
   Compiling 138 files for BUILD_NASGEN
   Compiling 15 files for jdk.attach
   Compiling 74 files for jdk.crypto.cryptoki
   Running nasgen
   Compiling 134 files for jdk.jdeps
   Compiling 221 files for jdk.javadoc
   Compiling 40 files for jdk.jcmd
   Compiling 251 files for jdk.jdi
   Compiling 11 files for jdk.jstatd
   Compiling 14 files for jdk.management.jfr
   Compiling 188 files for jdk.rmic
   Compiling 11 files for jdk.scripting.nashorn.shell
   Note: Some input files use or override a deprecated API that is marked for removal.
   Note: Recompile with -Xlint:removal for details.
   Note: Some input files use or override a deprecated API.
   Note: Recompile with -Xlint:deprecation for details.
   Compiling 197 files for java.naming
   Compiling 83 files for jdk.jlink
   Compiling 2781 files for java.desktop
   Compiling 94 files for jdk.jshell
   Compiling 16 files for jdk.naming.dns
   Compiling 8 files for jdk.naming.rmi
   Compiling 16 files for java.management.rmi
   Compiling 219 files for java.security.jgss
   Compiling 56 files for java.sql.rowset
   Compiling 31 files for jdk.management.agent
   Compiling 30 files for jdk.security.auth
   Compiling 16 files for jdk.security.jgss
   Compiling 1686 files for jdk.internal.vm.compiler
   Compiling 108 files for jdk.aot
   Compiling 68 files for COMPILE_CREATE_SYMBOLS
   Creating ct.sym classes
   Compiling 3 files for jdk.internal.vm.compiler.management
   Compiling 64 files for jdk.jconsole
   Compiling 8 files for jdk.unsupported.desktop
   Updating support/src.zip
   Creating support/symbols/ct.sym
   Compiling 1 files for java.se
   Compiling 18 files for jdk.accessibility
   Compiling 3 files for jdk.editpad
   Compiling 1004 files for jdk.hotspot.agent
   Note: Some input files use or override a deprecated API.
   Note: Recompile with -Xlint:deprecation for details.
   Note: Some input files use unchecked or unsafe operations.
   Note: Recompile with -Xlint:unchecked for details.
   Compiling 47 files for jdk.incubator.jpackage
   /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp: In member function 'void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler*)':
   /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp:619:47: error: 'TRAVERSAL' is not a member of 'ShenandoahHeap'
      __ mov(rscratch2, ShenandoahHeap::MARKING | ShenandoahHeap::TRAVERSAL);
                                                  ^
   At global scope:
   cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-misleading-indentation' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-implicit-fallthrough' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-int-in-bool-context' [-Werror]
   cc1plus: all warnings being treated as errors
   lib/CompileJvm.gmk:181: recipe for target '/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o' failed
   make[3]: *** [/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o] Error 1
   make[3]: *** Waiting for unfinished jobs....
   make/Main.gmk:252: recipe for target 'hotspot-server-libs' failed
   make[2]: *** [hotspot-server-libs] Error 1
   
   ERROR: Build failed for target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build' (exit code 2) 
   
   === Output from failing command(s) repeated here ===
   * For target hotspot_variant-server_libjvm_objs_shenandoahBarrierSetAssembler_aarch64.o:
   /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp: In member function 'void ShenandoahBarrierSetAssembler::generate_c1_pre_barrier_runtime_stub(StubAssembler*)':
   /home/buildslave/workspace/jdkX-ci-build/jdkX/src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp:619:47: error: 'TRAVERSAL' is not a member of 'ShenandoahHeap'
      __ mov(rscratch2, ShenandoahHeap::MARKING | ShenandoahHeap::TRAVERSAL);
                                                  ^
   At global scope:
   cc1plus: error: unrecognized command line option '-Wno-cast-function-type' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-misleading-indentation' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-implicit-fallthrough' [-Werror]
   cc1plus: error: unrecognized command line option '-Wno-int-in-bool-context' [-Werror]
   cc1plus: all warnings being treated as errors
   
   * All command lines available in /home/buildslave/workspace/jdkX-ci-build/build/make-support/failure-logs.
   === End of repeated output ===
   
   === Make failed targets repeated here ===
   lib/CompileJvm.gmk:181: recipe for target '/home/buildslave/workspace/jdkX-ci-build/build/hotspot/variant-server/libjvm/objs/shenandoahBarrierSetAssembler_aarch64.o' failed
   make/Main.gmk:252: recipe for target 'hotspot-server-libs' failed
   === End of repeated output ===
   
   Hint: Try searching the build log for the name of the first failed target.
   Hint: See doc/building.html#troubleshooting for assistance.
   
   /home/buildslave/workspace/jdkX-ci-build/jdkX/make/Init.gmk:307: recipe for target 'main' failed
   make[1]: *** [main] Error 1
   /home/buildslave/workspace/jdkX-ci-build/jdkX/make/Init.gmk:186: recipe for target 'images' failed
   make: *** [images] Error 2

From ci_notify at linaro.org  Fri Apr  3 18:31:38 2020
From: ci_notify at linaro.org (ci_notify at linaro.org)
Date: Fri, 3 Apr 2020 18:31:38 +0000 (UTC)
Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3266 Fixed
Message-ID: <1893770877.14203.1585938700152.JavaMail.javamailuser@localhost>

OpenJDK AArch64 jdk/jdk build status is Fixed
Build details -  https://ci.linaro.org/job/jdkX-ci-build/3266/

Changes -
  joehw: a2126bc7fab76661cc503743e9b11fd243765b2f 
	- test/jaxp/javax/xml/jaxp/unittest/transform/ResultTest.java
	- src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXBaseWriter.java
	- src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXEventWriter.java
	- src/java.xml/share/classes/com/sun/org/apache/xalan/internal/xsltc/trax/SAX2StAXStreamWriter.java 
--"8238183: SAX2StAXStreamWriter cannot deal with comments prior to the root element
Reviewed-by: naoto, lancea
"  rkennke: 4c277b7a598a2051efa3332ee9406f61206f2381 
	- src/hotspot/cpu/aarch64/gc/shenandoah/shenandoahBarrierSetAssembler_aarch64.cpp 
--"8242107: Shenandoah: Fix aarch64 build after JDK-8242082
Reviewed-by: shade
"

Build output -
   Creating java.security.jgss.jmod
   Creating java.security.sasl.jmod
   Creating java.smartcardio.jmod
   Creating java.sql.jmod
   Creating java.sql.rowset.jmod
   Creating java.transaction.xa.jmod
   Creating java.xml.jmod
   Creating java.xml.crypto.jmod
   Creating jdk.accessibility.jmod
   Creating jdk.aot.jmod
   Creating jdk.attach.jmod
   Creating jdk.charsets.jmod
   Creating jdk.compiler.jmod
   Creating jdk.crypto.cryptoki.jmod
   Creating jdk.dynalink.jmod
   Creating jdk.editpad.jmod
   Creating jdk.crypto.ec.jmod
   Creating jdk.httpserver.jmod
   Creating jdk.hotspot.agent.jmod
   Creating jdk.incubator.foreign.jmod
   Creating jdk.incubator.jpackage.jmod
   Creating jdk.internal.ed.jmod
   Creating jdk.internal.jvmstat.jmod
   Creating jdk.internal.le.jmod
   Creating jdk.internal.opt.jmod
   Creating jdk.internal.vm.ci.jmod
   Creating jdk.internal.vm.compiler.jmod
   Creating jdk.internal.vm.compiler.management.jmod
   Creating jdk.jartool.jmod
   Creating jdk.javadoc.jmod
   Creating jdk.jcmd.jmod
   Creating jdk.jconsole.jmod
   Creating jdk.jdeps.jmod
   Creating jdk.jdi.jmod
   Creating jdk.jdwp.agent.jmod
   Creating jdk.jfr.jmod
   Creating jdk.jshell.jmod
   Creating jdk.jsobject.jmod
   Creating jdk.jstatd.jmod
   Creating jdk.localedata.jmod
   Creating jdk.management.jmod
   Creating jdk.management.agent.jmod
   Creating jdk.management.jfr.jmod
   Creating jdk.naming.dns.jmod
   Creating jdk.naming.rmi.jmod
   Creating jdk.nio.mapmode.jmod
   Creating jdk.net.jmod
   Creating jdk.rmic.jmod
   Creating jdk.scripting.nashorn.jmod
   Creating jdk.scripting.nashorn.shell.jmod
   Creating jdk.security.auth.jmod
   Creating jdk.sctp.jmod
   Creating jdk.security.jgss.jmod
   Creating jdk.unsupported.jmod
   Creating jdk.unsupported.desktop.jmod
   Creating jdk.xml.dom.jmod
   Creating jdk.zipfs.jmod
   Creating interim jimage
   Compiling 3 files for BUILD_DEMO_CodePointIM
   Updating support/demos/image/jfc/CodePointIM/src.zip
   Compiling 3 files for BUILD_DEMO_FileChooserDemo
   Updating support/demos/image/jfc/FileChooserDemo/src.zip
   Compiling 29 files for BUILD_DEMO_SwingSet2
   Updating support/demos/image/jfc/SwingSet2/src.zip
   Compiling 3 files for BUILD_DEMO_Font2DTest
   Updating support/demos/image/jfc/Font2DTest/src.zip
   Compiling 64 files for BUILD_DEMO_J2Ddemo
   Updating support/demos/image/jfc/J2Ddemo/src.zip
   Compiling 15 files for BUILD_DEMO_Metalworks
   Updating support/demos/image/jfc/Metalworks/src.zip
   Compiling 2 files for BUILD_DEMO_Notepad
   Updating support/demos/image/jfc/Notepad/src.zip
   Compiling 5 files for BUILD_DEMO_Stylepad
   Updating support/demos/image/jfc/Stylepad/src.zip
   Compiling 5 files for BUILD_DEMO_SampleTree
   Updating support/demos/image/jfc/SampleTree/src.zip
   Compiling 8 files for BUILD_DEMO_TableExample
   Updating support/demos/image/jfc/TableExample/src.zip
   Compiling 1 files for BUILD_DEMO_TransparentRuler
   Updating support/demos/image/jfc/TransparentRuler/src.zip
   Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar
   Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar
   Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar
   Creating support/demos/image/jfc/Metalworks/Metalworks.jar
   Creating support/demos/image/jfc/Notepad/Notepad.jar
   Creating support/demos/image/jfc/Stylepad/Stylepad.jar
   Creating support/demos/image/jfc/SampleTree/SampleTree.jar
   Creating support/demos/image/jfc/TableExample/TableExample.jar
   Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar
   Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar
   Compiling 1 files for CLASSLIST_JAR
   Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar
   Creating support/classlist.jar
   Creating jdk.jlink.jmod
   Creating java.base.jmod
   Creating jdk image
   WARNING: Using incubator modules: jdk.incubator.foreign, jdk.incubator.jpackage
   Creating CDS archive for jdk image
   Stopping sjavac server
   Finished building target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build'

From nick.gasson at arm.com  Tue Apr  7 07:19:10 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Tue, 07 Apr 2020 15:19:10 +0800
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
	pre-barrier if marking not active
Message-ID: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>

Hi,

Bug: https://bugs.openjdk.java.net/browse/JDK-8242029
Webrev: http://cr.openjdk.java.net/~ngasson/8242029/webrev.0/

Currently on AArch64 the G1GC array copy pre-barrier unconditionally
performs a VM call into G1BarrierSetRuntime, but this is a no-op if
marking is not in progress.

This patch adds a check to skip the call unless marking is in
progress. X86 already has this optimisation and I can't see a reason not
to do it on AArch64 as well.

Tested jtreg hotspot_all_no_apps, jdk_core.

Results of ArrayCopy.arrayCopyObject* JMH benchmarks:

Before:

  Benchmark                                    Mode  Cnt    Score    Error  Units
  ArrayCopy.arrayCopyObject                    avgt   15  117.307 ? 11.607  ns/op
  ArrayCopy.arrayCopyObjectNonConst            avgt   15  107.786 ?  2.692  ns/op
  ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15   76.381 ?  0.761  ns/op
  ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15   79.519 ?  3.433  ns/op

After:

  Benchmark                                    Mode  Cnt   Score   Error  Units
  ArrayCopy.arrayCopyObject                    avgt   15  86.161 ? 6.150  ns/op
  ArrayCopy.arrayCopyObjectNonConst            avgt   15  83.539 ? 0.682  ns/op
  ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15  52.388 ? 0.732  ns/op
  ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15  54.619 ? 1.278  ns/op

The VM call overhead can be quite high on AArch64 as we insert a
serialising ISB instruction on every native->Java return.


Thanks,
Nick

From erik.osterlund at oracle.com  Tue Apr  7 07:33:23 2020
From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=)
Date: Tue, 7 Apr 2020 09:33:23 +0200
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
	pre-barrier if marking not active
In-Reply-To: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com>

Hi Nick,

Note that you only need ISB when returning from call_VM, not call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is redundant. Arraycopy uses leaf calls. So while this optimization is great for this case, maybe removing ISB fronleaf calls has a wider effect.

It also appears to me that with Stuart?s new nmethod entry barriers enabled, ISB is never required on returns, as oop are no longer embedded in the instruction stream then (which is what the ISB protects against).

Thanks,
/Erik

> On 7 Apr 2020, at 09:19, Nick Gasson <nick.gasson at arm.com> wrote:
> 
> ?Hi,
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8242029
> Webrev: http://cr.openjdk.java.net/~ngasson/8242029/webrev.0/
> 
> Currently on AArch64 the G1GC array copy pre-barrier unconditionally
> performs a VM call into G1BarrierSetRuntime, but this is a no-op if
> marking is not in progress.
> 
> This patch adds a check to skip the call unless marking is in
> progress. X86 already has this optimisation and I can't see a reason not
> to do it on AArch64 as well.
> 
> Tested jtreg hotspot_all_no_apps, jdk_core.
> 
> Results of ArrayCopy.arrayCopyObject* JMH benchmarks:
> 
> Before:
> 
>  Benchmark                                    Mode  Cnt    Score    Error  Units
>  ArrayCopy.arrayCopyObject                    avgt   15  117.307 ? 11.607  ns/op
>  ArrayCopy.arrayCopyObjectNonConst            avgt   15  107.786 ?  2.692  ns/op
>  ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15   76.381 ?  0.761  ns/op
>  ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15   79.519 ?  3.433  ns/op
> 
> After:
> 
>  Benchmark                                    Mode  Cnt   Score   Error  Units
>  ArrayCopy.arrayCopyObject                    avgt   15  86.161 ? 6.150  ns/op
>  ArrayCopy.arrayCopyObjectNonConst            avgt   15  83.539 ? 0.682  ns/op
>  ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15  52.388 ? 0.732  ns/op
>  ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15  54.619 ? 1.278  ns/op
> 
> The VM call overhead can be quite high on AArch64 as we insert a
> serialising ISB instruction on every native->Java return.
> 
> 
> Thanks,
> Nick


From aph at redhat.com  Tue Apr  7 09:53:53 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Apr 2020 10:53:53 +0100
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
 pre-barrier if marking not active
In-Reply-To: <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <4F81F31F-CE0A-4514-A2F2-86566BB18007@oracle.com>
Message-ID: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com>

On 4/7/20 8:33 AM, Erik ?sterlund wrote:

> Note that you only need ISB when returning from call_VM, not
> call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is
> redundant. Arraycopy uses leaf calls. So while this optimization is
> great for this case, maybe removing ISB fronleaf calls has a wider
> effect.
>
> It also appears to me that with Stuart?s new nmethod entry barriers
> enabled, ISB is never required on returns, as oop are no longer
> embedded in the instruction stream then (which is what the ISB
> protects against).

That's probably true. In order to get it right for sure we'd need to
insert a bunch of assertions in debug mode. I'll have a look.

A (fairly) recent change to the ARM ARM (DDI 0487E, B2.3.5, Ordering
of instruction fetches, first para.) means that we no longer have to
be quite so paranoid about issuing ISBs. When call_VM was written the
specification of what might happen was so loose that it was almost
impossible to comply with.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Apr  7 09:55:53 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Apr 2020 10:55:53 +0100
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
 pre-barrier if marking not active
In-Reply-To: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>

On 4/7/20 8:19 AM, Nick Gasson wrote:

The patch looks good, thanks.

> This patch adds a check to skip the call unless marking is in
> progress. X86 already has this optimisation and I can't see a reason not
> to do it on AArch64 as well.

Indeed not. I don't quite remember the history of this, but I guess
that this optimization was added to x86 after we did AArch64.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From erik.osterlund at oracle.com  Tue Apr  7 10:58:39 2020
From: erik.osterlund at oracle.com (=?utf-8?Q?Erik_=C3=96sterlund?=)
Date: Tue, 7 Apr 2020 12:58:39 +0200
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
	pre-barrier if marking not active
In-Reply-To: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com>
References: <1e05ef60-bcad-8ef4-75d9-26d56952e955@redhat.com>
Message-ID: <BB53EF50-5A24-4DA6-B7F1-A5320BEBA404@oracle.com>

Hi,

I think the best place to trigger the ISB is where you clear the last java frame. That happens on the exit path of all calls that may safepoint.

The worry is code that explicitly saves and restores the last java frame, and uses leaf calls, instead of using call_VM. This solves that.

Thanks,
/Erik

> On 7 Apr 2020, at 11:56, Andrew Haley <aph at redhat.com> wrote:
> 
> ?On 4/7/20 8:33 AM, Erik ?sterlund wrote:
> 
>> Note that you only need ISB when returning from call_VM, not
>> call_VM_leaf. Leaf calls can?t safepoint, and hence the ISB is
>> redundant. Arraycopy uses leaf calls. So while this optimization is
>> great for this case, maybe removing ISB fronleaf calls has a wider
>> effect.
>> 
>> It also appears to me that with Stuart?s new nmethod entry barriers
>> enabled, ISB is never required on returns, as oop are no longer
>> embedded in the instruction stream then (which is what the ISB
>> protects against).
> 
> That's probably true. In order to get it right for sure we'd need to
> insert a bunch of assertions in debug mode. I'll have a look.
> 
> A (fairly) recent change to the ARM ARM (DDI 0487E, B2.3.5, Ordering
> of instruction fetches, first para.) means that we no longer have to
> be quite so paranoid about issuing ISBs. When call_VM was written the
> specification of what might happen was so loose that it was almost
> impossible to comply with.
> 
> -- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
> 


From aph at redhat.com  Tue Apr  7 11:52:38 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Apr 2020 12:52:38 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
Message-ID: <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>

I notice that even after applying your patch we are still using embedded
OOPs in two places.

Here in aarch64.ad:

      if (rtype == relocInfo::oop_type) {
        __ movoop(dst_reg, (jobject)con, /*immediate*/true);
      }

and here in sharedRuntime_aarch64.cpp:

    //  load oop into a register
    __ movoop(c_rarg1,
              JNIHandles::make_local(method->method_holder()->java_mirror()),
              /*immediate*/true);

Why is this?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Apr  7 12:25:14 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 7 Apr 2020 13:25:14 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
 <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
Message-ID: <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>

On 4/7/20 12:52 PM, Andrew Haley wrote:
> I notice that even after applying your patch we are still using embedded
> OOPs in two places.
> 
> Here in aarch64.ad:
> 
>       if (rtype == relocInfo::oop_type) {
>         __ movoop(dst_reg, (jobject)con, /*immediate*/true);
>       }
> 
> and here in sharedRuntime_aarch64.cpp:
> 
>     //  load oop into a register
>     __ movoop(c_rarg1,
>               JNIHandles::make_local(method->method_holder()->java_mirror()),
>               /*immediate*/true);
> 
> Why is this?

Ah, the second one is a handle, of course, and AFAIK handles don't move.
Having said that, the use of movoop on something that is the address of an
oop rather than an oop is odd,but it's done on other targets.

The C2 one is still suspect.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From derekw at marvell.com  Tue Apr  7 16:05:55 2020
From: derekw at marvell.com (Derek White)
Date: Tue, 7 Apr 2020 16:05:55 +0000
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
Message-ID: <MW2PR18MB21232B3562E0307162FDD720D2C30@MW2PR18MB2123.namprd18.prod.outlook.com>

[Sorry it took a long time to get some answers on this]

I think we should no longer enable UseBarriersForVolatile for the prototype ThunderX processors (model A1, variant 0). We believe that these should all have been replaced or decommissioned. We can add an error at startup if we detect that CPU.

Processor support is independent of whether UseBarriersForVolatile should be kept for debugging & development.

On that issue, in addition to the support being broken and not regularly tested, I think that this adds a veneer of complexity to already subtle code. Especially since about half of the uses of UseBarriersForVolatile are of the form "if not using extra barriers, add a barrier" ??. I'd be fine with seeing it go.

 - Derek White, Marvell


-----Original Message-----
From: aarch64-port-dev <aarch64-port-dev-bounces at openjdk.java.net> On Behalf Of Ningsheng Jian
Sent: Thursday, April 2, 2020 10:30 PM
To: Nick Gasson <nick.gasson at arm.com>; Andrew Dinn <adinn at redhat.com>
Cc: aarch64-port-dev at openjdk.java.net
Subject: [EXT] Re: [aarch64-port-dev ] Question about JVM option "-XX:+UseBarriersForVolatile" usage in aarch64.

External Email

----------------------------------------------------------------------
On 4/3/20 10:03 AM, Nick Gasson wrote:
> On 04/02/20 21:22 pm, Andrew Dinn wrote:
>> One reason for having this switch was to provide a comparator for our 
>> scheme to implement the Java volatile accesses using ldar/stlr. That 
>> translation scheme avoids a dmb after the stlr allowing the value 
>> being written to be committed lazily while still providing the 
>> critical guarantee that prior writes are committed before it gets 
>> committed. The switch ensures we can fall back to a 'reference' 
>> implementation based on dmbs that, amongst other things, enforces 
>> immediate commit of the volatile write after commit of its predecessors.
>>
>> By removing support we lose the ability to test cases where 
>> synchronization errors occur with our scheme by switching to the 
>> 'standard' model. That may still be useful for finding bugs (current 
>> or newly injected) in our translation and, indeed, in new HW.
> 
> OK, but keeping it is not without cost. If UseBarriersForVolatile is 
> to have value as a reference implementation we need to expend effort 
> to test it and fix any bugs that arise from changes to other parts of 
> the code (see Xiaohong's original mail).
> 
> 

Yes, if the "reference" implementation is not widely used and tested, it might be buggy and misleading. I know that when Xiaohong was working on similar part on Graal [1], she spent a lot of time tracing the UseBarriersForVolatile issue in hotspot vm. So I agree with Nick and trend to not maintain this implementation.


[1] https://urldefense.proofpoint.com/v2/url?u=https-3A__github.com_oracle_graal_pull_2181&d=DwICaQ&c=nKjWec2b6R0mOyPaz7xtfQ&r=gW0hANMfJfyELYt_X2mceubwzCNjT0vmaU97kngYUJk&m=jMk_dCAeZJaapCXZaprROXyWC7AcLwFu-UIcW3SftNw&s=NXXKV871VjPzaiilChfrqlgm-viYGQBhujbES5aGNiw&e= 

Thanks,
Ningsheng

From nick.gasson at arm.com  Wed Apr  8 06:22:49 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Wed, 08 Apr 2020 14:22:49 +0800
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
	pre-barrier if marking not active
In-Reply-To: <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>
Message-ID: <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com>

On 04/07/20 17:55 pm, Andrew Haley wrote:
>
> The patch looks good, thanks.
>

Thanks, pushed it.

>> This patch adds a check to skip the call unless marking is in
>> progress. X86 already has this optimisation and I can't see a reason not
>> to do it on AArch64 as well.
>
> Indeed not. I don't quite remember the history of this, but I guess
> that this optimization was added to x86 after we did AArch64.

I'm wondering if it's possible to optimise the array copy post-barrier
in some cases as well. That VM call ends up in G1BarrierSet::invalidate
which has a loop:

  // skip initial young cards
  for (; byte <= last_byte && *byte == G1CardTable::g1_young_card_val(); byte++);

If the card table entries for the whole array are
G1CardTable::g1_young_card_val() then it's a no-op. I tried replicating
this loop in the barrier set assembly as a filter and only calling into
the runtime when we know it has work to do. This seems to work, and
gives a similar magnitude speed-up on the ArrayCopy microbenchmarks to
the above patch. Removing the ISB on leaf calls helps too but skipping
the call entirely is a bigger win.

Do you think this is safe and worth doing?


Thanks,
Nick

From aph at redhat.com  Wed Apr  8 10:26:51 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Apr 2020 11:26:51 +0100
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <MW2PR18MB21232B3562E0307162FDD720D2C30@MW2PR18MB2123.namprd18.prod.outlook.com>
References: <MW2PR18MB21232B3562E0307162FDD720D2C30@MW2PR18MB2123.namprd18.prod.outlook.com>
Message-ID: <e8d61a76-f2b9-0835-b18c-0e4e36c615f4@redhat.com>

On 4/7/20 5:05 PM, Derek White wrote:
> I think we should no longer enable UseBarriersForVolatile for the prototype ThunderX processors (model A1, variant 0). We believe that these should all have been replaced or decommissioned. We can add an error at startup if we detect that CPU.
> 
> Processor support is independent of whether UseBarriersForVolatile should be kept for debugging & development.
> 
> On that issue, in addition to the support being broken and not regularly tested, I think that this adds a veneer of complexity to already subtle code. Especially since about half of the uses of UseBarriersForVolatile are of the form "if not using extra barriers, add a barrier" ??. I'd be fine with seeing it go.

OK, thanks.

One other use of UseBarriersForVolatile was as a fallback when HotSpot
changes broke the use of ldar/stlr for volatile. Andrew Dinn, so you think
we still need it as a fallback in case ldar/stlr handling breaks again?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Apr  8 12:38:05 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Apr 2020 13:38:05 +0100
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
 pre-barrier if marking not active
In-Reply-To: <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>
 <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <b5926ebf-4274-d4b6-da73-ac9870c29992@redhat.com>

On 4/8/20 7:22 AM, Nick Gasson wrote:
> Do you think this is safe and worth doing?

Please forgive me for turning this into a rather extreme thought
experiment: if we hand-translate all GC runtime methods into all
targets, we have an NxM problem, #collectors * #targets. So it's hard
to justify without some heavy usage. And also, it means that if any of
these runtime methods change, we'd risk falling behind on AArch64.

Can you show us the assembly instructions that we'd save?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Apr  8 13:08:56 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Apr 2020 14:08:56 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
 <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
 <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>
Message-ID: <baa4c776-8e85-2832-43d3-650377efde4b@redhat.com>

On 4/7/20 1:25 PM, Andrew Haley wrote:
> On 4/7/20 12:52 PM, Andrew Haley wrote:
>> I notice that even after applying your patch we are still using embedded
>> OOPs in two places.
>>
>> Here in aarch64.ad:
>>
>>       if (rtype == relocInfo::oop_type) {
>>         __ movoop(dst_reg, (jobject)con, /*immediate*/true);
>>       }
>>
>> and here in sharedRuntime_aarch64.cpp:
>>
>>     //  load oop into a register
>>     __ movoop(c_rarg1,
>>               JNIHandles::make_local(method->method_holder()->java_mirror()),
>>               /*immediate*/true);
>>
>> Why is this?
> 
> Ah, the second one is a handle, of course, and AFAIK handles don't move.
> Having said that, the use of movoop on something that is the address of an
> oop rather than an oop is odd,but it's done on other targets.
> 
> The C2 one is still suspect.

I made the following changes, bootstrap still works:

diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/aarch64.ad
--- a/src/hotspot/cpu/aarch64/aarch64.ad	Wed Apr 08 08:57:07 2020 -0400
+++ b/src/hotspot/cpu/aarch64/aarch64.ad	Wed Apr 08 09:03:18 2020 -0400
@@ -3160,7 +3160,7 @@
     } else {
       relocInfo::relocType rtype = $src->constant_reloc();
       if (rtype == relocInfo::oop_type) {
-        __ movoop(dst_reg, (jobject)con, /*immediate*/true);
+        __ movoop(dst_reg, (jobject)con, /*immediate*/false);
       } else if (rtype == relocInfo::metadata_type) {
         __ mov_metadata(dst_reg, (Metadata*)con);
       } else {
diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	Wed Apr 08 08:57:07 2020 -0400
+++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	Wed Apr 08 09:03:18 2020 -0400
@@ -4145,7 +4145,7 @@
   if (! immediate) {
     // nmethod barriers need to be ordered with respected to oop accesses, so
     // we can't use immediate literals as that would necessitate ISBs.
-    if (BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) {
+    if (0 && BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) {
       adr(dst, InternalAddress(address_constant((address)obj, rspec)));
       ldr(dst, Address(dst));
     } else {
diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
--- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp	Wed Apr 08 08:57:07 2020 -0400
+++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp	Wed Apr 08 09:03:18 2020 -0400
@@ -1676,11 +1676,10 @@

   // Pre-load a static method's oop into c_rarg1.
   if (method->is_static() && !is_critical_native) {
-
     //  load oop into a register
     __ movoop(c_rarg1,
               JNIHandles::make_local(method->method_holder()->java_mirror()),
-              /*immediate*/true);
+              /*immediate*/false);

     // Now handlize the static class mirror it's known not-null.
     __ str(c_rarg1, Address(sp, klass_offset));
diff -r cd06d732d5f0 src/hotspot/share/runtime/sharedRuntime.cpp
--- a/src/hotspot/share/runtime/sharedRuntime.cpp	Wed Apr 08 08:57:07 2020 -0400
+++ b/src/hotspot/share/runtime/sharedRuntime.cpp	Wed Apr 08 09:03:18 2020 -0400
@@ -2873,6 +2873,7 @@
       CodeBuffer buffer(buf);
       double locs_buf[20];
       buffer.insts()->initialize_shared_locs((relocInfo*)locs_buf, sizeof(locs_buf) / sizeof(relocInfo));
+      buffer.initialize_consts_size(8);
       MacroAssembler _masm(&buffer);

       // Fill in the signature array, for the calling-convention call.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From adinn at redhat.com  Wed Apr  8 13:18:11 2020
From: adinn at redhat.com (Andrew Dinn)
Date: Wed, 8 Apr 2020 14:18:11 +0100
Subject: [aarch64-port-dev ] Question about JVM option
 "-XX:+UseBarriersForVolatile" usage in aarch64.
In-Reply-To: <e8d61a76-f2b9-0835-b18c-0e4e36c615f4@redhat.com>
References: <MW2PR18MB21232B3562E0307162FDD720D2C30@MW2PR18MB2123.namprd18.prod.outlook.com>
 <e8d61a76-f2b9-0835-b18c-0e4e36c615f4@redhat.com>
Message-ID: <0aed395c-b485-c846-bbb0-f4b64c32d087@redhat.com>

On 08/04/2020 11:26, Andrew Haley wrote:
> One other use of UseBarriersForVolatile was as a fallback when
> HotSpot changes broke the use of ldar/stlr for volatile. Andrew Dinn,
> so you think we still need it as a fallback in case ldar/stlr
> handling breaks again?

Well, I'm probably not the person to ask as my thought was that
maintaining the two sets of paths that this flag implies was not really
much of a burden. That's probably just me though (it /was/ mostly my code).

A-and yet I can see that it's not just me who is not maintaining this
code. So, if others find this complexity a burden then I'm happy for us
to simplify things by removing the flag and the alternative paths. I
think the code has baked fairly well so perhaps we don't need this fallback.

regards,


Andrew Dinn
-----------
Senior Principal Software Engineer
Red Hat UK Ltd
Registered in England and Wales under Company Registration No. 03798903
Directors: Michael Cunningham, Michael ("Mike") O'Neill


From ci_notify at linaro.org  Wed Apr  8 15:28:39 2020
From: ci_notify at linaro.org (ci_notify at linaro.org)
Date: Wed, 8 Apr 2020 15:28:39 +0000 (UTC)
Subject: [aarch64-port-dev ] JTREG, JCStress,
 SPECjbb2015 and Hadoop/Terasort results for OpenJDK 14 on AArch64
Message-ID: <1887422348.15550.1586359720620.JavaMail.javamailuser@localhost>

This is a summary of the JTREG test results
===========================================
 
The build and test results are cycled every 15 days.
 
For detailed information on the test output please refer to: 
 
  http://openjdk.linaro.org/jdk14/openjdk-jtreg-nightly-tests/summary/2020/098/summary.html
 
-------------------------------------------------------------------------------
release/hotspot
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/23 pass: 5,773; fail: 45
Build 1: aarch64/2020/jan/30 pass: 5,773; fail: 45
Build 2: aarch64/2020/feb/06 pass: 5,773; fail: 46
Build 3: aarch64/2020/feb/09 pass: 5,775; fail: 44
Build 4: aarch64/2020/apr/07 pass: 5,781; fail: 45

1 fatal errors were detected; please follow the link above for more detail.

-------------------------------------------------------------------------------
release/jdk
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/23 pass: 8,831; fail: 524; error: 18
Build 1: aarch64/2020/jan/30 pass: 8,839; fail: 518; error: 17
Build 2: aarch64/2020/feb/06 pass: 8,838; fail: 517; error: 18
Build 3: aarch64/2020/feb/09 pass: 8,832; fail: 523; error: 18
Build 4: aarch64/2020/apr/07 pass: 8,844; fail: 505; error: 20

1 fatal errors were detected; please follow the link above for more detail.

-------------------------------------------------------------------------------
release/langtools
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/23 pass: 4,031
Build 1: aarch64/2020/jan/30 pass: 4,031
Build 2: aarch64/2020/feb/03 pass: 4,031
Build 3: aarch64/2020/feb/06 pass: 4,031
Build 4: aarch64/2020/feb/09 pass: 4,031
Build 5: aarch64/2020/apr/07 pass: 4,031

Previous results can be found here: 
 
  http://openjdk.linaro.org/jdk14/openjdk-jtreg-nightly-tests/index.html
 

SPECjbb2015 composite regression test completed
===============================================

This test measures the relative performance of the server
compiler running the SPECjbb2015 composite tests and compares
the performance against the baseline performance of the server
compiler taken on 2016-11-21.

In accordance with [1], the SPECjbb2015 tests are run on a system
which is not production ready and does not meet all the
requirements for publishing compliant results. The numbers below
shall be treated as non-compliant (nc) and are for experimental
purposes only.

Relative performance: Server max-jOPS (nc): 8.24x
Relative performance: Server critical-jOPS (nc): 9.80x

Details of the test setup and historical results may be found here:

    http://openjdk.linaro.org/jdk14/SPECjbb2015-results/

[1] http://www.spec.org/fairuse.html#Academic

Regression test Hadoop-Terasort completed
=========================================

This test measures the performance of the server and client compilers
running Hadoop sorting a 1GB file using Terasort and compares
the performance against the baseline performance of the Zero interpreter
and against the baseline performance of the server compiler
on 2014-04-01.

Relative performance: Zero: 1.0, Server: 210.67

Server 210.67 / Server 2014-04-01 (71.00): 2.97x

Details of the test setup and historical results may be found here:

    http://openjdk.linaro.org/jdk14/hadoop-terasort-benchmark-results/

This is a summary of the jcstress test results
==============================================
 
The build and test results are cycled every 15 days.
 
2020-01-24 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/023/results/
2020-02-01 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/030/results/
2020-02-08 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/037/results/
2020-02-10 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/040/results/
2020-04-08 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/2020/098/results/
 
For detailed information on the test output please refer to: 
 
  http://openjdk.linaro.org/jdk14/jcstress-nightly-runs/

From stumon01 at arm.com  Wed Apr  8 15:33:20 2020
From: stumon01 at arm.com (Stuart Monteith)
Date: Wed, 8 Apr 2020 16:33:20 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <baa4c776-8e85-2832-43d3-650377efde4b@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
 <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
 <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>
 <baa4c776-8e85-2832-43d3-650377efde4b@redhat.com>
Message-ID: <ce61dedf-739c-d618-a28e-8940a2ba4ba4@arm.com>

I see what you did there. This comes back to our previous discussion
about the value of having immediate oops at all. isn't that what you are
effectively suggesting? That would simply the code somewhat.


On 08/04/2020 14:08, Andrew Haley wrote:
> On 4/7/20 1:25 PM, Andrew Haley wrote:
>> On 4/7/20 12:52 PM, Andrew Haley wrote:
>>> I notice that even after applying your patch we are still using embedded
>>> OOPs in two places.
>>>
>>> Here in aarch64.ad:
>>>
>>>       if (rtype == relocInfo::oop_type) {
>>>         __ movoop(dst_reg, (jobject)con, /*immediate*/true);
>>>       }
>>>
>>> and here in sharedRuntime_aarch64.cpp:
>>>
>>>     //  load oop into a register
>>>     __ movoop(c_rarg1,
>>>               JNIHandles::make_local(method->method_holder()->java_mirror()),
>>>               /*immediate*/true);
>>>
>>> Why is this?
>>
>> Ah, the second one is a handle, of course, and AFAIK handles don't move.
>> Having said that, the use of movoop on something that is the address of an
>> oop rather than an oop is odd,but it's done on other targets.
>>
>> The C2 one is still suspect.
>
> I made the following changes, bootstrap still works:
>
> diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/aarch64.ad
> --- a/src/hotspot/cpu/aarch64/aarch64.ad      Wed Apr 08 08:57:07 2020 -0400
> +++ b/src/hotspot/cpu/aarch64/aarch64.ad      Wed Apr 08 09:03:18 2020 -0400
> @@ -3160,7 +3160,7 @@
>      } else {
>        relocInfo::relocType rtype = $src->constant_reloc();
>        if (rtype == relocInfo::oop_type) {
> -        __ movoop(dst_reg, (jobject)con, /*immediate*/true);
> +        __ movoop(dst_reg, (jobject)con, /*immediate*/false);
>        } else if (rtype == relocInfo::metadata_type) {
>          __ mov_metadata(dst_reg, (Metadata*)con);
>        } else {
> diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp      Wed Apr 08 08:57:07 2020 -0400
> +++ b/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp      Wed Apr 08 09:03:18 2020 -0400
> @@ -4145,7 +4145,7 @@
>    if (! immediate) {
>      // nmethod barriers need to be ordered with respected to oop accesses, so
>      // we can't use immediate literals as that would necessitate ISBs.
> -    if (BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) {
> +    if (0 && BarrierSet::barrier_set()->barrier_set_nmethod() != NULL) {
>        adr(dst, InternalAddress(address_constant((address)obj, rspec)));
>        ldr(dst, Address(dst));
>      } else {
> diff -r cd06d732d5f0 src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp
> --- a/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp       Wed Apr 08 08:57:07 2020 -0400
> +++ b/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp       Wed Apr 08 09:03:18 2020 -0400
> @@ -1676,11 +1676,10 @@
>
>    // Pre-load a static method's oop into c_rarg1.
>    if (method->is_static() && !is_critical_native) {
> -
>      //  load oop into a register
>      __ movoop(c_rarg1,
>                JNIHandles::make_local(method->method_holder()->java_mirror()),
> -              /*immediate*/true);
> +              /*immediate*/false);
>
>      // Now handlize the static class mirror it's known not-null.
>      __ str(c_rarg1, Address(sp, klass_offset));
> diff -r cd06d732d5f0 src/hotspot/share/runtime/sharedRuntime.cpp
> --- a/src/hotspot/share/runtime/sharedRuntime.cpp     Wed Apr 08 08:57:07 2020 -0400
> +++ b/src/hotspot/share/runtime/sharedRuntime.cpp     Wed Apr 08 09:03:18 2020 -0400
> @@ -2873,6 +2873,7 @@
>        CodeBuffer buffer(buf);
>        double locs_buf[20];
>        buffer.insts()->initialize_shared_locs((relocInfo*)locs_buf, sizeof(locs_buf) / sizeof(relocInfo));
> +      buffer.initialize_consts_size(8);
>        MacroAssembler _masm(&buffer);
>
>        // Fill in the signature array, for the calling-convention call.
>

IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

From aph at redhat.com  Wed Apr  8 16:15:58 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Apr 2020 17:15:58 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <ce61dedf-739c-d618-a28e-8940a2ba4ba4@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
 <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
 <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>
 <baa4c776-8e85-2832-43d3-650377efde4b@redhat.com>
 <ce61dedf-739c-d618-a28e-8940a2ba4ba4@arm.com>
Message-ID: <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com>

On 4/8/20 4:33 PM, Stuart Monteith wrote:
> I see what you did there. This comes back to our previous discussion
> about the value of having immediate oops at all. isn't that what you are
> effectively suggesting? That would simply the code somewhat.

No entirely. Immediate oops are good for most GCs. But according to what
Erik said, immediate oops are verboten when we're using ZGC with concurrent
method unloading, and it seems to be very easy to do.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Thu Apr  9 06:43:12 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Thu, 9 Apr 2020 06:43:12 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
In-Reply-To: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi 

Update the patch a little. Could you please help to review it?
http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/

Test: tier1.

-----Original Message-----
From: aarch64-port-dev <aarch64-port-dev-bounces at openjdk.java.net> On Behalf Of Yang Zhang
Sent: Friday, April 3, 2020 6:49 PM
To: hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I

Hi,

Could you please help to review this patch?

In original reduce_add2I, dst may be the same as tmp2, which may get incorrect result.
Some reduction operation instruct code formats are also cleaned up.

JBS: https://bugs.openjdk.java.net/browse/JDK-8241911
Webrev: http://cr.openjdk.java.net/~yzhang/8241911/webrev.00/


Regards
Yang


From nick.gasson at arm.com  Thu Apr  9 08:59:44 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Thu, 09 Apr 2020 16:59:44 +0800
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
	pre-barrier if marking not active
In-Reply-To: <b5926ebf-4274-d4b6-da73-ac9870c29992@redhat.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>
 <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <b5926ebf-4274-d4b6-da73-ac9870c29992@redhat.com>
Message-ID: <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com>

On 04/08/20 20:38 pm, Andrew Haley wrote:
> On 4/8/20 7:22 AM, Nick Gasson wrote:
>> Do you think this is safe and worth doing?
>
> Please forgive me for turning this into a rather extreme thought
> experiment: if we hand-translate all GC runtime methods into all
> targets, we have an NxM problem, #collectors * #targets. So it's hard
> to justify without some heavy usage. And also, it means that if any of
> these runtime methods change, we'd risk falling behind on AArch64.
>
> Can you show us the assembly instructions that we'd save?

So I'm suggesting doing the following:

--- a/src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp
+++ b/src/hotspot/cpu/aarch64/gc/g1/g1BarrierSetAssembler_aarch64.cpp
@@ -87,13 +87,43 @@ void G1BarrierSetAssembler::gen_write_ref_array_pre_barrier(MacroAssembler* masm

 void G1BarrierSetAssembler::gen_write_ref_array_post_barrier(MacroAssembler* masm, DecoratorSet decorators,
                                                              Register start, Register count, Register scratch, RegSet saved_regs) {
-  __ push(saved_regs, sp);
-  assert_different_registers(start, count, scratch);
+
+  assert_different_registers(start, count, scratch, rscratch1, rscratch2);
   assert_different_registers(c_rarg0, count);
+
+  const Register card_addr = scratch;
+  const Register end_card_addr = rscratch1;
+
+  Label skip, slowpath, next;
+
+  __ cbz(count, skip);
+
+  __ lsr(card_addr, start, CardTable::card_shift);
+
+  __ lea(end_card_addr, Address(start, count, Address::lsl(LogBytesPerHeapOop)));
+  __ lsr(end_card_addr, end_card_addr, CardTable::card_shift);
+
+  __ load_byte_map_base(rscratch2);
+  __ add(card_addr, card_addr, rscratch2);
+  __ add(end_card_addr, end_card_addr, rscratch2);
+
+  __ bind(next);
+  __ ldrb(rscratch2, Address(card_addr));
+  __ cmpw(rscratch2, (int)G1CardTable::g1_young_card_val());
+  __ br(Assembler::NE, slowpath);
+  __ cmp(card_addr, end_card_addr);
+  __ br(Assembler::EQ, skip);
+  __ add(card_addr, card_addr, 1);
+  __ b(next);
+
+  __ bind(slowpath);
+  __ push(saved_regs, sp);
   __ mov(c_rarg0, start);
   __ mov(c_rarg1, count);
   __ call_VM_leaf(CAST_FROM_FN_PTR(address, G1BarrierSetRuntime::write_ref_array_post_entry), 2);
   __ pop(saved_regs, sp);
+
+  __ bind(skip);
 }


(Add change the call sites to not pass rscratch1 as scratch.)

It has a nice speedup on the ArrayCopy microbenchmarks, but I agree this
sort of thing is a maintenance burden if it doesn't affect real
workloads.

With JDK-8242029:

Benchmark                                    Mode  Cnt   Score   Error  Units
ArrayCopy.arrayCopyObject                    avgt   15  82.314 ? 0.641  ns/op
ArrayCopy.arrayCopyObjectNonConst            avgt   15  87.351 ? 6.820  ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15  54.272 ? 1.445  ns/op
ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15  54.596 ? 1.329  ns/op

With the above modification:

Benchmark                                    Mode  Cnt   Score   Error  Units
ArrayCopy.arrayCopyObject                    avgt   15  58.913 ? 1.265  ns/op
ArrayCopy.arrayCopyObjectNonConst            avgt   15  64.682 ? 8.147  ns/op
ArrayCopy.arrayCopyObjectSameArraysBackward  avgt   15  36.866 ? 1.319  ns/op
ArrayCopy.arrayCopyObjectSameArraysForward   avgt   15  30.445 ? 3.719  ns/op


Thanks,
Nick

From aph at redhat.com  Thu Apr  9 09:41:59 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Apr 2020 10:41:59 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
In-Reply-To: <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com>

On 4/9/20 7:43 AM, Yang Zhang wrote:
> Hi
>
> Update the patch a little. Could you please help to review it?
> http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/

I've been trying to figure out why this code is so difficult to
understand. I think it's because names like tmp1 and src1 are used
regardless of what kind of thing tmp1 is.

I suggest something like

instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp)
%{
  match(Set dst (AddReductionVI i_src v_src));
  ins_cost(INSN_COST);
  effect(TEMP v_tmp, TEMP i_tmp);
  format %{ "addv  $v_tmp, T4S, $v_src\n\t"
            "umov  $i_tmp, $v_tmp, S, 0\n\t"
            "addw  $dst, $i_tmp, $i_src\t# add reduction4I"
  %}
  ins_encode %{
    __ addv(as_FloatRegister($v_tmp$$reg), __ T4S,
            as_FloatRegister($v_src$$reg));
    __ umov($i_tmp$$Register, as_FloatRegister($v_tmp$$reg), __ S, 0);
    __ addw($dst$$Register, $i_tmp$$Register, $i_src$$Register);
  %}
  ins_pipe(pipe_class_default);
%}

I think this makes the intent much clearer. Thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Thu Apr  9 11:21:42 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Thu, 9 Apr 2020 11:21:42 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
In-Reply-To: <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com>
References: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com>
Message-ID: <VI1PR0802MB255836F34678A22D9E0BE8868EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Andrew

>instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{

Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? 

-----Original Message-----
From: Andrew Haley <aph at redhat.com> 
Sent: Thursday, April 9, 2020 5:42 PM
To: Yang Zhang <Yang.Zhang at arm.com>; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: Re: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I

On 4/9/20 7:43 AM, Yang Zhang wrote:
> Hi
>
> Update the patch a little. Could you please help to review it?
> http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/

I've been trying to figure out why this code is so difficult to understand. I think it's because names like tmp1 and src1 are used regardless of what kind of thing tmp1 is.

I suggest something like

instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{
  match(Set dst (AddReductionVI i_src v_src));
  ins_cost(INSN_COST);
  effect(TEMP v_tmp, TEMP i_tmp);
  format %{ "addv  $v_tmp, T4S, $v_src\n\t"
            "umov  $i_tmp, $v_tmp, S, 0\n\t"
            "addw  $dst, $i_tmp, $i_src\t# add reduction4I"
  %}
  ins_encode %{
    __ addv(as_FloatRegister($v_tmp$$reg), __ T4S,
            as_FloatRegister($v_src$$reg));
    __ umov($i_tmp$$Register, as_FloatRegister($v_tmp$$reg), __ S, 0);
    __ addw($dst$$Register, $i_tmp$$Register, $i_src$$Register);
  %}
  ins_pipe(pipe_class_default);
%}

I think this makes the intent much clearer. Thanks.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Apr  9 12:21:22 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Apr 2020 13:21:22 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
In-Reply-To: <VI1PR0802MB255836F34678A22D9E0BE8868EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com>
 <VI1PR0802MB255836F34678A22D9E0BE8868EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <d9e67f4b-3038-b48d-ca41-4d0541e0e0a0@redhat.com>

On 4/9/20 12:21 PM, Yang Zhang wrote:
> Hi Andrew
> 
>> instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, vecX v_tmp, iRegINoSp i_tmp) %{
> 
> Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? 

That's a good point. I'll accept http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/
as it is, with a separate patch to clarify those reduction operations.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Apr  9 16:31:38 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Apr 2020 17:31:38 +0100
Subject: [aarch64-port-dev ] RFR: 8242029: AArch64: skip G1 array copy
 pre-barrier if marking not active
In-Reply-To: <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
References: <85h7xv226p.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <a97ee093-0490-59b5-f8a8-45ac1c6159ef@redhat.com>
 <85ftde1op2.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
 <b5926ebf-4274-d4b6-da73-ac9870c29992@redhat.com>
 <85h7xtuj9b.fsf@nicgas01-03-arm-vm.shanghai.arm.com>
Message-ID: <e5181d25-f70b-1dad-50db-b6a337e08bf8@redhat.com>

On 4/9/20 9:59 AM, Nick Gasson wrote:
> It has a nice speedup on the ArrayCopy microbenchmarks, but I agree this
> sort of thing is a maintenance burden if it doesn't affect real
> workloads.

Now you've got me interested.  :-)

I'm looking at the code we we execute when we call the runtime. The
call_VM_leaf() we generate is

  0x0000ffffa913a6ec:   mov	x0, x1
  0x0000ffffa913a6f0:   mov	x1, x2
  0x0000ffffa913a6f4:   stp	x8, x12, [sp, #-16]!
 ;; 0xFFFFBCE50CD4
  0x0000ffffa913a6f8:   mov	x8, #0xcd4                 	// #3284
  0x0000ffffa913a6fc:   movk	x8, #0xbce5, lsl #16
  0x0000ffffa913a700:   movk	x8, #0xffff, lsl #32
  0x0000ffffa913a704:   blr	x8
  0x0000ffffa913a708:   ldp	x8, x12, [sp], #16
  0x0000ffffa913a70c:   isb

As discussed, we can lose the ISB here. If we're not called from the
interpreter we can also lose the saving of r12 and rscratch1.

This calls G1BarrierSetRuntime::write_ref_array_post_entry()

=> 0x0000ffffbd89d750 <+0>:	adrp	x2, 0xffffbe2ae000
   0x0000ffffbd89d754 <+4>:	adrp	x4, 0xffffbe2aa000
   0x0000ffffbd89d758 <+8>:	and	x3, x0, #0xfffffffffffffff8
   0x0000ffffbd89d75c <+12>:	ldr	x2, [x2, #264]
   0x0000ffffbd89d760 <+16>:	ldr	x4, [x4, #2024]
   0x0000ffffbd89d764 <+20>:	ldrsw	x2, [x2]
   0x0000ffffbd89d768 <+24>:	madd	x2, x2, x1, x0
   0x0000ffffbd89d76c <+28>:	ldr	x0, [x4]
   0x0000ffffbd89d770 <+32>:	add	x2, x2, #0x7
   0x0000ffffbd89d774 <+36>:	and	x2, x2, #0xfffffffffffffff8
   0x0000ffffbd89d778 <+40>:	adrp	x4, 0xffffbd895000
   0x0000ffffbd89d77c <+44>:	sub	x2, x2, x3
   0x0000ffffbd89d780 <+48>:	add	x4, x4, #0x640
   0x0000ffffbd89d784 <+52>:	ldr	x5, [x0]
   0x0000ffffbd89d788 <+56>:	lsr	x2, x2, #3
   0x0000ffffbd89d78c <+60>:	ldr	x7, [x5, #88]
   0x0000ffffbd89d790 <+64>:	cmp	x7, x4
   0x0000ffffbd89d794 <+68>:	b.ne	0xffffbd89d7a8
   0x0000ffffbd89d798 <+72>:	ldr	x4, [x5, #56]
   0x0000ffffbd89d79c <+76>:	mov	x1, x3
   0x0000ffffbd89d7a0 <+80>:	mov	x16, x4
   0x0000ffffbd89d7a4 <+84>:	br	x16

which seems to be a bunch of stuff to discover the adresses to scan,
aligning them properly, followed by a virtual dispatch to
G1BarrierSet::invalidate(), which contains the loop which scans the
card table:

   0x0000ffffbda250a0 <+0>:	cbz	x2, 0xffffbda25170 <G1BarrierSet::invalidate(MemRegion)+208>
   0x0000ffffbda250a4 <+4>:	stp	x29, x30, [sp, #-48]!
   0x0000ffffbda250a8 <+8>:	add	x2, x1, x2, lsl #3
   0x0000ffffbda250ac <+12>:	mov	x29, sp
   0x0000ffffbda250b0 <+16>:	str	x21, [sp, #32]
   0x0000ffffbda250b4 <+20>:	sub	x21, x2, #0x8
   0x0000ffffbda250b8 <+24>:	ldr	x0, [x0, #64]
   0x0000ffffbda250bc <+28>:	ldr	x0, [x0, #72]
   0x0000ffffbda250c0 <+32>:	add	x1, x0, x1, lsr #9
   0x0000ffffbda250c4 <+36>:	add	x21, x0, x21, lsr #9
   0x0000ffffbda250c8 <+40>:	cmp	x21, x1
   0x0000ffffbda250cc <+44>:	b.cc	0xffffbda25164 <G1BarrierSet::invalidate(MemRegion)+196>  // b.lo, b.ul, b.last
   0x0000ffffbda250d0 <+48>:	stp	x19, x20, [sp, #16]
   0x0000ffffbda250d4 <+52>:	b	0xffffbda250e0 <G1BarrierSet::invalidate(MemRegion)+64>


   0x0000ffffbda250d8 <+56>:	cmp	x21, x1
   0x0000ffffbda250dc <+60>:	b.cc	0xffffbda25160 <G1BarrierSet::invalidate(MemRegion)+192>  // b.lo, b.ul, b.last
   0x0000ffffbda250e0 <+64>:	ldrb	w0, [x1]
   0x0000ffffbda250e4 <+68>:	mov	x19, x1
   0x0000ffffbda250e8 <+72>:	add	x1, x1, #0x1
   0x0000ffffbda250ec <+76>:	and	w0, w0, #0xff
   0x0000ffffbda250f0 <+80>:	cmp	w0, #0x8
   0x0000ffffbda250f4 <+84>:	b.eq	0xffffbda250d8 <G1BarrierSet::invalidate(MemRegion)+56>  // b.none

...

   0x0000ffffbda25160 <+192>:	ldp	x19, x20, [sp, #16]
   0x0000ffffbda25164 <+196>:	ldr	x21, [sp, #32]
   0x0000ffffbda25168 <+200>:	ldp	x29, x30, [sp], #48
   0x0000ffffbda2516c <+204>:	ret

This clearly is a fair bit more than what we'd do by hand. The thing
that baffles me, I guess, is why the runtime does all this extra
stuff.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From stuart.monteith at arm.com  Thu Apr  9 19:18:01 2020
From: stuart.monteith at arm.com (Stuart Monteith)
Date: Thu, 9 Apr 2020 20:18:01 +0100
Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86
 specifics from os_linux.cpp/hpp/inline.hpp
In-Reply-To: <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>
References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>
 <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>
Message-ID: <ec4efd73-4d23-3d9b-5542-8cd269f18cdd@arm.com>

Thanks David. I'll need someone to push this for me. Ningsheng - would
you be able to?

Thanks,
	Stuart

On 01/04/2020 11:03, David Holmes wrote:
> Hi Stuart,
> 
> On 1/04/2020 7:29 pm, Stuart Monteith wrote:
>> Hello,
>> ???????? This patch removes a couple of x86 specifics from aarch64
>> code. Tested with hotspot tier1.
>>
>> Webrev:
>> ???????? http://cr.openjdk.java.net/~smonteith/8241587/webrev.0/
>> Bug:
>> ???????? https://bugs.openjdk.java.net/browse/JDK-8241587
> 
> That clean up seems good to me.
> 
>> Thanks,
>> ???????? Stuart
>>
>> IMPORTANT NOTICE: The contents of this email and any attachments are
>> confidential and may also be privileged. If you are not the intended
>> recipient, please notify the sender immediately and do not disclose
>> the contents to any other person, use it for any purpose, or store or
>> copy the information in any medium. Thank you.
> 
> That footer seems inappropriate for OpenJDK emails.
> 
> Cheers,
> David
> 


From ci_notify at linaro.org  Thu Apr  9 23:41:53 2020
From: ci_notify at linaro.org (ci_notify at linaro.org)
Date: Thu, 9 Apr 2020 23:41:53 +0000 (UTC)
Subject: [aarch64-port-dev ] JTREG, JCStress,
 SPECjbb2015 and Hadoop/Terasort results for OpenJDK JDK on AArch64
Message-ID: <35869368.16925.1586475714461.JavaMail.javamailuser@localhost>

This is a summary of the JTREG test results
===========================================
 
The build and test results are cycled every 15 days.
 
For detailed information on the test output please refer to: 
 
  http://openjdk.linaro.org/jdkX/openjdk-jtreg-nightly-tests/summary/2020/099/summary.html
 
-------------------------------------------------------------------------------
client-release/hotspot
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 5,780; fail: 19; not run: 90

-------------------------------------------------------------------------------
client-release/jdk
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 8,495; fail: 670; error: 23

-------------------------------------------------------------------------------
client-release/langtools
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 3,970; fail: 5

-------------------------------------------------------------------------------
release/hotspot
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/13 pass: 5,770; fail: 44
Build 1: aarch64/2020/jan/15 pass: 5,770; fail: 46
Build 2: aarch64/2020/jan/20 pass: 5,776; fail: 44
Build 3: aarch64/2020/jan/22 pass: 5,776; fail: 44
Build 4: aarch64/2020/jan/24 pass: 5,775; fail: 45
Build 5: aarch64/2020/jan/27 pass: 5,776; fail: 44
Build 6: aarch64/2020/jan/29 pass: 5,776; fail: 44
Build 7: aarch64/2020/feb/01 pass: 5,777; fail: 46
Build 8: aarch64/2020/feb/03 pass: 5,777; fail: 46
Build 9: aarch64/2020/feb/05 pass: 5,778; fail: 46
Build 10: aarch64/2020/feb/10 pass: 5,781; fail: 46
Build 11: aarch64/2020/feb/12 pass: 5,786; fail: 46
Build 12: aarch64/2020/mar/06 pass: 5,797; fail: 46
Build 13: aarch64/2020/mar/16 pass: 5,796; fail: 47
Build 14: aarch64/2020/apr/08 pass: 5,816; fail: 46; error: 2

-------------------------------------------------------------------------------
release/jdk
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/13 pass: 8,825; fail: 524; error: 20
Build 1: aarch64/2020/jan/15 pass: 8,827; fail: 524; error: 19
Build 2: aarch64/2020/jan/20 pass: 8,830; fail: 529; error: 16
Build 3: aarch64/2020/jan/22 pass: 8,829; fail: 528; error: 19
Build 4: aarch64/2020/jan/24 pass: 8,832; fail: 537; error: 16
Build 5: aarch64/2020/jan/27 pass: 8,846; fail: 523; error: 17
Build 6: aarch64/2020/jan/29 pass: 8,844; fail: 522; error: 19
Build 7: aarch64/2020/feb/01 pass: 8,848; fail: 523; error: 18
Build 8: aarch64/2020/feb/03 pass: 8,851; fail: 525; error: 15
Build 9: aarch64/2020/feb/05 pass: 8,851; fail: 526; error: 15
Build 10: aarch64/2020/feb/10 pass: 8,858; fail: 518; error: 20
Build 11: aarch64/2020/feb/12 pass: 8,849; fail: 525; error: 17
Build 12: aarch64/2020/mar/06 pass: 8,870; fail: 526; error: 17
Build 13: aarch64/2020/mar/16 pass: 8,872; fail: 525; error: 16
Build 14: aarch64/2020/apr/08 pass: 8,891; fail: 532; error: 13

4 fatal errors were detected; please follow the link above for more detail.

-------------------------------------------------------------------------------
release/langtools
-------------------------------------------------------------------------------
Build 0: aarch64/2020/jan/10 pass: 4,030
Build 1: aarch64/2020/jan/13 pass: 4,030
Build 2: aarch64/2020/jan/15 pass: 4,031
Build 3: aarch64/2020/jan/20 pass: 4,033
Build 4: aarch64/2020/jan/22 pass: 4,033
Build 5: aarch64/2020/jan/24 pass: 4,033
Build 6: aarch64/2020/jan/27 pass: 4,033
Build 7: aarch64/2020/feb/01 pass: 4,036
Build 8: aarch64/2020/feb/03 pass: 4,036
Build 9: aarch64/2020/feb/05 pass: 4,036
Build 10: aarch64/2020/feb/10 pass: 4,037
Build 11: aarch64/2020/feb/12 pass: 4,037
Build 12: aarch64/2020/mar/06 pass: 4,039
Build 13: aarch64/2020/mar/16 pass: 4,039
Build 14: aarch64/2020/apr/08 pass: 4,042

-------------------------------------------------------------------------------
server-release/hotspot
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 5,787; fail: 18; not run: 90

-------------------------------------------------------------------------------
server-release/jdk
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 8,476; fail: 686; error: 27

-------------------------------------------------------------------------------
server-release/langtools
-------------------------------------------------------------------------------
Build 0: aarch64/2018/oct/15 pass: 3,970; fail: 5

Previous results can be found here: 
 
  http://openjdk.linaro.org/jdkX/openjdk-jtreg-nightly-tests/index.html
 

SPECjbb2015 composite regression test completed
===============================================

This test measures the relative performance of the server
compiler running the SPECjbb2015 composite tests and compares
the performance against the baseline performance of the server
compiler taken on 2016-11-21.

In accordance with [1], the SPECjbb2015 tests are run on a system
which is not production ready and does not meet all the
requirements for publishing compliant results. The numbers below
shall be treated as non-compliant (nc) and are for experimental
purposes only.

Relative performance: Server max-jOPS (nc): 8.14x
Relative performance: Server critical-jOPS (nc): 12.38x

Details of the test setup and historical results may be found here:

    http://openjdk.linaro.org/jdkX/SPECjbb2015-results/

[1] http://www.spec.org/fairuse.html#Academic

Regression test Hadoop-Terasort completed
=========================================

This test measures the performance of the server and client compilers
running Hadoop sorting a 1GB file using Terasort and compares
the performance against the baseline performance of the Zero interpreter
and against the baseline performance of the server compiler
on 2014-04-01.

Relative performance: Zero: 1.0, Server: 210.67

Server 210.67 / Server 2014-04-01 (71.00): 2.97x

Details of the test setup and historical results may be found here:

    http://openjdk.linaro.org/jdkX/hadoop-terasort-benchmark-results/

This is a summary of the jcstress test results
==============================================
 
The build and test results are cycled every 15 days.
 
2020-01-11 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/010/results/
2020-01-14 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/013/results/
2020-01-16 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/015/results/
2020-01-21 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/020/results/
2020-01-23 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/022/results/
2020-01-25 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/024/results/
2020-01-28 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/027/results/
2020-02-02 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/032/results/
2020-02-04 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/034/results/
2020-02-06 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/036/results/
2020-02-11 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/041/results/
2020-02-13 pass rate: 10490/10490, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/043/results/
2020-03-17 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/066/results/
2020-03-19 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/076/results/
2020-04-09 pass rate: 9702/9702, results: http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/2020/099/results/
 
For detailed information on the test output please refer to: 
 
  http://openjdk.linaro.org/jdkX/jcstress-nightly-runs/

From ningsheng.jian at arm.com  Fri Apr 10 02:15:48 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Fri, 10 Apr 2020 10:15:48 +0800
Subject: [aarch64-port-dev ] RFR(S): 8241587: Aarch64: remove x86
 specifics from os_linux.cpp/hpp/inline.hpp
In-Reply-To: <ec4efd73-4d23-3d9b-5542-8cd269f18cdd@arm.com>
References: <9be2bcfe-faa4-1dc0-53fe-989c962f0ad7@arm.com>
 <4c10a542-f629-3a37-3d11-8809d70ebeea@oracle.com>
 <ec4efd73-4d23-3d9b-5542-8cd269f18cdd@arm.com>
Message-ID: <7641c396-fde8-dc35-c355-18faba5c5a39@arm.com>

On 4/10/20 3:18 AM, Stuart Monteith wrote:
> Thanks David. I'll need someone to push this for me. Ningsheng - would
> you be able to?
> 

Pushed.

Thanks,
Ningsheng

From Yang.Zhang at arm.com  Fri Apr 10 02:45:45 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 10 Apr 2020 02:45:45 +0000
Subject: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential
 register clash issue in reduce_add2I
In-Reply-To: <d9e67f4b-3038-b48d-ca41-4d0541e0e0a0@redhat.com>
References: <VI1PR0802MB2558190831829ACF49F38C4A8EC70@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB2558A330564CE46630DFED748EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <1a9ed6d0-40bb-1dc4-4eff-b55c86627a47@redhat.com>
 <VI1PR0802MB255836F34678A22D9E0BE8868EC10@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <d9e67f4b-3038-b48d-ca41-4d0541e0e0a0@redhat.com>
Message-ID: <VI1PR0802MB2558C6BF0B64E7FD27CFFD168EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Okay. When the patch is ready, I will send it for review.

Regards
Yang

-----Original Message-----
From: Andrew Haley <aph at redhat.com> 
Sent: Thursday, April 9, 2020 8:21 PM
To: Yang Zhang <Yang.Zhang at arm.com>; hotspot-compiler-dev at openjdk.java.net; aarch64-port-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: Re: [aarch64-port-dev ] RFR(S): 8241911: AArch64: Fix a potential register clash issue in reduce_add2I

On 4/9/20 12:21 PM, Yang Zhang wrote:
> Hi Andrew
> 
>> instruct reduce_add4I(iRegINoSp dst, iRegIorL2I i_src, vecX v_src, 
>> vecX v_tmp, iRegINoSp i_tmp) %{
> 
> Besides reduce_add4I, other reduction operations (reduce_mul4I, reduce_max4F, etc) also have such issues. How about creating another JBS and patch to fix this issue? 

That's a good point. I'll accept http://cr.openjdk.java.net/~yzhang/8241911/webrev.01/
as it is, with a separate patch to clarify those reduction operations.

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Fri Apr 10 02:52:45 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 10 Apr 2020 02:52:45 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo
 introduced by JDK-8238690
Message-ID: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8242070
Webrev: http://cr.openjdk.java.net/~yzhang/8242070/webrev.00/

In JDK-8238690, it unified IR shape for vector shifts by scalar and always used

ShiftV src (ShiftCntV shift)

When shift is scalar, the following IR nodes are generated.

         scalar_shift
               |
     src  ShiftCntV
      |     /
      |    /
      ShiftV

But when implementing this on AArch64, there is an issue in match rule
of vector shift right with imm shift for short type.

match(Set dst (RShiftVS src (LShiftCntV shift)));

LShiftCntV should be RShiftCntV here.

Test case:
  public static void shiftR(short[] a, short[] c) {
      for (int i = 0; i < a.length; i++) {
          c[i] = (short)(a[i] >> 2);
      }
  }

IR nodes:
                               imm:2
                                  |
      LoadVector RShiftCntV
           |                  /
           |               /
           RShiftVS

C2 aassembly generated:

Before:
  0x0000ffffac563764:   orr	w11, wzr, #0x2
  0x0000ffffac563768:   dup	v16.16b, w11  -------- vshiftcnt16B

  0x0000ffffac5637a8:   ldr	q24, [x18, #16]
  0x0000ffffac5637ac:   neg	v25.16b, v16.16b       ------
  0x0000ffffac5637b0:   sshl	v24.8h, v24.8h, v25.8h ------vsra8S
  0x0000ffffac5637b8:   str	q24, [x14, #16]

"match(Set dst (RShiftVS src (LShiftCntV shift)));" matching fails.
RShiftCntV and RShiftVS are matched separately by vshiftcnt16B and vsra8S.

After:
  0x0000ffffac563808:   ldr	q16, [x15, #16]
  0x0000ffffac56380c:   sshr	v16.8h, v16.8h, #2
  0x0000ffffac563814:   str	q16, [x14, #16]

"match(Set dst (RShiftVS src (RShiftCntV shift)));" matching succeeds.

Performance:
JMH test case is attached in JBS.

Before:
Benchmark               Mode  Cnt   Score   Error  Units
TestVect.testVectShift  avgt   10  66.964 ? 0.052  us/op

After:
Benchmark               Mode  Cnt   Score   Error  Units
TestVect.testVectShift  avgt   10  56.156 ? 0.053  us/op

Testing: tier1
Pass and no new failure.

Regards
Yang


From gnu.andrew at redhat.com  Tue Apr 14 20:25:35 2020
From: gnu.andrew at redhat.com (Andrew John Hughes)
Date: Tue, 14 Apr 2020 21:25:35 +0100
Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync
Message-ID: <f26ee623-55fe-8593-2944-bf30b5d918a9@redhat.com>

Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/

Merge changesets:
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset
http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset

Changes in aarch64-shenandoah-jdk8u252-b09:
  - S8204152: SignedObject throws NullPointerException for null keys
with an initialized Signature object
  - S8219597: (bf) Heap buffer state changes could provoke unexpected
exceptions
  - S8223898: Forward references to Nashorn
  - S8223904: Improve Nashorn matching
  - S8224541: Better mapping of serial ENUMs
  - S8224549: Less Blocking Array Queues
  - S8225603: Enhancement for big integers
  - S8227542: Manifest improved jar headers
  - S8231415: Better signatures in XML
  - S8233250: Better X11 rendering
  - S8233410: Better Build Scripting
  - S8234027: Better JCEKS key support
  - S8234408: Improve TLS session handling
  - S8234825: Better Headings for HTTP Servers
  - S8234841: Enhance buffering of byte buffers
  - S8235274: Enhance typing of methods
  - S8236201: Better Scanner conversions
  - S8238960: linux-i586 builds are inconsistent as the newly build jdk
is not able to reserve enough space for object heap

Main issues of note:
Simple merge, no HotSpot changes.

diffstat for root
 b/.hgtags                                |    1 +
 b/common/autoconf/flags.m4               |   15 +++++++++++++--
 b/common/autoconf/generated-configure.sh |   17 ++++++++++++++---
 3 files changed, 28 insertions(+), 5 deletions(-)

diffstat for corba
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for jaxp
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for jaxws
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for langtools
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

diffstat for nashorn
 b/.hgtags                                                          |    1
 b/src/jdk/nashorn/internal/runtime/regexp/RegExpScanner.java       |
 6 +--
 b/src/jdk/nashorn/internal/runtime/regexp/joni/Parser.java         |
 6 +--
 b/src/jdk/nashorn/internal/runtime/regexp/joni/ast/StringNode.java |
16 ++++++++--
 4 files changed, 21 insertions(+), 8 deletions(-)

diffstat for jdk
 b/.hgtags
       |    1
 b/make/CompileLaunchers.gmk
       |    1
 b/src/share/classes/com/sun/crypto/provider/JceKeyStore.java
       |   28 +++
 b/src/share/classes/com/sun/crypto/provider/KeyProtector.java
       |    9 -
 b/src/share/classes/com/sun/crypto/provider/SealedObjectForKeyProtector.java  |   26 ++-
 b/src/share/classes/com/sun/net/httpserver/Headers.java
       |   34 ++++
 b/src/share/classes/java/io/ObjectInputStream.java
       |    4
 b/src/share/classes/java/io/ObjectStreamClass.java
       |   16 +-
 b/src/share/classes/java/lang/instrument/package.html
       |    7
 b/src/share/classes/java/lang/invoke/MethodType.java
       |   38 +---
 b/src/share/classes/java/math/MutableBigInteger.java
       |   24 ++-
 b/src/share/classes/java/nio/ByteBufferAs-X-Buffer.java.template
       |    1
 b/src/share/classes/java/nio/Direct-X-Buffer.java.template
       |    1
 b/src/share/classes/java/nio/Heap-X-Buffer.java.template
       |   80 ++++++----
 b/src/share/classes/java/nio/StringCharBuffer.java
       |    9 -
 b/src/share/classes/java/util/Scanner.java
       |   22 +-
 b/src/share/classes/org/jcp/xml/dsig/internal/dom/DOMKeyInfoFactory.java      |   10 +
 b/src/share/classes/org/jcp/xml/dsig/internal/dom/DOMXMLSignatureFactory.java |   10 +
 b/src/share/classes/sun/security/rsa/RSAKeyFactory.java
       |    3
 b/src/share/classes/sun/security/ssl/ClientHandshaker.java
       |    2
 b/src/share/classes/sun/security/ssl/SSLEngineImpl.java
       |    2
 b/src/share/classes/sun/security/ssl/SSLSessionImpl.java
       |   15 -
 b/src/share/classes/sun/security/ssl/SSLSocketImpl.java
       |    2
 b/src/share/instrument/InvocationAdapter.c
       |   22 ++
 b/src/share/native/sun/awt/splashscreen/splashscreen_gfx_impl.c
       |    2
 b/src/share/native/sun/security/ec/impl/mpi.c
       |    9 -
 b/src/solaris/native/sun/awt/multiVis.c
       |    2
 b/src/solaris/native/sun/java2d/x11/X11PMBlitLoops.c
       |    2
 b/src/solaris/native/sun/java2d/x11/X11TextRenderer_md.c
       |    2
 b/src/solaris/native/sun/java2d/x11/XRBackendNative.c
       |    6
 b/test/java/math/BigInteger/ModInvTime.java
       |   57 +++++++
 31 files changed, 316 insertions(+), 131 deletions(-)

diffstat for hotspot
 b/.hgtags |    1 +
 1 file changed, 1 insertion(+)

Successfully built on x86, x86_64, s390, s390x, ppc, ppc64,
ppc64le & aarch64.

Ok to push?

Thanks,
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222
https://keybase.io/gnu_andrew


From shade at redhat.com  Tue Apr 14 20:28:11 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 14 Apr 2020 22:28:11 +0200
Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync
In-Reply-To: <f26ee623-55fe-8593-2944-bf30b5d918a9@redhat.com>
References: <f26ee623-55fe-8593-2944-bf30b5d918a9@redhat.com>
Message-ID: <a7aaab1f-103f-ed21-7eb1-abec676961c0@redhat.com>

On 4/14/20 10:25 PM, Andrew John Hughes wrote:
> Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/
> 
> Merge changesets:
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset

Look trivially good.

> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset

Looks good.

> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset

Look trivially good.

> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset
> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset

Look good.

> Ok to push?

Yes.

-- 
Thanks,
-Aleksey


From gnu.andrew at redhat.com  Tue Apr 14 20:32:13 2020
From: gnu.andrew at redhat.com (Andrew John Hughes)
Date: Tue, 14 Apr 2020 21:32:13 +0100
Subject: [aarch64-port-dev ] [RFR] [8u] 8u252-b09 Upstream Sync
In-Reply-To: <a7aaab1f-103f-ed21-7eb1-abec676961c0@redhat.com>
References: <f26ee623-55fe-8593-2944-bf30b5d918a9@redhat.com>
 <a7aaab1f-103f-ed21-7eb1-abec676961c0@redhat.com>
Message-ID: <724d1017-83cb-3cbe-30eb-b74da0710b12@redhat.com>


On 14/04/2020 21:28, Aleksey Shipilev wrote:
> On 4/14/20 10:25 PM, Andrew John Hughes wrote:
>> Webrevs: https://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/
>>
>> Merge changesets:
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/corba/merge.changeset
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxp/merge.changeset
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jaxws/merge.changeset
> 
> Look trivially good.
> 
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/jdk/merge.changeset
> 
> Looks good.
> 
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/hotspot/merge.changeset
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/langtools/merge.changeset
> 
> Look trivially good.
> 
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/nashorn/merge.changeset
>> http://cr.openjdk.java.net/~andrew/shenandoah-8/u252-b09/root/merge.changeset
> 
> Look good.
> 
>> Ok to push?
> 
> Yes.
> 

Thanks, pushed.
-- 
Andrew :)

Senior Free Java Software Engineer
Red Hat, Inc. (http://www.redhat.com)

PGP Key: ed25519/0xCFDA0F9B35964222 (hkp://keys.gnupg.net)
Fingerprint = 5132 579D D154 0ED2 3E04  C5A0 CFDA 0F9B 3596 4222
https://keybase.io/gnu_andrew


From gnu.andrew at redhat.com  Tue Apr 14 20:30:56 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Tue, 14 Apr 2020 20:30:56 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jaxp: 3 new
	changesets
Message-ID: <202004142030.03EKUubc019918@aojmv0008.oracle.com>

Changeset: 70da96196e76
Author:    andrew
Date:      2020-04-06 04:05 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/70da96196e76

Added tag jdk8u252-b09 for changeset 8476d78dc695

! .hgtags

Changeset: d40b54be2536
Author:    andrew
Date:      2020-04-06 05:09 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/d40b54be2536

Merge jdk8u252-b09

! .hgtags

Changeset: 085e5483df61
Author:    andrew
Date:      2020-04-06 05:11 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jaxp/rev/085e5483df61

Added tag aarch64-shenandoah-jdk8u252-b09 for changeset d40b54be2536

! .hgtags


From gnu.andrew at redhat.com  Tue Apr 14 20:31:09 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Tue, 14 Apr 2020 20:31:09 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/langtools: 3
	new changesets
Message-ID: <202004142031.03EKV9KD020088@aojmv0008.oracle.com>

Changeset: 5177983e7b78
Author:    andrew
Date:      2020-04-06 04:06 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/5177983e7b78

Added tag jdk8u252-b09 for changeset 01036da3155c

! .hgtags

Changeset: 805c9d0d623f
Author:    andrew
Date:      2020-04-06 05:09 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/805c9d0d623f

Merge jdk8u252-b09

! .hgtags

Changeset: 22d2bfae6afe
Author:    andrew
Date:      2020-04-06 05:11 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/langtools/rev/22d2bfae6afe

Added tag aarch64-shenandoah-jdk8u252-b09 for changeset 805c9d0d623f

! .hgtags


From gnu.andrew at redhat.com  Tue Apr 14 20:31:31 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Tue, 14 Apr 2020 20:31:31 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/hotspot: 3
	new changesets
Message-ID: <202004142031.03EKVVtW020314@aojmv0008.oracle.com>

Changeset: 8915b1e17904
Author:    andrew
Date:      2020-04-06 04:06 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/8915b1e17904

Added tag jdk8u252-b09 for changeset 095e60e7fc8c

! .hgtags

Changeset: e4e81ae21643
Author:    andrew
Date:      2020-04-06 05:09 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/e4e81ae21643

Merge jdk8u252-b09

! .hgtags

Changeset: 6d1cfa6cdbab
Author:    andrew
Date:      2020-04-06 05:11 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/hotspot/rev/6d1cfa6cdbab

Added tag aarch64-shenandoah-jdk8u252-b09 for changeset e4e81ae21643

! .hgtags


From gnu.andrew at redhat.com  Tue Apr 14 20:31:24 2020
From: gnu.andrew at redhat.com (gnu.andrew at redhat.com)
Date: Tue, 14 Apr 2020 20:31:24 +0000
Subject: [aarch64-port-dev ] hg: aarch64-port/jdk8u-shenandoah/jdk: 18 new
	changesets
Message-ID: <202004142031.03EKVOic020237@aojmv0008.oracle.com>

Changeset: db82be4e049c
Author:    bchristi
Date:      2020-01-21 10:56 -0800
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/db82be4e049c

8224541: Better mapping of serial ENUMs
Reviewed-by: mschoene, rhalade, robm, rriggs, smarks, andrew

! src/share/classes/java/io/ObjectInputStream.java
! src/share/classes/java/io/ObjectStreamClass.java

Changeset: a75922cb4096
Author:    andrew
Date:      2020-04-05 19:18 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a75922cb4096

8224549: Less Blocking Array Queues
Reviewed-by: mbalao

! src/share/classes/java/io/ObjectStreamClass.java

Changeset: a5f5d7fd9be6
Author:    bpb
Date:      2019-10-29 14:07 -0700
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a5f5d7fd9be6

8225603: Enhancement for big integers
Reviewed-by: darcy, ahgross, rhalade

! src/share/classes/java/math/MutableBigInteger.java
! src/share/native/sun/security/ec/impl/mpi.c
+ test/java/math/BigInteger/ModInvTime.java

Changeset: ae9b738bfb93
Author:    mbalao
Date:      2019-11-14 15:06 -0800
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/ae9b738bfb93

8227542: Manifest improved jar headers
Reviewed-by: andrew

! src/share/classes/java/lang/instrument/package.html
! src/share/instrument/InvocationAdapter.c

Changeset: 36afd1d59467
Author:    alvdavi
Date:      2019-10-15 08:18 -0400
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/36afd1d59467

8231415: Better signatures in XML
Reviewed-by: andrew

! src/share/classes/org/jcp/xml/dsig/internal/dom/DOMKeyInfoFactory.java
! src/share/classes/org/jcp/xml/dsig/internal/dom/DOMXMLSignatureFactory.java

Changeset: 914f1b61fcff
Author:    bae
Date:      2020-01-16 18:15 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/914f1b61fcff

8233250: Better X11 rendering
Reviewed-by: andrew

! src/share/native/sun/awt/splashscreen/splashscreen_gfx_impl.c
! src/solaris/native/sun/awt/multiVis.c
! src/solaris/native/sun/java2d/x11/X11PMBlitLoops.c
! src/solaris/native/sun/java2d/x11/X11TextRenderer_md.c
! src/solaris/native/sun/java2d/x11/XRBackendNative.c

Changeset: 7749c1865b40
Author:    andrew
Date:      2020-04-06 01:59 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/7749c1865b40

8233410: Better Build Scripting
Reviewed-by: mbalao

! make/CompileLaunchers.gmk

Changeset: 27498adf3cbb
Author:    yan
Date:      2020-04-06 02:10 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/27498adf3cbb

8234027: Better JCEKS key support
Reviewed-by: andrew

! src/share/classes/com/sun/crypto/provider/JceKeyStore.java
! src/share/classes/com/sun/crypto/provider/KeyProtector.java
! src/share/classes/com/sun/crypto/provider/SealedObjectForKeyProtector.java

Changeset: b6be024c35ca
Author:    abakhtin
Date:      2019-11-25 09:50 -0800
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/b6be024c35ca

8234408: Improve TLS session handling
Reviewed-by: andrew

! src/share/classes/sun/security/ssl/ClientHandshaker.java
! src/share/classes/sun/security/ssl/SSLEngineImpl.java
! src/share/classes/sun/security/ssl/SSLSessionImpl.java
! src/share/classes/sun/security/ssl/SSLSocketImpl.java

Changeset: 6592c0288089
Author:    michaelm
Date:      2020-01-29 21:46 +0300
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/6592c0288089

8234825: Better Headings for HTTP Servers
Reviewed-by: chegar, dfuchs, igerasim

! src/share/classes/com/sun/net/httpserver/Headers.java

Changeset: f5fa8182f5af
Author:    andrew
Date:      2020-04-06 03:06 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/f5fa8182f5af

8219597: (bf) Heap buffer state changes could provoke unexpected exceptions
Reviewed-by: mbalao

! src/share/classes/java/nio/Heap-X-Buffer.java.template

Changeset: a6dcbf49526c
Author:    robm
Date:      2020-03-30 05:13 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a6dcbf49526c

8234841: Enhance buffering of byte buffers
Reviewed-by: alanb, ahgross, rhalade, psandoz

! src/share/classes/java/nio/ByteBufferAs-X-Buffer.java.template
! src/share/classes/java/nio/Direct-X-Buffer.java.template
! src/share/classes/java/nio/Heap-X-Buffer.java.template
! src/share/classes/java/nio/StringCharBuffer.java

Changeset: 34bb0aa775b2
Author:    avoitylov
Date:      2020-02-20 19:35 +0300
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/34bb0aa775b2

8235274: Enhance typing of methods
Reviewed-by: andrew

! src/share/classes/java/lang/invoke/MethodType.java

Changeset: a8f0a9ef1797
Author:    igerasim
Date:      2020-01-30 01:15 -0800
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/a8f0a9ef1797

8236201: Better Scanner conversions
Reviewed-by: ahgross, rhalade, rriggs, skoivu, smarks, andrew

! src/share/classes/java/util/Scanner.java

Changeset: 3ad9fa6a5a13
Author:    valeriep
Date:      2018-06-19 23:33 +0000
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/3ad9fa6a5a13

8204152: SignedObject throws NullPointerException for null keys with an initialized Signature object
Summary: Check for null and throw InvalidKeyException to maintain same behavior
Reviewed-by: xuelei

! src/share/classes/sun/security/rsa/RSAKeyFactory.java

Changeset: b3db2cd0d9c4
Author:    andrew
Date:      2020-04-06 04:23 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/b3db2cd0d9c4

Added tag jdk8u252-b09 for changeset 3ad9fa6a5a13

! .hgtags

Changeset: 812f64a9a671
Author:    andrew
Date:      2020-04-06 05:09 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/812f64a9a671

Merge jdk8u252-b09

! .hgtags
! make/CompileLaunchers.gmk
! src/share/classes/com/sun/net/httpserver/Headers.java
! src/share/classes/java/io/ObjectInputStream.java
! src/share/classes/java/io/ObjectStreamClass.java
! src/share/classes/java/lang/invoke/MethodType.java
! src/share/classes/java/math/MutableBigInteger.java
! src/share/classes/java/util/Scanner.java
! src/share/classes/sun/security/ssl/ClientHandshaker.java
! src/share/classes/sun/security/ssl/SSLEngineImpl.java
! src/share/classes/sun/security/ssl/SSLSocketImpl.java
! src/solaris/native/sun/awt/multiVis.c
! src/solaris/native/sun/java2d/x11/XRBackendNative.c

Changeset: 0d7976fa1bc7
Author:    andrew
Date:      2020-04-06 05:11 +0100
URL:       https://hg.openjdk.java.net/aarch64-port/jdk8u-shenandoah/jdk/rev/0d7976fa1bc7

Added tag aarch64-shenandoah-jdk8u252-b09 for changeset 812f64a9a671

! .hgtags


From stumon01 at arm.com  Thu Apr 16 11:29:19 2020
From: stumon01 at arm.com (Stuart Monteith)
Date: Thu, 16 Apr 2020 12:29:19 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
 Concurrent Class Unloading
In-Reply-To: <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <105c4a4a-59c9-8095-6d45-642595f65539@redhat.com>
 <10e5adb8-3170-253b-17c4-ed70a708e404@redhat.com>
 <8af3b484-e9d6-5571-42d8-a42a66ebdd42@redhat.com>
 <bddf2410-760e-1f21-2ed3-ec5a43f29c60@redhat.com>
 <baa4c776-8e85-2832-43d3-650377efde4b@redhat.com>
 <ce61dedf-739c-d618-a28e-8940a2ba4ba4@arm.com>
 <5018c8e8-73f9-ad71-1e0b-7874e98dea3c@redhat.com>
Message-ID: <85312260-c65f-cb86-5a44-ee77e8d04b4d@arm.com>

On 08/04/2020 17:15, Andrew Haley wrote:
> On 4/8/20 4:33 PM, Stuart Monteith wrote:
>> I see what you did there. This comes back to our previous discussion
>> about the value of having immediate oops at all. isn't that what you are
>> effectively suggesting? That would simply the code somewhat.
>
> No entirely. Immediate oops are good for most GCs. But according to what
> Erik said, immediate oops are verboten when we're using ZGC with concurrent
> method unloading, and it seems to be very easy to do.
>

I've incorporated everyone's comments into the latest:
        http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/

I've cleaned up the logic somewhat - instead movoop will only make an
oop an immediate if there aren't nmethod entry guards - hopefully the
comments make than clear. This tested OK with a full run of JTREG.

I've made adding constants to wrapper only for AARCH64 with a comment
explaining why. I presume we don't want architectures that don't need it
to take the (albeit small) overhead. I've not made it conditional on
aarch64 with ZGC or class unloading.

BR,
        Stuart


IMPORTANT NOTICE: The contents of this email and any attachments are confidential and may also be privileged. If you are not the intended recipient, please notify the sender immediately and do not disclose the contents to any other person, use it for any purpose, or store or copy the information in any medium. Thank you.

From thomas.stuefe at gmail.com  Thu Apr 16 15:18:12 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 16 Apr 2020 17:18:12 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
Message-ID: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>

Hi all,

I am currently trying to wrap my head around the various ways the
CompressedClassSpace is reserved. Coding has grown a bit in complexity with
the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
bit.

Specifically, I am looking at

Metaspace::reserve_space

and its aarch64-specific outgrow

Metaspace::reserve_preferred_space

Despite its generic-sounding name, these functions can only be used to
allocate ccs. They lack any interface description, so I parsed the code to
understand their behavior.

So I tried and here is how I think Metaspace::reserve_space works for the
various combinations of input parameters:

A) requested_addr == NULL && use_requested_addr == false:
[aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
allocation points. Failing that, return an unreserved space.
[others]: Reserve a space anywhere.

B) requested_addr == NULL && use_requested_addr == true:
[aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I
assume this would be an invalid combination, but since it is not asserted I
am not sure.
[others]: Reserve a space anywhere (use_requested_addr is ignored).

C) requested_addr != NULL && use_requested_addr == false:
[aarch64, ppc64]: First attempt to reserve at the requested address, but
only if that would cause the space to falls into the lower 4G. Failing
that, allocate at one of the preferred OS dependent allocation points.
Failing that, return an unreserved space.
[others]: Attempt to reserve at requested_addr. . Failing that, return an
unreserved space.

D) requested_addr != NULL && use_requested_addr == true:
[aarch64, ppc64]: First attempt to reserve at the requested address, but
only if that would cause the space to falls into the lower 4G. Failing
that, return an unreserved space.
[others]: Attempt to reserve at requested_addr. . Failing that, return an
unreserved space. (use_requested_addr is ignored).

Note the many subtle platform differences. E.g. on aarch64 we honor the
requested address only if ccs would fall below 4G, for all other platforms
we always honor them. Or how for most platforms the parameter
"use_requested_addr" is just ignored.

Or that on aarch64 we never seem to "try anywhere", we just try a fixed set
of attachment points and if these are all occupied we fail. Is this a bug
or by design? Can we always rely at least one  of the attachment points
being unoccupied? Looking at the options for
MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using x +
base >> shift seems not to be wanted? There seems to be no fall back mode
which would work with any value of base/shift?

About reserve_preferred_space(), I was confused why a separate
"use_requested_addr" was even needed - requested_addr!=NULL would be a
perfectly valid way to communicate that the requested address should be
used. I wish we could simplify the coding to just two cases:
- hand down a requested address, which is to be taken-or-fail (somewhat
like case D)
- hand down NULL, which means "try whatever": which for most OSes would be
really anywhere, for aarch64 could be the fixed set of attachment points.
This would be case (A).

About case (C): under which circumstances does it happen that caller code
hands down a requested address below 4G which happens to be free? Does that
make sense? In other words, if the whole point of
Metaspace::reserve_preferred_space() is "OS knows better, let it try to
find a good address", would it not make sense to just try a low address as
part of the try-addresses-loop?

Hope these questions make sense, and thanks a lot!

..Thomas

From aph at redhat.com  Thu Apr 16 16:24:06 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 16 Apr 2020 17:24:06 +0100
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
Message-ID: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>

Hi,

On 4/16/20 4:18 PM, Thomas St?fe wrote:
>
> I am currently trying to wrap my head around the various ways the
> CompressedClassSpace is reserved. Coding has grown a bit in complexity with
> the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
> changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
> bit.
>
> Specifically, I am looking at
>
> Metaspace::reserve_space
>
> and its aarch64-specific outgrow
>
> Metaspace::reserve_preferred_space

Yowza. This one is mine, I think.

> Despite its generic-sounding name, these functions can only be used to
> allocate ccs. They lack any interface description, so I parsed the code to
> understand their behavior.
>
> So I tried and here is how I think Metaspace::reserve_space works for the
> various combinations of input parameters:

Bear in mind that this was changed recently. It was (even more)
complicated before.

Please read the discussion at
https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html

> A) requested_addr == NULL && use_requested_addr == false:
> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
> allocation points. Failing that, return an unreserved space.
> [others]: Reserve a space anywhere.
>
> B) requested_addr == NULL && use_requested_addr == true:
> [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I
> assume this would be an invalid combination, but since it is not asserted I
> am not sure.
> [others]: Reserve a space anywhere (use_requested_addr is ignored).
>
> C) requested_addr != NULL && use_requested_addr == false:
> [aarch64, ppc64]: First attempt to reserve at the requested address, but
> only if that would cause the space to falls into the lower 4G. Failing
> that, allocate at one of the preferred OS dependent allocation points.
> Failing that, return an unreserved space.

Yes. We have to do that, because we can't cope with the heap base
being anything other than a multiple of 4*G. We've got rid of
rheapbase, in other words, for all compiled code.

> [others]: Attempt to reserve at requested_addr. . Failing that, return an
> unreserved space.
>
> D) requested_addr != NULL && use_requested_addr == true:
> [aarch64, ppc64]: First attempt to reserve at the requested address, but
> only if that would cause the space to falls into the lower 4G. Failing
> that, return an unreserved space.
> [others]: Attempt to reserve at requested_addr. . Failing that, return an
> unreserved space. (use_requested_addr is ignored).
>
> Note the many subtle platform differences. E.g. on aarch64 we honor the
> requested address only if ccs would fall below 4G, for all other platforms
> we always honor them. Or how for most platforms the parameter
> "use_requested_addr" is just ignored.
>
> Or that on aarch64 we never seem to "try anywhere", we just try a
> fixed set of attachment points and if these are all occupied we
> fail. Is this a bug or by design?

It's by design. We looked at it and decided that we would always be
able to allocate one of our "nice" points: they are spaced 4G apart,
and it's very unlikely that any Linux system (which is all we support)
would fail to map any of the possibilities.

> Can we always rely at least one of the attachment points being
> unoccupied? Looking at the options for
> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
> x + base >> shift seems not to be wanted? There seems to be no fall
> back mode which would work with any value of base/shift?

That is correct.

> About reserve_preferred_space(), I was confused why a separate
> "use_requested_addr" was even needed - requested_addr!=NULL would be a
> perfectly valid way to communicate that the requested address should be
> used. I wish we could simplify the coding to just two cases:
> - hand down a requested address, which is to be taken-or-fail (somewhat
> like case D)
> - hand down NULL, which means "try whatever": which for most OSes would be
> really anywhere, for aarch64 could be the fixed set of attachment points.
> This would be case (A).
>
> About case (C): under which circumstances does it happen that caller code
> hands down a requested address below 4G which happens to be free?

I don't know.

> Does that make sense? In other words, if the whole point of
> Metaspace::reserve_preferred_space() is "OS knows better, let it try
> to find a good address", would it not make sense to just try a low
> address as part of the try-addresses-loop?

We certainly don't want to have to use a dedicated heapbase register
or a shift. Just give us a multiple of 4*G and we're happy.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.stuefe at gmail.com  Thu Apr 16 17:51:04 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 16 Apr 2020 19:51:04 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
Message-ID: <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>

Hi Andrew,

Thanks for the prompt answer. See my answers inline.

On Thu, Apr 16, 2020 at 6:24 PM Andrew Haley <aph at redhat.com> wrote:

> Hi,
>
> On 4/16/20 4:18 PM, Thomas St?fe wrote:
> >
> > I am currently trying to wrap my head around the various ways the
> > CompressedClassSpace is reserved. Coding has grown a bit in complexity
> with
> > the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
> > changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
> > bit.
> >
> > Specifically, I am looking at
> >
> > Metaspace::reserve_space
> >
> > and its aarch64-specific outgrow
> >
> > Metaspace::reserve_preferred_space
>
> Yowza. This one is mine, I think.
>
> > Despite its generic-sounding name, these functions can only be used to
> > allocate ccs. They lack any interface description, so I parsed the code
> to
> > understand their behavior.
> >
> > So I tried and here is how I think Metaspace::reserve_space works for the
> > various combinations of input parameters:
>
> Bear in mind that this was changed recently. It was (even more)
> complicated before.
>
>
Yes, ccs reservation is complex, and the aarch64 parts are only a small
part of it. I would love to simplify it a bit but its not that easy.


> Please read the discussion at
>
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
>
>
Thank you for pointing me to the discussion. We have missed that review.

The reason I looked at this coding was due to the new Metaspace
implementation. In the new allocator it is possible to allocate a Klass at
ccs offset zero (currently this never happens out of accident). If
CompressedKlassPointers::base() points to the start of ccs, the resulting
narrow Klass pointer would be 0. But the VM cannot tell that apart from a
real NULL reference. So far my cheap fix has been to
move CompressedKlassPointers::base() a bit below the start of the ccs. But
as I saw yesterday that breaks the 4G-alignment-assumption on aarch64.
Nevermind, there are different ways to solve that, but I wondered why
aarch64 could not handle this crooked base address.


> > A) requested_addr == NULL && use_requested_addr == false:
> > [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
> > allocation points. Failing that, return an unreserved space.
> > [others]: Reserve a space anywhere.
> >
> > B) requested_addr == NULL && use_requested_addr == true:
> > [aarch64, ppc64]: Does nothing, returns an unreserved space immediately.
> I
> > assume this would be an invalid combination, but since it is not
> asserted I
> > am not sure.
> > [others]: Reserve a space anywhere (use_requested_addr is ignored).
> >
> > C) requested_addr != NULL && use_requested_addr == false:
> > [aarch64, ppc64]: First attempt to reserve at the requested address, but
> > only if that would cause the space to falls into the lower 4G. Failing
> > that, allocate at one of the preferred OS dependent allocation points.
> > Failing that, return an unreserved space.
>
> Yes. We have to do that, because we can't cope with the heap base
> being anything other than a multiple of 4*G. We've got rid of
> rheapbase, in other words, for all compiled code.
>
> > [others]: Attempt to reserve at requested_addr. . Failing that, return an
> > unreserved space.
> >
> > D) requested_addr != NULL && use_requested_addr == true:
> > [aarch64, ppc64]: First attempt to reserve at the requested address, but
> > only if that would cause the space to falls into the lower 4G. Failing
> > that, return an unreserved space.
> > [others]: Attempt to reserve at requested_addr. . Failing that, return an
> > unreserved space. (use_requested_addr is ignored).
> >
> > Note the many subtle platform differences. E.g. on aarch64 we honor the
> > requested address only if ccs would fall below 4G, for all other
> platforms
> > we always honor them. Or how for most platforms the parameter
> > "use_requested_addr" is just ignored.
> >
> > Or that on aarch64 we never seem to "try anywhere", we just try a
> > fixed set of attachment points and if these are all occupied we
> > fail. Is this a bug or by design?
>
> It's by design. We looked at it and decided that we would always be
> able to allocate one of our "nice" points: they are spaced 4G apart,
> and it's very unlikely that any Linux system (which is all we support)
> would fail to map any of the possibilities.
>
>
Thank you for clarifying. This clearly distinguishes aarch64 from at least
AIX and possibly Linux ppc, not sure - there we clearly want a fallback
"try anywhere". We have to look at the code again.

I believe either Goetz or me wrote the original AIX version but my memory
is dim. I am at a loss why we restricted this to AIX only. I have to talk
this over with Goetz.


> > Can we always rely at least one of the attachment points being
> > unoccupied? Looking at the options for
> > MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
> > x + base >> shift seems not to be wanted? There seems to be no fall
> > back mode which would work with any value of base/shift?
>
> That is correct.
>
> > About reserve_preferred_space(), I was confused why a separate
> > "use_requested_addr" was even needed - requested_addr!=NULL would be a
> > perfectly valid way to communicate that the requested address should be
> > used. I wish we could simplify the coding to just two cases:
> > - hand down a requested address, which is to be taken-or-fail (somewhat
> > like case D)
> > - hand down NULL, which means "try whatever": which for most OSes would
> be
> > really anywhere, for aarch64 could be the fixed set of attachment points.
> > This would be case (A).
> >
> > About case (C): under which circumstances does it happen that caller code
> > hands down a requested address below 4G which happens to be free?
>
> I don't know.
>
> > Does that make sense? In other words, if the whole point of
> > Metaspace::reserve_preferred_space() is "OS knows better, let it try
> > to find a good address", would it not make sense to just try a low
> > address as part of the try-addresses-loop?
>
> We certainly don't want to have to use a dedicated heapbase register
> or a shift. Just give us a multiple of 4*G and we're happy.
>
>
Good to know. So, zero based encoding does not have any special place in
your heart? 4G aligned base works just as well?

Thanks, Thomas


-- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>

From zgu at redhat.com  Thu Apr 16 18:11:26 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 16 Apr 2020 14:11:26 -0400
Subject: [aarch64-port-dev ] [15] RFR(T) 8243008: Shenandoah:
 TestVolatilesShenandoah test failed on aarch64
Message-ID: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com>

compiler/c2/aarch64/TestVolatilesShenandoah.java test failed on aarch64, 
because Shenandoah no long has traversal mode, but new 
incremental-update mode.

Bug: https://bugs.openjdk.java.net/browse/JDK-8243008
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8243008/webrev.00/

Thanks,

-Zhengyu


From thomas.stuefe at gmail.com  Thu Apr 16 18:14:08 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Thu, 16 Apr 2020 20:14:08 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
Message-ID: <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>

Hi Ioi,

On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:

> (I suppose you mean "compressed class space" by "ccs" :-)
>
>
Yes, I think I stole this from Stefan Karlsson :)


> <snip>
>


> I am not even sure if case (C) can happen at all.
>
> I admit that I've been guilty of making the interface even more complicated
> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
> (Relocate the CDS archive if it cannot be mapped to the
> requested address). Looks now is a good time to clean up.
>
>
The coding has been complicated to begin with, and then it usually only
gets worse since no-one has time for a revamp :( A clean up would be very
helpful.

One reason I look at this coding now, beside the aarch64 problem, was that
I try to disentangle CDS from Metaspace, especially the alignment policy.
Remember, I tried to tackle this last summer? but it keeps biting me. For
such a small problem this is weirdly complicated.


> One thing that can be cleaned up is the call to
> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>
> (a) when CDS is enabled:
>
>     Metaspace::global_initialize()
>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>        -> ... MetaspaceShared::map_archives()
>          -> ... reserve the space, eventually calling
> Metaspace::reserve_space
>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>
> (b) when CDS is disabled
>
>     Metaspace::global_initialize()
>     -> allocate_metaspace_compressed_klass_ptrs
>        -> (if cds is not enabled) Metaspace::reserve_space()
>
>
> In case (b), we should first reserve the space, and then call into
> allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments
> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
> variations
> of calls to Metaspace::reserve_space(). I think this will make it possible
> to
> drop the use_requested_addr argument and rely simply on (requested_addr !=
> NULL)
>
>
So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
Metaspace::allocate_metaspace_compressed_klass_ptrs()?

This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs()
to just "initialize compressed class space from a pre-arranged
ReservedSpace, and set up base + shift".

We could probably rename that thing
to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);

We even could move set_narrow_klass_base_and_shift() out of
Metaspace::set_up_compressed_klass_space, then it becomes a series of three
simple operations:
1) obtain a ReservedSpace however you see fit
2) register it with Metaspace as address space for ccs,
3) set_narrow_klass_base_and_shift. We would not have to hand down cds_base
to Metaspace, only for it to be used as base address
in set_narrow_klass_base_and_shift.

One question which came to me today was:

In AppCDS, DynamicArchiveBuilder::do_it() calls Metaspace::reserve_space().
Is that really needed, does a DumpRegion have anything to do with ccs?
Don't they just need some space to dump into? Hope that question is not
dumb.

Thanks, Thomas


> Thanks
> - Ioi
>
>
> Does that make sense? In other words, if the whole point of
> Metaspace::reserve_preferred_space() is "OS knows better, let it try
> to find a good address", would it not make sense to just try a low
> address as part of the try-addresses-loop?
>
> We certainly don't want to have to use a dedicated heapbase register
> or a shift. Just give us a multiple of 4*G and we're happy.
>
>
>
>

From shade at redhat.com  Thu Apr 16 18:16:11 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 16 Apr 2020 20:16:11 +0200
Subject: [aarch64-port-dev ] [15] RFR(T) 8243008: Shenandoah:
 TestVolatilesShenandoah test failed on aarch64
In-Reply-To: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com>
References: <3a701e5e-d6f9-0be8-94f8-a0110a26322b@redhat.com>
Message-ID: <386be87e-d971-6bd4-aa36-5cef4aeb5814@redhat.com>

On 4/16/20 8:11 PM, Zhengyu Gu wrote:
> compiler/c2/aarch64/TestVolatilesShenandoah.java test failed on aarch64, 
> because Shenandoah no long has traversal mode, but new 
> incremental-update mode.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8243008
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8243008/webrev.00/

Right. Looks good.

-- 
Thanks,
-Aleksey


From Yang.Zhang at arm.com  Fri Apr 17 06:34:20 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 17 Apr 2020 06:34:20 +0000
Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter
 names of reduction operations to make code clear
Message-ID: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8242482
Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/

This patch is a followup patch of previous discussion.
https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008740.html

To make the intent clear, the scalar parameter name is changed to isrc, fsrc or dsrc based on
its data type. The vector parameter name is changed to vsrc. And so does temp register.

Testing: tier1

Regards
Yang


From thomas.stuefe at gmail.com  Fri Apr 17 07:08:24 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Fri, 17 Apr 2020 09:08:24 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
Message-ID: <CAA-vtUwU123Gr2CCJrDjU3D6695H_vYZ8ErngtdUWETZJE3c9g@mail.gmail.com>

On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:

>
>
> On 4/16/20 11:14 AM, Thomas St?fe wrote:
>
> Hi Ioi,
>
> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>> (I suppose you mean "compressed class space" by "ccs" :-)
>>
>>
> Yes, I think I stole this from Stefan Karlsson :)
>
>
>> <snip>
>>
>
>
>> I am not even sure if case (C) can happen at all.
>>
>> I admit that I've been guilty of making the interface even more
>> complicated
>> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
>> (Relocate the CDS archive if it cannot be mapped to the
>> requested address). Looks now is a good time to clean up.
>>
>>
> The coding has been complicated to begin with, and then it usually only
> gets worse since no-one has time for a revamp :( A clean up would be very
> helpful.
>
> One reason I look at this coding now, beside the aarch64 problem, was that
> I try to disentangle CDS from Metaspace, especially the alignment policy.
> Remember, I tried to tackle this last summer? but it keeps biting me. For
> such a small problem this is weirdly complicated.
>
>
>> One thing that can be cleaned up is the call to
>> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>
>> (a) when CDS is enabled:
>>
>>     Metaspace::global_initialize()
>>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>        -> ... MetaspaceShared::map_archives()
>>          -> ... reserve the space, eventually calling
>> Metaspace::reserve_space
>>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>
>> (b) when CDS is disabled
>>
>>     Metaspace::global_initialize()
>>     -> allocate_metaspace_compressed_klass_ptrs
>>        -> (if cds is not enabled) Metaspace::reserve_space()
>>
>>
>> In case (b), we should first reserve the space, and then call into
>> allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments
>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
>> variations
>> of calls to Metaspace::reserve_space(). I think this will make it
>> possible to
>> drop the use_requested_addr argument and rely simply on (requested_addr
>> != NULL)
>>
>>
> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
> Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>
> This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs()
> to just "initialize compressed class space from a pre-arranged
> ReservedSpace, and set up base + shift".
>
> We could probably rename that thing
> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>
> We even could move set_narrow_klass_base_and_shift() out of
> Metaspace::set_up_compressed_klass_space, then it becomes a series of three
> simple operations:
> 1) obtain a ReservedSpace however you see fit
> 2) register it with Metaspace as address space for ccs,
> 3) set_narrow_klass_base_and_shift. We would not have to hand down
> cds_base to Metaspace, only for it to be used as base address
> in set_narrow_klass_base_and_shift.
>
>
> Yes, that seems the right thing to do. That will hopefully make the
> aarch64 initialization code a little simpler as well.
>
>
It would.

One question which came to me today was:
>
> In AppCDS, DynamicArchiveBuilder::do_it() calls
> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have
> anything to do with ccs? Don't they just need some space to dump into? Hope
> that question is not dumb.
>
> Do you mean:
>
> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
> -> MetaspaceShared::reserve_shared_space
>     -> Metaspace::reserve_space
>
> That's not necessary. When I wrote the code I thought
> Metaspace::reserve_space was a general function for reserving spaces :-)
> but as you said, this function is probably intended only for initializing
> the CCS.
>
>
Oh thank god :) That is good, this really tripped me off when reading the
code.

Thanks, Thomas


> Thanks
> - Ioi
>
> Thanks, Thomas
>
>
>> Thanks
>> - Ioi
>>
>>
>> Does that make sense? In other words, if the whole point of
>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>> to find a good address", would it not make sense to just try a low
>> address as part of the try-addresses-loop?
>>
>> We certainly don't want to have to use a dedicated heapbase register
>> or a shift. Just give us a multiple of 4*G and we're happy.
>>
>>
>>
>>
>

From ioi.lam at oracle.com  Thu Apr 16 17:46:37 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 16 Apr 2020 10:46:37 -0700
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
Message-ID: <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>

(I suppose you mean "compressed class space" by "ccs" :-)

On 4/16/20 9:24 AM, Andrew Haley wrote:
> Hi,
>
> On 4/16/20 4:18 PM, Thomas St?fe wrote:
>> I am currently trying to wrap my head around the various ways the
>> CompressedClassSpace is reserved. Coding has grown a bit in complexity with
>> the advent of CDS/AppCDS and recently some aarch64 changes atop of that,
>> changing behavior for aarch64 and ppc64. Maybe someone can enlighten me a
>> bit.
>>
>> Specifically, I am looking at
>>
>> Metaspace::reserve_space
>>
>> and its aarch64-specific outgrow
>>
>> Metaspace::reserve_preferred_space
> Yowza. This one is mine, I think.
>
>> Despite its generic-sounding name, these functions can only be used to
>> allocate ccs. They lack any interface description, so I parsed the code to
>> understand their behavior.
>>
>> So I tried and here is how I think Metaspace::reserve_space works for the
>> various combinations of input parameters:
> Bear in mind that this was changed recently. It was (even more)
> complicated before.
>
> Please read the discussion at
> https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html
>
>> A) requested_addr == NULL && use_requested_addr == false:
>> [aarch64, ppc64]: Attempt to reserve at one of the preferred OS dependent
>> allocation points. Failing that, return an unreserved space.
>> [others]: Reserve a space anywhere.
>>
>> B) requested_addr == NULL && use_requested_addr == true:
>> [aarch64, ppc64]: Does nothing, returns an unreserved space immediately. I
>> assume this would be an invalid combination, but since it is not asserted I
>> am not sure.
>> [others]: Reserve a space anywhere (use_requested_addr is ignored).
>>
>> C) requested_addr != NULL && use_requested_addr == false:
>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>> only if that would cause the space to falls into the lower 4G. Failing
>> that, allocate at one of the preferred OS dependent allocation points.
>> Failing that, return an unreserved space.
> Yes. We have to do that, because we can't cope with the heap base
> being anything other than a multiple of 4*G. We've got rid of
> rheapbase, in other words, for all compiled code.
>
>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>> unreserved space.
>>
>> D) requested_addr != NULL && use_requested_addr == true:
>> [aarch64, ppc64]: First attempt to reserve at the requested address, but
>> only if that would cause the space to falls into the lower 4G. Failing
>> that, return an unreserved space.
>> [others]: Attempt to reserve at requested_addr. . Failing that, return an
>> unreserved space. (use_requested_addr is ignored).
>>
>> Note the many subtle platform differences. E.g. on aarch64 we honor the
>> requested address only if ccs would fall below 4G, for all other platforms
>> we always honor them. Or how for most platforms the parameter
>> "use_requested_addr" is just ignored.
>>
>> Or that on aarch64 we never seem to "try anywhere", we just try a
>> fixed set of attachment points and if these are all occupied we
>> fail. Is this a bug or by design?
> It's by design. We looked at it and decided that we would always be
> able to allocate one of our "nice" points: they are spaced 4G apart,
> and it's very unlikely that any Linux system (which is all we support)
> would fail to map any of the possibilities.
>
>> Can we always rely at least one of the attachment points being
>> unoccupied? Looking at the options for
>> MacroAssembler::KlassDecodeMode on aarch64, a simple decoding using
>> x + base >> shift seems not to be wanted? There seems to be no fall
>> back mode which would work with any value of base/shift?
> That is correct.
>
>> About reserve_preferred_space(), I was confused why a separate
>> "use_requested_addr" was even needed - requested_addr!=NULL would be a
>> perfectly valid way to communicate that the requested address should be
>> used. I wish we could simplify the coding to just two cases:
>> - hand down a requested address, which is to be taken-or-fail (somewhat
>> like case D)
>> - hand down NULL, which means "try whatever": which for most OSes would be
>> really anywhere, for aarch64 could be the fixed set of attachment points.
>> This would be case (A).
>>
>> About case (C): under which circumstances does it happen that caller code
>> hands down a requested address below 4G which happens to be free?
> I don't know.
I am not even sure if case (C) can happen at all.

I admit that I've been guilty of making the interface even more complicated
with JDK-8231610 
<https://bugs.openjdk.java.net/browse/JDK-8231610>(Relocate the CDS 
archive if it cannot be mapped to the
requested address). Looks now is a good time to clean up.

One thing that can be cleaned up is the call to
Metaspace::allocate_metaspace_compressed_klass_ptrs:

(a) when CDS is enabled:

 ??? Metaspace::global_initialize()
 ??? -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
 ?????? -> ... MetaspaceShared::map_archives()
 ???????? -> ... reserve the space, eventually calling 
Metaspace::reserve_space
 ???????? -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()

(b) when CDS is disabled

 ??? Metaspace::global_initialize()
-> allocate_metaspace_compressed_klass_ptrs
 ?????? -> (if cds is not enabled) Metaspace::reserve_space()


In case (b), we should first reserve the space, and then call into
allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments
of allocate_metaspace_compressed_klass_ptrs, and will also limit the 
variations
of calls to Metaspace::reserve_space(). I think this will make it 
possible to
drop the use_requested_addr argument and rely simply on (requested_addr 
!= NULL)

Thanks
- Ioi


>> Does that make sense? In other words, if the whole point of
>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>> to find a good address", would it not make sense to just try a low
>> address as part of the try-addresses-loop?
> We certainly don't want to have to use a dedicated heapbase register
> or a shift. Just give us a multiple of 4*G and we're happy.
>


From ioi.lam at oracle.com  Thu Apr 16 18:28:50 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Thu, 16 Apr 2020 11:28:50 -0700
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
Message-ID: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>


On 4/16/20 11:14 AM, Thomas St?fe wrote:
> Hi Ioi,
>
> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com 
> <mailto:ioi.lam at oracle.com>> wrote:
>
>     (I suppose you mean "compressed class space" by "ccs" :-)
>
>
> Yes, I think I stole this from Stefan Karlsson :)
>
>     <snip>
>
>     I am not even sure if case (C) can happen at all.
>
>     I admit that I've been guilty of making the interface even more
>     complicated
>     with JDK-8231610
>     <https://bugs.openjdk.java.net/browse/JDK-8231610>(Relocate the
>     CDS archive if it cannot be mapped to the
>     requested address). Looks now is a good time to clean up.
>
>
> The coding has been complicated to begin with, and then it usually 
> only gets worse since no-one has time for a revamp :( A clean up would 
> be very helpful.
>
> One reason I look at this coding now, beside the aarch64 problem, was 
> that I try to disentangle?CDS from Metaspace, especially the alignment 
> policy. Remember, I tried to tackle this last summer? but it keeps 
> biting me. For such a small problem this is weirdly complicated.
>
>     One thing that can be cleaned up is the call to
>     Metaspace::allocate_metaspace_compressed_klass_ptrs:
>
>     (a) when CDS is enabled:
>
>     ??? Metaspace::global_initialize()
>     ??? -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>     ?????? -> ... MetaspaceShared::map_archives()
>     ???????? -> ... reserve the space, eventually calling
>     Metaspace::reserve_space
>     ???????? -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>
>     (b) when CDS is disabled
>
>     ??? Metaspace::global_initialize()
>     -> allocate_metaspace_compressed_klass_ptrs
>     ?????? -> (if cds is not enabled) Metaspace::reserve_space()
>
>
>     In case (b), we should first reserve the space, and then call into
>     allocate_metaspace_compressed_klass_ptrs. This will simplify the
>     arguments
>     of allocate_metaspace_compressed_klass_ptrs, and will also limit
>     the variations
>     of calls to Metaspace::reserve_space(). I think this will make it
>     possible to
>     drop the use_requested_addr argument and rely simply on
>     (requested_addr != NULL)
>
>
> So, in all cases we'd pre-reserve the ReservedSpace and hand it down 
> to Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>
> This would melt down 
> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just 
> "initialize compressed class space from a pre-arranged ReservedSpace, 
> and set up base?+ shift".
>
> We could probably rename that thing 
> to?Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>
> We even could move set_narrow_klass_base_and_shift() out of 
> Metaspace::set_up_compressed_klass_space, then it becomes a series of 
> three simple operations:
> 1) obtain a ReservedSpace however you see fit
> 2) register it with Metaspace as address space for ccs,
> 3) set_narrow_klass_base_and_shift. We would not have to hand down 
> cds_base to Metaspace, only for it to be used as base address 
> in?set_narrow_klass_base_and_shift.
>

Yes, that seems the right thing to do. That will hopefully make the 
aarch64 initialization code a little simpler as well.

> One question which came to me today was:
>
> In AppCDS, DynamicArchiveBuilder::do_it() calls 
> Metaspace::reserve_space(). Is that really needed,?does a DumpRegion 
> have anything to do with ccs? Don't they just need some space to dump 
> into? Hope that question is not dumb.
>
Do you mean:

DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
-> MetaspaceShared::reserve_shared_space
 ??? -> Metaspace::reserve_space

That's not necessary. When I wrote the code I thought 
Metaspace::reserve_space was a general function for reserving spaces :-) 
but as you said, this function is probably intended only for 
initializing the CCS.

Thanks
- Ioi

> Thanks, Thomas
>
>     Thanks
>     - Ioi
>
>
>>>     Does that make sense? In other words, if the whole point of
>>>     Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>>     to find a good address", would it not make sense to just try a low
>>>     address as part of the try-addresses-loop?
>>     We certainly don't want to have to use a dedicated heapbase register
>>     or a shift. Just give us a multiple of 4*G and we're happy.
>>
>


From aph at redhat.com  Fri Apr 17 08:42:10 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 17 Apr 2020 09:42:10 +0100
Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter
 names of reduction operations to make code clear
In-Reply-To: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <f998a1cc-2ff8-d3b0-e3db-6c9ef4ffecd8@redhat.com>

On 4/17/20 7:34 AM, Yang Zhang wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8242482
> Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/
>
> This patch is a followup patch of previous discussion.
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008740.html
>
> To make the intent clear, the scalar parameter name is changed to isrc, fsrc or dsrc based on
> its data type. The vector parameter name is changed to vsrc. And so does temp register.

Thanks, that's much nicer. I haven't been able to check every
substitution, though. I'm not quite sure about how to do that.
Is all this stuff covered by our test cases?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Fri Apr 17 08:47:55 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 17 Apr 2020 09:47:55 +0100
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
Message-ID: <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>

On 4/16/20 6:51 PM, Thomas St?fe wrote:
> Good to know. So, zero based encoding does not have any special place in
> your heart? 4G aligned base works just as well?

Absolutely, yes.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Fri Apr 17 09:13:11 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 17 Apr 2020 09:13:11 +0000
Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter
 names of reduction operations to make code clear
In-Reply-To: <f998a1cc-2ff8-d3b0-e3db-6c9ef4ffecd8@redhat.com>
References: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <f998a1cc-2ff8-d3b0-e3db-6c9ef4ffecd8@redhat.com>
Message-ID: <VI1PR0802MB2558027F96432B28B3C25EFF8ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Andrew
Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations.  

In this directory, there are also some test cases about reduction operations,  which is added in [1].
https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtreg/compiler/loopopts/superword

[1] https://bugs.openjdk.java.net/browse/JDK-8240248

Regards
Yang

-----Original Message-----
From: Andrew Haley <aph at redhat.com> 
Sent: Friday, April 17, 2020 4:42 PM
To: Yang Zhang <Yang.Zhang at arm.com>; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: Re: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear

On 4/17/20 7:34 AM, Yang Zhang wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8242482
> Webrev: http://cr.openjdk.java.net/~yzhang/8242482/webrev.00/
>
> This patch is a followup patch of previous discussion.
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/00
> 8740.html
>
> To make the intent clear, the scalar parameter name is changed to 
> isrc, fsrc or dsrc based on its data type. The vector parameter name is changed to vsrc. And so does temp register.

Thanks, that's much nicer. I haven't been able to check every substitution, though. I'm not quite sure about how to do that.
Is all this stuff covered by our test cases?

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Fri Apr 17 09:14:24 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 17 Apr 2020 09:14:24 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo
 introduced by JDK-8238690
In-Reply-To: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Andrew

Ping it again. Could you please help to review this?

Regards
Yang

-----Original Message-----
From: aarch64-port-dev <aarch64-port-dev-bounces at openjdk.java.net> On Behalf Of Yang Zhang
Sent: Friday, April 10, 2020 10:53 AM
To: aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo introduced by JDK-8238690

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8242070
Webrev: http://cr.openjdk.java.net/~yzhang/8242070/webrev.00/

In JDK-8238690, it unified IR shape for vector shifts by scalar and always used

ShiftV src (ShiftCntV shift)

When shift is scalar, the following IR nodes are generated.

         scalar_shift
               |
     src  ShiftCntV
      |     /
      |    /
      ShiftV

But when implementing this on AArch64, there is an issue in match rule of vector shift right with imm shift for short type.

match(Set dst (RShiftVS src (LShiftCntV shift)));

LShiftCntV should be RShiftCntV here.

Test case:
  public static void shiftR(short[] a, short[] c) {
      for (int i = 0; i < a.length; i++) {
          c[i] = (short)(a[i] >> 2);
      }
  }

IR nodes:
                               imm:2
                                  |
      LoadVector RShiftCntV
           |                  /
           |               /
           RShiftVS

C2 aassembly generated:

Before:
  0x0000ffffac563764:   orr	w11, wzr, #0x2
  0x0000ffffac563768:   dup	v16.16b, w11  -------- vshiftcnt16B

  0x0000ffffac5637a8:   ldr	q24, [x18, #16]
  0x0000ffffac5637ac:   neg	v25.16b, v16.16b       ------
  0x0000ffffac5637b0:   sshl	v24.8h, v24.8h, v25.8h ------vsra8S
  0x0000ffffac5637b8:   str	q24, [x14, #16]

"match(Set dst (RShiftVS src (LShiftCntV shift)));" matching fails.
RShiftCntV and RShiftVS are matched separately by vshiftcnt16B and vsra8S.

After:
  0x0000ffffac563808:   ldr	q16, [x15, #16]
  0x0000ffffac56380c:   sshr	v16.8h, v16.8h, #2
  0x0000ffffac563814:   str	q16, [x14, #16]

"match(Set dst (RShiftVS src (RShiftCntV shift)));" matching succeeds.

Performance:
JMH test case is attached in JBS.

Before:
Benchmark               Mode  Cnt   Score   Error  Units
TestVect.testVectShift  avgt   10  66.964 ? 0.052  us/op

After:
Benchmark               Mode  Cnt   Score   Error  Units
TestVect.testVectShift  avgt   10  56.156 ? 0.053  us/op

Testing: tier1
Pass and no new failure.

Regards
Yang


From thomas.stuefe at gmail.com  Sat Apr 18 06:26:24 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Apr 2020 08:26:24 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
Message-ID: <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>

On Fri, Apr 17, 2020 at 10:48 AM Andrew Haley <aph at redhat.com> wrote:

> On 4/16/20 6:51 PM, Thomas St?fe wrote:
> > Good to know. So, zero based encoding does not have any special place in
> > your heart? 4G aligned base works just as well?
>
> Absolutely, yes.
>
>
Just occurred to me that aarch64 also relies on SharedBaseAddress being 4G
aligned. The default is 32G so it works out. If you modify it with
-XX:SharedBaseAddress, looks like the setting is ignored when the value is
not usable.

I guess that is all okay, maybe we just need to make it more explicit in
coding.

..Thomas


> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>

From thomas.stuefe at gmail.com  Sat Apr 18 07:15:21 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Sat, 18 Apr 2020 09:15:21 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
Message-ID: <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>

Hi Ioi,

I am working on a small patch and have some more questions.

- First, a simple one, in
DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the
space does not have anything to do with metaspace, as you wrote, so the
alignment could be anything, right?

- Out of curiousity, when you pack the different regions (DumpRegion::pack)
you align the end to page size. Why? Why could the next region not simply
follow immediately? I looked if any code needs a region to be page aligned,
but may have missed it.

- void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :

I assume this code has to work for all three cases right
1) lp32.
2) lp64 with and without UseCompressedClassPointers?
3) lp64 without UseCompressedClassPointers?

If yes, does the setting for UseCompressedClassPointers have to be the same
at run time?


In this layout:
  // On 64-bit VM, the heap and class space layout will be the same as if
  // you're running in -Xshare:on mode:
  //
  //                              +-- SharedBaseAddress (default =
0x800000000)
  //                              v
  // +-..---------+---------+ ... +----+----+----+--------------------+
  // |    Heap    | Archive |     | MC | RW | RO |    class space     |
  // +-..---------+---------+ ... +----+----+----+--------------------+
  // |<--   MaxHeapSize  -->|     |<-- UnscaledClassSpaceMax = 4GB -->|
  //

Why does the class space has to follow mc+rw+ro? Could it come before?

Actually, does it have to be in the same space at all, or could it live
somewhere completely different?

Thanks!

On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:

>
>
> On 4/16/20 11:14 AM, Thomas St?fe wrote:
>
> Hi Ioi,
>
> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>> (I suppose you mean "compressed class space" by "ccs" :-)
>>
>>
> Yes, I think I stole this from Stefan Karlsson :)
>
>
>> <snip>
>>
>
>
>> I am not even sure if case (C) can happen at all.
>>
>> I admit that I've been guilty of making the interface even more
>> complicated
>> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
>> (Relocate the CDS archive if it cannot be mapped to the
>> requested address). Looks now is a good time to clean up.
>>
>>
> The coding has been complicated to begin with, and then it usually only
> gets worse since no-one has time for a revamp :( A clean up would be very
> helpful.
>
> One reason I look at this coding now, beside the aarch64 problem, was that
> I try to disentangle CDS from Metaspace, especially the alignment policy.
> Remember, I tried to tackle this last summer? but it keeps biting me. For
> such a small problem this is weirdly complicated.
>
>
>> One thing that can be cleaned up is the call to
>> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>
>> (a) when CDS is enabled:
>>
>>     Metaspace::global_initialize()
>>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>        -> ... MetaspaceShared::map_archives()
>>          -> ... reserve the space, eventually calling
>> Metaspace::reserve_space
>>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>
>> (b) when CDS is disabled
>>
>>     Metaspace::global_initialize()
>>     -> allocate_metaspace_compressed_klass_ptrs
>>        -> (if cds is not enabled) Metaspace::reserve_space()
>>
>>
>> In case (b), we should first reserve the space, and then call into
>> allocate_metaspace_compressed_klass_ptrs. This will simplify the arguments
>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
>> variations
>> of calls to Metaspace::reserve_space(). I think this will make it
>> possible to
>> drop the use_requested_addr argument and rely simply on (requested_addr
>> != NULL)
>>
>>
> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
> Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>
> This would melt down Metaspace::allocate_metaspace_compressed_klass_ptrs()
> to just "initialize compressed class space from a pre-arranged
> ReservedSpace, and set up base + shift".
>
> We could probably rename that thing
> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>
> We even could move set_narrow_klass_base_and_shift() out of
> Metaspace::set_up_compressed_klass_space, then it becomes a series of three
> simple operations:
> 1) obtain a ReservedSpace however you see fit
> 2) register it with Metaspace as address space for ccs,
> 3) set_narrow_klass_base_and_shift. We would not have to hand down
> cds_base to Metaspace, only for it to be used as base address
> in set_narrow_klass_base_and_shift.
>
>
> Yes, that seems the right thing to do. That will hopefully make the
> aarch64 initialization code a little simpler as well.
>
> One question which came to me today was:
>
> In AppCDS, DynamicArchiveBuilder::do_it() calls
> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have
> anything to do with ccs? Don't they just need some space to dump into? Hope
> that question is not dumb.
>
> Do you mean:
>
> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
> -> MetaspaceShared::reserve_shared_space
>     -> Metaspace::reserve_space
>
> That's not necessary. When I wrote the code I thought
> Metaspace::reserve_space was a general function for reserving spaces :-)
> but as you said, this function is probably intended only for initializing
> the CCS.
>
> Thanks
> - Ioi
>
> Thanks, Thomas
>
>
>> Thanks
>> - Ioi
>>
>>
>> Does that make sense? In other words, if the whole point of
>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>> to find a good address", would it not make sense to just try a low
>> address as part of the try-addresses-loop?
>>
>> We certainly don't want to have to use a dedicated heapbase register
>> or a shift. Just give us a multiple of 4*G and we're happy.
>>
>>
>>
>>
>

From nick.gasson at arm.com  Mon Apr 20 02:35:48 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Mon, 20 Apr 2020 10:35:48 +0800
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
Message-ID: <851roiopdn.fsf@arm.com>


On 04/17/20 16:47 pm, Andrew Haley wrote:
> On 4/16/20 6:51 PM, Thomas St?fe wrote:
>> Good to know. So, zero based encoding does not have any special place in
>> your heart? 4G aligned base works just as well?
>
> Absolutely, yes.

There's an extra constraint: above 32G the alignment needs to be
(4 << LogKlassAlignmentInBytes)*G if the compressed klass shift is
non-zero. There's some explanation here:

https://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2019-December/037472.html


Nick


From nick.gasson at arm.com  Mon Apr 20 02:55:29 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Mon, 20 Apr 2020 10:55:29 +0800
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
 <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>
Message-ID: <85zhb6n9we.fsf@arm.com>

On 04/18/20 14:26 pm, Thomas St?fe wrote:
> Just occurred to me that aarch64 also relies on SharedBaseAddress being 4G
> aligned. The default is 32G so it works out. If you modify it with
> -XX:SharedBaseAddress, looks like the setting is ignored when the value is
> not usable.
>

Yes that's correct, it's treated as a hint on AArch64 since
8234794. Because MacroAssembler::{decode,encode}_klass cannot implement
arbitrary base + (src << shift) without an additional temporary register
that isn't always available when it's called. It seems better to
constrain the possible base addresses than reserve a dedicated
compressed class base register.

All the different compressed class decoding modes should now be covered
by the jtreg tests. So if you have access to an AArch64 machine, running
these should be sufficient to prevent regressions. I'm also happy to
help with testing.


Thanks,
Nick

From Pengfei.Li at arm.com  Mon Apr 20 04:32:00 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Mon, 20 Apr 2020 04:32:00 +0000
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>,
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
Message-ID: <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi Wei,

> Thanks for all feedback. I think this patch has enough review and can be merged.
> 
> Hi Pengfei,
>
>  I need help to push it. Could you help to merge it?

I'm not a reviewer, and not sure whether your updated webrev.01 [1] still requires an official reviewer to confirm.

Maybe Andrew Haley or other AArch64 reviewers can help?

[1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/

--
Thanks,
Pengfei


From thomas.stuefe at gmail.com  Mon Apr 20 06:35:04 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Apr 2020 08:35:04 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <85zhb6n9we.fsf@arm.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
 <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>
 <85zhb6n9we.fsf@arm.com>
Message-ID: <CAA-vtUysw8+eQPYdTv_aHqn2y93VEifVPFTuLAZWh3eTcZz+3A@mail.gmail.com>

Hi Nick,

thanks for your explanations!

This may be a stupid question, but you rheapbase was used for both
en/decoding compressed oops and compressed class pointers, right? Just was
confused by the name.

I would like to abstract the logic of allocating "good" memory for ccs away
from the shared code into platform dependent files (e.g.
metaspace_aarch64.cpp) so that each platform can cleanly implement whatever
they feel is right. Ages ago we had a similar thing for allocating heap
memory on AIX, that worked quite well.

So, if I were to give you these prototypes:

+  // Given a size, reserve a space anywhere, suitable to be used as
backing storage for ccs. The return
+  // address will be the base address for encoding/decoding compressed
Klass pointers.
+  // Depending on the platform, this function may allocate anywhere or
attempt platform specific optimized
+  // placement.
+  // If size is not aligned to Metaspace::reserve_alignment it will be
corrected, so the returned space may be larger.
+  // On failure an unreserved space is returned.
+  static ReservedSpace reserve_compressed_class_space_anywhere(size_t
size);
+
+  // Given a size and an address p, reserve a space at address p, suitable
to be used as backing storage for ccs.
+  // If p is not a suitable base address for encoding/decoding compressed
Klass pointers, function will fail.
+  // Attach point has to be aligned to metaspace reserve alignment.
+  // If size is not aligned to Metaspace::reserve_alignment it will be
corrected, so the returned space may be larger.
+  // On failure an unreserved space is returned.
+  static ReservedSpace reserve_compressed_class_space_at(address p, size_t
size);

this should be enough to implement platform dependent ccs allocation, right?

Cheers, Thomas

On Mon, Apr 20, 2020 at 4:56 AM Nick Gasson <nick.gasson at arm.com> wrote:

> On 04/18/20 14:26 pm, Thomas St?fe wrote:
> > Just occurred to me that aarch64 also relies on SharedBaseAddress being
> 4G
> > aligned. The default is 32G so it works out. If you modify it with
> > -XX:SharedBaseAddress, looks like the setting is ignored when the value
> is
> > not usable.
> >
>
> Yes that's correct, it's treated as a hint on AArch64 since
> 8234794. Because MacroAssembler::{decode,encode}_klass cannot implement
> arbitrary base + (src << shift) without an additional temporary register
> that isn't always available when it's called. It seems better to
> constrain the possible base addresses than reserve a dedicated
> compressed class base register.
>
> All the different compressed class decoding modes should now be covered
> by the jtreg tests. So if you have access to an AArch64 machine, running
> these should be sufficient to prevent regressions. I'm also happy to
> help with testing.
>
>
> Thanks,
> Nick
>

From thomas.stuefe at gmail.com  Mon Apr 20 06:37:48 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Apr 2020 08:37:48 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUysw8+eQPYdTv_aHqn2y93VEifVPFTuLAZWh3eTcZz+3A@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
 <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>
 <85zhb6n9we.fsf@arm.com>
 <CAA-vtUysw8+eQPYdTv_aHqn2y93VEifVPFTuLAZWh3eTcZz+3A@mail.gmail.com>
Message-ID: <CAA-vtUz2_AFY2NFSiBNJgE7JMxSAE5O5r27B97mEaC4fgk-F8g@mail.gmail.com>

On Mon, Apr 20, 2020 at 8:35 AM Thomas St?fe <thomas.stuefe at gmail.com>
wrote:

> Hi Nick,
>
> thanks for your explanations!
>
> This may be a stupid question, but you rheapbase was used for both
> en/decoding compressed oops and compressed class pointers, right? Just was
> confused by the name.
>
> I would like to abstract the logic of allocating "good" memory for ccs
> away from the shared code into platform dependent files (e.g.
> metaspace_aarch64.cpp) so that each platform can cleanly implement whatever
> they feel is right. Ages ago we had a similar thing for allocating heap
> memory on AIX, that worked quite well.
>
> So, if I were to give you these prototypes:
>
> +  // Given a size, reserve a space anywhere, suitable to be used as
> backing storage for ccs. The return
> +  // address will be the base address for encoding/decoding compressed
> Klass pointers.
> +  // Depending on the platform, this function may allocate anywhere or
> attempt platform specific optimized
> +  // placement.
> +  // If size is not aligned to Metaspace::reserve_alignment it will be
> corrected, so the returned space may be larger.
> +  // On failure an unreserved space is returned.
> +  static ReservedSpace reserve_compressed_class_space_anywhere(size_t
> size);
> +
> +  // Given a size and an address p, reserve a space at address p,
> suitable to be used as backing storage for ccs.
> +  // If p is not a suitable base address for encoding/decoding compressed
> Klass pointers, function will fail.
> +  // Attach point has to be aligned to metaspace reserve alignment.
> +  // If size is not aligned to Metaspace::reserve_alignment it will be
> corrected, so the returned space may be larger.
> +  // On failure an unreserved space is returned.
> +  static ReservedSpace reserve_compressed_class_space_at(address p,
> size_t size);
>
> this should be enough to implement platform dependent ccs allocation,
> right?
>
>
(Specifically, leaving the part of "checking request address if it fits
zero based encoding" out, since the odds that the caller knows a low
address which just happens to be free on my platforms are low; if zero
based is important, platform probably knows better and can come up with a
low address on its own.)


> Cheers, Thomas
>
> On Mon, Apr 20, 2020 at 4:56 AM Nick Gasson <nick.gasson at arm.com> wrote:
>
>> On 04/18/20 14:26 pm, Thomas St?fe wrote:
>> > Just occurred to me that aarch64 also relies on SharedBaseAddress being
>> 4G
>> > aligned. The default is 32G so it works out. If you modify it with
>> > -XX:SharedBaseAddress, looks like the setting is ignored when the value
>> is
>> > not usable.
>> >
>>
>> Yes that's correct, it's treated as a hint on AArch64 since
>> 8234794. Because MacroAssembler::{decode,encode}_klass cannot implement
>> arbitrary base + (src << shift) without an additional temporary register
>> that isn't always available when it's called. It seems better to
>> constrain the possible base addresses than reserve a dedicated
>> compressed class base register.
>>
>> All the different compressed class decoding modes should now be covered
>> by the jtreg tests. So if you have access to an AArch64 machine, running
>> these should be sufficient to prevent regressions. I'm also happy to
>> help with testing.
>>
>>
>> Thanks,
>> Nick
>>
>

From nick.gasson at arm.com  Mon Apr 20 07:00:51 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Mon, 20 Apr 2020 15:00:51 +0800
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUysw8+eQPYdTv_aHqn2y93VEifVPFTuLAZWh3eTcZz+3A@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <CAA-vtUyW50Q0Ar4zBd2XmB+SC_CEubQ-MqG+tqbKWW=jA7VJoQ@mail.gmail.com>
 <cc3967ad-f9c8-32e3-b5de-767f264a5b79@redhat.com>
 <CAA-vtUyaQUqMVrHub0=5kr0Ut05A=v3z8SM9f+tsJQ+OcgMTjw@mail.gmail.com>
 <85zhb6n9we.fsf@arm.com>
 <CAA-vtUysw8+eQPYdTv_aHqn2y93VEifVPFTuLAZWh3eTcZz+3A@mail.gmail.com>
Message-ID: <85y2qqmyjg.fsf@arm.com>

Hi Thomas,

>
> This may be a stupid question, but you rheapbase was used for both en/decoding compressed
> oops and compressed class pointers, right? Just was confused by the name.
>

Yes. Previously if the alignment of the compressed class base was such
that it couldn't use EOR or MOVK to decode, it would fall back to a
generic shift+add using the compressed oop base register, rheapbase, as
temporary. This works because you can call reinit_heapbase to restore
the original value of rheapbase.

The problem with this is that it generates ~2x the instructions of the
other decoding modes, which breaks some size assertions elsewhere in
addition to just being inefficient. Additionally if compressed oops are
not enabled we want to be able to reuse rheapbase as a general
allocatable register, but in that case we can't use it as temporary for
decoding class pointers.

> I would like to abstract the logic of allocating "good" memory for ccs away from the shared
> code into platform dependent files (e.g. metaspace_aarch64.cpp) so that each platform can
> cleanly implement whatever they feel is right. Ages ago we had a similar thing for
> allocating heap memory on AIX, that worked quite well.
>
> So, if I were to give you these prototypes:
>
> +  // Given a size, reserve a space anywhere, suitable to be used as backing storage for
> ccs. The return
> +  // address will be the base address for encoding/decoding compressed Klass pointers.
> +  // Depending on the platform, this function may allocate anywhere or attempt platform
> specific optimized
> +  // placement.
> +  // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the
> returned space may be larger.
> +  // On failure an unreserved space is returned.
> +  static ReservedSpace reserve_compressed_class_space_anywhere(size_t size);
> +
> +  // Given a size and an address p, reserve a space at address p, suitable to be used as
> backing storage for ccs.
> +  // If p is not a suitable base address for encoding/decoding compressed Klass pointers,
> function will fail.
> +  // Attach point has to be aligned to metaspace reserve alignment.
> +  // If size is not aligned to Metaspace::reserve_alignment it will be corrected, so the
> returned space may be larger.
> +  // On failure an unreserved space is returned.
> +  static ReservedSpace reserve_compressed_class_space_at(address p, size_t size);
>
> this should be enough to implement platform dependent ccs allocation, right?
>

Yes I think this works.


Thanks,
Nick

From aph at redhat.com  Mon Apr 20 08:48:50 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 09:48:50 +0100
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>

On 4/20/20 5:32 AM, Pengfei Li wrote:
> Maybe Andrew Haley or other AArch64 reviewers can help?
> 
> [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/

It's fine. At some point in the future maybe we can get round to taking
out all references to rheapbase, but it'll require careful thinking about
JVMCI and Graal-precompiled code.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From ioi.lam at oracle.com  Mon Apr 20 08:47:00 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Apr 2020 01:47:00 -0700
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
 <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>
Message-ID: <eb148861-1adc-705a-60c0-c2c81fc760e4@oracle.com>


On 4/18/20 12:15 AM, Thomas St?fe wrote:
> Hi Ioi,
>
> I am working on a small patch and have some more questions.
>
> - First, a simple one, in 
> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), 
> the space does not have anything to do with metaspace, as you wrote, 
> so the alignment could be anything, right?
>
I think so.

> - Out of curiousity, when you pack the different regions 
> (DumpRegion::pack) you align the end to page size. Why? Why could the 
> next region not simply follow immediately? I looked if any code needs 
> a region to be page aligned, but may have missed it.

We map RO read-only and MC/RW in read-write. If the regions are not 
aligned, you will have a page that wants half to be read-only and half 
to be read-write.

I guess we can adjust the mapping to be more lenient (if a page wants 
half read-write, we map it read-write), but that's no done today.

>
> - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :
>
> I assume this code has to work for all three cases right
> 1) lp32.
> 2) lp64 with and without UseCompressedClassPointers?
> 3) lp64 without UseCompressedClassPointers?
>
> If yes, does the setting for UseCompressedClassPointers have to be the 
> same at run time?

Yes. The value of UseCompressedOops and UseCompressedClassPointers must 
be the same between dump time and run time.

>
>
> In this layout:
> ? // On 64-bit VM, the heap and class space layout will be the same as if
> ? // you're running in -Xshare:on mode:
> ? //
> ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?+-- SharedBaseAddress (default = 
> 0x800000000)
> ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?v
> ? // +-..---------+---------+ ... +----+----+----+--------------------+
> ? // | ? ?Heap ? ?| Archive | ? ? | MC | RW | RO | ?class space ? ? |
> ? // +-..---------+---------+ ... +----+----+----+--------------------+
> ? // |<-- ? MaxHeapSize ?-->| ? ? |<-- UnscaledClassSpaceMax = 4GB -->|
> ? //
>
> Why does the class space has to follow mc+rw+ro? Could it come before?
>
>
Compressed klass pointers are stored in archived objects. If the class 
space is now lower than SharedBaseAddress, you will need to rebase all 
of the compressed klass pointers. This is not efficient and will slow 
down start-up.


>
> Actually, does it have to be in the same space at all, or could it 
> live somewhere completely different?

It can be higher. You just need to ensure that the distance between 
SharedBaseAddress to the end of the class space is within max compressed 
klass space size.

But, I am wondering why you're asking this :-)

> To ask in a more precise way: I understand that both the mc+rw+ro 
> archives and the ccs have to live in an area encompassed by the 
> compressed class pointers encoding scheme. I wonder whether there are 
> any restrictions beyond that.
>
> Could there be a gap between archives and ccs? 
Yes

> Can the order be reversed? 
No.

> Do the relative positions between archives and ccs have to be the same 
> between dump time and runtime?
No. All the pointers stored inside CDS point to inside of the MC/MW/RO 
regions, so it doesn't retain any knowledge of where the CCS was at dump 
time.


Thanks
- Ioi


>
> Thanks!
>
> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com 
> <mailto:ioi.lam at oracle.com>> wrote:
>
>
>
>     On 4/16/20 11:14 AM, Thomas St?fe wrote:
>>     Hi Ioi,
>>
>>     On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com
>>     <mailto:ioi.lam at oracle.com>> wrote:
>>
>>         (I suppose you mean "compressed class space" by "ccs" :-)
>>
>>
>>     Yes, I think I stole this from Stefan Karlsson :)
>>
>>         <snip>
>>
>>         I am not even sure if case (C) can happen at all.
>>
>>         I admit that I've been guilty of making the interface even
>>         more complicated
>>         with JDK-8231610
>>         <https://bugs.openjdk.java.net/browse/JDK-8231610>(Relocate
>>         the CDS archive if it cannot be mapped to the
>>         requested address). Looks now is a good time to clean up.
>>
>>
>>     The coding has been complicated to begin with, and then it
>>     usually only gets worse since no-one has time for a revamp :( A
>>     clean up would be very helpful.
>>
>>     One reason I look at this coding now, beside the aarch64 problem,
>>     was that I try to disentangle?CDS from Metaspace, especially the
>>     alignment policy. Remember, I tried to tackle this last summer?
>>     but it keeps biting me. For such a small problem this is weirdly
>>     complicated.
>>
>>         One thing that can be cleaned up is the call to
>>         Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>
>>         (a) when CDS is enabled:
>>
>>         ??? Metaspace::global_initialize()
>>         ??? ->
>>         MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>         ?????? -> ... MetaspaceShared::map_archives()
>>         ???????? -> ... reserve the space, eventually calling
>>         Metaspace::reserve_space
>>         ???????? -> call
>>         Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>
>>         (b) when CDS is disabled
>>
>>         Metaspace::global_initialize()
>>         -> allocate_metaspace_compressed_klass_ptrs
>>         ?????? -> (if cds is not enabled) Metaspace::reserve_space()
>>
>>
>>         In case (b), we should first reserve the space, and then call
>>         into
>>         allocate_metaspace_compressed_klass_ptrs. This will simplify
>>         the arguments
>>         of allocate_metaspace_compressed_klass_ptrs, and will also
>>         limit the variations
>>         of calls to Metaspace::reserve_space(). I think this will
>>         make it possible to
>>         drop the use_requested_addr argument and rely simply on
>>         (requested_addr != NULL)
>>
>>
>>     So, in all cases we'd pre-reserve the ReservedSpace and hand it
>>     down to Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>>
>>     This would melt down
>>     Metaspace::allocate_metaspace_compressed_klass_ptrs() to just
>>     "initialize compressed class space from a pre-arranged
>>     ReservedSpace, and set up base?+ shift".
>>
>>     We could probably rename that thing
>>     to?Metaspace::set_up_compressed_klass_space(ReservedSpace* rs,
>>     cds_base);
>>
>>     We even could move set_narrow_klass_base_and_shift() out of
>>     Metaspace::set_up_compressed_klass_space, then it becomes a
>>     series of three simple operations:
>>     1) obtain a ReservedSpace however you see fit
>>     2) register it with Metaspace as address space for ccs,
>>     3) set_narrow_klass_base_and_shift. We would not have to hand
>>     down cds_base to Metaspace, only for it to be used as base
>>     address in?set_narrow_klass_base_and_shift.
>>
>
>     Yes, that seems the right thing to do. That will hopefully make
>     the aarch64 initialization code a little simpler as well.
>
>>     One question which came to me today was:
>>
>>     In AppCDS, DynamicArchiveBuilder::do_it() calls
>>     Metaspace::reserve_space(). Is that really needed,?does a
>>     DumpRegion have anything to do with ccs? Don't they just need
>>     some space to dump into? Hope that question is not dumb.
>>
>     Do you mean:
>
>     DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
>
>     -> MetaspaceShared::reserve_shared_space
>     ??? -> Metaspace::reserve_space
>
>     That's not necessary. When I wrote the code I thought
>     Metaspace::reserve_space was a general function for reserving
>     spaces :-) but as you said, this function is probably intended
>     only for initializing the CCS.
>
>     Thanks
>     - Ioi
>
>>     Thanks, Thomas
>>
>>         Thanks
>>         - Ioi
>>
>>
>>>>         Does that make sense? In other words, if the whole point of
>>>>         Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>>>         to find a good address", would it not make sense to just try a low
>>>>         address as part of the try-addresses-loop?
>>>         We certainly don't want to have to use a dedicated heapbase register
>>>         or a shift. Just give us a multiple of 4*G and we're happy.
>>>
>>
>


From Pengfei.Li at arm.com  Mon Apr 20 09:54:40 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Mon, 20 Apr 2020 09:54:40 +0000
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
Message-ID: <DB8PR08MB4969181925F2C63B60AF178596D40@DB8PR08MB4969.eurprd08.prod.outlook.com>


> It's fine. At some point in the future maybe we can get round to taking out all
> references to rheapbase, but it'll require careful thinking about JVMCI and
> Graal-precompiled code.

Thanks Andrew.
Pushed here http://hg.openjdk.java.net/jdk/jdk/rev/aedc9bf21743

--
Thanks,
Pengfei


From aph at redhat.com  Mon Apr 20 10:01:10 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 11:01:10 +0100
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
Message-ID: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com>

On 4/20/20 9:48 AM, Andrew Haley wrote:
> On 4/20/20 5:32 AM, Pengfei Li wrote:
>> Maybe Andrew Haley or other AArch64 reviewers can help?
>>
>> [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/
> It's fine.

Sorry, no it isn't fine. Please get rid of this hunk:

--- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	2020-04-14 21:18:52.009758661 +0800
+++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	2020-04-14 21:18:51.785764043 +0800
@@ -2185,6 +2185,10 @@
 #if 0
   assert (UseCompressedOops || UseCompressedClassPointers, "should be compressed");
   assert (Universe::heap() != NULL, "java heap should be initialized");
+  if (!UseCompressedOops || Universe::ptr_base() == NULL) {
+    // rheapbase is allocated as general register
+    return;
+  }
   if (CheckCompressedOops) {
     Label ok;
     push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Pengfei.Li at arm.com  Mon Apr 20 10:10:05 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Mon, 20 Apr 2020 10:10:05 +0000
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
 <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com>
Message-ID: <DB8PR08MB4969AE6B4F49E3CF882C96F596D40@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi Andrew,

> Sorry, no it isn't fine. Please get rid of this hunk:
> 
> --- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	2020-
> 04-14 21:18:52.009758661 +0800
> +++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp	2020-
> 04-14 21:18:51.785764043 +0800
> @@ -2185,6 +2185,10 @@
>  #if 0
>    assert (UseCompressedOops || UseCompressedClassPointers, "should be
> compressed");
>    assert (Universe::heap() != NULL, "java heap should be initialized");
> +  if (!UseCompressedOops || Universe::ptr_base() == NULL) {
> +    // rheapbase is allocated as general register
> +    return;
> +  }
>    if (CheckCompressedOops) {
>      Label ok;
>      push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1

Oh. It's already pushed just now. According to the process, we may need Wei to create another JBS to backout that part?

--
Thanks,
Pengfei


From aph at redhat.com  Mon Apr 20 10:23:41 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 11:23:41 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo
 introduced by JDK-8238690
In-Reply-To: <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com>

On 4/17/20 10:14 AM, Yang Zhang wrote:
> Ping it again. Could you please help to review this?

I'm running it, and I get no vector code generated. How did you test it?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Mon Apr 20 10:36:19 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 11:36:19 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo
 introduced by JDK-8238690
In-Reply-To: <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com>
References: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <3b47599e-6b2f-06a9-6ea4-057795850065@redhat.com>
Message-ID: <b7eb43e4-72e3-7c3d-81cb-111ead6acac4@redhat.com>

On 4/20/20 11:23 AM, Andrew Haley wrote:
> On 4/17/20 10:14 AM, Yang Zhang wrote:
>> Ping it again. Could you please help to review this?
> 
> I'm running it, and I get no vector code generated. How did you test it?

Sorry, my mistake. I'm testing it now.
-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.stuefe at gmail.com  Mon Apr 20 11:10:42 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Mon, 20 Apr 2020 13:10:42 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <eb148861-1adc-705a-60c0-c2c81fc760e4@oracle.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
 <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>
 <eb148861-1adc-705a-60c0-c2c81fc760e4@oracle.com>
Message-ID: <CAA-vtUydJ9BMAEU8=3HCGb3UG3NiE_tRFD6hORqCB1ba8KXVrA@mail.gmail.com>

On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam <ioi.lam at oracle.com> wrote:

>
>
> On 4/18/20 12:15 AM, Thomas St?fe wrote:
>
> Hi Ioi,
>
> I am working on a small patch and have some more questions.
>
> - First, a simple one, in
> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the
> space does not have anything to do with metaspace, as you wrote, so the
> alignment could be anything, right?
>
> I think so.
>
> - Out of curiousity, when you pack the different regions
> (DumpRegion::pack) you align the end to page size. Why? Why could the next
> region not simply follow immediately? I looked if any code needs a region
> to be page aligned, but may have missed it.
>
>
> We map RO read-only and MC/RW in read-write. If the regions are not
> aligned, you will have a page that wants half to be read-only and half to
> be read-write.
>
>
Okay. I wondered why page align here and not allocation granularity. Now I
understand. I guess this is also the reason why we could not use large
pages for the archive?

I think this is fine, I did not want to change it. On some platforms we
have 64K (non-large) pages, but even there I think the waste would be
acceptable.


> I guess we can adjust the mapping to be more lenient (if a page wants half
> read-write, we map it read-write), but that's no done today.
>
>
> - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :
>
> I assume this code has to work for all three cases right
> 1) lp32.
> 2) lp64 with and without UseCompressedClassPointers?
> 3) lp64 without UseCompressedClassPointers?
>
> If yes, does the setting for UseCompressedClassPointers have to be the
> same at run time?
>
>
> Yes. The value of UseCompressedOops and UseCompressedClassPointers must be
> the same between dump time and run time.
>
>
>
> In this layout:
>   // On 64-bit VM, the heap and class space layout will be the same as if
>   // you're running in -Xshare:on mode:
>   //
>   //                              +-- SharedBaseAddress (default =
> 0x800000000)
>   //                              v
>   // +-..---------+---------+ ... +----+----+----+--------------------+
>   // |    Heap    | Archive |     | MC | RW | RO |    class space     |
>   // +-..---------+---------+ ... +----+----+----+--------------------+
>   // |<--   MaxHeapSize  -->|     |<-- UnscaledClassSpaceMax = 4GB -->|
>   //
>
> Why does the class space has to follow mc+rw+ro? Could it come before?
>
>
> Compressed klass pointers are stored in archived objects. If the class
> space is now lower than SharedBaseAddress, you will need to rebase all of
> the compressed klass pointers. This is not efficient and will slow down
> start-up.
>
>
Well, could SharedBaseAddress not point to start of the ccs:

  // +-- SharedBaseAddress (default = 0x800000000)
  // v
  // +----+----+----+-----------------------------------+
  // |    class space     | ..gap maybe.. | MC | RW | RO
  // +----+----+----+-----------------------------------+

you'd then need to make sure that the relative offset of MC to
SharedBaseAddress is the same at dump time and at runtime. Is my
understanding correct? I am not saying I want to do this, I just try to
understand the way ccs archive allocation works.


>
>
> Actually, does it have to be in the same space at all, or could it live
> somewhere completely different?
>
>
> It can be higher. You just need to ensure that the distance between
> SharedBaseAddress to the end of the class space is within max compressed
> klass space size.
>
> But, I am wondering why you're asking this :-)
>
>
I try to understand the allocation and where apply what restrictions. We
have at least three parties, cds, metaspace and the underlying platform,
all with their own subtleties of how the memory should be allocated:
- metaspace will in the near future want a larger alignment than what cds
uses for reservation.
- platforms like aarch64 and maybe ppc want the compressed class base to
look in a certain way

Part of my confusion was that I always thought of
CompressClassPointers::base() to be basically the same as the start of the
ccs (maybe modulo being zero on zero-based mode) but that is obviously not
true since CDS exists. So what I wrote first:

"Metaspace::reserve_preferred_space.. Despite its generic-sounding name,
these functions can only be used to allocate ccs."

is actually not fully correct. In reality this space is to be used to
allocate memory to house Klass structures so that their pointers are
compressable, so the reserved start address has to be compatible with that.
But, e.g., that start address does not have to be aligned to
Metaspace::reserve_alignment().

In both cds dump and runtime case, the ccs is carved from the end part of
the reserved space. Only that split point, and the size of that second
part, have to be aligned to Metaspace::reserve_alignment().

Were we to allocate ccs first and put the archives behind it this would
simplify some matters, but only minor points. I think the way it works now
is okay. I will try to disentangle it a bit in a way you proposed.


> To ask in a more precise way: I understand that both the mc+rw+ro archives
> and the ccs have to live in an area encompassed by the compressed class
> pointers encoding scheme. I wonder whether there are any restrictions
> beyond that.
>
> Could there be a gap between archives and ccs?
>
> Yes
>
> Can the order be reversed?
>
> No.
>
> Do the relative positions between archives and ccs have to be the same
> between dump time and runtime?
>
> No. All the pointers stored inside CDS point to inside of the MC/MW/RO
> regions, so it doesn't retain any knowledge of where the CCS was at dump
> time.
>
>
Clear answers, thank you!

..Thomas


>
> Thanks
> - Ioi
>
>
>
> Thanks!
>
> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>>
>>
>> On 4/16/20 11:14 AM, Thomas St?fe wrote:
>>
>> Hi Ioi,
>>
>> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>> (I suppose you mean "compressed class space" by "ccs" :-)
>>>
>>>
>> Yes, I think I stole this from Stefan Karlsson :)
>>
>>
>>> <snip>
>>>
>>
>>
>>> I am not even sure if case (C) can happen at all.
>>>
>>> I admit that I've been guilty of making the interface even more
>>> complicated
>>> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
>>> (Relocate the CDS archive if it cannot be mapped to the
>>> requested address). Looks now is a good time to clean up.
>>>
>>>
>> The coding has been complicated to begin with, and then it usually only
>> gets worse since no-one has time for a revamp :( A clean up would be very
>> helpful.
>>
>> One reason I look at this coding now, beside the aarch64 problem, was
>> that I try to disentangle CDS from Metaspace, especially the alignment
>> policy. Remember, I tried to tackle this last summer? but it keeps biting
>> me. For such a small problem this is weirdly complicated.
>>
>>
>>> One thing that can be cleaned up is the call to
>>> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>>
>>> (a) when CDS is enabled:
>>>
>>>     Metaspace::global_initialize()
>>>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>>        -> ... MetaspaceShared::map_archives()
>>>          -> ... reserve the space, eventually calling
>>> Metaspace::reserve_space
>>>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>>
>>> (b) when CDS is disabled
>>>
>>>     Metaspace::global_initialize()
>>>     -> allocate_metaspace_compressed_klass_ptrs
>>>        -> (if cds is not enabled) Metaspace::reserve_space()
>>>
>>>
>>> In case (b), we should first reserve the space, and then call into
>>> allocate_metaspace_compressed_klass_ptrs. This will simplify the
>>> arguments
>>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
>>> variations
>>> of calls to Metaspace::reserve_space(). I think this will make it
>>> possible to
>>> drop the use_requested_addr argument and rely simply on (requested_addr
>>> != NULL)
>>>
>>>
>> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
>> Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>>
>> This would melt down
>> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize
>> compressed class space from a pre-arranged ReservedSpace, and set up base +
>> shift".
>>
>> We could probably rename that thing
>> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>>
>> We even could move set_narrow_klass_base_and_shift() out of
>> Metaspace::set_up_compressed_klass_space, then it becomes a series of three
>> simple operations:
>> 1) obtain a ReservedSpace however you see fit
>> 2) register it with Metaspace as address space for ccs,
>> 3) set_narrow_klass_base_and_shift. We would not have to hand down
>> cds_base to Metaspace, only for it to be used as base address
>> in set_narrow_klass_base_and_shift.
>>
>>
>> Yes, that seems the right thing to do. That will hopefully make the
>> aarch64 initialization code a little simpler as well.
>>
>> One question which came to me today was:
>>
>> In AppCDS, DynamicArchiveBuilder::do_it() calls
>> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have
>> anything to do with ccs? Don't they just need some space to dump into? Hope
>> that question is not dumb.
>>
>> Do you mean:
>>
>> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
>> -> MetaspaceShared::reserve_shared_space
>>     -> Metaspace::reserve_space
>>
>> That's not necessary. When I wrote the code I thought
>> Metaspace::reserve_space was a general function for reserving spaces :-)
>> but as you said, this function is probably intended only for initializing
>> the CCS.
>>
>> Thanks
>> - Ioi
>>
>> Thanks, Thomas
>>
>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>> Does that make sense? In other words, if the whole point of
>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>> to find a good address", would it not make sense to just try a low
>>> address as part of the try-addresses-loop?
>>>
>>> We certainly don't want to have to use a dedicated heapbase register
>>> or a shift. Just give us a multiple of 4*G and we're happy.
>>>
>>>
>>>
>>>
>>
>

From aph at redhat.com  Mon Apr 20 11:50:33 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 12:50:33 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8242070: AArch64: Fix a typo
 introduced by JDK-8238690
In-Reply-To: <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB25580275D036617C1713AF158EDE0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <VI1PR0802MB255835F5D4BD55CDAFF4B1578ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <ec98e42e-23de-9d64-5e3b-7f4589e07674@redhat.com>

On 4/17/20 10:14 AM, Yang Zhang wrote:
> 
> Ping it again. Could you please help to review this?

Before:

Benchmark                    Mode  Cnt    Score   Error  Units
TestVect.testVectShift       avgt    5  141.027 ? 0.117  us/op

  0.41%    0x0000ffffa8c5fc40:   sbfiz	x15, x11, #1, #32
           0x0000ffffa8c5fc44:   add	x16, x18, x15               ;*saload {reexecute=0 rethrow=0 return_oop=0}
                                                                     ; - org.sample.TestVect::testVectShift at 16 (line 31)
           0x0000ffffa8c5fc48:   ldr	q16, [x16, #16]
  0.51%    0x0000ffffa8c5fc4c:   neg	v17.16b, v18.16b
           0x0000ffffa8c5fc50:   sshl	v16.8h, v16.8h, v17.8h
           0x0000ffffa8c5fc54:   add	x15, x17, x15


After:

Benchmark                    Mode  Cnt    Score   Error  Units
TestVect.testVectShift       avgt    5  143.021 ? 0.506  us/op

  0.46%    0x0000ffff78c61f00:   sbfiz	x13, x15, #1, #32
           0x0000ffff78c61f04:   add	x14, x17, x13               ;*saload {reexecute=0 rethrow=0 return_oop=0}
                                                                     ; - org.sample.TestVect::testVectShift at 16 (line 31)
           0x0000ffff78c61f08:   ldr	q16, [x14, #16]
  0.36%    0x0000ffff78c61f0c:   sshr	v16.8h, v16.8h, #2
           0x0000ffff78c61f10:   add	x13, x16, x13

So, at least on this thing it makes no difference. I'll grant you it's
less code, so OK.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Mon Apr 20 12:14:29 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 13:14:29 +0100
Subject: [aarch64-port-dev ] RFR: heapbase register can be allocated in
	compressed mode
In-Reply-To: <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>
 <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.co m>
 <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com>
Message-ID: <f523bd5c-27a7-e024-0822-0afb5fee0b79@redhat.com>

On 4/20/20 12:12 PM, Kuai Wei wrote:

>  Could you tell more detail about it? I can start a new patch for it
>  if it break anything.

Well, it's ifdef'd out at the moment, so by definition it can't break anything.
But there may be issues with Graal whereby we really do need to check rheapbase,
but it's OK for now.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From stuart.monteith at arm.com  Mon Apr 20 14:19:28 2020
From: stuart.monteith at arm.com (Stuart Monteith)
Date: Mon, 20 Apr 2020 15:19:28 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
Message-ID: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>

Hi,
	If anyone has bandwidth, would the be able to review this patch? It
addresses Andrew, Per and Erik's comments:
        http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/

Thanks,
	Stuart


On 27/03/2020 09:47, Erik ?sterlund wrote:
> Hi Stuart,
> 
> Thanks for sorting this out on AArch64. It is nice to see thatyou can
> implement these
> barriers on platforms that do not have instruction cache coherency.
> 
> One small change request:
> It looks like in C1 you inject the entry barrier right after build_frame
> is done:
> 
> ?629?????? build_frame();
> ?630?????? {
> ?631???????? // Insert nmethod entry barrier into frame.
> ?632???????? BarrierSetAssembler* bs =
> BarrierSet::barrier_set()->barrier_set_assembler();
> ?633???????? bs->nmethod_entry_barrier(_masm);
> ?634?????? }
> 
> Unfortunately, this is in the platform independent part of the LIR
> assembler. In the x86 version
> we inject it at the very end of build_frame() instead, which is a
> platform-specific function.
> The platform-specific function is in the C1 macro assembler file for
> that platform.
> 
> We intentionally put it in the platform-specific path as it is a
> platform-specific feature.
> Now on x86, the barrier code will be emitted once in build_frame() and
> once after returning
> from build_frame, resulting in two nmethod entry barriers, and only the
> first one will get
> patched, causing the second one to mostly take slow paths, which isn't
> necessarily wrong,
> but will cause regressions.
> 
> I would propose you just move those lines into the very end of the
> AArch64-specific part of
> build_frame().
> 
> I don't need to see another webrev for that trivial code motion. This
> looks good to me.
> Agan, thanks a lot for fixing this! It will allow me to go forward with
> concurrent stack
> scanning on AArch64 as well.
> 
> Thanks,
> /Erik
> 
> 
> On 2020-03-26 23:42, Stuart Monteith wrote:
>> Hello,
>> ???????? Please review this change to implement nmethod entry barriers on
>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will
>> need to be separately tested and enabled - there are problems with this
>> on Shenandoah.
>>
>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as
>> well as Netbeans.
>>
>> In terms of interesting features:
>> ????????? With nmethod entry barriers,? immediate oops are removed by:
>> ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop
>> ???????? This is to ensure consistency with the entry barrier, as
>> otherwise with
>> an immediate we'd otherwise need an ISB.
>>
>> ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this
>> functionality
>> useful in testing as deoptimisation is very infrequent. I've written it
>> as an atomic to avoid it happening too frequently. As it is a new
>> option, I'm not sure whether any more is needed than this review. A new
>> test has been added
>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to
>> test GC with that option enabled.
>>
>> ???????? BarrierSetAssembler::nmethod_entry_barrier
>> ???????? This method emits the barrier code. In internal review it was
>> suggested
>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not
>> done this as the BarrierSetNMethod code checks the exact instruction
>> sequence, and I prefer to be explicit.
>>
>> ???????? Benchmarking method entry shows an increase of around 6ns
>> with the
>> nmethod entry barrier.
>>
>>
>> The deoptimisation code was contributed by Andrew Haley.
>>
>> The bug:
>> ???????? https://bugs.openjdk.java.net/browse/JDK-8216557
>>
>> The webrev:
>> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/
>>
>>
>> BR,
>> ???????? Stuart


From aph at redhat.com  Mon Apr 20 16:35:35 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 17:35:35 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
Message-ID: <7cd269af-c621-8d33-c7d6-1baa6729fc31@redhat.com>

On 4/20/20 3:19 PM, Stuart Monteith wrote:
> 	If anyone has bandwidth, would the be able to review this patch? It
> addresses Andrew, Per and Erik's comments:
>         http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/

Yes, yes. I'm pedalling as quickly as I can.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Mon Apr 20 17:22:12 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 20 Apr 2020 18:22:12 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
Message-ID: <f113ec1f-540e-4ece-1442-f6064e627797@redhat.com>

On 4/20/20 3:19 PM, Stuart Monteith wrote:
> 	If anyone has bandwidth, would the be able to review this patch? It
> addresses Andrew, Per and Erik's comments:
>         http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/

It looks right. For clarity I wonder if perhaps we should have a method
bool BarrierSet::use_nmethod_barriers() or somesuch. It would be much
easier to read.

In future we should perhaps not inline the guard value, and move the
infrequently-executed code out of line.

Then this:

  0x0000ffffa97fff14:   ldr	w8, 0x0000ffffa97fff3c
  0x0000ffffa97fff18:   dmb	ishld
  0x0000ffffa97fff1c:   ldr	w9, [x28, #36]
  0x0000ffffa97fff20:   cmp	w8, w9
  0x0000ffffa97fff24:   b.eq	0x0000ffffa97fff40  // b.none
 ;; 0xFFFFA9118B00
  0x0000ffffa97fff28:   mov	x8, #0x8b00                	// #35584
  0x0000ffffa97fff2c:   movk	x8, #0xa911, lsl #16
  0x0000ffffa97fff30:   movk	x8, #0xffff, lsl #32
  0x0000ffffa97fff34:   blr	x8
  0x0000ffffa97fff38:   b	0x0000ffffa97fff40
  0x0000ffffa97fff3c

would turn into this:

  :   ldr	w8, 0x0000ffffa97fff3c
  :   dmb	ishld
  :   ldr	w9, [x28, #36]
  :   cmp	w8, w9
  :   b.ne	0x0000ffffa97fff40  // b.none

but we don't have to do that right now. OK.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From ci_notify at linaro.org  Mon Apr 20 18:06:39 2020
From: ci_notify at linaro.org (ci_notify at linaro.org)
Date: Mon, 20 Apr 2020 18:06:39 +0000 (UTC)
Subject: [aarch64-port-dev ] Linaro OpenJDK AArch64 jdk/jdk build 3385
	Failure
Message-ID: <265510322.18108.1587406000291.JavaMail.javamailuser@localhost>

OpenJDK AArch64 jdk/jdk build status is Failure
Build details -  https://ci.linaro.org/job/jdkX-ci-build/3385/

Changes -
  erikj: f499459eda7ae5bc843e072e0cfdd6201636456f 
	- make/autoconf/boot-jdk.m4
	- make/autoconf/version-numbers
	- make/conf/jib-profiles.js 
--"8242863: Bump minimum boot jdk to JDK 14
Reviewed-by: ihse, jlahoda, dholmes
"

Build output -
   Creating java.rmi.jmod
   Creating java.scripting.jmod
   Creating java.se.jmod
   Creating java.security.jgss.jmod
   Creating java.security.sasl.jmod
   Creating java.smartcardio.jmod
   Creating java.sql.jmod
   Creating java.sql.rowset.jmod
   Creating java.transaction.xa.jmod
   Creating java.xml.jmod
   Creating jdk.accessibility.jmod
   Creating java.xml.crypto.jmod
   Creating jdk.aot.jmod
   Creating jdk.charsets.jmod
   Creating jdk.attach.jmod
   Creating jdk.compiler.jmod
   Creating jdk.crypto.cryptoki.jmod
   Creating jdk.crypto.ec.jmod
   Creating jdk.dynalink.jmod
   Creating jdk.editpad.jmod
   Creating jdk.hotspot.agent.jmod
   Creating jdk.httpserver.jmod
   Creating jdk.incubator.foreign.jmod
   Creating jdk.incubator.jpackage.jmod
   Creating jdk.internal.ed.jmod
   Creating jdk.internal.jvmstat.jmod
   Creating jdk.internal.le.jmod
   Creating jdk.internal.opt.jmod
   Creating jdk.internal.vm.ci.jmod
   Creating jdk.internal.vm.compiler.jmod
   Creating jdk.internal.vm.compiler.management.jmod
   Creating jdk.jartool.jmod
   Creating jdk.javadoc.jmod
   Creating jdk.jcmd.jmod
   Creating jdk.jconsole.jmod
   Creating jdk.jdeps.jmod
   Creating jdk.jdwp.agent.jmod
   Creating jdk.jdi.jmod
   Creating jdk.jfr.jmod
   Creating jdk.jshell.jmod
   Creating jdk.jsobject.jmod
   Creating jdk.jstatd.jmod
   Creating jdk.localedata.jmod
   Creating jdk.management.jmod
   Creating jdk.management.agent.jmod
   Creating jdk.management.jfr.jmod
   Creating jdk.naming.dns.jmod
   Creating jdk.naming.rmi.jmod
   Creating jdk.net.jmod
   Creating jdk.nio.mapmode.jmod
   Creating jdk.sctp.jmod
   Creating jdk.security.auth.jmod
   Creating jdk.security.jgss.jmod
   Creating jdk.unsupported.jmod
   Creating jdk.unsupported.desktop.jmod
   Creating jdk.xml.dom.jmod
   Creating jdk.zipfs.jmod
   Creating interim jimage
   Compiling 3 files for BUILD_DEMO_CodePointIM
   Updating support/demos/image/jfc/CodePointIM/src.zip
   Compiling 3 files for BUILD_DEMO_FileChooserDemo
   Updating support/demos/image/jfc/FileChooserDemo/src.zip
   Compiling 29 files for BUILD_DEMO_SwingSet2
   Updating support/demos/image/jfc/SwingSet2/src.zip
   Compiling 3 files for BUILD_DEMO_Font2DTest
   Updating support/demos/image/jfc/Font2DTest/src.zip
   Compiling 64 files for BUILD_DEMO_J2Ddemo
   Updating support/demos/image/jfc/J2Ddemo/src.zip
   Compiling 15 files for BUILD_DEMO_Metalworks
   Updating support/demos/image/jfc/Metalworks/src.zip
   Compiling 2 files for BUILD_DEMO_Notepad
   Updating support/demos/image/jfc/Notepad/src.zip
   Compiling 5 files for BUILD_DEMO_Stylepad
   Updating support/demos/image/jfc/Stylepad/src.zip
   Compiling 5 files for BUILD_DEMO_SampleTree
   Updating support/demos/image/jfc/SampleTree/src.zip
   Compiling 8 files for BUILD_DEMO_TableExample
   Updating support/demos/image/jfc/TableExample/src.zip
   Compiling 1 files for BUILD_DEMO_TransparentRuler
   Updating support/demos/image/jfc/TransparentRuler/src.zip
   Creating support/demos/image/jfc/FileChooserDemo/FileChooserDemo.jar
   Creating support/demos/image/jfc/CodePointIM/CodePointIM.jar
   Creating support/demos/image/jfc/Font2DTest/Font2DTest.jar
   Creating support/demos/image/jfc/Metalworks/Metalworks.jar
   Creating support/demos/image/jfc/Notepad/Notepad.jar
   Creating support/demos/image/jfc/Stylepad/Stylepad.jar
   Creating support/demos/image/jfc/SampleTree/SampleTree.jar
   Creating support/demos/image/jfc/TableExample/TableExample.jar
   Creating support/demos/image/jfc/TransparentRuler/TransparentRuler.jar
   Creating support/demos/image/jfc/SwingSet2/SwingSet2.jar
   Compiling 1 files for CLASSLIST_JAR
   Creating support/demos/image/jfc/J2Ddemo/J2Ddemo.jar
   Creating support/classlist.jar
   Creating jdk.jlink.jmod
   Creating java.base.jmod
   Creating jdk image
   WARNING: Using incubator modules: jdk.incubator.jpackage, jdk.incubator.foreign
   Creating CDS archive for jdk image
   Stopping sjavac server
   Finished building target 'images' in configuration '/home/buildslave/workspace/jdkX-ci-build/build'

From stuart.monteith at arm.com  Mon Apr 20 19:35:19 2020
From: stuart.monteith at arm.com (Stuart Monteith)
Date: Mon, 20 Apr 2020 20:35:19 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <f113ec1f-540e-4ece-1442-f6064e627797@redhat.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
 <f113ec1f-540e-4ece-1442-f6064e627797@redhat.com>
Message-ID: <b24faaf4-c1e5-3002-a1b3-cf823f6e21b5@arm.com>

On 20/04/2020 18:22, Andrew Haley wrote:
> On 4/20/20 3:19 PM, Stuart Monteith wrote:
>> 	If anyone has bandwidth, would the be able to review this patch? It
>> addresses Andrew, Per and Erik's comments:
>>         http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/
> 
> It looks right. For clarity I wonder if perhaps we should have a method
> bool BarrierSet::use_nmethod_barriers() or somesuch. It would be much
> easier to read.
> 
That would be good. How about I apply that as a separate patch, as it
would necessarily be cross-platform.

> In future we should perhaps not inline the guard value, and move the
> infrequently-executed code out of line.
> 
> Then this:
> 
>   0x0000ffffa97fff14:   ldr	w8, 0x0000ffffa97fff3c
>   0x0000ffffa97fff18:   dmb	ishld
>   0x0000ffffa97fff1c:   ldr	w9, [x28, #36]
>   0x0000ffffa97fff20:   cmp	w8, w9
>   0x0000ffffa97fff24:   b.eq	0x0000ffffa97fff40  // b.none
>  ;; 0xFFFFA9118B00
>   0x0000ffffa97fff28:   mov	x8, #0x8b00                	// #35584
>   0x0000ffffa97fff2c:   movk	x8, #0xa911, lsl #16
>   0x0000ffffa97fff30:   movk	x8, #0xffff, lsl #32
>   0x0000ffffa97fff34:   blr	x8
>   0x0000ffffa97fff38:   b	0x0000ffffa97fff40
>   0x0000ffffa97fff3c
> 
> would turn into this:
> 
>   :   ldr	w8, 0x0000ffffa97fff3c
>   :   dmb	ishld
>   :   ldr	w9, [x28, #36]
>   :   cmp	w8, w9
>   :   b.ne	0x0000ffffa97fff40  // b.none
> 
> but we don't have to do that right now. OK.
> 

That would be ideal - a CodeStub to handle the slow path would then take
the responsibility for recording the relative location of the guard
value - currently that happens to be fixed.

I think I'd prefer to do that as a separate patch while I work out the
details.

Thanks for the review,
	Stuart

From david.holmes at oracle.com  Tue Apr 21 04:42:29 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Apr 2020 14:42:29 +1000
Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for
	mainline changes
Message-ID: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>

Hi everyone,

aarch64-port-dev at openjdk.java.net is the mailing list for the 
aarch64-port project:

http://openjdk.java.net/projects/aarch64-port/

which was setup to get the Aarch64 port into OpenJDK 8. While the 
mailing list still serves as a good place for people to specifically 
discuss issues around the Aarch64 port, all code reviews for changes to 
be pushed to mainline openjdk (not the aarch64-port project repo) should 
be occurring on the appropriate mainline hotspot*-dev mailing list 
(which correspond to the 'appropriate development list' as per 
http://openjdk.java.net/contribute/).

Thanks,
David

From thomas.stuefe at gmail.com  Tue Apr 21 05:18:30 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Apr 2020 07:18:30 +0200
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
 <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>
 <eb148861-1adc-705a-60c0-c2c81fc760e4@oracle.com>
 <CAA-vtUydJ9BMAEU8=3HCGb3UG3NiE_tRFD6hORqCB1ba8KXVrA@mail.gmail.com>
 <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com>
Message-ID: <CAA-vtUw5LtrXCaMmgGUduX0GCRPxmemd2w_1PkkpcP6at9OFgg@mail.gmail.com>

On Tue, Apr 21, 2020 at 6:43 AM Ioi Lam <ioi.lam at oracle.com> wrote:

>
>
> On 4/20/20 4:10 AM, Thomas St?fe wrote:
>
> On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam <ioi.lam at oracle.com> wrote:
>
>>
>>
>> On 4/18/20 12:15 AM, Thomas St?fe wrote:
>>
>> Hi Ioi,
>>
>> I am working on a small patch and have some more questions.
>>
>> - First, a simple one, in
>> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(), the
>> space does not have anything to do with metaspace, as you wrote, so the
>> alignment could be anything, right?
>>
>> I think so.
>>
>> - Out of curiousity, when you pack the different regions
>> (DumpRegion::pack) you align the end to page size. Why? Why could the next
>> region not simply follow immediately? I looked if any code needs a region
>> to be page aligned, but may have missed it.
>>
>>
>> We map RO read-only and MC/RW in read-write. If the regions are not
>> aligned, you will have a page that wants half to be read-only and half to
>> be read-write.
>>
>>
> Okay. I wondered why page align here and not allocation granularity. Now I
> understand. I guess this is also the reason why we could not use large
> pages for the archive?
>
> I think this is fine, I did not want to change it. On some platforms we
> have 64K (non-large) pages, but even there I think the waste would be
> acceptable.
>
>
>> I guess we can adjust the mapping to be more lenient (if a page wants
>> half read-write, we map it read-write), but that's no done today.
>>
>>
>> - void MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :
>>
>> I assume this code has to work for all three cases right
>> 1) lp32.
>> 2) lp64 with and without UseCompressedClassPointers?
>> 3) lp64 without UseCompressedClassPointers?
>>
>> If yes, does the setting for UseCompressedClassPointers have to be the
>> same at run time?
>>
>>
>> Yes. The value of UseCompressedOops and UseCompressedClassPointers must
>> be the same between dump time and run time.
>>
>>
>>
>> In this layout:
>>   // On 64-bit VM, the heap and class space layout will be the same as if
>>   // you're running in -Xshare:on mode:
>>   //
>>   //                              +-- SharedBaseAddress (default =
>> 0x800000000)
>>   //                              v
>>   // +-..---------+---------+ ... +----+----+----+--------------------+
>>   // |    Heap    | Archive |     | MC | RW | RO |    class space     |
>>   // +-..---------+---------+ ... +----+----+----+--------------------+
>>   // |<--   MaxHeapSize  -->|     |<-- UnscaledClassSpaceMax = 4GB -->|
>>   //
>>
>> Why does the class space has to follow mc+rw+ro? Could it come before?
>>
>>
>> Compressed klass pointers are stored in archived objects. If the class
>> space is now lower than SharedBaseAddress, you will need to rebase all of
>> the compressed klass pointers. This is not efficient and will slow down
>> start-up.
>>
>>
> Well, could SharedBaseAddress not point to start of the ccs:
>
>   // +-- SharedBaseAddress (default = 0x800000000)
>   // v
>   // +----+----+----+-----------------------------------+
>   // |    class space     | ..gap maybe.. | MC | RW | RO
>   // +----+----+----+-----------------------------------+
>
> you'd then need to make sure that the relative offset of MC to
> SharedBaseAddress is the same at dump time and at runtime. Is my
> understanding correct? I am not saying I want to do this, I just try to
> understand the way ccs archive allocation works.
>
>
> That should work. But you are still using a fixed offset from the bottom
> of MC to SharedBaseAddress (instead of a fixed address of 0). I am not sure
> if that will buy you any flexibility.
>
>
>
>>
>>
>> Actually, does it have to be in the same space at all, or could it live
>> somewhere completely different?
>>
>>
>> It can be higher. You just need to ensure that the distance between
>> SharedBaseAddress to the end of the class space is within max compressed
>> klass space size.
>>
>> But, I am wondering why you're asking this :-)
>>
>>
> I try to understand the allocation and where apply what restrictions. We
> have at least three parties, cds, metaspace and the underlying platform,
> all with their own subtleties of how the memory should be allocated:
> - metaspace will in the near future want a larger alignment than what cds
> uses for reservation.
> - platforms like aarch64 and maybe ppc want the compressed class base to
> look in a certain way
>
> Part of my confusion was that I always thought of
> CompressClassPointers::base() to be basically the same as the start of the
> ccs (maybe modulo being zero on zero-based mode) but that is obviously not
> true since CDS exists. So what I wrote first:
>
> "Metaspace::reserve_preferred_space.. Despite its generic-sounding name,
> these functions can only be used to allocate ccs."
>
> is actually not fully correct. In reality this space is to be used to
> allocate memory to house Klass structures so that their pointers are
> compressable, so the reserved start address has to be compatible with that.
> But, e.g., that start address does not have to be aligned to
> Metaspace::reserve_alignment().
>
> That's very true. I have some logic in CDS to pick the greater of
> Metaspace::reserve_alignment() and os alignment. I think this is probably
> unnecessary.
>
> Maybe we should simply untangle CDS/CCS from Metaspace altogether? We
> really just need a 1GB reservation to be anchored at 32GB (or some address
> that aarch64 likes). That way, you can do whatever you want with Metaspace
> and not worry about CDS/CCS.
>
>
Yes, this is what I am trying to do. CDS and Metaspace do not have much in
common beyond the ccs reservation. I still like your first proposal of
moving the creation of ReservedSpace for ccs out of Metaspace altogether. I
also plan to change the code so that Metaspace::reserve_alignment is not
used in cds anymore, only at that one or two places where it is really
needed.

..Thomas


> Thanks
> - Ioi
>
> In both cds dump and runtime case, the ccs is carved from the end part of
> the reserved space. Only that split point, and the size of that second
> part, have to be aligned to Metaspace::reserve_alignment().
>
> Were we to allocate ccs first and put the archives behind it this would
> simplify some matters, but only minor points. I think the way it works now
> is okay. I will try to disentangle it a bit in a way you proposed.
>
>
>> To ask in a more precise way: I understand that both the mc+rw+ro
>> archives and the ccs have to live in an area encompassed by the compressed
>> class pointers encoding scheme. I wonder whether there are any restrictions
>> beyond that.
>>
>> Could there be a gap between archives and ccs?
>>
>> Yes
>>
>> Can the order be reversed?
>>
>> No.
>>
>> Do the relative positions between archives and ccs have to be the same
>> between dump time and runtime?
>>
>> No. All the pointers stored inside CDS point to inside of the MC/MW/RO
>> regions, so it doesn't retain any knowledge of where the CCS was at dump
>> time.
>>
>>
> Clear answers, thank you!
>
> ..Thomas
>
>
>>
>> Thanks
>> - Ioi
>>
>>
>>
>> Thanks!
>>
>> On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>>
>>>
>>> On 4/16/20 11:14 AM, Thomas St?fe wrote:
>>>
>>> Hi Ioi,
>>>
>>> On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>
>>>> (I suppose you mean "compressed class space" by "ccs" :-)
>>>>
>>>>
>>> Yes, I think I stole this from Stefan Karlsson :)
>>>
>>>
>>>> <snip>
>>>>
>>>
>>>
>>>> I am not even sure if case (C) can happen at all.
>>>>
>>>> I admit that I've been guilty of making the interface even more
>>>> complicated
>>>> with JDK-8231610 <https://bugs.openjdk.java.net/browse/JDK-8231610>
>>>> (Relocate the CDS archive if it cannot be mapped to the
>>>> requested address). Looks now is a good time to clean up.
>>>>
>>>>
>>> The coding has been complicated to begin with, and then it usually only
>>> gets worse since no-one has time for a revamp :( A clean up would be very
>>> helpful.
>>>
>>> One reason I look at this coding now, beside the aarch64 problem, was
>>> that I try to disentangle CDS from Metaspace, especially the alignment
>>> policy. Remember, I tried to tackle this last summer? but it keeps biting
>>> me. For such a small problem this is weirdly complicated.
>>>
>>>
>>>> One thing that can be cleaned up is the call to
>>>> Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>>>
>>>> (a) when CDS is enabled:
>>>>
>>>>     Metaspace::global_initialize()
>>>>     -> MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>>>        -> ... MetaspaceShared::map_archives()
>>>>          -> ... reserve the space, eventually calling
>>>> Metaspace::reserve_space
>>>>          -> call Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>>>
>>>> (b) when CDS is disabled
>>>>
>>>>     Metaspace::global_initialize()
>>>>     -> allocate_metaspace_compressed_klass_ptrs
>>>>        -> (if cds is not enabled) Metaspace::reserve_space()
>>>>
>>>>
>>>> In case (b), we should first reserve the space, and then call into
>>>> allocate_metaspace_compressed_klass_ptrs. This will simplify the
>>>> arguments
>>>> of allocate_metaspace_compressed_klass_ptrs, and will also limit the
>>>> variations
>>>> of calls to Metaspace::reserve_space(). I think this will make it
>>>> possible to
>>>> drop the use_requested_addr argument and rely simply on (requested_addr
>>>> != NULL)
>>>>
>>>>
>>> So, in all cases we'd pre-reserve the ReservedSpace and hand it down to
>>> Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>>>
>>> This would melt down
>>> Metaspace::allocate_metaspace_compressed_klass_ptrs() to just "initialize
>>> compressed class space from a pre-arranged ReservedSpace, and set up base +
>>> shift".
>>>
>>> We could probably rename that thing
>>> to Metaspace::set_up_compressed_klass_space(ReservedSpace* rs, cds_base);
>>>
>>> We even could move set_narrow_klass_base_and_shift() out of
>>> Metaspace::set_up_compressed_klass_space, then it becomes a series of three
>>> simple operations:
>>> 1) obtain a ReservedSpace however you see fit
>>> 2) register it with Metaspace as address space for ccs,
>>> 3) set_narrow_klass_base_and_shift. We would not have to hand down
>>> cds_base to Metaspace, only for it to be used as base address
>>> in set_narrow_klass_base_and_shift.
>>>
>>>
>>> Yes, that seems the right thing to do. That will hopefully make the
>>> aarch64 initialization code a little simpler as well.
>>>
>>> One question which came to me today was:
>>>
>>> In AppCDS, DynamicArchiveBuilder::do_it() calls
>>> Metaspace::reserve_space(). Is that really needed, does a DumpRegion have
>>> anything to do with ccs? Don't they just need some space to dump into? Hope
>>> that question is not dumb.
>>>
>>> Do you mean:
>>>
>>> DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
>>> -> MetaspaceShared::reserve_shared_space
>>>     -> Metaspace::reserve_space
>>>
>>> That's not necessary. When I wrote the code I thought
>>> Metaspace::reserve_space was a general function for reserving spaces :-)
>>> but as you said, this function is probably intended only for initializing
>>> the CCS.
>>>
>>> Thanks
>>> - Ioi
>>>
>>> Thanks, Thomas
>>>
>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>> Does that make sense? In other words, if the whole point of
>>>> Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>>> to find a good address", would it not make sense to just try a low
>>>> address as part of the try-addresses-loop?
>>>>
>>>> We certainly don't want to have to use a dedicated heapbase register
>>>> or a shift. Just give us a multiple of 4*G and we're happy.
>>>>
>>>>
>>>>
>>>>
>>>
>>
>

From nick.gasson at arm.com  Tue Apr 21 06:03:05 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Tue, 21 Apr 2020 14:03:05 +0800
Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for
 mainline changes
In-Reply-To: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>
References: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>
Message-ID: <85wo69ml46.fsf@arm.com>

Hi David,

>
> aarch64-port-dev at openjdk.java.net is the mailing list for the
> aarch64-port project:
>
> http://openjdk.java.net/projects/aarch64-port/
>
> which was setup to get the Aarch64 port into OpenJDK 8. While the
> mailing list still serves as a good place for people to specifically
> discuss issues around the Aarch64 port, all code reviews for changes to
> be pushed to mainline openjdk (not the aarch64-port project repo) should
> be occurring on the appropriate mainline hotspot*-dev mailing list
> (which correspond to the 'appropriate development list' as per
> http://openjdk.java.net/contribute/).
>

Is there a specific change you're referring to? I checked the RFRs on
this list in the last few months and I can't find any that aren't also
To/CC the appropriate hotspot-* mailing list. For patches from Arm the
policy we've been following is to send to the relevant hotspot-* list
and also copy aarch64-port-dev if it touches AArch64-specific code. I
think the way Mailman is configured, if a message is sent to multiple
lists you are subscribed to then you'll only receive it from one of
those.


Thanks,
Nick

From david.holmes at oracle.com  Tue Apr 21 06:25:01 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Apr 2020 16:25:01 +1000
Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for
 mainline changes
In-Reply-To: <85wo69ml46.fsf@arm.com>
References: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>
 <85wo69ml46.fsf@arm.com>
Message-ID: <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com>

Hi Nick,

On 21/04/2020 4:03 pm, Nick Gasson wrote:
> Hi David,
> 
>>
>> aarch64-port-dev at openjdk.java.net is the mailing list for the
>> aarch64-port project:
>>
>> http://openjdk.java.net/projects/aarch64-port/
>>
>> which was setup to get the Aarch64 port into OpenJDK 8. While the
>> mailing list still serves as a good place for people to specifically
>> discuss issues around the Aarch64 port, all code reviews for changes to
>> be pushed to mainline openjdk (not the aarch64-port project repo) should
>> be occurring on the appropriate mainline hotspot*-dev mailing list
>> (which correspond to the 'appropriate development list' as per
>> http://openjdk.java.net/contribute/).
>>
> 
> Is there a specific change you're referring to? I checked the RFRs on
> this list in the last few months and I can't find any that aren't also
> To/CC the appropriate hotspot-* mailing list. For patches from Arm the
> policy we've been following is to send to the relevant hotspot-* list
> and also copy aarch64-port-dev if it touches AArch64-specific code. I
> think the way Mailman is configured, if a message is sent to multiple
> lists you are subscribed to then you'll only receive it from one of
> those.

My apologies to all. I saw a JBS update with a link to a RFR thread on 
aarch64-dev and then failed to find the corresponding mail on 
hotspot-compiler-dev.

Sorry for the noise.

David
-----

> 
> Thanks,
> Nick
> 

From Pengfei.Li at arm.com  Tue Apr 21 06:35:32 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Tue, 21 Apr 2020 06:35:32 +0000
Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for
 mainline changes
In-Reply-To: <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com>
References: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>
 <85wo69ml46.fsf@arm.com> <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com>
Message-ID: <DB8PR08MB49696BBBB4F746B5D01E33D696D50@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi David,

> My apologies to all. I saw a JBS update with a link to a RFR thread on
> aarch64-dev and then failed to find the corresponding mail on hotspot-
> compiler-dev.
> 
> Sorry for the noise.

Perhaps you saw this one https://bugs.openjdk.java.net/browse/JDK-8242070

I have updated the JBS comment and will use the hotspot-*-dev URLs as official review thread links in the future. Anyway, thanks for reminding.

--
Thanks,
Pengfei


From david.holmes at oracle.com  Tue Apr 21 06:47:07 2020
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 21 Apr 2020 16:47:07 +1000
Subject: [aarch64-port-dev ] Procedural issue regarding code reviews for
 mainline changes
In-Reply-To: <DB8PR08MB49696BBBB4F746B5D01E33D696D50@DB8PR08MB4969.eurprd08.prod.outlook.com>
References: <e8f9e6f1-6837-f1c4-ad1a-a911f163518c@oracle.com>
 <85wo69ml46.fsf@arm.com> <074188cf-fe78-2b80-ed2d-158eb19c1adf@oracle.com>
 <DB8PR08MB49696BBBB4F746B5D01E33D696D50@DB8PR08MB4969.eurprd08.prod.outlook.com>
Message-ID: <6cd09601-5904-2c07-1e9b-4f8975c5ccf4@oracle.com>

Hi Pengfei,

On 21/04/2020 4:35 pm, Pengfei Li wrote:
> Hi David,
> 
>> My apologies to all. I saw a JBS update with a link to a RFR thread on
>> aarch64-dev and then failed to find the corresponding mail on hotspot-
>> compiler-dev.
>>
>> Sorry for the noise.
> 
> Perhaps you saw this one https://bugs.openjdk.java.net/browse/JDK-8242070

Yes that was the one.

> I have updated the JBS comment and will use the hotspot-*-dev URLs as official review thread links in the future. Anyway, thanks for reminding.

Thank you for doing that, but it was my mistake.

Cheers,
David

> --
> Thanks,
> Pengfei
> 

From per.liden at oracle.com  Tue Apr 21 08:20:02 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 21 Apr 2020 10:20:02 +0200
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
Message-ID: <90c015a8-3db9-b7bd-f8d3-f05f5e6458d3@oracle.com>

Looks good to me.

One minor thing, you no longer need -XX:+UnlockExperimentalVMOptions in 
test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherWithZ.java. I don't 
need to see another webrev for that.

cheers,
Per

On 4/20/20 4:19 PM, Stuart Monteith wrote:
> Hi,
> 	If anyone has bandwidth, would the be able to review this patch? It
> addresses Andrew, Per and Erik's comments:
>          http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/
> 
> Thanks,
> 	Stuart
> 
> 
> On 27/03/2020 09:47, Erik ?sterlund wrote:
>> Hi Stuart,
>>
>> Thanks for sorting this out on AArch64. It is nice to see thatyou can
>> implement these
>> barriers on platforms that do not have instruction cache coherency.
>>
>> One small change request:
>> It looks like in C1 you inject the entry barrier right after build_frame
>> is done:
>>
>>  ?629?????? build_frame();
>>  ?630?????? {
>>  ?631???????? // Insert nmethod entry barrier into frame.
>>  ?632???????? BarrierSetAssembler* bs =
>> BarrierSet::barrier_set()->barrier_set_assembler();
>>  ?633???????? bs->nmethod_entry_barrier(_masm);
>>  ?634?????? }
>>
>> Unfortunately, this is in the platform independent part of the LIR
>> assembler. In the x86 version
>> we inject it at the very end of build_frame() instead, which is a
>> platform-specific function.
>> The platform-specific function is in the C1 macro assembler file for
>> that platform.
>>
>> We intentionally put it in the platform-specific path as it is a
>> platform-specific feature.
>> Now on x86, the barrier code will be emitted once in build_frame() and
>> once after returning
>> from build_frame, resulting in two nmethod entry barriers, and only the
>> first one will get
>> patched, causing the second one to mostly take slow paths, which isn't
>> necessarily wrong,
>> but will cause regressions.
>>
>> I would propose you just move those lines into the very end of the
>> AArch64-specific part of
>> build_frame().
>>
>> I don't need to see another webrev for that trivial code motion. This
>> looks good to me.
>> Agan, thanks a lot for fixing this! It will allow me to go forward with
>> concurrent stack
>> scanning on AArch64 as well.
>>
>> Thanks,
>> /Erik
>>
>>
>> On 2020-03-26 23:42, Stuart Monteith wrote:
>>> Hello,
>>>  ???????? Please review this change to implement nmethod entry barriers on
>>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will
>>> need to be separately tested and enabled - there are problems with this
>>> on Shenandoah.
>>>
>>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as
>>> well as Netbeans.
>>>
>>> In terms of interesting features:
>>>  ????????? With nmethod entry barriers,? immediate oops are removed by:
>>>  ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop
>>>  ???????? This is to ensure consistency with the entry barrier, as
>>> otherwise with
>>> an immediate we'd otherwise need an ISB.
>>>
>>>  ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this
>>> functionality
>>> useful in testing as deoptimisation is very infrequent. I've written it
>>> as an atomic to avoid it happening too frequently. As it is a new
>>> option, I'm not sure whether any more is needed than this review. A new
>>> test has been added
>>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to
>>> test GC with that option enabled.
>>>
>>>  ???????? BarrierSetAssembler::nmethod_entry_barrier
>>>  ???????? This method emits the barrier code. In internal review it was
>>> suggested
>>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not
>>> done this as the BarrierSetNMethod code checks the exact instruction
>>> sequence, and I prefer to be explicit.
>>>
>>>  ???????? Benchmarking method entry shows an increase of around 6ns
>>> with the
>>> nmethod entry barrier.
>>>
>>>
>>> The deoptimisation code was contributed by Andrew Haley.
>>>
>>> The bug:
>>>  ???????? https://bugs.openjdk.java.net/browse/JDK-8216557
>>>
>>> The webrev:
>>>  ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/
>>>
>>>
>>> BR,
>>>  ???????? Stuart
> 

From kuaiwei.kw at alibaba-inc.com  Mon Apr 20 11:12:55 2020
From: kuaiwei.kw at alibaba-inc.com (Kuai Wei)
Date: Mon, 20 Apr 2020 19:12:55 +0800
Subject: [aarch64-port-dev ]
 =?utf-8?q?RFR=3A_heapbase_register_can_be_all?=
 =?utf-8?q?ocated_in_compressed_mode?=
In-Reply-To: <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com>
References: <613724a7-1dd1-448d-aaaa-dbbe0d0beca4.kuaiwei.kw@alibaba-inc.com>
 <9f991f61-2d59-ca87-d68e-7b8c257d9be4@redhat.com>
 <ef37fac9-364b-442c-88ef-eb0cc9855cb5.kuaiwei.kw@alibaba-inc.com>
 <7bd76285-b58d-5359-85ed-4430288a675e@redhat.com>
 <0c6fdf72-3c83-4563-8d13-45e83ee70310.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB49694C72021174B76D032E5C96DD0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <a69c8cd9-b14e-4e5a-95f1-604197534d98.kuaiwei.kw@alibaba-inc.com>
 <8E4A835E-3853-40BA-B44F-DD0A4ECC0308@amazon.com>
 <DB8PR08MB4969EB393F9ACF9724A6D4C796DA0@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <78D18021-2129-485A-8407-A37D385D0DE6@amazon.com>
 <229d2a57-8fd0-4826-889d-cca833ca19f3.kuaiwei.kw@alibaba-inc.com>
 <781CB090-0386-4D32-8465-8238E516789B@amazon.com>
 <77fd9246-b951-47b9-9743-11aa3fd851bd.kuaiwei.kw@alibaba-inc.com>
 <DB8PR08MB4969167A32C497396AF1693396D40@DB8PR08MB4969.eurprd08.prod.outlook.com>
 <d8bd968e-b376-d9ae-dcc7-9d79e2c382ac@redhat.com>,
 <84c21683-eaba-5598-6a1d-c58abdb39014@redhat.com>
Message-ID: <74ad538f-3247-4b31-832f-b3cb1bd9f41a.kuaiwei.kw@alibaba-inc.com>

Hi Andrew,

  Could you tell more detail about it? I can start a new patch for it if it break anything.

Kuai Wei


------------------------------------------------------------------
From:Andrew Haley <aph at redhat.com>
Send Time:2020?4?20?(???) 18:01
To:Pengfei Li <Pengfei.Li at arm.com>; ??(??) <kuaiwei.kw at alibaba-inc.com>; "Liu, Xin" <xxinliu at amazon.com>; hotspot compiler <hotspot-compiler-dev at openjdk.java.net>
Cc:nd <nd at arm.com>; aarch64-port-dev at openjdk.java.net <aarch64-port-dev at openjdk.java.net>
Subject:Re: RFR: heapbase register can be allocated in compressed mode

On 4/20/20 9:48 AM, Andrew Haley wrote:
> On 4/20/20 5:32 AM, Pengfei Li wrote:
>> Maybe Andrew Haley or other AArch64 reviewers can help?
>>
>> [1] http://cr.openjdk.java.net/~wzhuo/8242449/webrev.01/
> It's fine.

Sorry, no it isn't fine. Please get rid of this hunk:

--- old/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:52.009758661 +0800
+++ new/src/hotspot/cpu/aarch64/macroAssembler_aarch64.cpp 2020-04-14 21:18:51.785764043 +0800
@@ -2185,6 +2185,10 @@
 #if 0
   assert (UseCompressedOops || UseCompressedClassPointers, "should be compressed");
   assert (Universe::heap() != NULL, "java heap should be initialized");
+  if (!UseCompressedOops || Universe::ptr_base() == NULL) {
+    // rheapbase is allocated as general register
+    return;
+  }
   if (CheckCompressedOops) {
     Label ok;
     push(1 << rscratch1->encoding(), sp); // cmpptr trashes rscratch1

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671

From ioi.lam at oracle.com  Tue Apr 21 04:42:56 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Mon, 20 Apr 2020 21:42:56 -0700
Subject: [aarch64-port-dev ] Question about ccs reservation,
	CDS and aarch64 specifics
In-Reply-To: <CAA-vtUydJ9BMAEU8=3HCGb3UG3NiE_tRFD6hORqCB1ba8KXVrA@mail.gmail.com>
References: <CAA-vtUw3TNez8=_bBnjaNqcgRpUoERWfx-q4VzS5xNPwghh_9Q@mail.gmail.com>
 <9cf0d56e-77bb-7f1e-3a01-bc62c4d39486@redhat.com>
 <be1c7e88-7c32-c566-5b10-ce189bfe9652@oracle.com>
 <CAA-vtUz1nhi_q_0QbBk+BG3cY5L73YoF1G3vsxMTtDgD=S832Q@mail.gmail.com>
 <4589f23a-96ce-781a-8d78-3c4abcabc902@oracle.com>
 <CAA-vtUw_07m7oVB2ycyhceDFAcRx3Q0si4g0jcwc2W=VvUs5pQ@mail.gmail.com>
 <eb148861-1adc-705a-60c0-c2c81fc760e4@oracle.com>
 <CAA-vtUydJ9BMAEU8=3HCGb3UG3NiE_tRFD6hORqCB1ba8KXVrA@mail.gmail.com>
Message-ID: <6b38ee93-6961-0a34-4b91-0d8cedce9ddd@oracle.com>


On 4/20/20 4:10 AM, Thomas St?fe wrote:
> On Mon, Apr 20, 2020 at 10:47 AM Ioi Lam <ioi.lam at oracle.com 
> <mailto:ioi.lam at oracle.com>> wrote:
>
>
>
>     On 4/18/20 12:15 AM, Thomas St?fe wrote:
>>     Hi Ioi,
>>
>>     I am working on a small patch and have some more questions.
>>
>>     - First, a simple one, in
>>     DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta(),
>>     the space does not have anything to do with metaspace, as you
>>     wrote, so the alignment could be anything, right?
>>
>     I think so.
>
>>     - Out of curiousity, when you pack the different regions
>>     (DumpRegion::pack) you align the end to page size. Why? Why could
>>     the next region not simply follow immediately? I looked if any
>>     code needs a region to be page aligned, but may have missed it.
>
>     We map RO read-only and MC/RW in read-write. If the regions are
>     not aligned, you will have a page that wants half to be read-only
>     and half to be read-write.
>
>
> Okay. I wondered why page align here and not allocation granularity. 
> Now I understand. I guess this is also the reason why we could not use 
> large pages for the archive?
>
> I think this is fine,?I did not want to change it. On some platforms 
> we have 64K (non-large) pages, but even there I think the waste would 
> be acceptable.
>
>     I guess we can adjust the mapping to be more lenient (if a page
>     wants half read-write, we map it read-write), but that's no done
>     today.
>
>>
>>     - void
>>     MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() :
>>
>>     I assume this code has to work for all three cases right
>>     1) lp32.
>>     2) lp64 with and without UseCompressedClassPointers?
>>     3) lp64 without UseCompressedClassPointers?
>>
>>     If yes, does the setting for UseCompressedClassPointers have to
>>     be the same at run time?
>
>     Yes. The value of UseCompressedOops and UseCompressedClassPointers
>     must be the same between dump time and run time.
>
>>
>>
>>     In this layout:
>>     ? // On 64-bit VM, the heap and class space layout will be the
>>     same as if
>>     ? // you're running in -Xshare:on mode:
>>     ? //
>>     ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?+-- SharedBaseAddress (default
>>     = 0x800000000)
>>     ? // ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?v
>>     ? // +-..---------+---------+ ...
>>     +----+----+----+--------------------+
>>     ? // | ? ?Heap ? ?| Archive | ? ? | MC | RW | RO | ? ?class space
>>     ? ? |
>>     ? // +-..---------+---------+ ...
>>     +----+----+----+--------------------+
>>     ? // |<-- ? MaxHeapSize ?-->| |<-- UnscaledClassSpaceMax = 4GB -->|
>>     ? //
>>
>>     Why does the class space has to follow mc+rw+ro? Could it come
>>     before?
>>
>>
>     Compressed klass pointers are stored in archived objects. If the
>     class space is now lower than SharedBaseAddress, you will need to
>     rebase all of the compressed klass pointers. This is not efficient
>     and will slow down start-up.
>
>
> Well, could SharedBaseAddress not point to start of the ccs:
>
> ? // +-- SharedBaseAddress (default = 0x800000000)
> ? // v
> ? // +----+----+----+-----------------------------------+
> ? // | ? ?class space ? ? | ..gap maybe.. | MC | RW | RO
> ? // +----+----+----+-----------------------------------+
>
> you'd then need to make sure that the relative offset of MC to 
> SharedBaseAddress is the same at dump time and at runtime. Is my 
> understanding correct? I am not saying I want to do this, I just try 
> to understand the way ccs archive allocation works.

That should work. But you are still using a fixed offset from the bottom 
of MC to SharedBaseAddress (instead of a fixed address of 0). I am not 
sure if that will buy you any flexibility.

>
>>
>>     Actually, does it have to be in the same space at all, or could
>>     it live somewhere completely different?
>
>     It can be higher. You just need to ensure that the distance
>     between SharedBaseAddress to the end of the class space is within
>     max compressed klass space size.
>
>     But, I am wondering why you're asking this :-)
>
>
> I try to understand the allocation and where apply what restrictions. 
> We have at least three parties, cds, metaspace and the underlying 
> platform, all with their own subtleties of how the memory should be 
> allocated:
> - metaspace will in the near future want a larger alignment than what 
> cds uses for reservation.
> - platforms like aarch64 and maybe ppc want the compressed class base 
> to look in a certain way
>
> Part of my confusion was that I always thought of 
> CompressClassPointers::base() to be basically the same as the start of 
> the ccs (maybe modulo being zero on zero-based mode) but that is 
> obviously not true since CDS exists. So what I wrote first:
>
> "Metaspace::reserve_preferred_space.. Despite its generic-sounding 
> name, these functions can only be used to allocate ccs."
>
> is actually not fully correct. In reality this space is?to be used to 
> allocate memory to house Klass structures so that their pointers are 
> compressable, so the reserved start address has to be compatible with 
> that. But, e.g., that start address does not have to be aligned to 
> Metaspace::reserve_alignment().
>
That's very true. I have some logic in CDS to pick the greater of 
Metaspace::reserve_alignment() and os alignment. I think this is 
probably unnecessary.

Maybe we should simply untangle CDS/CCS from Metaspace altogether? We 
really just need a 1GB reservation to be anchored at 32GB (or some 
address that aarch64 likes). That way, you can do whatever you want with 
Metaspace and not worry about CDS/CCS.

Thanks
- Ioi
> In both cds dump and runtime case, the ccs is carved from the end part 
> of the reserved space. Only that split point, and the size of that 
> second part, have to be aligned to Metaspace::reserve_alignment().
>
> Were we to allocate ccs first and put the archives behind it this 
> would simplify some matters, but only minor points. I think the way it 
> works now is okay. I will try to disentangle it a bit in a way you 
> proposed.
>
>>     To ask in a more precise way: I understand that both the mc+rw+ro
>>     archives and the ccs have to live in an area encompassed by the
>>     compressed class pointers encoding scheme. I wonder whether there
>>     are any restrictions beyond that.
>>
>>     Could there be a gap between archives and ccs? 
>     Yes
>
>>     Can the order be reversed? 
>     No.
>
>>     Do the relative positions between archives and ccs have to be the
>>     same between dump time and runtime?
>     No. All the pointers stored inside CDS point to inside of the
>     MC/MW/RO regions, so it doesn't retain any knowledge of where the
>     CCS was at dump time.
>
>
> Clear answers, thank you!
>
> ..Thomas
>
>
>     Thanks
>     - Ioi
>
>
>>
>>     Thanks!
>>
>>     On Thu, Apr 16, 2020 at 8:31 PM Ioi Lam <ioi.lam at oracle.com
>>     <mailto:ioi.lam at oracle.com>> wrote:
>>
>>
>>
>>         On 4/16/20 11:14 AM, Thomas St?fe wrote:
>>>         Hi Ioi,
>>>
>>>         On Thu, Apr 16, 2020 at 7:49 PM Ioi Lam <ioi.lam at oracle.com
>>>         <mailto:ioi.lam at oracle.com>> wrote:
>>>
>>>             (I suppose you mean "compressed class space" by "ccs" :-)
>>>
>>>
>>>         Yes, I think I stole this from Stefan Karlsson :)
>>>
>>>             <snip>
>>>
>>>             I am not even sure if case (C) can happen at all.
>>>
>>>             I admit that I've been guilty of making the interface
>>>             even more complicated
>>>             with JDK-8231610
>>>             <https://bugs.openjdk.java.net/browse/JDK-8231610>(Relocate
>>>             the CDS archive if it cannot be mapped to the
>>>             requested address). Looks now is a good time to clean up.
>>>
>>>
>>>         The coding has been complicated to begin with, and then it
>>>         usually only gets worse since no-one has time for a revamp
>>>         :( A clean up would be very helpful.
>>>
>>>         One reason I look at this coding now, beside the aarch64
>>>         problem, was that I try to disentangle?CDS from Metaspace,
>>>         especially the alignment policy. Remember, I tried to tackle
>>>         this last summer? but it keeps biting me. For such a small
>>>         problem this is weirdly complicated.
>>>
>>>             One thing that can be cleaned up is the call to
>>>             Metaspace::allocate_metaspace_compressed_klass_ptrs:
>>>
>>>             (a) when CDS is enabled:
>>>
>>>             Metaspace::global_initialize()
>>>             ??? ->
>>>             MetaspaceShared::initialize_runtime_shared_and_meta_spaces()
>>>             ?????? -> ... MetaspaceShared::map_archives()
>>>             ???????? -> ... reserve the space, eventually calling
>>>             Metaspace::reserve_space
>>>             ???????? -> call
>>>             Metaspace::allocate_metaspace_compressed_klass_ptrs()
>>>
>>>             (b) when CDS is disabled
>>>
>>>             Metaspace::global_initialize()
>>>             -> allocate_metaspace_compressed_klass_ptrs
>>>             ?????? -> (if cds is not enabled) Metaspace::reserve_space()
>>>
>>>
>>>             In case (b), we should first reserve the space, and then
>>>             call into
>>>             allocate_metaspace_compressed_klass_ptrs. This will
>>>             simplify the arguments
>>>             of allocate_metaspace_compressed_klass_ptrs, and will
>>>             also limit the variations
>>>             of calls to Metaspace::reserve_space(). I think this
>>>             will make it possible to
>>>             drop the use_requested_addr argument and rely simply on
>>>             (requested_addr != NULL)
>>>
>>>
>>>         So, in all cases we'd pre-reserve the ReservedSpace and hand
>>>         it down to
>>>         Metaspace::allocate_metaspace_compressed_klass_ptrs()?
>>>
>>>         This would melt down
>>>         Metaspace::allocate_metaspace_compressed_klass_ptrs() to
>>>         just "initialize compressed class space from a pre-arranged
>>>         ReservedSpace, and set up base?+ shift".
>>>
>>>         We could probably rename that thing
>>>         to?Metaspace::set_up_compressed_klass_space(ReservedSpace*
>>>         rs, cds_base);
>>>
>>>         We even could move set_narrow_klass_base_and_shift() out of
>>>         Metaspace::set_up_compressed_klass_space, then it becomes a
>>>         series of three simple operations:
>>>         1) obtain a ReservedSpace however you see fit
>>>         2) register it with Metaspace as address space for ccs,
>>>         3) set_narrow_klass_base_and_shift. We would not have to
>>>         hand down cds_base to Metaspace, only for it to be used as
>>>         base address in?set_narrow_klass_base_and_shift.
>>>
>>
>>         Yes, that seems the right thing to do. That will hopefully
>>         make the aarch64 initialization code a little simpler as well.
>>
>>>         One question which came to me today was:
>>>
>>>         In AppCDS, DynamicArchiveBuilder::do_it() calls
>>>         Metaspace::reserve_space(). Is that really needed,?does a
>>>         DumpRegion have anything to do with ccs? Don't they just
>>>         need some space to dump into? Hope that question is not dumb.
>>>
>>         Do you mean:
>>
>>         DynamicArchiveBuilder::reserve_space_and_init_buffer_to_target_delta()
>>
>>         -> MetaspaceShared::reserve_shared_space
>>         ??? -> Metaspace::reserve_space
>>
>>         That's not necessary. When I wrote the code I thought
>>         Metaspace::reserve_space was a general function for reserving
>>         spaces :-) but as you said, this function is probably
>>         intended only for initializing the CCS.
>>
>>         Thanks
>>         - Ioi
>>
>>>         Thanks, Thomas
>>>
>>>             Thanks
>>>             - Ioi
>>>
>>>
>>>>>             Does that make sense? In other words, if the whole point of
>>>>>             Metaspace::reserve_preferred_space() is "OS knows better, let it try
>>>>>             to find a good address", would it not make sense to just try a low
>>>>>             address as part of the try-addresses-loop?
>>>>             We certainly don't want to have to use a dedicated heapbase register
>>>>             or a shift. Just give us a multiple of 4*G and we're happy.
>>>>
>>>
>>
>


From aph at redhat.com  Tue Apr 21 09:23:33 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 21 Apr 2020 10:23:33 +0100
Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter
 names of reduction operations to make code clear
In-Reply-To: <VI1PR0802MB2558027F96432B28B3C25EFF8ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <f998a1cc-2ff8-d3b0-e3db-6c9ef4ffecd8@redhat.com>
 <VI1PR0802MB2558027F96432B28B3C25EFF8ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <b2b8c00f-0e07-c84e-c566-fcb72bb4f3ff@redhat.com>

On 4/17/20 10:13 AM, Yang Zhang wrote:
> Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations.  
> 
> In this directory, there are also some test cases about reduction operations,  which is added in [1].
> https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtreg/compiler/loopopts/superword
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8240248

Sounds good. Thanks!

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.stuefe at gmail.com  Tue Apr 21 14:31:08 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Apr 2020 16:31:08 +0200
Subject: [aarch64-port-dev ] Question about CompressedKlassPointers::range
Message-ID: <CAA-vtUyGhP38Sz13Nz_E2+PSGK21Z0GfqGq24Axfz1i7A42_eg@mail.gmail.com>

Hi,

this is a followup question, mainly for aarch64, to
https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
 .

CompressedKlassPointers has a range field, only used by aarch64 afaics,
introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV".

I read its bug description and the patch. If I understand the problem,
before CDS the assumption was that CompressedClassSpaceSize is synonymous
with the range of values narrow Klass pointers could have; which seems
logical, but that assumption was broken since CDS and now the encoding
range must span both the ccs and the cds archives.

The range is used inside MacroAssembler::klass_decode_mode() to decide
whether to use the OR mode.

I see this being set in three places:
1) at cds dumptime, to 4G
2) at cds runtime, to CompressedClassSpaceSize, and
3) if cds is disabled it keeps its default value of 4G.

I may miss something here. Would (2) not be too small? Should that size not
include the size of the archives? We map first the archives, lets say they
are 300MB, after that ccs, lets say 1G default, would that not mean any
Klass residing toward the end of the ccs - if it were to fill up, which it
almost never does - would have an offset larger than the initially assumed
range and hence not correctly OR-able with the base anymore?

And I'm not sure (3) is correct either since the range we could encode in
theory is 32G with shift=3. In practice this is today no problem. Today
CompressedClassSpaceSize is artificially capped at 3G. If that were ever to
change, and someone would set it to >4G, this should cause problems too, no?

If my assumption about (2) is correct, it could be the error is just well
hidden either because MacroAssembler::_klass_decode_mode is already
initialized, using the default value (3). Or because it is difficult to
allocate so many classes to trigger this error.

--

As a more general question: CompressedKlassPointers::range(), as in "the
expected range of narrow Klass pointer values", I guess it makes sense to
keep it as small as possible, right? Instead of, say hard-coding it to 32G?
Since the smaller the expected range of narrow pointers is, the more
probable we could choose the OR mode?

Oh, and on aarch64, how "good" is that OR mode compared with the "movk"
mode on aarch64? Since it seems to be preferred?

Thanks a lot, again,

Thomas

From aph at redhat.com  Tue Apr 21 15:59:22 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 21 Apr 2020 16:59:22 +0100
Subject: [aarch64-port-dev ] Question about
	CompressedKlassPointers::range
In-Reply-To: <CAA-vtUyGhP38Sz13Nz_E2+PSGK21Z0GfqGq24Axfz1i7A42_eg@mail.gmail.com>
References: <CAA-vtUyGhP38Sz13Nz_E2+PSGK21Z0GfqGq24Axfz1i7A42_eg@mail.gmail.com>
Message-ID: <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com>

Hi,

On 4/21/20 3:31 PM, Thomas St?fe wrote:
> this is a followup question, mainly for aarch64, to
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
>  .
>
> CompressedKlassPointers has a range field, only used by aarch64 afaics,
> introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV".
>
> I read its bug description and the patch. If I understand the problem,
> before CDS the assumption was that CompressedClassSpaceSize is synonymous
> with the range of values narrow Klass pointers could have; which seems
> logical, but that assumption was broken since CDS and now the encoding
> range must span both the ccs and the cds archives.
>
> The range is used inside MacroAssembler::klass_decode_mode() to decide
> whether to use the OR mode.
>
> I see this being set in three places:
> 1) at cds dumptime, to 4G
> 2) at cds runtime, to CompressedClassSpaceSize, and
> 3) if cds is disabled it keeps its default value of 4G.
>
> I may miss something here. Would (2) not be too small? Should that size not
> include the size of the archives?

I believe so.

> We map first the archives, lets say they
> are 300MB, after that ccs, lets say 1G default, would that not mean any
> Klass residing toward the end of the ccs - if it were to fill up, which it
> almost never does - would have an offset larger than the initially assumed
> range and hence not correctly OR-able with the base anymore?

How would that happen? If someone maps CDS space miles from CCS,
you mean? OK, but that'd be a pointless thing to do.

> And I'm not sure (3) is correct either since the range we could encode in
> theory is 32G with shift=3. In practice this is today no problem. Today
> CompressedClassSpaceSize is artificially capped at 3G. If that were ever to
> change, and someone would set it to >4G, this should cause problems too, no?

Yes, it would. It'd be a fool thing to do, but that doesn't mean it
won't happen. We really don't need more than 3G, after all.

> If my assumption about (2) is correct, it could be the error is just well
> hidden either because MacroAssembler::_klass_decode_mode is already
> initialized, using the default value (3). Or because it is difficult to
> allocate so many classes to trigger this error.

(2) looks wrong.

> As a more general question: CompressedKlassPointers::range(), as in
> "the expected range of narrow Klass pointer values", I guess it
> makes sense to keep it as small as possible, right? Instead of, say
> hard-coding it to 32G?

Yes, it does.

> Since the smaller the expected range of narrow pointers is, the more
> probable we could choose the OR mode?

At the moment the probability of being able to do that is so high that
if it fails I'd expect it'd be a bug.

> Oh, and on aarch64, how "good" is that OR mode compared with the "movk"
> mode on aarch64? Since it seems to be preferred?

A shift is sometimes slower than a simple XOR, so a shift is never
preferred. Beyond that it's impossible to say for sure because there
are many independent implementations, some of which I have never seen,
but I doubt that there's a huge difference. Any 4G range is probably
OK.

Bear in mind, though, that people designing AArch64 hardware today are
benchmarking OpenJDK and making decisions based on what HotSpot
does. For that reason, changing what we do without a really good
reason isn't the best idea.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.stuefe at gmail.com  Tue Apr 21 17:24:33 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 21 Apr 2020 19:24:33 +0200
Subject: [aarch64-port-dev ] Question about
	CompressedKlassPointers::range
In-Reply-To: <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com>
References: <CAA-vtUyGhP38Sz13Nz_E2+PSGK21Z0GfqGq24Axfz1i7A42_eg@mail.gmail.com>
 <49a0fe0f-709c-9e6a-51e1-5898962430fc@redhat.com>
Message-ID: <CAA-vtUwiZ=xV0cf5OEhGVtozz_RGdzxvRcdv+wO4aqY8SqMNBQ@mail.gmail.com>

On Tue, Apr 21, 2020 at 5:59 PM Andrew Haley <aph at redhat.com> wrote:

> Hi,
>
> On 4/21/20 3:31 PM, Thomas St?fe wrote:
> > this is a followup question, mainly for aarch64, to
> >
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
> >  .
> >
> > CompressedKlassPointers has a range field, only used by aarch64 afaics,
> > introduced with "8193266: AArch64: TestOptionsWithRanges.java SIGSEGV".
> >
> > I read its bug description and the patch. If I understand the problem,
> > before CDS the assumption was that CompressedClassSpaceSize is synonymous
> > with the range of values narrow Klass pointers could have; which seems
> > logical, but that assumption was broken since CDS and now the encoding
> > range must span both the ccs and the cds archives.
> >
> > The range is used inside MacroAssembler::klass_decode_mode() to decide
> > whether to use the OR mode.
> >
> > I see this being set in three places:
> > 1) at cds dumptime, to 4G
> > 2) at cds runtime, to CompressedClassSpaceSize, and
> > 3) if cds is disabled it keeps its default value of 4G.
> >
> > I may miss something here. Would (2) not be too small? Should that size
> not
> > include the size of the archives?
>
> I believe so.
>
> > We map first the archives, lets say they
> > are 300MB, after that ccs, lets say 1G default, would that not mean any
> > Klass residing toward the end of the ccs - if it were to fill up, which
> it
> > almost never does - would have an offset larger than the initially
> assumed
> > range and hence not correctly OR-able with the base anymore?
>
> How would that happen? If someone maps CDS space miles from CCS,
> you mean? OK, but that'd be a pointless thing to do.
>

I thought this could happen by filling up ccs.

At CDS runtime (-Xshare=on) we map the cds archive, followed by the ccs:

Encoding base
|
v
+------+----------------------------+
|  cds |   ccs                      |
+------+----------------------------+
+----------------------------+
                             A

The size of the ccs is CompressedClassSpaceSize. Address A is Encoding
base + CompressedClassSpaceSize, as in case (2), without archive size taken
into account.

ccs fills up at runtime, starting at the bottom, if more non-shared classes
are loaded. E.g. lots of lambdas or reflection glue classes, or just
application classes. When ccs fills up beyond point A, the assumption that
no Klass ever has an offset larger than CompressedClassPointers::range is
broken and the OR mode may not work anymore.

However I see now that we would only have a problem if the encoding base
had a non-zero bit set right above the end of the offset mask. But if the
encoding base on aarch64 is always 4G aligned, and a narrow Klass pointer
cannot be larger than 4G, the OR would still work. So, this is only a
theoretical problem.


> > And I'm not sure (3) is correct either since the range we could encode in
> > theory is 32G with shift=3. In practice this is today no problem. Today
> > CompressedClassSpaceSize is artificially capped at 3G. If that were ever
> to
> > change, and someone would set it to >4G, this should cause problems too,
> no?
>
> Yes, it would. It'd be a fool thing to do, but that doesn't mean it
> won't happen. We really don't need more than 3G, after all.
>

> > If my assumption about (2) is correct, it could be the error is just well
> > hidden either because MacroAssembler::_klass_decode_mode is already
> > initialized, using the default value (3). Or because it is difficult to
> > allocate so many classes to trigger this error.
>
> (2) looks wrong.
>
> > As a more general question: CompressedKlassPointers::range(), as in
> > "the expected range of narrow Klass pointer values", I guess it
> > makes sense to keep it as small as possible, right? Instead of, say
> > hard-coding it to 32G?
>
> Yes, it does.
>
> > Since the smaller the expected range of narrow pointers is, the more
> > probable we could choose the OR mode?
>
> At the moment the probability of being able to do that is so high that
> if it fails I'd expect it'd be a bug.
>
> > Oh, and on aarch64, how "good" is that OR mode compared with the "movk"
> > mode on aarch64? Since it seems to be preferred?
>
> A shift is sometimes slower than a simple XOR, so a shift is never
> preferred. Beyond that it's impossible to say for sure because there
> are many independent implementations, some of which I have never seen,
> but I doubt that there's a huge difference. Any 4G range is probably
> OK.
>
> Bear in mind, though, that people designing AArch64 hardware today are
> benchmarking OpenJDK and making decisions based on what HotSpot
> does. For that reason, changing what we do without a really good
> reason isn't the best idea.
>

I had no idea. Thank you. I will be very careful.

..Thomas


>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>

From Yang.Zhang at arm.com  Wed Apr 22 04:23:51 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Wed, 22 Apr 2020 04:23:51 +0000
Subject: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter
 names of reduction operations to make code clear
In-Reply-To: <b2b8c00f-0e07-c84e-c566-fcb72bb4f3ff@redhat.com>
References: <VI1PR0802MB2558670821C29DC6202817258ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <f998a1cc-2ff8-d3b0-e3db-6c9ef4ffecd8@redhat.com>
 <VI1PR0802MB2558027F96432B28B3C25EFF8ED90@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <b2b8c00f-0e07-c84e-c566-fcb72bb4f3ff@redhat.com>
Message-ID: <VI1PR0802MB2558C0940AB4BA224A3AC6C68ED20@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi Andrew

Thanks for your review. I will ask Pengfei to help push it.

Regards
Yang

-----Original Message-----
From: Andrew Haley <aph at redhat.com> 
Sent: Tuesday, April 21, 2020 5:24 PM
To: Yang Zhang <Yang.Zhang at arm.com>; aarch64-port-dev at openjdk.java.net; hotspot-compiler-dev at openjdk.java.net
Cc: nd <nd at arm.com>
Subject: Re: [aarch64-port-dev ] RFR(M): 8242482: AArch64: Change parameter names of reduction operations to make code clear

On 4/17/20 10:13 AM, Yang Zhang wrote:
> Besides tier1, I also test these operations in Vector API test, which can cover all the reduction operations.  
> 
> In this directory, there are also some test cases about reduction operations,  which is added in [1].
> https://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/test/hotspot/jtr
> eg/compiler/loopopts/superword
> 
> [1] https://bugs.openjdk.java.net/browse/JDK-8240248

Sounds good. Thanks!

--
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com> https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From Yang.Zhang at arm.com  Thu Apr 23 02:39:26 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Thu, 23 Apr 2020 02:39:26 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build failed
Message-ID: <VI1PR0802MB2558A97BE952C61A2F7BF9428ED30@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8242905
Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/

This issue is introduced by [1]. In this commit, pop_CPU_state(restore
_vectors) and leave() are included under COMPILER2_OR_JVMCI check in
AArc64 restore_live_registers[2].

But restore_live_registers is used in generate_resolve_blob[3] which
might be called from c1. In x86 restore_live_registers, pop_CPU_state()
and pop(rbp) are always done [4].

To fix this issue, pop_CPU_state(restore_vectors) and leave()
are also moved outside of COMPILER2_OR_JVMCI check in AArch64
restore_live_registers.

Testing on AArch64 platform:
tier1 test with server build
server build with configuring --with-jvm-features=-compiler2
client build and ran HelloWorld

[1] https://bugs.openjdk.java.net/browse/JDK-8241665
[2] https://hg.openjdk.java.net/jdk/jdk/rev/53568400fec3#l1.23
[3] http://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp#l2850
[4] http://hg.openjdk.java.net/jdk/jdk/file/55c4283a7606/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#l378

From aph at redhat.com  Thu Apr 23 08:44:15 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 23 Apr 2020 09:44:15 +0100
Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build
 failed
In-Reply-To: <VI1PR0802MB2558A97BE952C61A2F7BF9428ED30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558A97BE952C61A2F7BF9428ED30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <f5882c3b-6223-f89d-ec7a-3a6eaa3198e1@redhat.com>

On 4/23/20 3:39 AM, Yang Zhang wrote:
> Could you please help to review this patch?
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8242905
> Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/

Ok, thanks.

Does anyone in the real world use AArch64 client builds? I'm wondering if
we'd be better off without that option.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From erik.osterlund at oracle.com  Thu Apr 23 10:51:46 2020
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 23 Apr 2020 12:51:46 +0200
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
Message-ID: <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>

Hi Stuart,

This looks good to me.

Thanks,
/Erik

On 2020-04-20 16:19, Stuart Monteith wrote:
> Hi,
> 	If anyone has bandwidth, would the be able to review this patch? It
> addresses Andrew, Per and Erik's comments:
>          http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/
>
> Thanks,
> 	Stuart
>
>
> On 27/03/2020 09:47, Erik ?sterlund wrote:
>> Hi Stuart,
>>
>> Thanks for sorting this out on AArch64. It is nice to see thatyou can
>> implement these
>> barriers on platforms that do not have instruction cache coherency.
>>
>> One small change request:
>> It looks like in C1 you inject the entry barrier right after build_frame
>> is done:
>>
>>  ?629?????? build_frame();
>>  ?630?????? {
>>  ?631???????? // Insert nmethod entry barrier into frame.
>>  ?632???????? BarrierSetAssembler* bs =
>> BarrierSet::barrier_set()->barrier_set_assembler();
>>  ?633???????? bs->nmethod_entry_barrier(_masm);
>>  ?634?????? }
>>
>> Unfortunately, this is in the platform independent part of the LIR
>> assembler. In the x86 version
>> we inject it at the very end of build_frame() instead, which is a
>> platform-specific function.
>> The platform-specific function is in the C1 macro assembler file for
>> that platform.
>>
>> We intentionally put it in the platform-specific path as it is a
>> platform-specific feature.
>> Now on x86, the barrier code will be emitted once in build_frame() and
>> once after returning
>> from build_frame, resulting in two nmethod entry barriers, and only the
>> first one will get
>> patched, causing the second one to mostly take slow paths, which isn't
>> necessarily wrong,
>> but will cause regressions.
>>
>> I would propose you just move those lines into the very end of the
>> AArch64-specific part of
>> build_frame().
>>
>> I don't need to see another webrev for that trivial code motion. This
>> looks good to me.
>> Agan, thanks a lot for fixing this! It will allow me to go forward with
>> concurrent stack
>> scanning on AArch64 as well.
>>
>> Thanks,
>> /Erik
>>
>>
>> On 2020-03-26 23:42, Stuart Monteith wrote:
>>> Hello,
>>>  ???????? Please review this change to implement nmethod entry barriers on
>>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will
>>> need to be separately tested and enabled - there are problems with this
>>> on Shenandoah.
>>>
>>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as
>>> well as Netbeans.
>>>
>>> In terms of interesting features:
>>>  ????????? With nmethod entry barriers,? immediate oops are removed by:
>>>  ???????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop
>>>  ???????? This is to ensure consistency with the entry barrier, as
>>> otherwise with
>>> an immediate we'd otherwise need an ISB.
>>>
>>>  ???????? I've added "-XX:DeoptNMethodBarrierALot". I found this
>>> functionality
>>> useful in testing as deoptimisation is very infrequent. I've written it
>>> as an atomic to avoid it happening too frequently. As it is a new
>>> option, I'm not sure whether any more is needed than this review. A new
>>> test has been added
>>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to
>>> test GC with that option enabled.
>>>
>>>  ???????? BarrierSetAssembler::nmethod_entry_barrier
>>>  ???????? This method emits the barrier code. In internal review it was
>>> suggested
>>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not
>>> done this as the BarrierSetNMethod code checks the exact instruction
>>> sequence, and I prefer to be explicit.
>>>
>>>  ???????? Benchmarking method entry shows an increase of around 6ns
>>> with the
>>> nmethod entry barrier.
>>>
>>>
>>> The deoptimisation code was contributed by Andrew Haley.
>>>
>>> The bug:
>>>  ???????? https://bugs.openjdk.java.net/browse/JDK-8216557
>>>
>>> The webrev:
>>>  ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/
>>>
>>>
>>> BR,
>>>  ???????? Stuart


From aleksei.voitylov at bell-sw.com  Thu Apr 23 13:12:16 2020
From: aleksei.voitylov at bell-sw.com (Aleksei Voitylov)
Date: Thu, 23 Apr 2020 16:12:16 +0300
Subject: [aarch64-port-dev ] RFR(XS): 8242905: AArch64: Client build
 failed
In-Reply-To: <f5882c3b-6223-f89d-ec7a-3a6eaa3198e1@redhat.com>
References: <VI1PR0802MB2558A97BE952C61A2F7BF9428ED30@VI1PR0802MB2558.eurprd08.prod.outlook.com>
 <f5882c3b-6223-f89d-ec7a-3a6eaa3198e1@redhat.com>
Message-ID: <7b98219a-e45b-f0e8-9008-0c7a712c06f4@bell-sw.com>

Yes, in the embedded space.

On 23/04/2020 11:44, Andrew Haley wrote:
> On 4/23/20 3:39 AM, Yang Zhang wrote:
>> Could you please help to review this patch?
>>
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8242905
>> Webrev: http://cr.openjdk.java.net/~yzhang/8242905/webrev.00/
> Ok, thanks.
>
> Does anyone in the real world use AArch64 client builds? I'm wondering if
> we'd be better off without that option.
>

From Yang.Zhang at arm.com  Fri Apr 24 06:01:28 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Fri, 24 Apr 2020 06:01:28 +0000
Subject: [aarch64-port-dev ] RFR(S): 8243240: AArch64: Add support for MulVB
Message-ID: <VI1PR0802MB25582396E839DC55AB2586E18ED00@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8243240
Webrev: http://cr.openjdk.java.net/~yzhang/8243240/webrev.00/

In this patch, the missing MulVB support for AArch64 is added.

Testing: tier1

Test case:
public static void mulvb(byte[] a, byte[] b, byte[] c) {
    for (int i = 0; i < a.length; i++) {
        c[i] = (byte)(a[i] * b[i]);
    }
}

Assembly generated by C2:
0x0000ffffacafdbac:   ldr q17, [x15, #16]
0x0000ffffacafdbb0:   ldr q16, [x14, #16]
0x0000ffffacafdbb4:   mul v16.16b, v16.16b, v17.16b
0x0000ffffacafdbbc:   str q16, [x11, #16]

Performance:
JMH test case is attached in JBS.

Before:
Benchmark               (size)  Mode  Cnt  Score   Error  Units
TestVect.testVectMulVB    1024  avgt    5  0.952  0.005  us/op

After:
Benchmark               (size)  Mode  Cnt  Score   Error  Units
TestVect.testVectMulVB    1024  avgt    5  0.110  0.001  us/op

Regards
Yang

From aph at redhat.com  Fri Apr 24 09:31:59 2020
From: aph at redhat.com (Andrew Haley)
Date: Fri, 24 Apr 2020 10:31:59 +0100
Subject: [aarch64-port-dev ] RFR(S): 8243240: AArch64: Add support for
 MulVB
In-Reply-To: <VI1PR0802MB25582396E839DC55AB2586E18ED00@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB25582396E839DC55AB2586E18ED00@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <893f6983-7e3c-adc0-ecf4-48e57312c456@redhat.com>

On 4/24/20 7:01 AM, Yang Zhang wrote:
> JBS: https://bugs.openjdk.java.net/browse/JDK-8243240
> Webrev: http://cr.openjdk.java.net/~yzhang/8243240/webrev.00/

OK, thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From stuart.monteith at arm.com  Mon Apr 27 16:34:49 2020
From: stuart.monteith at arm.com (Stuart Monteith)
Date: Mon, 27 Apr 2020 17:34:49 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
 <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>
Message-ID: <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com>

Thanks Erik, Per, Andrew,
	I've fixed up the testcase and retested.

Uploaded here:

	http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/

Would someone be able to submit this for me?


Thanks,
	Stuart

On 23/04/2020 11:51, Erik ?sterlund wrote:
> Hi Stuart,
> 
> This looks good to me.
> 
> Thanks,
> /Erik
> 
> On 2020-04-20 16:19, Stuart Monteith wrote:
>> Hi,
>> ????If anyone has bandwidth, would the be able to review this patch? It
>> addresses Andrew, Per and Erik's comments:
>> ???????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.1/
>>
>> Thanks,
>> ????Stuart
>>
>>
>> On 27/03/2020 09:47, Erik ?sterlund wrote:
>>> Hi Stuart,
>>>
>>> Thanks for sorting this out on AArch64. It is nice to see thatyou can
>>> implement these
>>> barriers on platforms that do not have instruction cache coherency.
>>>
>>> One small change request:
>>> It looks like in C1 you inject the entry barrier right after build_frame
>>> is done:
>>>
>>> ??629?????? build_frame();
>>> ??630?????? {
>>> ??631???????? // Insert nmethod entry barrier into frame.
>>> ??632???????? BarrierSetAssembler* bs =
>>> BarrierSet::barrier_set()->barrier_set_assembler();
>>> ??633???????? bs->nmethod_entry_barrier(_masm);
>>> ??634?????? }
>>>
>>> Unfortunately, this is in the platform independent part of the LIR
>>> assembler. In the x86 version
>>> we inject it at the very end of build_frame() instead, which is a
>>> platform-specific function.
>>> The platform-specific function is in the C1 macro assembler file for
>>> that platform.
>>>
>>> We intentionally put it in the platform-specific path as it is a
>>> platform-specific feature.
>>> Now on x86, the barrier code will be emitted once in build_frame() and
>>> once after returning
>>> from build_frame, resulting in two nmethod entry barriers, and only the
>>> first one will get
>>> patched, causing the second one to mostly take slow paths, which isn't
>>> necessarily wrong,
>>> but will cause regressions.
>>>
>>> I would propose you just move those lines into the very end of the
>>> AArch64-specific part of
>>> build_frame().
>>>
>>> I don't need to see another webrev for that trivial code motion. This
>>> looks good to me.
>>> Agan, thanks a lot for fixing this! It will allow me to go forward with
>>> concurrent stack
>>> scanning on AArch64 as well.
>>>
>>> Thanks,
>>> /Erik
>>>
>>>
>>> On 2020-03-26 23:42, Stuart Monteith wrote:
>>>> Hello,
>>>> ????????? Please review this change to implement nmethod entry barriers on
>>>> aarch64, and hence concurrent class unloading with ZGC. Shenandoah will
>>>> need to be separately tested and enabled - there are problems with this
>>>> on Shenandoah.
>>>>
>>>> It has been tested with JTreg, runs with SPECjbb, gcbench, and Lucene as
>>>> well as Netbeans.
>>>>
>>>> In terms of interesting features:
>>>> ?????????? With nmethod entry barriers,? immediate oops are removed by:
>>>> ????????????????? LIR_Assembler::jobject2reg? and? MacroAssembler::movoop
>>>> ????????? This is to ensure consistency with the entry barrier, as
>>>> otherwise with
>>>> an immediate we'd otherwise need an ISB.
>>>>
>>>> ????????? I've added "-XX:DeoptNMethodBarrierALot". I found this
>>>> functionality
>>>> useful in testing as deoptimisation is very infrequent. I've written it
>>>> as an atomic to avoid it happening too frequently. As it is a new
>>>> option, I'm not sure whether any more is needed than this review. A new
>>>> test has been added
>>>> "test/hotspot/jtreg/gc/stress/gcbasher/TestGCBasherDeoptWithZ.java" to
>>>> test GC with that option enabled.
>>>>
>>>> ????????? BarrierSetAssembler::nmethod_entry_barrier
>>>> ????????? This method emits the barrier code. In internal review it was
>>>> suggested
>>>> the "dmb( ISHLD )" should be replaced by "membar(LoadLoad)". I've not
>>>> done this as the BarrierSetNMethod code checks the exact instruction
>>>> sequence, and I prefer to be explicit.
>>>>
>>>> ????????? Benchmarking method entry shows an increase of around 6ns
>>>> with the
>>>> nmethod entry barrier.
>>>>
>>>>
>>>> The deoptimisation code was contributed by Andrew Haley.
>>>>
>>>> The bug:
>>>> ????????? https://bugs.openjdk.java.net/browse/JDK-8216557
>>>>
>>>> The webrev:
>>>> ????????? http://cr.openjdk.java.net/~smonteith/8216557/webrev.0/
>>>>
>>>>
>>>> BR,
>>>> ????????? Stuart
> 


From ningsheng.jian at arm.com  Tue Apr 28 05:26:43 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Tue, 28 Apr 2020 13:26:43 +0800
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
 <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>
 <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com>
Message-ID: <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com>

Hi Stuart,

On 4/28/20 12:34 AM, Stuart Monteith wrote:
> Thanks Erik, Per, Andrew,
> 	I've fixed up the testcase and retested.
> 
> Uploaded here:
> 
> 	http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/
> 
> Would someone be able to submit this for me?
> 

I submitted a build job before pushing your code, but it failed to build 
with minimal variant configure. Here's error message:

./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member 
function 'static AdapterHandlerEntry* 
SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const 
BasicType*, const VMRegPair*, AdapterFingerPrint*)':

./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: 
invalid use of incomplete type 'class BarrierSetAssembler'

    bs->c2i_entry_barrier(masm);

I think you need to include barrierSetAssembler.hpp in 
sharedRuntime_aarch64.cpp?

Thanks,
Ningsheng

From Yang.Zhang at arm.com  Tue Apr 28 06:57:15 2020
From: Yang.Zhang at arm.com (Yang Zhang)
Date: Tue, 28 Apr 2020 06:57:15 +0000
Subject: [aarch64-port-dev ] RFR(S): 8243155: AArch64: Add support for SqrtVF
Message-ID: <VI1PR0802MB2558D446B5D05488708624148EAC0@VI1PR0802MB2558.eurprd08.prod.outlook.com>

Hi,

Could you please help to review this patch?

JBS: https://bugs.openjdk.java.net/browse/JDK-8243155
Webrev: http://cr.openjdk.java.net/~yzhang/8243155/webrev.00/

In Java, Math.sqrt() supports double data only. To support Math.sqrt()
for float, the following conversion must be done.

    float a, b;
    a = (float)Math.sqrt((double)b)

Both AArch64 and x86 support such single-precision sqrt by hardware
instructions. AArch64 FSQRT instruction matches Java (float)Math.
sqrt((double)b) exactly. And X86 has supported vectorization of
Math.sqrt() on floats in [1].

In this patch, vectorized sqrt for float (SqrtVF) is supported in
AArch64 backend. Jtreg test cases for SqrtVF and SqrtVD are also
added. Special cases such as min/max, +/-Inf, +0.0/-0.0 and NaN are
covered.

Testing:
Full jtreg
Newly added sqrt jtreg tests
Panama/Vector API tests which cover vector sqrt

Test case for sqrtvf:

public static void sqrtvf(float[] a, float[] b, float[] c) {
    float tmp;
    for (int i = 0; i < a.length; i++) {
        tmp = (float)(a[i] + b[i]);
        c[i] = (float)Math.sqrt((double)tmp);
    }
}

With this patch, the following code snippet is generated.

  0x0000ffffacaf872c:   ldr	q17, [x18, #16]
  0x0000ffffacaf8730:   ldr	q16, [x16, #16]
  0x0000ffffacaf8734:   fadd	v16.4s, v16.4s, v17.4s
  0x0000ffffacaf8738:   fsqrt	v16.4s, v16.4s
  0x0000ffffacaf8740:   str	q16, [x14, #16]

Performance:
JMH test is attached in JBS.

Before:
Benchmark                (size)  Mode  Cnt  Score   Error  Units
TestVect.testVectSqrtVF    1024  avgt    5  4.372 ? 0.016  us/op

After:
Benchmark                (size)  Mode  Cnt  Score   Error  Units
TestVect.testVectSqrtVF    1024  avgt    5  1.115 ? 0.013  us/op

[1] https://bugs.openjdk.java.net/browse/JDK-8190800

Regards
Yang

From aph at redhat.com  Tue Apr 28 09:46:43 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 28 Apr 2020 10:46:43 +0100
Subject: [aarch64-port-dev ] RFR(S): 8243155: AArch64: Add support for
 SqrtVF
In-Reply-To: <VI1PR0802MB2558D446B5D05488708624148EAC0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
References: <VI1PR0802MB2558D446B5D05488708624148EAC0@VI1PR0802MB2558.eurprd08.prod.outlook.com>
Message-ID: <33e0d71a-0b82-9112-fe81-a8e9a34d6d57@redhat.com>

On 4/28/20 7:57 AM, Yang Zhang wrote:
> Could you please help to review this patch?
>
> JBS: https://bugs.openjdk.java.net/browse/JDK-8243155
> Webrev: http://cr.openjdk.java.net/~yzhang/8243155/webrev.00/

This was a bit of a head scratcher. To begin with I thought that this
must be wrong, because Math.sqrt() is supposed to be correctly
rounded, and (float)Math.sqrt(float) is double rounded, leading to an
inaccurate result.

Looking round the web, Figueroa [1] proved double rounding to be
innocuous for the square root if it is performed with a precision
larger than twice the original precision, plus two. [2]

But it's not hard to write a program to do an exhaustive search from
x = FLT_MIN; x <= FLT_MAX, like so:

float roundedSqrt(float x) {
  return (float)ieee754_sqrt((double)x);
}

int main() {
  for (float x = FLT_MIN; x <= FLT_MAX; x = nextFloat(x)) {
    if (ieee754_sqrtf(x) != roundedSqrt(x)) {
      fprintf(stdout, "%12.6f\n", x);
    }
  }
}

... and it returns no differences.

The patch is OK, thanks.

[1] Samuel A. Figueroa. When is Double Rounding Innocuous? SIGNUM
Newsl., 30(3):21?26, July 1995.

[2] Pierre Roux. Innocuous Double Rounding of Basic Arithmetic
Operations. Journal of Formalized Reasoning, ASDD-AlmaDL, 2014, 7 (1),
pp.131-142. 10.6092/issn.1972-5787/4359. hal-01091186

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From xxinliu at amazon.com  Tue Apr 28 09:57:48 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Tue, 28 Apr 2020 09:57:48 +0000
Subject: [aarch64-port-dev ] FW: [Mach5]
	mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
Message-ID: <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com>

Hello,

I recently received some build failures from submit repo.
May I know more details ? eg. linux-x64-linux-x64-build-5, linux-x64-debug-linux-x64-build-6
I have successfully built it on the latest linux on ubuntu 18.04 with gcc 7.5.

My nightly buildbot started failing recently on aarch64.
One issue is the error message of [cds] prevents configure from bootjdk determining.
https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77

./bin/java --version
[0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
openjdk 14.0.1 2020-04-14
OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)

Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host.
Maybe it?s aarch64-only.  CC aarch-port-dev.

Thanks,
--lx

From: <do-not-reply at oracle.com> on behalf of "do-not-reply at oracle.com" <do-not-reply at oracle.com>
Reply-To: "mach5_admin_ww_grp at oracle.com" <mach5_admin_ww_grp at oracle.com>
Date: Monday, April 27, 2020 at 3:46 PM
To: "Hohensee, Paul" <hohensee at amazon.com>
Subject: [EXTERNAL] [Mach5] mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
Job: mach5-one-phh-JDK-8151779-20200427-2151-10554367
BuildId: 2020-04-27-2150109.hohensee.source
No failed tests
Tasks Summary

  *   NOTHING_TO_RUN: 0
  *   UNABLE_TO_RUN: 24
  *   KILLED: 0
  *   NA: 0
  *   HARNESS_ERROR: 0
  *   FAILED: 0
  *   EXECUTED_WITH_FAILURE: 9
  *   PASSED: 51

Build
2 Unable to run

     *   linux-aarch64-install-linux-aarch64-build-signing-20 Dependency task failed: mach5...-10554367-linux-aarch64-linux-aarch64-build-1
     *   linux-x64-install-linux-x64-build-signing-21 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5

9 Executed with failure

     *   linux-aarch64-linux-aarch64-build-1 error while building, return value: 2
     *   linux-aarch64-debug-linux-aarch64-build-2 error while building, return value: 2
     *   linux-aarch64-open-linux-aarch64-build-3 error while building, return value: 2
     *   linux-aarch64-open-debug-linux-aarch64-build-4 error while building, return value: 2
     *   linux-x64-linux-x64-build-5 error while building, return value: 2
     *   linux-x64-debug-linux-x64-build-6 error while building, return value: 2
     *   linux-x64-debug-nopch-linux-x64-build-9 error while building, return value: 2
     *   linux-x64-open-linux-x64-build-7 error while building, return value: 2
     *   linux-x64-open-debug-linux-x64-build-8 error while building, return value: 2

Test
22 Unable to run

     *   tier1-product-open_test_hotspot_jtreg_tier1_common-linux-x64-24 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5
     *   tier1-debug-open_test_hotspot_jtreg_tier1_common-linux-x64-debug-30 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_compiler_1-linux-x64-debug-33 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_compiler_2-linux-x64-debug-36 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_compiler_3-linux-x64-debug-39 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_compiler_graal-linux-x64-debug-45 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_compiler_not_xcomp-linux-x64-debug-42 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_gc_1-linux-x64-debug-48 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-debug-open_test_hotspot_jtreg_tier1_gc_2-linux-x64-debug-51 Dependency task failed: mach5...51-10554367-linux-x64-debug-linux-x64-build-6
     *   tier1-product-open_test_hotspot_jtreg_tier1_gc_gcbasher-linux-x64-27 Dependency task failed: mach5...427-2151-10554367-linux-x64-linux-x64-build-5
     *   See all 22...

From nick.gasson at arm.com  Tue Apr 28 10:11:04 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Tue, 28 Apr 2020 18:11:04 +0800
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com>
Message-ID: <858sifgbt3.fsf@arm.com>

>
> My nightly buildbot started failing recently on aarch64.
> One issue is the error message of [cds] prevents configure from bootjdk determining.
> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77
>
> ./bin/java --version
> [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
> openjdk 14.0.1 2020-04-14
> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
>
> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host.
> Maybe it?s aarch64-only.  CC aarch-port-dev.
>

Was your boot JDK built on a machine configured with a different page
size to your current machine? Looks like the CDS archive was dumped on a
machine with 64k pages but you're running with 4k pages. There's a JBS
issue for this:

https://bugs.openjdk.java.net/browse/JDK-8236847


Thanks,
Nick

From aph at redhat.com  Tue Apr 28 10:47:02 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 28 Apr 2020 11:47:02 +0100
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <858sifgbt3.fsf@arm.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
Message-ID: <c13feff5-31e5-10fa-4915-df74c170a2e8@redhat.com>

On 4/28/20 11:11 AM, Nick Gasson wrote:
>>
>> My nightly buildbot started failing recently on aarch64.
>> One issue is the error message of [cds] prevents configure from bootjdk determining.
>> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77
>>
>> ./bin/java --version
>> [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
>> openjdk 14.0.1 2020-04-14
>> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
>> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
>>
>> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host.
>> Maybe it?s aarch64-only.  CC aarch-port-dev.
> 
> Was your boot JDK built on a machine configured with a different page
> size to your current machine? Looks like the CDS archive was dumped on a
> machine with 64k pages but you're running with 4k pages. There's a JBS
> issue for this:
> 
> https://bugs.openjdk.java.net/browse/JDK-8236847

The thread seems to have died here:

http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html

We really need to get this fixed. If anyone reading this has machines with
both 4k and 64k pages, please do the experiment and we'll make a suitable
patch. Everything here has 64k pages.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From stuart.monteith at arm.com  Tue Apr 28 11:28:52 2020
From: stuart.monteith at arm.com (Stuart Monteith)
Date: Tue, 28 Apr 2020 12:28:52 +0100
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
 <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>
 <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com>
 <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com>
Message-ID: <bfd121cd-f0eb-25ea-1b56-d42b3467bb6a@arm.com>

On 28/04/2020 06:26, Ningsheng Jian wrote:
> Hi Stuart,
> 
> On 4/28/20 12:34 AM, Stuart Monteith wrote:
>> Thanks Erik, Per, Andrew,
>> ????I've fixed up the testcase and retested.
>>
>> Uploaded here:
>>
>> ????http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/
>>
>> Would someone be able to submit this for me?
>>
> 
> I submitted a build job before pushing your code, but it failed to build with minimal variant configure. Here's error
> message:
> 
> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member function 'static AdapterHandlerEntry*
> SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const BasicType*, const VMRegPair*,
> AdapterFingerPrint*)':
> 
> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: invalid use of incomplete type 'class
> BarrierSetAssembler'
> 
> ?? bs->c2i_entry_barrier(masm);
> 
> I think you need to include barrierSetAssembler.hpp in sharedRuntime_aarch64.cpp?
> 
> Thanks,
> Ningsheng

Thanks for that Ningsheng - I've made some changes, and built with minimal.

The revised patch:

 http://cr.openjdk.java.net/~smonteith/8216557/webrev.3/

There were contributions from aph at redhat.com

Thanks,
	Stuart

From thomas.stuefe at gmail.com  Tue Apr 28 14:54:57 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 28 Apr 2020 16:54:57 +0200
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace storage
	reservation
Message-ID: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>

Hi all,

Could I have reviews for the following proposal of reworking cds/class
space reservation?

Bug: https://bugs.openjdk.java.net/browse/JDK-8243392

Webrev:
http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/

(Many thanks to Ioi Lam for so patiently explaining CDS internals to me,
and to Andrew Haley and Nick Gasson for help with aarch64!)

Reservation of the compressed class space is needlessly complicated and has
some minor issues. It can be simplified and made clearer.

The complexity stems from the fact that this area lives at the intersection
of two to three sub systems, depending on how one counts. Metaspace, CDS,
and the platform which may or may not its own view of how to reserve class
space. And this code has been growing organically over time.

One small example:

ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t
alignment,
                                                 bool large_pages, char
*requested_addr,
                                                 bool use_requested_addr)

which I spent hours decoding, resulting in a very confused mail to
hs-runtime and aarch64-port-dev [2].

This patch attempts to simplify cds and metaspace setup a bit; to comment
implicit knowledge which is not immediately clear; to cleanly abstract
platform concerns like optimized class space placement; and to disentangle
cds from metaspace to solve issues which may bite us later with Elastic
Metaspace [4].

---

The main change is the reworked reservation mechanism. This is based on
Ioi's proposal [5].

When reserving class space, three things must happen:

1) reservation of the space obviously. If cds is active that space must be
in the vicinity of cds archives to be covered by compressed class pointer
encoding.
2) setting up the internal Metaspace structures atop of that space
3) setting up compressed class pointer encoding.

In its current form, Metaspace may or may not do some or all of that in one
function (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
metaspace_rs, char* requested_addr, address cds_base);) - if cds is active,
it will reserve the space for Metaspace and hand it in, otherwise it will
create it itself.

When discussing this in [2], Ioi proposed to move the reservation of the
class space completely out of Metaspace and make it a responsibility of the
caller always. This would reduce some complexity, and this patch follows
the proposal.

I removed Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
metaspace_rs, char* requested_addr, address cds_base); and all its sub
functions.

(1) now has to always be done outside - a ReservedSpace for class space has
to be provided by the caller. However, Metaspace now offers a utility
function for reserving space at a "nice" location, and explicitly doing
nothing else:

ReservedSpace
Metaspace::reserve_address_space_for_compressed_classes(size_t size);

this function can be redefined on a platform level for platform optimized
reservation, see below for details.

(2) is taken care of by a new function,
Metaspace::initialize_class_space(ReservedSpace rs)

(3) is taken care of a new function CompressedKlassPointers::initialize(),
see below for details.


So, class space now is set up three explicit steps:

- First, reserve a suitable space by however means you want. For
convenience you may use
Metaspace::reserve_address_space_for_compressed_classes(), or you may roll
your own reservation.
- Next, tell Metaspace to use that range as backing storage for class
space: Metaspace::initialize_class_space(ReservedSpace rs)
- Finally, set up encoding. Encoding is independent from the concept of a
ReservedSpace, it just gets an address range, see below for details.

Separating these steps and moving them out of the responsibility of
Metaspace makes this whole thing more flexible; it also removes unnecessary
knowledge (e.g. Metaspace does not need to know anything about either ccp
encoding or cds).

---

How it comes together:

If CDS is off, we just reserve a space using
Metaspace::reserve_address_space_for_compressed_classes(), initialize it
with Metaspace::initialize_class_space(ReservedSpace rs), then set up
compressed class pointer encoding covering the range of this class space.

If CDS is on (dump time), we reserve large 4G space, either at
SharedBaseAddress or using
Metaspace::reserve_address_space_for_compressed_classes(); we then split
that into 3G archive space and 1G class space; we set up that space with
Metaspace as class space; then we set up compressed class pointer encoding
covering both archive space and cds.

If CDS is on (run time), we reserve a large space, split it into archive
space (large enough to hold both archives) and class space, then basically
proceed as above.

Note that this is almost exactly how things worked before (modulo some
minor fixes, e.g. alignment issues), only the code is reformed and made
more explicit.

---

I moved compressed class pointer setup over to CompressedKlassPointers and
changed the interface:

-void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace
metaspace_rs, address cds_base)
+void CompressedKlassPointers::initialize(address addr, size_t len);

Instead of feeding it a single ReservedSpace, which is supposed to
represent class space, and an optional alternate base if cds is on, now we
give it just an numeric address range. That range marks the limits to where
Klass structures are to be expected, and is the implicit promise that
outside that range no Klass structures will exist, so encoding has to cover
only this range.

This range may contain just the class space; or class space+cds; or
whatever allocation scheme we come up with in the future. Encoding does not
really care how the memory is organized as long as the input range covers
all possible Klass locations. That way we remove knowledge about class
space/cds from compressed class pointer encoding.

Moving it away from metaspace.cpp into the CompressedKlassPointers class
also mirrors CompressedOops::initialize().

---

I renamed _narrow_klass_range to just _range, because strictly speaking
this is the range un-narrow Klass pointers can have.

As for the implementation of CompressedKlassPointers::initialize(address
addr, size_t len), I mimicked very closely what happened before, so there
should be almost no differences. Since "almost no differences" sounds scary
:) here are the differences:

- When CDS is active (dump or run time) we now always, unconditionally, set
the encoding range to 4G. This fixes a theoretical bug discussed on
aarch64-port-dev [1].

- When CDS is not active, we set the encoding range to the minimum required
length. Before, it was left at its default value of 4G.

Both differences only affect aarch64, since they are currently the only one
using the range field in CompressedKlassPointers.

I wanted to add an assert somewhere to test encoding of the very last
address of the CompressedKlassPointers range, again to prevent errors like
[3]. But I did not come up with a good place for this assert which would
cover also the encoding done by C1/C2.

For the same reason I thought about introducing a mode where Klass
structures would be allocated in reverse order, starting at the end of the
ccs, but again left it out as too big a change.

---

OS abstraction: platforms may have restrictions of what constitutes a valid
compressed class pointer encoding base. Or if not, they may have at least
preferences. There was logic like this in metaspace.cpp, which I removed
and cleanly factored out into platform dependent files, giving each
platform the option to add special logic.

These are two new methods:

- bool CompressedKlassPointers::is_valid_base(address p)

to let the platform tell you whether it considers p to be a valid encoding
base. The only platform having these restrictions currently is aarch64.

- ReservedSpace
Metaspace::reserve_address_space_for_compressed_classes(size_t size);

this hands over the process of allocating a range suitable for compressed
class pointer encoding to the platform. Most platforms will allocate just
anywhere, but some platforms may have a better strategy (e.g. trying low
memory first, trying only correctly aligned addresses and so on).

Beforehand, this coding existed in a similar form in metaspace.cpp for
aarch64 and AIX. For now, I left the AIX part out - it seems only half
done, and I want to check further if we even need it, if yes why not on
Linux ppc, and C1 does not seem to support anything other than base+offset
with shift either, but I may be mistaken.

These two methods should give the platform enough control to implement
their own scheme for optimized class space placement without bothering any
shared code about it.

Note about the form, I introduced two new platform dependent files,
"metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy about
this but this seems to be what we generally do in hotspot, right?

---

Metaspace reserve alignment vs cds alignment

CDS was using Metaspace reserve alignment for CDS internal purposes. I
guess this was just a copy paste issue. It never caused problems since
Metaspace reserve alignment == page size, but that is not true anymore in
the upcoming Elastic Metaspace where reserve alignment will be larger. This
causes a number of issues.

I separated those two cleanly. CDS now uses os::vm_allocation_granularity.
Metaspace::reserve_alignment is only used in those two places where it is
needed, when CDS creates the address space for class space on behalf of the
Metaspace.

---

Windows special handling in CDS

To simplify coding I removed the windows specific handling which left out
reservation of the archive. This was needed because windows cannot mmap
files into reserved regions. But fallback code exists in filemap.cpp for
this case which just reads in the region instead of mapping it.

Should that turn out to be a performance problem, I will reinstate the
feature. But a simpler way would be reserve the archive and later just
before mmapping the archive file to release the archive space. That would
not only be simpler but give us the best guarantee that that address space
is actually available. But I'd be happy to leave that part out completely
if we do not see any performance problems on windows x64.

---

NMT cannot deal with spaces which are split. This problem manifests in that
bookkeeping for class space is done under "Shared Classes", not "Classes"
as it should. This problem exists today too at dump time and randomly at
run time. But since I simplified the reservation, this problem now shows up
always, whether or not we map at the SharedBaseAddress.
While I could work around this problem, I'd prefer this problem to be
solved at the core, and NMT to have an option to recognize reservation
splits. So I'd rather not put a workaround for this into the patch but
leave it for fixing as a separate issue. I opened this issue to track it
[6].

---

Jtreg tests:

I expanded the CompressedOops/CompressedClassPointers.java. I also extended
them to Windows. The tests now optionally omit strict class space placement
tests, since these tests heavily depend on ASLR and were the reason they
were excluded on Windows. However I think even without checking for class
space placement they make sense, just to see that the VM comes up and lives
with the many different settings we can run in.

---

Tests:

- I ran the patch through Oracles submit repo
- I ran tests manually for aarch64, zero, linux 32bit and windows x64
- The whole battery of nightly tests at SAP, including ppc, ppcle and
aarch64, unfortunately excluding windows because of unrelated errors.
Windows x64 tests will be redone tonight.


Thank you,

Thomas

[1]
https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
[2]
https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
[3] https://bugs.openjdk.java.net/browse/JDK-8193266
[4] https://bugs.openjdk.java.net/browse/JDK-8221173
[5]
https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
[6] https://bugs.openjdk.java.net/browse/JDK-8243535

From thomas.stuefe at gmail.com  Tue Apr 28 16:29:39 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Tue, 28 Apr 2020 18:29:39 +0200
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <c13feff5-31e5-10fa-4915-df74c170a2e8@redhat.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
 <c13feff5-31e5-10fa-4915-df74c170a2e8@redhat.com>
Message-ID: <CAA-vtUw8eB5+VsvZ+W+WLcFUQ4FpZETFLF++5PCDbZkb=NB4yQ@mail.gmail.com>

On a related note, I wonder whether it would be possible to fake a larger
page size than the system uses. Well, I don't wonder since we do this on
AIX, for complicated reasons.

It requires a bit of work. So, not only spoofing os::vm_page_size(), but
also fixing the places where the native page size shines thru, e.g. making
sure os::reserve_memory() returns always os::vm_page_size() aligned memory.
I wonder whether it would be worth the work, that way one could simulate a
larger page size with a switch and test for errors like this.

..Thomas


On Tue, Apr 28, 2020 at 12:50 PM Andrew Haley <aph at redhat.com> wrote:

> On 4/28/20 11:11 AM, Nick Gasson wrote:
> >>
> >> My nightly buildbot started failing recently on aarch64.
> >> One issue is the error message of [cds] prevents configure from bootjdk
> determining.
> >>
> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77
> >>
> >> ./bin/java --version
> >> [0.006s][error][cds] Unable to map CDS archive --
> os::vm_allocation_granularity() expected: 65536 actual: 4096
> >> openjdk 14.0.1 2020-04-14
> >> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
> >> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
> >>
> >> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64
> host.
> >> Maybe it?s aarch64-only.  CC aarch-port-dev.
> >
> > Was your boot JDK built on a machine configured with a different page
> > size to your current machine? Looks like the CDS archive was dumped on a
> > machine with 64k pages but you're running with 4k pages. There's a JBS
> > issue for this:
> >
> > https://bugs.openjdk.java.net/browse/JDK-8236847
>
> The thread seems to have died here:
>
>
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html
>
> We really need to get this fixed. If anyone reading this has machines with
> both 4k and 64k pages, please do the experiment and we'll make a suitable
> patch. Everything here has 64k pages.
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>

From dms at samersoff.net  Tue Apr 28 16:31:03 2020
From: dms at samersoff.net (Dmitry Samersoff)
Date: Tue, 28 Apr 2020 19:31:03 +0300
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <c13feff5-31e5-10fa-4915-df74c170a2e8@redhat.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
 <c13feff5-31e5-10fa-4915-df74c170a2e8@redhat.com>
Message-ID: <2fc1d25f-e854-31d9-6d14-fab24b5d24c5@samersoff.net>

Hello Andrew,

I'm working on it, based on the fix proposed by Ioi.

-Dmitry


On 28.04.2020 13:47, Andrew Haley wrote:
> On 4/28/20 11:11 AM, Nick Gasson wrote:
>>>
>>> My nightly buildbot started failing recently on aarch64.
>>> One issue is the error message of [cds] prevents configure from bootjdk determining.
>>> https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77
>>>
>>> ./bin/java --version
>>> [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
>>> openjdk 14.0.1 2020-04-14
>>> OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
>>> OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
>>>
>>> Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host.
>>> Maybe it?s aarch64-only.  CC aarch-port-dev.
>>
>> Was your boot JDK built on a machine configured with a different page
>> size to your current machine? Looks like the CDS archive was dumped on a
>> machine with 64k pages but you're running with 4k pages. There's a JBS
>> issue for this:
>>
>> https://bugs.openjdk.java.net/browse/JDK-8236847
> 
> The thread seems to have died here:
> 
> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2020-February/038207.html
> 
> We really need to get this fixed. If anyone reading this has machines with
> both 4k and 64k pages, please do the experiment and we'll make a suitable
> patch. Everything here has 64k pages.
> 


From thomas.stuefe at gmail.com  Wed Apr 29 06:18:58 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 29 Apr 2020 08:18:58 +0200
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
	storage reservation
In-Reply-To: <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com>
Message-ID: <CAA-vtUwvm_29aMpffCLrSaPp9TFL6ngXAaLQe5Atjyj4qd9DSg@mail.gmail.com>

Hi Ioi,

thanks for looking at this. Of course I'm happy if you run it through your
CI too.

The patch is based on

changeset:   59009:2b3b41fff837
tag:         qparent
user:        egahlin
date:        Mon Apr 27 15:01:22 2020 +0200
summary:     8242034: Remove JRE_HOME references


On Wed, Apr 29, 2020 at 7:14 AM Ioi Lam <ioi.lam at oracle.com> wrote:

> Hi Thomas,
>
> There are a lot of changes so it will take me a while to go through
> everything. Just some initial comments:
>
>    // User may have specified an invalid base address. Should we ignore
> it or assert?
> guarantee(CompressedKlassPointers::is_valid_base((address)shared_base),
>              "SharedBaseAddress: " PTR_FORMAT " is not a valid base.",
> p2i(shared_base));
>
> This will cause the VM to crash. I think it's better (1) exit the VM
> properly with an error code, or (2) override the user's input.
>
> ======
>
> Since this is a potentially disruptive change, I want to run it in our
> CI as well. Could you tell me the tip of your repo?
>
> ========
>
> For testing the CDS relocation code, I would suggest running:
>
> cd test/hotspot/jtreg
> jtreg -javaoption:-XX:+UnlockDiagnosticVMOptions \
>        -javaoption:-XX:ArchiveRelocationMode=1 \
>        -javaoption:-XX:NativeMemoryTracking=detail
>        :hotspot_cds_relocation
>
> This will place the CCS at random locations picked by the OS.
>
> ========
>
> metaspace.cpp:
>
> If your intention is to "shake things up a little", it's not a good idea
> to include it in a complex change set. If things indeed go wrong, we
> don't know who caused it (your CCS changes, or old bugs triggered by
> this debug code), and we will end up backing out the entire changeset.
>
> I would suggest putting this in a different RFE, and even push it now.
>
>    // The upcoming Elastic Metaspace will have stricter alignment
> requirements.
>    // For debug builds, increase reserve alignment to shake loose errors
> resulting
>    // from misusing this alignment.
>    // Note: do not increase too much (e.g. not on platforms with 64K
> pages), we do not
>    // want to disturb tests requiring precise numbers for metaspace size
> or ccs size.
> #ifdef ASSERT
>    if (_reserve_alignment == 4 * K) {
>      _reserve_alignment *= 4;
>    }
> #endif
>
>
> More to come ....
>
> Thanks
> - Ioi
>
>
>
All good points, I'll wait for your final review.

Cheers, Thomas


> On 4/28/20 7:54 AM, Thomas St?fe wrote:
> > Hi all,
> >
> > Could I have reviews for the following proposal of reworking cds/class
> > space reservation?
> >
> > Bug: https://bugs.openjdk.java.net/browse/JDK-8243392
> >
> > Webrev:
> >
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/
> >
> > (Many thanks to Ioi Lam for so patiently explaining CDS internals to
> > me, and to Andrew Haley and Nick Gasson for help with aarch64!)
> >
> > Reservation of the compressed class space is needlessly complicated
> > and has some minor issues. It can be simplified and made clearer.
> >
> > The complexity stems from the fact that this area lives at the
> > intersection of two to three sub systems, depending on how one counts.
> > Metaspace, CDS, and the platform which may or may not its own view of
> > how to reserve class space. And this code has been growing organically
> > over time.
> >
> > One small example:
> >
> > ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t
> > alignment,
> >                                                  bool large_pages,
> > char *requested_addr,
> >                                                  bool use_requested_addr)
> >
> > which I spent hours decoding, resulting in a very confused mail to
> > hs-runtime and aarch64-port-dev [2].
> >
> > This patch attempts to simplify cds and metaspace setup a bit; to
> > comment implicit knowledge which is not immediately clear; to cleanly
> > abstract platform concerns like optimized class space placement; and
> > to disentangle cds from metaspace to solve issues which may bite us
> > later with Elastic Metaspace [4].
> >
> > ---
> >
> > The main change is the reworked reservation mechanism. This is based
> > on Ioi's proposal [5].
> >
> > When reserving class space, three things must happen:
> >
> > 1) reservation of the space obviously. If cds is active that space
> > must be in the vicinity of cds archives to be covered by compressed
> > class pointer encoding.
> > 2) setting up the internal Metaspace structures atop of that space
> > 3) setting up compressed class pointer encoding.
> >
> > In its current form, Metaspace may or may not do some or all of that
> > in one function
> > (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
> > metaspace_rs, char* requested_addr, address cds_base);) - if cds is
> > active, it will reserve the space for Metaspace and hand it in,
> > otherwise it will create it itself.
> >
> > When discussing this in [2], Ioi proposed to move the reservation of
> > the class space completely out of Metaspace and make it a
> > responsibility of the caller always. This would reduce some
> > complexity, and this patch follows the proposal.
> >
> > I removed
> > Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace
> > metaspace_rs, char* requested_addr, address cds_base); and all its sub
> > functions.
> >
> > (1) now has to always be done outside - a ReservedSpace for class
> > space has to be provided by the caller. However, Metaspace now offers
> > a utility function for reserving space at a "nice" location, and
> > explicitly doing nothing else:
> >
> > ReservedSpace
> > Metaspace::reserve_address_space_for_compressed_classes(size_t size);
> >
> > this function can be redefined on a platform level for platform
> > optimized reservation, see below for details.
> >
> > (2) is taken care of by a new function,
> > Metaspace::initialize_class_space(ReservedSpace rs)
> >
> > (3) is taken care of a new function
> > CompressedKlassPointers::initialize(), see below for details.
> >
> >
> > So, class space now is set up three explicit steps:
> >
> > - First, reserve a suitable space by however means you want. For
> > convenience you may use
> > Metaspace::reserve_address_space_for_compressed_classes(), or you may
> > roll your own reservation.
> > - Next, tell Metaspace to use that range as backing storage for class
> > space: Metaspace::initialize_class_space(ReservedSpace rs)
> > - Finally, set up encoding. Encoding is independent from the concept
> > of a ReservedSpace, it just gets an address range, see below for details.
> >
> > Separating these steps and moving them out of the responsibility of
> > Metaspace makes this whole thing more flexible; it also removes
> > unnecessary knowledge (e.g. Metaspace does not need to know anything
> > about either ccp encoding or cds).
> >
> > ---
> >
> > How it comes together:
> >
> > If CDS is off, we just reserve a space using
> > Metaspace::reserve_address_space_for_compressed_classes(), initialize
> > it with Metaspace::initialize_class_space(ReservedSpace rs), then set
> > up compressed class pointer encoding covering the range of this class
> > space.
> >
> > If CDS is on (dump time), we reserve large 4G space, either at
> > SharedBaseAddress or using
> > Metaspace::reserve_address_space_for_compressed_classes(); we then
> > split that into 3G archive space and 1G class space; we set up that
> > space with Metaspace as class space; then we set up compressed class
> > pointer encoding covering both archive space and cds.
> >
> > If CDS is on (run time), we reserve a large space, split it into
> > archive space (large enough to hold both archives) and class space,
> > then basically proceed as above.
> >
> > Note that this is almost exactly how things worked before (modulo some
> > minor fixes, e.g. alignment issues), only the code is reformed and
> > made more explicit.
> >
> > ---
> >
> > I moved compressed class pointer setup over to CompressedKlassPointers
> > and changed the interface:
> >
> > -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace
> > metaspace_rs, address cds_base)
> > +void CompressedKlassPointers::initialize(address addr, size_t len);
> >
> > Instead of feeding it a single ReservedSpace, which is supposed to
> > represent class space, and an optional alternate base if cds is on,
> > now we give it just an numeric address range. That range marks the
> > limits to where Klass structures are to be expected, and is the
> > implicit promise that outside that range no Klass structures will
> > exist, so encoding has to cover only this range.
> >
> > This range may contain just the class space; or class space+cds; or
> > whatever allocation scheme we come up with in the future. Encoding
> > does not really care how the memory is organized as long as the input
> > range covers all possible Klass locations. That way we remove
> > knowledge about class space/cds from compressed class pointer encoding.
> >
> > Moving it away from metaspace.cpp into the CompressedKlassPointers
> > class also mirrors CompressedOops::initialize().
> >
> > ---
> >
> > I renamed _narrow_klass_range to just _range, because strictly
> > speaking this is the range un-narrow Klass pointers can have.
> >
> > As for the implementation of
> > CompressedKlassPointers::initialize(address addr, size_t len), I
> > mimicked very closely what happened before, so there should be almost
> > no differences. Since "almost no differences" sounds scary :) here are
> > the differences:
> >
> > - When CDS is active (dump or run time) we now always,
> > unconditionally, set the encoding range to 4G. This fixes a
> > theoretical bug discussed on aarch64-port-dev [1].
> >
> > - When CDS is not active, we set the encoding range to the minimum
> > required length. Before, it was left at its default value of 4G.
> >
> > Both differences only affect aarch64, since they are currently the
> > only one using the range field in CompressedKlassPointers.
> >
> > I wanted to add an assert somewhere to test encoding of the very last
> > address of the CompressedKlassPointers range, again to prevent errors
> > like [3]. But I did not come up with a good place for this assert
> > which would cover also the encoding done by C1/C2.
> >
> > For the same reason I thought about introducing a mode where Klass
> > structures would be allocated in reverse order, starting at the end of
> > the ccs, but again left it out as too big a change.
> >
> > ---
> >
> > OS abstraction: platforms may have restrictions of what constitutes a
> > valid compressed class pointer encoding base. Or if not, they may have
> > at least preferences. There was logic like this in metaspace.cpp,
> > which I removed and cleanly factored out into platform dependent
> > files, giving each platform the option to add special logic.
> >
> > These are two new methods:
> >
> > - bool CompressedKlassPointers::is_valid_base(address p)
> >
> > to let the platform tell you whether it considers p to be a valid
> > encoding base. The only platform having these restrictions currently
> > is aarch64.
> >
> > - ReservedSpace
> > Metaspace::reserve_address_space_for_compressed_classes(size_t size);
> >
> > this hands over the process of allocating a range suitable for
> > compressed class pointer encoding to the platform. Most platforms will
> > allocate just anywhere, but some platforms may have a better strategy
> > (e.g. trying low memory first, trying only correctly aligned addresses
> > and so on).
> >
> > Beforehand, this coding existed in a similar form in metaspace.cpp for
> > aarch64 and AIX. For now, I left the AIX part out - it seems only half
> > done, and I want to check further if we even need it, if yes why not
> > on Linux ppc, and C1 does not seem to support anything other than
> > base+offset with shift either, but I may be mistaken.
> >
> > These two methods should give the platform enough control to implement
> > their own scheme for optimized class space placement without bothering
> > any shared code about it.
> >
> > Note about the form, I introduced two new platform dependent files,
> > "metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy
> > about this but this seems to be what we generally do in hotspot, right?
> >
> > ---
> >
> > Metaspace reserve alignment vs cds alignment
> >
> > CDS was using Metaspace reserve alignment for CDS internal purposes. I
> > guess this was just a copy paste issue. It never caused problems since
> > Metaspace reserve alignment == page size, but that is not true anymore
> > in the upcoming Elastic Metaspace where reserve alignment will be
> > larger. This causes a number of issues.
> >
> > I separated those two cleanly. CDS now uses
> > os::vm_allocation_granularity. Metaspace::reserve_alignment is only
> > used in those two places where it is needed, when CDS creates the
> > address space for class space on behalf of the Metaspace.
> >
> > ---
> >
> > Windows special handling in CDS
> >
> > To simplify coding I removed the windows specific handling which left
> > out reservation of the archive. This was needed because windows cannot
> > mmap files into reserved regions. But fallback code exists in
> > filemap.cpp for this case which just reads in the region instead of
> > mapping it.
> >
> > Should that turn out to be a performance problem, I will reinstate the
> > feature. But a simpler way would be reserve the archive and later just
> > before mmapping the archive file to release the archive space. That
> > would not only be simpler but give us the best guarantee that that
> > address space is actually available. But I'd be happy to leave that
> > part out completely if we do not see any performance problems on
> > windows x64.
> >
> > ---
> >
> > NMT cannot deal with spaces which are split. This problem manifests in
> > that bookkeeping for class space is done under "Shared Classes", not
> > "Classes" as it should. This problem exists today too at dump time and
> > randomly at run time. But since I simplified the reservation, this
> > problem now shows up always, whether or not we map at the
> > SharedBaseAddress.
> > While I could work around this problem, I'd prefer this problem to be
> > solved at the core, and NMT to have an option to recognize reservation
> > splits. So I'd rather not put a workaround for this into the patch but
> > leave it for fixing as a separate issue. I opened this issue to track
> > it [6].
> >
> > ---
> >
> > Jtreg tests:
> >
> > I expanded the CompressedOops/CompressedClassPointers.java. I also
> > extended them to Windows. The tests now optionally omit strict class
> > space placement tests, since these tests heavily depend on ASLR and
> > were the reason they were excluded on Windows. However I think even
> > without checking for class space placement they make sense, just to
> > see that the VM comes up and lives with the many different settings we
> > can run in.
> >
> > ---
> >
> > Tests:
> >
> > - I ran the patch through Oracles submit repo
> > - I ran tests manually for aarch64, zero, linux 32bit and windows x64
> > - The whole battery of nightly tests at SAP, including ppc, ppcle and
> > aarch64, unfortunately excluding windows because of unrelated errors.
> > Windows x64 tests will be redone tonight.
> >
> >
> > Thank you,
> >
> > Thomas
> >
> > [1]
> >
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
> > [2]
> >
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
> > [3] https://bugs.openjdk.java.net/browse/JDK-8193266
> > [4] https://bugs.openjdk.java.net/browse/JDK-8221173
> > [5]
> >
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
> > [6] https://bugs.openjdk.java.net/browse/JDK-8243535
> >
>
>

From ningsheng.jian at arm.com  Wed Apr 29 06:59:18 2020
From: ningsheng.jian at arm.com (Ningsheng Jian)
Date: Wed, 29 Apr 2020 14:59:18 +0800
Subject: [aarch64-port-dev ] RFR: 8216557 Aarch64: Add support for
	Concurrent Class Unloading
In-Reply-To: <bfd121cd-f0eb-25ea-1b56-d42b3467bb6a@arm.com>
References: <520f8085-eaa0-46bc-9eb9-c1244fca2531@arm.com>
 <8f317840-a2b2-3ccb-fbb2-a38b2ebcbf4b@oracle.com>
 <7e49dc25-da51-50d3-eb3f-4840dab7db47@arm.com>
 <fc53ddcd-1464-a3f1-1843-8d7ac7f17905@oracle.com>
 <4a01252e-e7b2-d2bc-2858-ca1785b8b2a2@arm.com>
 <3f193fdc-b1fb-9f0a-4635-acdb7de29bca@arm.com>
 <bfd121cd-f0eb-25ea-1b56-d42b3467bb6a@arm.com>
Message-ID: <0f602574-a4ea-da3e-46de-d35862e276d6@arm.com>

On 4/28/20 7:28 PM, Stuart Monteith wrote:
> On 28/04/2020 06:26, Ningsheng Jian wrote:
>> Hi Stuart,
>>
>> On 4/28/20 12:34 AM, Stuart Monteith wrote:
>>> Thanks Erik, Per, Andrew,
>>>  ????I've fixed up the testcase and retested.
>>>
>>> Uploaded here:
>>>
>>>  ????http://cr.openjdk.java.net/~smonteith/8216557/webrev.2/
>>>
>>> Would someone be able to submit this for me?
>>>
>>
>> I submitted a build job before pushing your code, but it failed to build with minimal variant configure. Here's error
>> message:
>>
>> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp: In static member function 'static AdapterHandlerEntry*
>> SharedRuntime::generate_i2c2i_adapters(MacroAssembler*, int, int, const BasicType*, const VMRegPair*,
>> AdapterFingerPrint*)':
>>
>> ./src/hotspot/cpu/aarch64/sharedRuntime_aarch64.cpp:736:5: error: invalid use of incomplete type 'class
>> BarrierSetAssembler'
>>
>>  ?? bs->c2i_entry_barrier(masm);
>>
>> I think you need to include barrierSetAssembler.hpp in sharedRuntime_aarch64.cpp?
>>
>> Thanks,
>> Ningsheng
> 
> Thanks for that Ningsheng - I've made some changes, and built with minimal.
> 
> The revised patch:
> 
>   http://cr.openjdk.java.net/~smonteith/8216557/webrev.3/
> 

Looks good and pushed.

Thanks,
Ningsheng


From ioi.lam at oracle.com  Wed Apr 29 05:14:31 2020
From: ioi.lam at oracle.com (Ioi Lam)
Date: Tue, 28 Apr 2020 22:14:31 -0700
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
	storage reservation
In-Reply-To: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
Message-ID: <6802c3af-77a5-8a91-b94c-d590dd41765f@oracle.com>

Hi Thomas,

There are a lot of changes so it will take me a while to go through 
everything. Just some initial comments:

 ? // User may have specified an invalid base address. Should we ignore 
it or assert?
guarantee(CompressedKlassPointers::is_valid_base((address)shared_base),
 ??????????? "SharedBaseAddress: " PTR_FORMAT " is not a valid base.", 
p2i(shared_base));

This will cause the VM to crash. I think it's better (1) exit the VM 
properly with an error code, or (2) override the user's input.

======

Since this is a potentially disruptive change, I want to run it in our 
CI as well. Could you tell me the tip of your repo?

========

For testing the CDS relocation code, I would suggest running:

cd test/hotspot/jtreg
jtreg -javaoption:-XX:+UnlockDiagnosticVMOptions \
 ????? -javaoption:-XX:ArchiveRelocationMode=1 \
 ????? -javaoption:-XX:NativeMemoryTracking=detail
 ????? :hotspot_cds_relocation

This will place the CCS at random locations picked by the OS.

========

metaspace.cpp:

If your intention is to "shake things up a little", it's not a good idea 
to include it in a complex change set. If things indeed go wrong, we 
don't know who caused it (your CCS changes, or old bugs triggered by 
this debug code), and we will end up backing out the entire changeset.

I would suggest putting this in a different RFE, and even push it now.

 ? // The upcoming Elastic Metaspace will have stricter alignment 
requirements.
 ? // For debug builds, increase reserve alignment to shake loose errors 
resulting
 ? // from misusing this alignment.
 ? // Note: do not increase too much (e.g. not on platforms with 64K 
pages), we do not
 ? // want to disturb tests requiring precise numbers for metaspace size 
or ccs size.
#ifdef ASSERT
 ? if (_reserve_alignment == 4 * K) {
 ??? _reserve_alignment *= 4;
 ? }
#endif


More to come ....

Thanks
- Ioi


On 4/28/20 7:54 AM, Thomas St?fe wrote:
> Hi all,
>
> Could I have reviews for the following proposal of reworking cds/class 
> space reservation?
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8243392
>
> Webrev: 
> http://cr.openjdk.java.net/~stuefe/webrevs/rework-cds-ccs-reservation/webrev.00/webrev/
>
> (Many thanks to Ioi Lam for so patiently explaining CDS internals to 
> me, and to Andrew Haley and Nick Gasson for help with aarch64!)
>
> Reservation of the compressed class space is needlessly complicated 
> and has some minor issues. It can be simplified and made clearer.
>
> The complexity stems from the fact that this area lives at the 
> intersection of two to three sub systems, depending on how one counts. 
> Metaspace, CDS, and the platform which may or may not its own view of 
> how to reserve class space. And this code has been growing organically 
> over time.
>
> One small example:
>
> ReservedSpace Metaspace::reserve_preferred_space(size_t size, size_t 
> alignment,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bool large_pages, 
> char *requested_addr,
> ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ? ?bool use_requested_addr)
>
> which I spent hours decoding, resulting in a very confused mail to 
> hs-runtime and aarch64-port-dev [2].
>
> This patch attempts to simplify cds and metaspace setup a bit; to 
> comment implicit knowledge which is not immediately clear; to cleanly 
> abstract platform concerns like optimized class space placement; and 
> to disentangle cds from metaspace to solve issues which may bite us 
> later with Elastic Metaspace [4].
>
> ---
>
> The main change is the reworked reservation mechanism. This is based 
> on Ioi's proposal [5].
>
> When reserving class space, three things must happen:
>
> 1) reservation of the space obviously. If cds is active that space 
> must be in the vicinity of cds archives to be covered by compressed 
> class pointer encoding.
> 2) setting up the internal Metaspace structures atop of that space
> 3) setting up compressed class pointer encoding.
>
> In its current form, Metaspace may or may not do some or all of that 
> in one function 
> (Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace 
> metaspace_rs, char* requested_addr, address cds_base);) - if cds is 
> active, it will reserve the space for Metaspace and hand it in, 
> otherwise it will create it itself.
>
> When discussing this in [2], Ioi proposed to move the reservation of 
> the class space completely out of Metaspace and make it a 
> responsibility of the caller always. This would reduce some 
> complexity, and this patch follows the proposal.
>
> I removed 
> Metaspace::allocate_metaspace_compressed_klass_ptrs(ReservedSpace 
> metaspace_rs, char* requested_addr, address cds_base); and all its sub 
> functions.
>
> (1) now has to always be done outside - a ReservedSpace for class 
> space has to be provided by the caller. However, Metaspace now offers 
> a utility function for reserving space at a "nice" location, and 
> explicitly doing nothing else:
>
> ReservedSpace 
> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>
> this function can be redefined on a platform level for platform 
> optimized reservation, see below for details.
>
> (2) is taken care of by a new function, 
> Metaspace::initialize_class_space(ReservedSpace rs)
>
> (3) is taken care of a new function 
> CompressedKlassPointers::initialize(), see below for details.
>
>
> So, class space now is set up three explicit steps:
>
> - First, reserve a suitable space by however means you want. For 
> convenience you may use 
> Metaspace::reserve_address_space_for_compressed_classes(), or you may 
> roll your own reservation.
> - Next, tell Metaspace to use that range as backing storage for class 
> space: Metaspace::initialize_class_space(ReservedSpace rs)
> - Finally, set up encoding. Encoding is independent from the concept 
> of a ReservedSpace, it just gets an address range, see below for details.
>
> Separating these steps and moving them out of the responsibility of 
> Metaspace makes this whole thing more flexible; it also removes 
> unnecessary knowledge (e.g. Metaspace does not need to know anything 
> about either ccp encoding or cds).
>
> ---
>
> How it comes together:
>
> If CDS is off, we just reserve a space using 
> Metaspace::reserve_address_space_for_compressed_classes(), initialize 
> it with Metaspace::initialize_class_space(ReservedSpace rs), then set 
> up compressed class pointer encoding covering the range of this class 
> space.
>
> If CDS is on (dump time), we reserve large 4G space, either at 
> SharedBaseAddress or using 
> Metaspace::reserve_address_space_for_compressed_classes(); we then 
> split that into 3G archive space and 1G class space; we set up that 
> space with Metaspace as class space; then we set up?compressed class 
> pointer encoding covering both archive space and cds.
>
> If CDS is on (run time), we reserve a large space, split it into 
> archive space (large enough to hold both archives) and class space, 
> then basically proceed as above.
>
> Note that this is almost exactly how things worked before (modulo some 
> minor fixes, e.g. alignment issues), only the code is reformed and 
> made more explicit.
>
> ---
>
> I moved compressed class pointer setup over to CompressedKlassPointers 
> and changed the interface:
>
> -void Metaspace::set_narrow_klass_base_and_shift(ReservedSpace 
> metaspace_rs, address cds_base)
> +void CompressedKlassPointers::initialize(address addr, size_t len);
>
> Instead of feeding it a single ReservedSpace, which is supposed to 
> represent class space, and an optional alternate base if cds is on, 
> now we give it just an numeric address range. That range marks the 
> limits to where Klass structures are to be expected, and is the 
> implicit promise that outside that range no Klass structures will 
> exist, so encoding has to cover only this range.
>
> This range may contain just the class space; or class space+cds; or 
> whatever allocation scheme we come up with in the future. Encoding 
> does not really care how the memory is organized as long as the input 
> range covers all possible Klass locations. That way we remove 
> knowledge about class space/cds from compressed class pointer encoding.
>
> Moving it away from metaspace.cpp into the CompressedKlassPointers 
> class also mirrors CompressedOops::initialize().
>
> ---
>
> I renamed _narrow_klass_range to just _range, because strictly 
> speaking this is the range un-narrow Klass pointers can have.
>
> As for the implementation of 
> CompressedKlassPointers::initialize(address addr, size_t len), I 
> mimicked very closely what happened before, so there should be almost 
> no differences. Since "almost no differences" sounds scary :) here are 
> the differences:
>
> - When CDS is active (dump or run time) we now always, 
> unconditionally, set the encoding range to 4G. This fixes a 
> theoretical bug discussed on aarch64-port-dev [1].
>
> - When CDS is not active, we set the encoding range to the minimum 
> required length. Before, it was left at its default value of 4G.
>
> Both differences only affect aarch64, since they are currently the 
> only one using the range field in CompressedKlassPointers.
>
> I wanted to add an assert somewhere to test encoding of the very last 
> address of the CompressedKlassPointers range, again to prevent errors 
> like [3]. But I did not come up with a good place for this assert 
> which would cover also the encoding done by C1/C2.
>
> For the same reason I thought about introducing a mode where Klass 
> structures would be allocated in reverse order, starting at the end of 
> the ccs, but again left it out as too big a change.
>
> ---
>
> OS abstraction: platforms may have restrictions of what constitutes a 
> valid compressed class pointer encoding base. Or if not, they may have 
> at least preferences. There was logic like this in metaspace.cpp, 
> which I removed and cleanly factored out into platform dependent 
> files, giving each platform the option to add special logic.
>
> These are two new methods:
>
> - bool CompressedKlassPointers::is_valid_base(address p)
>
> to let the platform tell you whether it considers p to be a valid 
> encoding base. The only platform having these restrictions currently 
> is aarch64.
>
> - ReservedSpace 
> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>
> this hands over the process of allocating a range suitable for 
> compressed class pointer encoding to the platform. Most platforms will 
> allocate just anywhere, but some platforms may have a better strategy 
> (e.g. trying low memory first, trying only correctly aligned addresses 
> and so on).
>
> Beforehand, this coding existed in a similar form in metaspace.cpp for 
> aarch64 and AIX. For now, I left the AIX part out - it seems only half 
> done, and I want to check further if we even need it, if yes why not 
> on Linux ppc, and C1 does not seem to support anything other than 
> base+offset with shift either, but I may be mistaken.
>
> These two methods should give the platform enough control to implement 
> their own scheme for optimized class space placement without bothering 
> any shared code about it.
>
> Note about the form, I introduced two new platform dependent files, 
> "metaspace_<cpu>.cpp" and "compressedOops_<cpu>.cpp". I am not happy 
> about this but this seems to be what we generally do in hotspot, right?
>
> ---
>
> Metaspace reserve alignment vs cds alignment
>
> CDS was using Metaspace reserve alignment for CDS internal purposes. I 
> guess this was just a copy paste issue. It never caused problems since 
> Metaspace reserve alignment == page size, but that is not true anymore 
> in the upcoming Elastic Metaspace where reserve alignment will be 
> larger. This causes a number of issues.
>
> I separated those two cleanly. CDS now uses 
> os::vm_allocation_granularity. Metaspace::reserve_alignment is only 
> used in those two places where it is needed, when CDS creates the 
> address space for class space on behalf of the Metaspace.
>
> ---
>
> Windows special handling in CDS
>
> To simplify coding I removed the windows specific handling which left 
> out reservation of the archive. This was needed because windows cannot 
> mmap files into reserved regions. But fallback code exists in 
> filemap.cpp for this case which just reads in the region instead of 
> mapping?it.
>
> Should that turn out to be a performance problem, I will reinstate the 
> feature. But a simpler way would be reserve the archive and later just 
> before mmapping?the archive file to release the archive space. That 
> would not only be simpler but give us the best guarantee that that 
> address space is actually available. But I'd be happy to leave that 
> part out completely if we do not see any performance problems on 
> windows x64.
>
> ---
>
> NMT cannot deal with spaces which are split. This problem manifests in 
> that bookkeeping for class space is done under "Shared Classes", not 
> "Classes" as it should. This problem exists today too at dump?time and 
> randomly at run time. But since I simplified the reservation, this 
> problem now shows up always, whether or not we map at the 
> SharedBaseAddress.
> While I could work around this problem, I'd prefer this problem to be 
> solved at the core, and NMT to have an option to recognize reservation 
> splits. So I'd rather not put a workaround for this into the patch but 
> leave it for fixing as a separate issue. I opened this issue to track 
> it [6].
>
> ---
>
> Jtreg tests:
>
> I expanded the CompressedOops/CompressedClassPointers.java. I also 
> extended them to Windows. The tests now optionally omit strict class 
> space placement tests, since these tests heavily depend on ASLR and 
> were the reason they were excluded on Windows. However I think even 
> without checking for class space placement they make sense, just to 
> see that the VM comes up and lives with the many different settings we 
> can run in.
>
> ---
>
> Tests:
>
> - I ran the patch through Oracles submit repo
> - I ran tests manually for aarch64, zero, linux 32bit and windows x64
> - The whole battery of nightly tests at SAP, including ppc, ppcle and 
> aarch64, unfortunately excluding windows because of unrelated errors. 
> Windows x64 tests will be redone tonight.
>
>
> Thank you,
>
> Thomas
>
> [1] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008804.html
> [2] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008757.html
> [3] https://bugs.openjdk.java.net/browse/JDK-8193266
> [4] https://bugs.openjdk.java.net/browse/JDK-8221173
> [5] 
> https://mail.openjdk.java.net/pipermail/aarch64-port-dev/2020-April/008765.html
> [6] https://bugs.openjdk.java.net/browse/JDK-8243535
>


From nick.gasson at arm.com  Wed Apr 29 07:47:02 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Wed, 29 Apr 2020 15:47:02 +0800
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
Message-ID: <857dxyg2dl.fsf@arm.com>

Hi Thomas,

On 04/28/20 22:54 pm, Thomas St?fe wrote:
>
> These are two new methods:
>
> - bool CompressedKlassPointers::is_valid_base(address p)
>
> to let the platform tell you whether it considers p to be a valid encoding
> base. The only platform having these restrictions currently is aarch64.
>
> - ReservedSpace
> Metaspace::reserve_address_space_for_compressed_classes(size_t size);
>
> this hands over the process of allocating a range suitable for compressed
> class pointer encoding to the platform. Most platforms will allocate just
> anywhere, but some platforms may have a better strategy (e.g. trying low
> memory first, trying only correctly aligned addresses and so on).
>
> Beforehand, this coding existed in a similar form in metaspace.cpp for
> aarch64 and AIX. For now, I left the AIX part out - it seems only half
> done, and I want to check further if we even need it, if yes why not on
> Linux ppc, and C1 does not seem to support anything other than base+offset
> with shift either, but I may be mistaken.

Just a small comment:

  33 bool CompressedKlassPointers::is_valid_base(address p) {
  34
  35   // Below 32G, base must be aligned to 4G.
  36   // Above that point, base must be aligned to 32G
  37
  38   if (p < (address)(32 * G)) {
  39     return is_aligned(p, 4 * G);
  40   }
  41
  42   return is_aligned(p, 32 * G);
  43
  44 }

On line 42 I'd prefer to use (4 << LogKlassAlignmentInBytes)*G as it
currently is in metaspace.cpp instead of the literal 32. This makes the
relationship with the compressed class decode logic a bit clearer as the
restriction comes from the MOV and MOVK instructions we use to
decompress the pointer: we have to ensure the bits of the base and bits
of the offset after shifting do not overlap. Similarly for the
`increment` field in metaspace_aarch64.cpp line 51.


Thanks,
Nick

From aph at redhat.com  Wed Apr 29 08:36:03 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 29 Apr 2020 09:36:03 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <857dxyg2dl.fsf@arm.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com>
Message-ID: <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>

On 4/29/20 8:47 AM, Nick Gasson wrote:
> This makes the
> relationship with the compressed class decode logic a bit clearer as the
> restriction comes from the MOV and MOVK instructions we use to
> decompress the pointer: we have to ensure the bits of the base and bits
> of the offset after shifting do not overlap.

This seems a bit crazy. Whyever would anyone want shifted
CompressedKlassPointers with an offset? I guess I'm going to have to
look very closely at this patch.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From nick.gasson at arm.com  Wed Apr 29 08:51:23 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Wed, 29 Apr 2020 16:51:23 +0800
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
Message-ID: <854kt2fzec.fsf@arm.com>


On 04/29/20 16:36 pm, Andrew Haley wrote:
> On 4/29/20 8:47 AM, Nick Gasson wrote:
>> This makes the
>> relationship with the compressed class decode logic a bit clearer as the
>> restriction comes from the MOV and MOVK instructions we use to
>> decompress the pointer: we have to ensure the bits of the base and bits
>> of the offset after shifting do not overlap.
>
> This seems a bit crazy. Whyever would anyone want shifted
> CompressedKlassPointers with an offset? I guess I'm going to have to
> look very closely at this patch.

The compressed class shift is always set to LogKlassAlignmentInBytes
when CDS is enabled. It's for compatibility with AOT. See this comment
in Metaspace::set_narrow_klass_base_and_shift():

  // CDS uses LogKlassAlignmentInBytes for narrow_klass_shift. See
  // MetaspaceShared::initialize_dumptime_shared_and_meta_spaces() for
  // how dump time narrow_klass_shift is set. Although, CDS can work
  // with zero-shift mode also, to be consistent with AOT it uses
  // LogKlassAlignmentInBytes for klass shift so archived java heap objects
  // can be used at same time as AOT code.

For AOT this is set up in AOTGraalHotSpotVMConfig.java:

  // AOT captures VM settings during compilation. For compressed oops this
  // presents a problem for the case when the VM selects a zero-shift mode
  // (i.e., when the heap is less than 4G). Compiling an AOT binary with
  // zero-shift limits its usability. As such we force the shift to be
  // always equal to alignment to avoid emitting zero-shift AOT code.
  CompressEncoding vmOopEncoding = super.getOopEncoding();
  aotOopEncoding = new CompressEncoding(vmOopEncoding.getBase(), logMinObjAlignment());
  CompressEncoding vmKlassEncoding = super.getKlassEncoding();
  aotKlassEncoding = new CompressEncoding(vmKlassEncoding.getBase(), logKlassAlignment);

For compressed OOPs it makes sense because it allows a larger heap
without changing the encoding mode. But for compressed class pointers we
never need to address more than 4G so maybe it's better to use 0 shift
instead of logKlassAlignment above? With CDS the default shared base
address is 0x80000000 which doesn't allow a zero base anyway.


Thanks,
Nick

From aph at redhat.com  Wed Apr 29 12:56:10 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 29 Apr 2020 13:56:10 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <854kt2fzec.fsf@arm.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
 <854kt2fzec.fsf@arm.com>
Message-ID: <25829763-7485-5208-58e8-1d51f7068816@redhat.com>

On 4/29/20 9:51 AM, Nick Gasson wrote:
> For compressed OOPs it makes sense because it allows a larger heap
> without changing the encoding mode.

That's right: I'm looking at AOT-compiled code (after applying your
patch) and by default it uses a shift of 3, no offset. If I then run
the AOT-compiled code with -Xmx31G I get:

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  SIGSEGV (0xb) at pc=0x0000ffffa142bd3c, pid=9965, tid=10174
#
# JRE version:  (15.0) (slowdebug build )
# Java VM: OpenJDK 64-Bit Server VM (slowdebug 15-internal+0-adhoc.aph.jdk-tmp, mixed mode, aot, tiered, compressed oops, g1 gc, linux-aarch64)
# Problematic frame:
# A 388  java.lang.Thread.setPriority(I)V java.base (56 bytes) @ 0x0000ffffa142bd3c [0x0000ffffa142bac0+0x000000000000027c]

   0x0000ffffa142bd30 <+624>:	ldr	w1, [x4, #56]
   0x0000ffffa142bd34 <+628>:	cbz	w1, 0xffffa142bd84 <java.lang.Thread.setPriority(I)V+708>
   0x0000ffffa142bd38 <+632>:	lsl	x1, x1, #3
   0x0000ffffa142bd3c <+636>:	ldr	w0, [x1, #12]

... so the AOT-compiled code is still trying to use the shift of 3,
but it is not adding in the base, which is 0x1000000000. I guess this
is pilot error, but I'm trying to understand what gets checked and
when.

> But for compressed class pointers we never need to address more than
> 4G so maybe it's better to use 0 shift instead of logKlassAlignment
> above? With CDS the default shared base address is 0x80000000 which
> doesn't allow a zero base anyway.

Maybe. What actually happens when we decode compressed class pointers
in AOT-compiled code is:

Load the klass pointer from an Object:

  532440:       b940082a        ldr     w10, [x1,#8]

Load the compressed class base:

  532444:       90055e68        adrp    x8, b0fe000 <A.meta.got+0x16e000>
  532448:       9104c108        add     x8, x8, #0x130
  53244c:       f9400108        ldr     x8, [x8]

Shift and add:

  532450:       8b2a6d0a        add     x10, x8, x10, uxtx #3

... none of which is very nice, but the expensive part is loading the
compressed classw base and doing the add, so I guess we don't care
that there is a shift as well.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Apr 29 13:32:56 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 29 Apr 2020 14:32:56 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <25829763-7485-5208-58e8-1d51f7068816@redhat.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
 <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com>
Message-ID: <040930ce-2008-455f-4427-82c2795492c7@redhat.com>

On 4/29/20 1:56 PM, Andrew Haley wrote:
> Maybe. What actually happens when we decode compressed class pointers
> in AOT-compiled code is:
> 
> Load the klass pointer from an Object:
> 
>   532440:       b940082a        ldr     w10, [x1,#8]
> 
> Load the compressed class base:
> 
>   532444:       90055e68        adrp    x8, b0fe000 <A.meta.got+0x16e000>
>   532448:       9104c108        add     x8, x8, #0x130
>   53244c:       f9400108        ldr     x8, [x8]
> 
> Shift and add:
> 
>   532450:       8b2a6d0a        add     x10, x8, x10, uxtx #3
> 
> ... none of which is very nice, but the expensive part is loading the
> compressed class base and doing the add, so I guess we don't care
> that there is a shift as well.

Argh. s/compressed class/compressed oop/g

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Apr 29 14:21:44 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 29 Apr 2020 15:21:44 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
Message-ID: <d6a94952-4d45-9fc3-2b45-f0ba46fda0c5@redhat.com>

On 4/28/20 3:54 PM, Thomas St?fe wrote:
> These two methods should give the platform enough control to implement
> their own scheme for optimized class space placement without bothering any
> shared code about it.

There's still something I don't like. If we have a compressed class space
in the lower 32G but above 4G, we do this:

    // Otherwise we attempt to use a zero base if the range fits in lower 32G.
    if (end <= (address)ClassEncodingMetaspaceMax) {
      base = 0;
    } else {
      base = addr;
    }

    // Highest offset a Class* can ever have in relation to base.
    range = end - base;

    // We may not even need a shift if the range fits into 32bit:
    const uint64_t UnscaledClassSpaceMax = (uint64_t(max_juint) + 1);
    if (range < UnscaledClassSpaceMax) {
      shift = 0;
    } else {
      shift = LogClassAlignmentInBytes;
    }

... which means that we end up with zero base, shifted compressed
class pointers, *despite the fact* that we carefully chose a
nicely-aligned compressed class base we could encode efficiently.

I guess this code above is really optimized for x86; it certainly
seems to prefer shifts to offsets, which makes sense on that part. It
doesn't make much difference (if any) for AArch64, I admit, but it is
odd.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.stuefe at gmail.com  Wed Apr 29 16:14:30 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 29 Apr 2020 18:14:30 +0200
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <d6a94952-4d45-9fc3-2b45-f0ba46fda0c5@redhat.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <d6a94952-4d45-9fc3-2b45-f0ba46fda0c5@redhat.com>
Message-ID: <CAA-vtUwKc57J=xinDrES0tES+8eYD+nTzjqaqFyxz3E3fzMtVg@mail.gmail.com>

Hi Andrew,

On Wed, Apr 29, 2020 at 4:21 PM Andrew Haley <aph at redhat.com> wrote:

> On 4/28/20 3:54 PM, Thomas St?fe wrote:
> > These two methods should give the platform enough control to implement
> > their own scheme for optimized class space placement without bothering
> any
> > shared code about it.
>
> There's still something I don't like. If we have a compressed class space
> in the lower 32G but above 4G, we do this:
>
>     // Otherwise we attempt to use a zero base if the range fits in lower
> 32G.
>     if (end <= (address)ClassEncodingMetaspaceMax) {
>       base = 0;
>     } else {
>       base = addr;
>     }
>
>     // Highest offset a Class* can ever have in relation to base.
>     range = end - base;
>
>     // We may not even need a shift if the range fits into 32bit:
>     const uint64_t UnscaledClassSpaceMax = (uint64_t(max_juint) + 1);
>     if (range < UnscaledClassSpaceMax) {
>       shift = 0;
>     } else {
>       shift = LogClassAlignmentInBytes;
>     }
>
> ... which means that we end up with zero base, shifted compressed
> class pointers, *despite the fact* that we carefully chose a
> nicely-aligned compressed class base we could encode efficiently.
>
> I guess this code above is really optimized for x86; it certainly
> seems to prefer shifts to offsets, which makes sense on that part. It
> doesn't make much difference (if any) for AArch64, I admit, but it is
> odd.
>
>
I understand.

First off, this patch is supposed to be an almost clean code reshuffle. It
was not the intent of this patch to change functionality, at least not by
much, since the patch is complicated enough as it is. It just wants to
improve maintainability, so that we can improve the code in the future
easier. So, compressed class pointer encoding should work as it did before,
modulo those little details about CompressedKlassPointers::range.

That said, I agree with you, this is not optimal. There are other
possibilities to improve matters, e.g. we miss opportunities to go zero
based if the heap is very large since the class space gets always allocated
behind the heap.

Since the intent of this code is to give platforms greater leeway to do
their thing without disturbing shared code, maybe we should make
CompressedKlassPointers::initialize() platform dependent too? Or, add a
hook at the end of it allowing platforms to overwrite the default behavior.
If aarch64 prefers shift=0 and base=<nice base address> it could do so.

..Thomas


-- 
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>
>

From thomas.stuefe at gmail.com  Wed Apr 29 16:16:24 2020
From: thomas.stuefe at gmail.com (=?UTF-8?Q?Thomas_St=C3=BCfe?=)
Date: Wed, 29 Apr 2020 18:16:24 +0200
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <857dxyg2dl.fsf@arm.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com>
Message-ID: <CAA-vtUxguP9_1QF7wEn7qAhQF6UZ9srP0krY0_mFHq=yjrJ-og@mail.gmail.com>

Thank you Nick. You are of course right, I should use speaking constants
instead. I will change that.

..Thomas

On Wed, Apr 29, 2020 at 10:14 AM Nick Gasson <nick.gasson at arm.com> wrote:

> Hi Thomas,
>
> On 04/28/20 22:54 pm, Thomas St?fe wrote:
> >
> > These are two new methods:
> >
> > - bool CompressedKlassPointers::is_valid_base(address p)
> >
> > to let the platform tell you whether it considers p to be a valid
> encoding
> > base. The only platform having these restrictions currently is aarch64.
> >
> > - ReservedSpace
> > Metaspace::reserve_address_space_for_compressed_classes(size_t size);
> >
> > this hands over the process of allocating a range suitable for compressed
> > class pointer encoding to the platform. Most platforms will allocate just
> > anywhere, but some platforms may have a better strategy (e.g. trying low
> > memory first, trying only correctly aligned addresses and so on).
> >
> > Beforehand, this coding existed in a similar form in metaspace.cpp for
> > aarch64 and AIX. For now, I left the AIX part out - it seems only half
> > done, and I want to check further if we even need it, if yes why not on
> > Linux ppc, and C1 does not seem to support anything other than
> base+offset
> > with shift either, but I may be mistaken.
>
> Just a small comment:
>
>   33 bool CompressedKlassPointers::is_valid_base(address p) {
>   34
>   35   // Below 32G, base must be aligned to 4G.
>   36   // Above that point, base must be aligned to 32G
>   37
>   38   if (p < (address)(32 * G)) {
>   39     return is_aligned(p, 4 * G);
>   40   }
>   41
>   42   return is_aligned(p, 32 * G);
>   43
>   44 }
>
> On line 42 I'd prefer to use (4 << LogKlassAlignmentInBytes)*G as it
> currently is in metaspace.cpp instead of the literal 32. This makes the
> relationship with the compressed class decode logic a bit clearer as the
> restriction comes from the MOV and MOVK instructions we use to
> decompress the pointer: we have to ensure the bits of the base and bits
> of the offset after shifting do not overlap. Similarly for the
> `increment` field in metaspace_aarch64.cpp line 51.
>
>
> Thanks,
> Nick
>

From zgu at redhat.com  Wed Apr 29 19:09:11 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 29 Apr 2020 15:09:11 -0400
Subject: [aarch64-port-dev ] [15] RFR 8241793: Shenandoah: Enable concurrent
 class unloading for aarch64
Message-ID: <4f453020-1f29-30a4-9e45-33854451bd3d@redhat.com>

Concurrent class unloading support for aarch64 [1] has been pushed, 
let's enable it for Shenandoah GC.

Bug: https://bugs.openjdk.java.net/browse/JDK-8241793
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8241793/webrev.00/

Test:
   hotspot_gc_shenandoah
   tier1 with Shenandoah GC

Thanks,

-Zhengyu

[1] https://bugs.openjdk.java.net/browse/JDK-8216557


From xxinliu at amazon.com  Wed Apr 29 23:43:13 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Wed, 29 Apr 2020 23:43:13 +0000
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <858sifgbt3.fsf@arm.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
Message-ID: <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com>

Hi, Nick,

Thanks for taking a look at this issue.  I figure out why.

There?re two separated issues. 
1. My toolchain was too old. That?s why I can?t reproduce the building failures feedbacked from submit repo. 

OpenJDK wiki says jdk13 Linux x86_64 supports gcc 8.2, so I believe it means the building hosts must use gcc 8.2+. 
My WIP patch does have a C++ issue captured by g++-8+.

2.  I can?t use adoptOpenJDK?s aarch64 jdk14 as boot-jdk.  Thanks for helping me to understand the problem. 
The pagesize of my aarch64 host is 4k.  I don't have access of adoptOpenJDK's build hosts. I have filed an issue to adoptJDK https://github.com/AdoptOpenJDK/openjdk14-binaries/issues/1
 
One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. 
 
Thanks,
--lx

?On 4/28/20, 3:14 AM, "hotspot-dev on behalf of Nick Gasson" <hotspot-dev-bounces at openjdk.java.net on behalf of nick.gasson at arm.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


    >
    > My nightly buildbot started failing recently on aarch64.
    > One issue is the error message of [cds] prevents configure from bootjdk determining.
    > https://hg.openjdk.java.net/jdk/jdk/file/1b8f9e72b22b/make/autoconf/boot-jdk.m4#l77
    >
    > ./bin/java --version
    > [0.006s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
    > openjdk 14.0.1 2020-04-14
    > OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
    > OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
    >
    > Are you aware of this issue? I can?t see [cds] line on my linux/x86_64 host.
    > Maybe it?s aarch64-only.  CC aarch-port-dev.
    >

    Was your boot JDK built on a machine configured with a different page
    size to your current machine? Looks like the CDS archive was dumped on a
    machine with 64k pages but you're running with 4k pages. There's a JBS
    issue for this:

    https://bugs.openjdk.java.net/browse/JDK-8236847


    Thanks,
    Nick


From nick.gasson at arm.com  Thu Apr 30 05:55:16 2020
From: nick.gasson at arm.com (Nick Gasson)
Date: Thu, 30 Apr 2020 13:55:16 +0800
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <25829763-7485-5208-58e8-1d51f7068816@redhat.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
 <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com>
Message-ID: <85368lfrgb.fsf@arm.com>

On 04/29/20 20:56 pm, Andrew Haley wrote:
>
> That's right: I'm looking at AOT-compiled code (after applying your
> patch) and by default it uses a shift of 3, no offset. If I then run
> the AOT-compiled code with -Xmx31G I get:
>
> #
> # A fatal error has been detected by the Java Runtime Environment:
> #
> #  SIGSEGV (0xb) at pc=0x0000ffffa142bd3c, pid=9965, tid=10174
> #
> # JRE version:  (15.0) (slowdebug build )
> # Java VM: OpenJDK 64-Bit Server VM (slowdebug 15-internal+0-adhoc.aph.jdk-tmp, mixed mode, aot, tiered, compressed oops, g1 gc, linux-aarch64)
> # Problematic frame:
> # A 388  java.lang.Thread.setPriority(I)V java.base (56 bytes) @ 0x0000ffffa142bd3c [0x0000ffffa142bac0+0x000000000000027c]
>
>    0x0000ffffa142bd30 <+624>:	ldr	w1, [x4, #56]
>    0x0000ffffa142bd34 <+628>:	cbz	w1, 0xffffa142bd84 <java.lang.Thread.setPriority(I)V+708>
>    0x0000ffffa142bd38 <+632>:	lsl	x1, x1, #3
>    0x0000ffffa142bd3c <+636>:	ldr	w0, [x1, #12]
>
> ... so the AOT-compiled code is still trying to use the shift of 3,
> but it is not adding in the base, which is 0x1000000000. I guess this
> is pilot error, but I'm trying to understand what gets checked and
> when.

No this looks like a real bug: jaotc is using the value of the heap base
in the VM where jaotc is run to decide whether to emit the add or
not. If you run `jaotc -J-Xmx31g` it works.

I'm not very familiar with Graal but I believe this fixes it:

--- a/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java
+++ b/src/jdk.internal.vm.compiler/share/classes/org.graalvm.compiler.hotspot.aarch64/src/org/graalvm/compiler/hotspot/aarch64/AArch64HotSpotMove.java
@@ -139,8 +139,9 @@ public class AArch64HotSpotMove {
             Register resultRegister = asRegister(result);
             Register ptr = asRegister(input);
             Register base = (isRegister(baseRegister) ? asRegister(baseRegister) : zr);
+            boolean pic = GeneratePIC.getValue(crb.getOptions());
             // result = (ptr - base) >> shift
-            if (!encoding.hasBase()) {
+            if (!pic && !encoding.hasBase()) {
                 if (encoding.hasShift()) {
                     masm.lshr(64, resultRegister, ptr, encoding.getShift());
                 } else {
@@ -189,7 +190,8 @@ public class AArch64HotSpotMove {
         public void emitCode(CompilationResultBuilder crb, AArch64MacroAssembler masm) {
             Register inputRegister = asRegister(input);
             Register resultRegister = asRegister(result);
-            Register base = encoding.hasBase() ? asRegister(baseRegister) : null;
+            boolean pic = GeneratePIC.getValue(crb.getOptions());
+            Register base = pic || encoding.hasBase() ? asRegister(baseRegister) : null;
             emitUncompressCode(masm, inputRegister, resultRegister, base, encoding.getShift(), nonNull);
         }


I've made a JBS issue to track this:
https://bugs.openjdk.java.net/browse/JDK-8244164

>
>> But for compressed class pointers we never need to address more than
>> 4G so maybe it's better to use 0 shift instead of logKlassAlignment
>> above? With CDS the default shared base address is 0x80000000 which
>> doesn't allow a zero base anyway.
>
> Maybe. What actually happens when we decode compressed class pointers
> in AOT-compiled code is:
>
> Load the klass pointer from an Object:
>
>   532440:       b940082a        ldr     w10, [x1,#8]
>
> Load the compressed class base:
>
>   532444:       90055e68        adrp    x8, b0fe000 <A.meta.got+0x16e000>
>   532448:       9104c108        add     x8, x8, #0x130
>   53244c:       f9400108        ldr     x8, [x8]
>
> Shift and add:
>
>   532450:       8b2a6d0a        add     x10, x8, x10, uxtx #3
>
> ... none of which is very nice, but the expensive part is loading the
> compressed classw base and doing the add, so I guess we don't care
> that there is a shift as well.

Yes but if we can avoid the shift here then CDS can also use zero shift
by default. Which avoids the problem of having compressed class pointers
with both shift and base non-zero in the Hotspot-generated code.


Thanks,
Nick

From Pengfei.Li at arm.com  Thu Apr 30 06:05:30 2020
From: Pengfei.Li at arm.com (Pengfei Li)
Date: Thu, 30 Apr 2020 06:05:30 +0000
Subject: [aarch64-port-dev ] RFR(XS): Provide information when hitting a
 HaltNode for architectures other than x86
In-Reply-To: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com>
References: <92E14A43-E260-49D5-BF74-CB6331A2EB33@amazon.com>
Message-ID: <DB8PR08MB4969A585B39B6E50BD4272B096AA0@DB8PR08MB4969.eurprd08.prod.outlook.com>

Hi Xin,

> I tested on aarch64.  It generates the same crash report as x86_64 when it
> does hit HaltNode.  Halt reason is displayed. I paste report on the JBS.
> I ran hotspot:tier1 on aarch64 fastdebug build.  It passed except for 3
> relevant failures[1].

(NOT a reviewer) The original instruction used should be dcps1 instead of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a trivial fix to change dpcs1/2/3 to dcps1/2/3?

BTW, how did you test to hit the HaltNode?

--
Thanks,
Pengfei


From xxinliu at amazon.com  Thu Apr 30 06:35:54 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 30 Apr 2020 06:35:54 +0000
Subject: [aarch64-port-dev ] RFR(XS): 8230552: Provide information when
 hitting a HaltNode for architectures other than x86
Message-ID: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com>


?On 4/29/20, 11:06 PM, "Pengfei Li" <Pengfei.Li at arm.com> wrote:


    Hi Xin,

    > I tested on aarch64.  It generates the same crash report as x86_64 when it
    > does hit HaltNode.  Halt reason is displayed. I paste report on the JBS.
    > I ran hotspot:tier1 on aarch64 fastdebug build.  It passed except for 3
    > relevant failures[1].

    (NOT a reviewer) The original instruction used should be dcps1 instead of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a trivial fix to change dpcs1/2/3 to dcps1/2/3?

Oh, I don't know that. I did search dpcs and found nothing. 
I've filed a new issue about the typo thing:  JDK-8244170. Let's resolve it in separated issue.

    BTW, how did you test to hit the HaltNode?
    --
    Thanks,
    Pengfei

I followed Christian and Volkers' recipe on JDK-8230552. Both of them can generate HaltNode. 
Volker's approach is very interesting. You have to give program a couple of "-XX:SuppressErrorAt=" to increase tolerance.

Thanks,
--lx


From aph at redhat.com  Thu Apr 30 07:57:31 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 30 Apr 2020 08:57:31 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <CAA-vtUwKc57J=xinDrES0tES+8eYD+nTzjqaqFyxz3E3fzMtVg@mail.gmail.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <d6a94952-4d45-9fc3-2b45-f0ba46fda0c5@redhat.com>
 <CAA-vtUwKc57J=xinDrES0tES+8eYD+nTzjqaqFyxz3E3fzMtVg@mail.gmail.com>
Message-ID: <617e5a37-3282-fb9e-6904-a03bd1f71411@redhat.com>

On 4/29/20 5:14 PM, Thomas St?fe wrote:
> Since the intent of this code is to give platforms greater leeway to do
> their thing without disturbing shared code, maybe we should make
> CompressedKlassPointers::initialize() platform dependent too? 

That would be very nice. If we can fix things so that we never shift,
then a lot of things become easier.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Apr 30 07:57:43 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 30 Apr 2020 08:57:43 +0100
Subject: [aarch64-port-dev ] RFR(M): 8243392: Remodel CDS/Metaspace
 storage reservation
In-Reply-To: <85368lfrgb.fsf@arm.com>
References: <CAA-vtUyVki-DHepyAzORoTK_MwnrezFUB+QsxbWszztsx1x7NA@mail.gmail.com>
 <857dxyg2dl.fsf@arm.com> <463e11c5-8c0d-3dad-060b-2a3d9e80fd40@redhat.com>
 <854kt2fzec.fsf@arm.com> <25829763-7485-5208-58e8-1d51f7068816@redhat.com>
 <85368lfrgb.fsf@arm.com>
Message-ID: <45cf5ef0-3939-38df-7959-0e19d79d9e39@redhat.com>

On 4/30/20 6:55 AM, Nick Gasson wrote:
> Yes but if we can avoid the shift here then CDS can also use zero shift
> by default. Which avoids the problem of having compressed class pointers
> with both shift and base non-zero in the Hotspot-generated code.

That sounds good.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Thu Apr 30 08:18:51 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 30 Apr 2020 09:18:51 +0100
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
 <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com>
Message-ID: <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com>

On 4/30/20 12:43 AM, Liu, Xin wrote:
> One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then. 

The fix should be submitted to build-dev.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From volker.simonis at gmail.com  Thu Apr 30 14:45:03 2020
From: volker.simonis at gmail.com (Volker Simonis)
Date: Thu, 30 Apr 2020 16:45:03 +0200
Subject: [aarch64-port-dev ] RFR(XS): 8230552: Provide information when
 hitting a HaltNode for architectures other than x86
In-Reply-To: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com>
References: <19BC4D2D-56F3-45BE-898C-1389469A7B36@amazon.com>
Message-ID: <CA+3eh13-vr8u=yDrYvSh-F8taHxFehe13DnrEPcVFFhhKUFc+A@mail.gmail.com>

Forwarding to ppc-aix and s390 port mailing lists with the kind request for
testing this simple fix on the corresponding platforms.

Thank you and best regards,
Volker


Liu, Xin <xxinliu at amazon.com> schrieb am Do., 30. Apr. 2020, 08:39:

>
>
> ?On 4/29/20, 11:06 PM, "Pengfei Li" <Pengfei.Li at arm.com> wrote:
>
>
>
>     Hi Xin,
>
>     > I tested on aarch64.  It generates the same crash report as x86_64
> when it
>     > does hit HaltNode.  Halt reason is displayed. I paste report on the
> JBS.
>     > I ran hotspot:tier1 on aarch64 fastdebug build.  It passed except
> for 3
>     > relevant failures[1].
>
>     (NOT a reviewer) The original instruction used should be dcps1 instead
> of dpcs1 - there's a misspelling in AArch64 assembler. Could you add a
> trivial fix to change dpcs1/2/3 to dcps1/2/3?
>
> Oh, I don't know that. I did search dpcs and found nothing.
> I've filed a new issue about the typo thing:  JDK-8244170. Let's resolve
> it in separated issue.
>
>     BTW, how did you test to hit the HaltNode?
>     --
>     Thanks,
>     Pengfei
>
> I followed Christian and Volkers' recipe on JDK-8230552. Both of them can
> generate HaltNode.
> Volker's approach is very interesting. You have to give program a couple
> of "-XX:SuppressErrorAt=" to increase tolerance.
>
> Thanks,
> --lx
>
>
>

From xxinliu at amazon.com  Thu Apr 30 21:48:07 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 30 Apr 2020 21:48:07 +0000
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
 <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com>
 <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com>
Message-ID: <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com>

Hi, Andrew, 

That's a hack. A general way should use grep or sed to capture the needed line instead of hardcoding first or second line. 
Okay, Let me try to do that. 

Thanks, 
--lx


?On 4/30/20, 1:19 AM, "aph at redhat.com" <aph at redhat.com> wrote:

    CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


    On 4/30/20 12:43 AM, Liu, Xin wrote:
    > One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then.

    The fix should be submitted to build-dev.

    --
    Andrew Haley  (he/him)
    Java Platform Lead Engineer
    Red Hat UK Ltd. <https://www.redhat.com>
    https://keybase.io/andrewhaley
    EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From xxinliu at amazon.com  Thu Apr 30 23:48:05 2020
From: xxinliu at amazon.com (Liu, Xin)
Date: Thu, 30 Apr 2020 23:48:05 +0000
Subject: [aarch64-port-dev ] FW: [Mach5]
 mach5-one-phh-JDK-8151779-20200427-2151-10554367: FAILED
In-Reply-To: <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com>
References: <C769AA2F-55A8-4986-9EEE-0AD93FB93E00@amazon.com>
 <E6D13CFA-A9E6-4262-8D33-2FB4412413B6@amazon.com> <858sifgbt3.fsf@arm.com>
 <57BB18E5-1F35-45AC-88BF-8D8C6A0767E3@amazon.com>
 <56d85fb9-6ce8-7c00-ff4f-bf3fe40718a4@redhat.com>
 <28A344AC-8FF4-49FB-96B9-6AC886C05930@amazon.com>
Message-ID: <9725176C-5E5A-499B-8093-A5865C4AC443@amazon.com>

Hi, Andrew, 

How about this?  I can use awk to capture java -version.  There're 2 cases.

I) openjdk
openjdk version "14.0.1" 2020-04-14
2) oraclejdk
java 14.0.1 2020-04-14 
 
if somehow java displays some error/warning messages,  awk can filter them out and capture the version line.
Eg. 
$ ~/builds/jdk-14.0.1+7/bin/java -version
[0.009s][error][cds] Unable to map CDS archive -- os::vm_allocation_granularity() expected: 65536 actual: 4096
openjdk version "14.0.1" 2020-04-14
OpenJDK Runtime Environment AdoptOpenJDK (build 14.0.1+7)
OpenJDK 64-Bit Server VM AdoptOpenJDK (build 14.0.1+7, mixed mode)
$ ~/builds/jdk-14.0.1+7/bin/java -version 2>&1 | awk '/^(openjdk version|java)/ {print $0}'
openjdk version "14.0.1" 2020-04-14

I think this awk stmt is portable, but it's always good to ask experts to review it, so I cc build-dev. 
 
Hers is the change. 

diff --git a/make/autoconf/boot-jdk.m4 b/make/autoconf/boot-jdk.m4
--- a/make/autoconf/boot-jdk.m4
+++ b/make/autoconf/boot-jdk.m4
@@ -74,7 +74,7 @@
           BOOT_JDK_FOUND=no
         else
           # Oh, this is looking good! We probably have found a proper JDK. Is it the correct version?
-          BOOT_JDK_VERSION=`"$BOOT_JDK/bin/java$EXE_SUFFIX" $USER_BOOT_JDK_OPTIONS -version 2>&1 | $HEAD -n 1`
+          BOOT_JDK_VERSION=`"$BOOT_JDK/bin/java$EXE_SUFFIX" $USER_BOOT_JDK_OPTIONS -version 2>&1 | $AWK '/^(openjdk version|java)/ {print [$]0}'`
           if [ [[ "$BOOT_JDK_VERSION" =~ "Picked up" ]] ]; then
             AC_MSG_NOTICE([You have _JAVA_OPTIONS or JAVA_TOOL_OPTIONS set. This can mess up the build. Please use --with-boot-jdk-jvmargs instead.])
             AC_MSG_NOTICE([Java reports: "$BOOT_JDK_VERSION".])
@@ -529,7 +529,7 @@
         BUILD_JDK_FOUND=no
       else
         # Oh, this is looking good! We probably have found a proper JDK. Is it the correct version?
-        BUILD_JDK_VERSION=`"$BUILD_JDK/bin/java" -version 2>&1 | $HEAD -n 1`
+        BUILD_JDK_VERSION=`"$BUILD_JDK/bin/java" -version 2>&1 | $AWK '/^(openjdk version|java)/ {print [$]0}'`

         # Extra M4 quote needed to protect [] in grep expression.
         [FOUND_CORRECT_VERSION=`echo $BUILD_JDK_VERSION | $EGREP "\"$VERSION_FEATURE([\.+-].*)?\""`]


?On 4/30/20, 2:52 PM, "aarch64-port-dev on behalf of Liu, Xin" <aarch64-port-dev-bounces at openjdk.java.net on behalf of xxinliu at amazon.com> wrote:

    Hi, Andrew, 

    That's a hack. A general way should use grep or sed to capture the needed line instead of hardcoding first or second line. 
    Okay, Let me try to do that. 

    Thanks, 
    --lx


    On 4/30/20, 1:19 AM, "aph at redhat.com" <aph at redhat.com> wrote:

        CAUTION: This email originated from outside of the organization. Do not click links or open attachments unless you can confirm the sender and know the content is safe.


        On 4/30/20 12:43 AM, Liu, Xin wrote:
        > One trick here. It's very easy to cheat configure by hacking the boot-jdk.m4 to "$HEAD -n 2". Everything looks fine then.

        The fix should be submitted to build-dev.

        --
        Andrew Haley  (he/him)
        Java Platform Lead Engineer
        Red Hat UK Ltd. <https://www.redhat.com>
        https://keybase.io/andrewhaley
        EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671