From zgu at redhat.com  Mon Jan  6 19:14:05 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 6 Jan 2020 14:14:05 -0500
Subject: [15] RFR(XS) 8236681: Shenandoah: Disable concurrent class unloading
 flag if no class unloading for the GC cycle
Message-ID: <be3bfaac-c9f6-2dc1-8858-965532069971@redhat.com>

Please review this small patch that disables concurrent class unloading 
if there is no class unloading for the particular GC cycle.

This is not a fatal error, but can confuse verifier.


Bug: https://bugs.openjdk.java.net/browse/JDK-8236681
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236681/webrev.00/
Test:
   hotspot_gc_shenandoah (fastdebug and release)


Thanks,

-Zhengyu


From rkennke at redhat.com  Tue Jan  7 07:34:37 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 7 Jan 2020 08:34:37 +0100
Subject: [15] RFR(XS) 8236681: Shenandoah: Disable concurrent class
 unloading flag if no class unloading for the GC cycle
In-Reply-To: <be3bfaac-c9f6-2dc1-8858-965532069971@redhat.com>
References: <be3bfaac-c9f6-2dc1-8858-965532069971@redhat.com>
Message-ID: <14e0f3c6-6e54-67c3-7229-0c7b475badf0@redhat.com>

Ok.

Thanks!
Roman

Roman


> Please review this small patch that disables concurrent class unloading
> if there is no class unloading for the particular GC cycle.
> 
> This is not a fatal error, but can confuse verifier.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236681
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236681/webrev.00/
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> 
> 
> Thanks,
> 
> -Zhengyu
> 


From stefan.johansson at oracle.com  Tue Jan  7 09:10:44 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 7 Jan 2020 10:10:44 +0100
Subject: RFR[14]: 8235751: Assertion when triggering concurrent cycle
 during shutdown
In-Reply-To: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
References: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
Message-ID: <23ec1be7-9619-d105-f05c-29603840839a@oracle.com>

Hi Kim,

On 2019-12-31 04:01, Kim Barrett wrote:
> Please review this change to G1's handling of requests to initiate
> concurrent marking.
> 
> When such a request is made during shutdown processing, after the cm
> thread has been stopped, the request to initiate concurrent marking is
> ignored.  This could lead to an assertion failure for user requested
> GCs (System.gc and via agent) by a thread that has not yet been
> brought to a halt, because the possibility of such a request being
> ignored was missed when the assertion was recently added by JDK-8232588.
> 
> We now report to the GC-invoking thread when initiation of concurrent
> marking has been suppressed because termination of the cm thread has
> been requested.  In that case the GC invocation is considered finished.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8235751
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8235751/open.00/
>

Looks good,
StefanJ

> Testing:
> mach5 tier1-5
> 
> Locally (linux-x64) reproduced fairly quickly the failure using the
> approach described in the CR; after applying the proposed chage,
> failed to reproduce.
> 


From thomas.schatzl at oracle.com  Tue Jan  7 10:41:38 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 7 Jan 2020 11:41:38 +0100
Subject: [14] RFR (S): 8235934: gc/g1/TestGCLogMessages.java fails with
 'DerivedPointerTable Update' found
In-Reply-To: <F655C880-ADCF-47EF-A06E-C56EECFF5139@oracle.com>
References: <41e652e0-b843-e2da-c196-37b9b327d4aa@oracle.com>
 <F655C880-ADCF-47EF-A06E-C56EECFF5139@oracle.com>
Message-ID: <0635ecc6-a4e9-0d34-d320-002ff148ca1a@oracle.com>

Hi Kim,

   thanks for your review.

On 31.12.19 08:09, Kim Barrett wrote:
>> On Dec 17, 2019, at 4:27 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   can I have reviews for this testbug where there is a mismatch between "C2 compiler is enabled" and "C2 compiler is compiled in" in verifying output messages.
>>
>> I.e. G1 prints some additional log messages if the C2 compiler is compiled in, but the test checks this message for (non-)existence if the C2 compiler is enabled.
>>
>> Since there are a few flags that can toggle compiler use even when compiled in (UseCompiler, TieredStopAtLevel<=3, ...) the GC prints that message but the test does not expect it.
>>
>> The fix is to add a whitebox method that specifically returns whether the C2 compiler is compiled in or not, to be used by the test.
>>
>> I would like to push this to 14 even if it is P4 because of the test bug exemption, returning unnecessary reproducable errors.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8235934
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8235934/webrev/
>> Testing:
>> hs-tier1-3, local runs of TestGCLogMessages.java
>>
>> Thanks,
>>   Thomas
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/prims/whitebox.cpp
> 1990 #if COMPILER2_OR_JVMCI
> 1991   return true;
> 1992 #else
> 1993   return false;
> 1994 #endif
> 
> This could perhaps be just
> 
>    return bool(COMPILER_OR_JVMCI);
> 
> That will fail to compile if COMPILER_OR_JVMCI is not defined at all;
> not sure whether that's a pro or con for this alternative form.

I do not have an opinion, so kept it as is.

> ------------------------------------------------------------------------------
> src/hotspot/share/prims/whitebox.cpp
> 1989 WB_ENTRY(jboolean, WB_isC2OrGraalIncludedInVmBuild(JNIEnv* env))
> 
> I think the name ought to use "Jvmci" rather than "Graal".
> 
> ------------------------------------------------------------------------------
> 

Fixed.

http://cr.openjdk.java.net/~tschatzl/8235934/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8235934/webrev.1 (full)

Tested locally.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Jan  7 10:55:52 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 7 Jan 2020 11:55:52 +0100
Subject: RFR (M): 8235860: Obsolete the UseParallelOldGC option
In-Reply-To: <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
References: <292ab94f-f2c8-b373-d5a5-46a45470540e@oracle.com>
 <2A4B1955-26D5-4544-B476-6E9E5E8009D4@oracle.com>
 <5e21e50d-a026-98ba-d03d-3f7aa1c31e21@oracle.com>
 <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
Message-ID: <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>

Hi Kim,

On 18.12.19 16:45, Kim Barrett wrote:
> 
> 
>> On Dec 18, 2019, at 4:52 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Fixed in
>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.0_to_1 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.1 (full)
> 
> Looks good.
> 

Thanks for your review.

>>
>>> ------------------------------------------------------------------------------
>>> src/hotspot/share/gc/parallel/psParallelCompact.hpp
>>> Pre-existing: It seems like the big block comment before SplitInfo
>>> should have received some updates as part of the recent shadow-region
>>> patch, but it wasn't touched.
>>> ------------------------------------------------------------------------------
>>
>> I am filing a CR for that.
> 
> The comment before PSParallelCompact in the same file might also need some updating.
> 
> (I was a bit confused in my earlier review about where the relevant comments were.)
> 

   I filed JDK-8141637 before the holidays. I added your recent comment.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Tue Jan  7 11:47:30 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 7 Jan 2020 12:47:30 +0100
Subject: RFR[14]: 8235751: Assertion when triggering concurrent cycle
 during shutdown
In-Reply-To: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
References: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
Message-ID: <c23f26ff-774c-14a1-d561-de952cefaeb3@oracle.com>

Hi,

On 31.12.19 04:01, Kim Barrett wrote:
> Please review this change to G1's handling of requests to initiate
> concurrent marking.
> 
> When such a request is made during shutdown processing, after the cm
> thread has been stopped, the request to initiate concurrent marking is
> ignored.  This could lead to an assertion failure for user requested
> GCs (System.gc and via agent) by a thread that has not yet been
> brought to a halt, because the possibility of such a request being
> ignored was missed when the assertion was recently added by JDK-8232588.
> 
> We now report to the GC-invoking thread when initiation of concurrent
> marking has been suppressed because termination of the cm thread has
> been requested.  In that case the GC invocation is considered finished.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8235751
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8235751/open.00/
> 
> Testing:
> mach5 tier1-5
> 
> Locally (linux-x64) reproduced fairly quickly the failure using the
> approach described in the CR; after applying the proposed chage,
> failed to reproduce.
> 

   looks good.

Thomas


From maoliang.ml at alibaba-inc.com  Tue Jan  7 16:33:25 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 08 Jan 2020 00:33:25 +0800
Subject: =?UTF-8?B?UkZSOiA4MjM2MDczOiBHMTogVXNlIFNvZnRNYXhIZWFwU2l6ZSB0byBndWlkZSBHQyBoZXVy?=
 =?UTF-8?B?aXN0aWNz?=
Message-ID: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>

Hi Thomas,

As we previously discussed, I use the concurrent heap uncommit/commit mechanism to implement the
 SoftMaxHeapSize for G1. It is also for the further implementation of G1ElasticHeap for ergonomic
change of heap size. In the previous 8u implementation, we had some limitations which are all
removed now in this patch. The concurrent uncommit/commit can also work with some senarios for
 immediate heap expansion.

Here is the webrev link:
http://cr.openjdk.java.net/~luchsh/8236073.webrev/

We still have some questions.
1. Does the SoftMaxHeapSize limitation need to consider the GC time ratio as in
expand_heap_after_young_collection? Now we haven't put the logic in yet.
2. The concurrent uncommit/commit can only work for G1RegionsLargerThanCommitSizeMapper but not
G1RegionsSmallerThanCommitSizeMapper which might need some locks to ensure the multi-thread
 synchronization issue( heap may expand immediately). I think bringing the lock synchronization
may not be worthy for the little gain. Another idea is can we just not uncommit the pages of
auxiliary data if in G1RegionsSmallerThanCommitSizeMapper? Heap regions should not be 
G1RegionsSmallerThanCommitSizeMapper most of time I guess...

Looking forward to your advice:)

Thanks,
Liang


------------------------------------------------------------------
From:MAO, Liang <maoliang.ml at alibaba-inc.com>
Send Time:2019 Oct. 14 (Mon.) 11:52
To:Thomas Schatzl <thomas.schatzl at oracle.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi Thomas,

Thank you for the recognition:) Since we both agree on some clear specific points,
 I will try to extract them from current implementation and create a patch in OpenJDK
upstream branch so we can continue discussion on the code level.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2019 Oct. 12 (Sat.) 23:00
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: G1 patch of elastic Java heap

Hi,

On Sat, 2019-10-12 at 19:51 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> The manual generation limit can be put aside currently since we know
> it might not be so general for a GC. We can focus on how to change
> heap size and return memory in runtime first. 
> 
> GCTimeRatio is a good metric to measure the health of a Java
> application and I have considered to use that. But finally I chose
> a simple way just like the periodic old GC. Guarantee a long 
> enough young GC interval is an alternative way to make sure the
> GCTimeRatio at a heathy state. 
> I'm absolutely ok to use GCTimeRatio instead of the fixed young GC
> interval. This part is same to ZGC or Shenandoah for how to balance
> the desired memory size and GC frequency. I'm open to  any good
> solution and we are already in the same page for this issue
> I think:)

+1

> A big difference of our implementation is evaluating heap resizing in
> any young GC instead of a concurrent gc cycle which I think is
> swifter and more immmediate. The concurrent map/unmap 
> mechanism gets rid of the additional pause time. My thought is the
> heap shrink/expand can be all determined in young GC pause and
> performed in concurrent thread which could exclude the 
> considerable time cost by OS interface. Most of our Java users are
> intolerant to those pause pikes caused by page fault which can be up
> to seconds. And we also found the issue of time cost by map/unmap in
> ZGC.
>
> A direct advantage of the young GC resizing and concurrent memory
> free machanism is for implementing SoftMaxHeapSize. The heap size can
> be changed after last mixed GC. The young GC won't have longer
> pause and the memory can be freed concurrently without side effect.

Agree and agree. Both evaluating and giving back memory at any gc
sounds nice, and doing that without incurring the costs in the pause is
even better :)

Thanks,
  Thomas


From rkennke at redhat.com  Tue Jan  7 20:26:14 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 7 Jan 2020 21:26:14 +0100
Subject: RFR: 8236732: Shenandoah: Stricter placement for oom-evac scopes
Message-ID: <2c07b4bc-70ce-5107-6c77-76a59c912ac6@redhat.com>

I'm currently looking at a deadlock with the derby benchmark which
involves oom-scopes and new concurrent-class-unloading.

Currently, we have sprinkled OOM-evac scopes all over the place:
- In the main evac-loop (of course)
- In the LRB (of course)
- In various places

The latter is very questionable and has repeatedly lead to problems in
the past. The trouble was usually that some weird path would dive into
evacuation with a GC worker, although the oom-scope was already held at
an outer scope. It becomes really bad when locks are involved, e.g. the
heap-lock, code-cache-lock and recently the per-nmethod locks. This is
very deadlock-prone.

The way out is to be very strict about where we place the oom-scopes.
They should *only* be very close to SH::evacuate_object(), and they
should *always* be the innermost scopes, inside any possible locks.
Placement must be such that both conditions are rather obviously met.

The biggest trouble here is Traversal GC: since it does *both* evacs and
other stuff during traversal, it dives into LRB through various paths
while GC threads holding the evac-scope. The solution is to only enter
evac-scope very closely to SH::evacuate_object() at the expense of doing
it quite often during traversal. I prefer to have a clear way to do it
though, instead of the mess that we currently have.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8236732
Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8236732/webrev.00/

Testing: hotspot_gc_shenandoah, the specjvm/derby benchmark that
troubled me with deadlock is now looking clean

Ok?

Roman


From zgu at redhat.com  Tue Jan  7 20:43:32 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jan 2020 15:43:32 -0500
Subject: RFR: 8236732: Shenandoah: Stricter placement for oom-evac scopes
In-Reply-To: <2c07b4bc-70ce-5107-6c77-76a59c912ac6@redhat.com>
References: <2c07b4bc-70ce-5107-6c77-76a59c912ac6@redhat.com>
Message-ID: <c6fbbae0-790d-d164-43c0-cac1d8d97586@redhat.com>

Okay.

-Zhengyu

On 1/7/20 3:26 PM, Roman Kennke wrote:
> I'm currently looking at a deadlock with the derby benchmark which
> involves oom-scopes and new concurrent-class-unloading.
> 
> Currently, we have sprinkled OOM-evac scopes all over the place:
> - In the main evac-loop (of course)
> - In the LRB (of course)
> - In various places
> 
> The latter is very questionable and has repeatedly lead to problems in
> the past. The trouble was usually that some weird path would dive into
> evacuation with a GC worker, although the oom-scope was already held at
> an outer scope. It becomes really bad when locks are involved, e.g. the
> heap-lock, code-cache-lock and recently the per-nmethod locks. This is
> very deadlock-prone.
> 
> The way out is to be very strict about where we place the oom-scopes.
> They should *only* be very close to SH::evacuate_object(), and they
> should *always* be the innermost scopes, inside any possible locks.
> Placement must be such that both conditions are rather obviously met.
> 
> The biggest trouble here is Traversal GC: since it does *both* evacs and
> other stuff during traversal, it dives into LRB through various paths
> while GC threads holding the evac-scope. The solution is to only enter
> evac-scope very closely to SH::evacuate_object() at the expense of doing
> it quite often during traversal. I prefer to have a clear way to do it
> though, instead of the mess that we currently have.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8236732
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8236732/webrev.00/
> 
> Testing: hotspot_gc_shenandoah, the specjvm/derby benchmark that
> troubled me with deadlock is now looking clean
> 
> Ok?
> 
> Roman
> 


From zgu at redhat.com  Tue Jan  7 20:44:48 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jan 2020 15:44:48 -0500
Subject: RFR: 8236732: Shenandoah: Stricter placement for oom-evac scopes
In-Reply-To: <c6fbbae0-790d-d164-43c0-cac1d8d97586@redhat.com>
References: <2c07b4bc-70ce-5107-6c77-76a59c912ac6@redhat.com>
 <c6fbbae0-790d-d164-43c0-cac1d8d97586@redhat.com>
Message-ID: <4e3cd3f1-6784-eec4-8584-82b400d40909@redhat.com>

Need to update copyright years :-)

-Zhengyu

On 1/7/20 3:43 PM, Zhengyu Gu wrote:
> Okay.
> 
> -Zhengyu
> 
> On 1/7/20 3:26 PM, Roman Kennke wrote:
>> I'm currently looking at a deadlock with the derby benchmark which
>> involves oom-scopes and new concurrent-class-unloading.
>>
>> Currently, we have sprinkled OOM-evac scopes all over the place:
>> - In the main evac-loop (of course)
>> - In the LRB (of course)
>> - In various places
>>
>> The latter is very questionable and has repeatedly lead to problems in
>> the past. The trouble was usually that some weird path would dive into
>> evacuation with a GC worker, although the oom-scope was already held at
>> an outer scope. It becomes really bad when locks are involved, e.g. the
>> heap-lock, code-cache-lock and recently the per-nmethod locks. This is
>> very deadlock-prone.
>>
>> The way out is to be very strict about where we place the oom-scopes.
>> They should *only* be very close to SH::evacuate_object(), and they
>> should *always* be the innermost scopes, inside any possible locks.
>> Placement must be such that both conditions are rather obviously met.
>>
>> The biggest trouble here is Traversal GC: since it does *both* evacs and
>> other stuff during traversal, it dives into LRB through various paths
>> while GC threads holding the evac-scope. The solution is to only enter
>> evac-scope very closely to SH::evacuate_object() at the expense of doing
>> it quite often during traversal. I prefer to have a clear way to do it
>> though, instead of the mess that we currently have.
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8236732
>> Webrev:
>> http://cr.openjdk.java.net/~rkennke/JDK-8236732/webrev.00/
>>
>> Testing: hotspot_gc_shenandoah, the specjvm/derby benchmark that
>> troubled me with deadlock is now looking clean
>>
>> Ok?
>>
>> Roman
>>


From kim.barrett at oracle.com  Tue Jan  7 22:40:38 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 7 Jan 2020 17:40:38 -0500
Subject: [14] RFR (S): 8235934: gc/g1/TestGCLogMessages.java fails with
 'DerivedPointerTable Update' found
In-Reply-To: <0635ecc6-a4e9-0d34-d320-002ff148ca1a@oracle.com>
References: <41e652e0-b843-e2da-c196-37b9b327d4aa@oracle.com>
 <F655C880-ADCF-47EF-A06E-C56EECFF5139@oracle.com>
 <0635ecc6-a4e9-0d34-d320-002ff148ca1a@oracle.com>
Message-ID: <87F643E8-16AF-4E1C-8272-3D8CC0813938@oracle.com>

> On Jan 7, 2020, at 5:41 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> http://cr.openjdk.java.net/~tschatzl/8235934/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8235934/webrev.1 (full)
> 
> Tested locally.
> 
> Thanks,
>  Thomas

Looks good.


From kim.barrett at oracle.com  Tue Jan  7 23:22:08 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 7 Jan 2020 18:22:08 -0500
Subject: RFR[14]: 8235751: Assertion when triggering concurrent cycle
 during shutdown
In-Reply-To: <c23f26ff-774c-14a1-d561-de952cefaeb3@oracle.com>
References: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
 <c23f26ff-774c-14a1-d561-de952cefaeb3@oracle.com>
Message-ID: <55E3567C-1E40-4182-8243-80285583572D@oracle.com>

> On Jan 7, 2020, at 6:47 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 31.12.19 04:01, Kim Barrett wrote:
>> Please review this change to G1's handling of requests to initiate
>> concurrent marking.
>> When such a request is made during shutdown processing, after the cm
>> thread has been stopped, the request to initiate concurrent marking is
>> ignored.  This could lead to an assertion failure for user requested
>> GCs (System.gc and via agent) by a thread that has not yet been
>> brought to a halt, because the possibility of such a request being
>> ignored was missed when the assertion was recently added by JDK-8232588.
>> We now report to the GC-invoking thread when initiation of concurrent
>> marking has been suppressed because termination of the cm thread has
>> been requested.  In that case the GC invocation is considered finished.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8235751
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8235751/open.00/
>> Testing:
>> mach5 tier1-5
>> Locally (linux-x64) reproduced fairly quickly the failure using the
>> approach described in the CR; after applying the proposed chage,
>> failed to reproduce.
> 
>  looks good.
> 
> Thomas

Thanks.


From kim.barrett at oracle.com  Tue Jan  7 23:22:18 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 7 Jan 2020 18:22:18 -0500
Subject: RFR[14]: 8235751: Assertion when triggering concurrent cycle
 during shutdown
In-Reply-To: <23ec1be7-9619-d105-f05c-29603840839a@oracle.com>
References: <695EF54D-675F-4162-8518-115CC9F63F8D@oracle.com>
 <23ec1be7-9619-d105-f05c-29603840839a@oracle.com>
Message-ID: <7EFDC050-D672-4BF5-8009-EF95736829E2@oracle.com>

> On Jan 7, 2020, at 4:10 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 2019-12-31 04:01, Kim Barrett wrote:
>> Please review this change to G1's handling of requests to initiate
>> concurrent marking.
>> When such a request is made during shutdown processing, after the cm
>> thread has been stopped, the request to initiate concurrent marking is
>> ignored.  This could lead to an assertion failure for user requested
>> GCs (System.gc and via agent) by a thread that has not yet been
>> brought to a halt, because the possibility of such a request being
>> ignored was missed when the assertion was recently added by JDK-8232588.
>> We now report to the GC-invoking thread when initiation of concurrent
>> marking has been suppressed because termination of the cm thread has
>> been requested.  In that case the GC invocation is considered finished.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8235751
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8235751/open.00/
>> 
> 
> Looks good,
> StefanJ
> 
>> Testing:
>> mach5 tier1-5
>> Locally (linux-x64) reproduced fairly quickly the failure using the
>> approach described in the CR; after applying the proposed chage,
>> failed to reproduce.

Thanks.


From stuart.monteith at linaro.org  Tue Jan  7 23:34:37 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Tue, 7 Jan 2020 23:34:37 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
Message-ID: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>

Hello Zhengyu, et al,
	This is the current state of the nmethod barrier code I have for ZGC on
aarch64. As I understand it, Zhengyu may have been working on this, and
so this is my sharing it:

	http://cr.openjdk.java.net/~smonteith/nmethod/webrev.0/

The code has various bits for debugging, prototype level code, with
comments and some notes interspersed throughout - it is not ready for
merging.

The approach I've taken for the nmethod barrier is to have the nmethod
barrier that is emitted be implemented like so:

  __ adr(rscratch1, __ pc());
  __ ldarw(rscratch2, rscratch1);
  __ ldrw(rscratch1, thread_disarmed_addr);
  __ cmpw(rscratch2, rscratch1);
  __ br(Assembler::EQ, continuation);

  __ mov(rscratch1, StubRoutines::aarch64::method_entry_barrier());
  __ blr(rscratch1);

 __ bind(continuation);


This code is patched up such that the ldarw is loading from a field I've
added to nmethod "_nmethod_guard". There don't appear to be existing
ways to emit a relocation (there aren't spare bits to do a small change)
from an address in nmethod emitted code into the nmethod data structure.
It is initialized to the instruction's current address and
BarrierSetNMethod::disarm will detect this known value and fix it up,
which occurs on initialization.

Currently the deoptmise path is broken. By setting the environment
variable "SRDM_forcedeopt", the deoptimisation can be provoked even when
not needed - the x86 implementation is good with this change. The
aarch64 code isn't working yet - I suspect I've followed the x86 code
too closely, and my offsets are perhaps miscalculated - I may be
pointing at the wrong frame, or I've neglected FP too much.

BR,
  Stuart


From zgu at redhat.com  Wed Jan  8 01:22:52 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 7 Jan 2020 20:22:52 -0500
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
Message-ID: <aa8a257b-b385-7266-7ea2-ba8133fcd89f@redhat.com>

Hi Stuart,

Thanks for sharing.

Actually, Andrew Haley (cc'd) is helping us on implementing nmethod 
entry barrier.

Your patch is largely inline with what we have right now, but Andrew 
seems to have second thoughts :-)

-Zhengyu

On 1/7/20 6:34 PM, Stuart Monteith wrote:
> Hello Zhengyu, et al,
> 	This is the current state of the nmethod barrier code I have for ZGC on
> aarch64. As I understand it, Zhengyu may have been working on this, and
> so this is my sharing it:
> 
> 	http://cr.openjdk.java.net/~smonteith/nmethod/webrev.0/
> 
> The code has various bits for debugging, prototype level code, with
> comments and some notes interspersed throughout - it is not ready for
> merging.
> 
> The approach I've taken for the nmethod barrier is to have the nmethod
> barrier that is emitted be implemented like so:
> 
>    __ adr(rscratch1, __ pc());
>    __ ldarw(rscratch2, rscratch1);
>    __ ldrw(rscratch1, thread_disarmed_addr);
>    __ cmpw(rscratch2, rscratch1);
>    __ br(Assembler::EQ, continuation);
> 
>    __ mov(rscratch1, StubRoutines::aarch64::method_entry_barrier());
>    __ blr(rscratch1);
> 
>   __ bind(continuation);
> 
> 
> This code is patched up such that the ldarw is loading from a field I've
> added to nmethod "_nmethod_guard". There don't appear to be existing
> ways to emit a relocation (there aren't spare bits to do a small change)
> from an address in nmethod emitted code into the nmethod data structure.
> It is initialized to the instruction's current address and
> BarrierSetNMethod::disarm will detect this known value and fix it up,
> which occurs on initialization.
> 
> Currently the deoptmise path is broken. By setting the environment
> variable "SRDM_forcedeopt", the deoptimisation can be provoked even when
> not needed - the x86 implementation is good with this change. The
> aarch64 code isn't working yet - I suspect I've followed the x86 code
> too closely, and my offsets are perhaps miscalculated - I may be
> pointing at the wrong frame, or I've neglected FP too much.
> 
> BR,
>    Stuart
> 


From thomas.schatzl at oracle.com  Wed Jan  8 09:47:32 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 8 Jan 2020 10:47:32 +0100
Subject: [14] RFR (S): 8235934: gc/g1/TestGCLogMessages.java fails with
 'DerivedPointerTable Update' found
In-Reply-To: <87F643E8-16AF-4E1C-8272-3D8CC0813938@oracle.com>
References: <41e652e0-b843-e2da-c196-37b9b327d4aa@oracle.com>
 <F655C880-ADCF-47EF-A06E-C56EECFF5139@oracle.com>
 <0635ecc6-a4e9-0d34-d320-002ff148ca1a@oracle.com>
 <87F643E8-16AF-4E1C-8272-3D8CC0813938@oracle.com>
Message-ID: <6795a285-6d99-9471-1c1d-30115ac57305@oracle.com>

Hi Kim,

On 07.01.20 23:40, Kim Barrett wrote:
>> On Jan 7, 2020, at 5:41 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> http://cr.openjdk.java.net/~tschatzl/8235934/webrev.0_to_1 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8235934/webrev.1 (full)
>>
>> Tested locally.
>>
>> Thanks,
>>   Thomas
> 
> Looks good.
> 

   thanks for your review.

Thomas


From aph at redhat.com  Wed Jan  8 10:10:32 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Jan 2020 10:10:32 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
Message-ID: <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>

On 1/7/20 11:34 PM, Stuart Monteith wrote:
> There don't appear to be existing
> ways to emit a relocation (there aren't spare bits to do a small change)
> from an address in nmethod emitted code into the nmethod data structure.

Yeah, there is. See MacroAssembler::int_constant.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Jan  8 10:21:02 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Jan 2020 10:21:02 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
Message-ID: <8c8aa64b-a306-2801-fdb8-2942ea362b48@redhat.com>

On 1/7/20 11:34 PM, Stuart Monteith wrote:
> This code is patched up such that the ldarw is loading from a field I've
> added to nmethod "_nmethod_guard". There don't appear to be existing
> ways to emit a relocation (there aren't spare bits to do a small change)
> from an address in nmethod emitted code into the nmethod data structure.
> It is initialized to the instruction's current address and
> BarrierSetNMethod::disarm will detect this known value and fix it up,
> which occurs on initialization.
> 
> Currently the deoptmise path is broken. By setting the environment
> variable "SRDM_forcedeopt", the deoptimisation can be provoked even when
> not needed - the x86 implementation is good with this change. The
> aarch64 code isn't working yet - I suspect I've followed the x86 code
> too closely, and my offsets are perhaps miscalculated - I may be
> pointing at the wrong frame, or I've neglected FP too much.

I'll integrate this with what I've got and return it to you.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From zgu at redhat.com  Wed Jan  8 13:18:19 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 8 Jan 2020 08:18:19 -0500
Subject: [15] RFR 8228818: Shenandoah: Processing weak roots in concurrent
 phase when possible
Message-ID: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>

Please review this enhancement that moves some of weak root processing 
into concurrent phase whenever possible.

When concurrent class unloading is enabled, the weak roots that backed 
by OopStorage can be processed in concurrent phase, as Shenandoah native 
LRB can properly resolve the object and hide dead oops from mutators.


Bug: https://bugs.openjdk.java.net/browse/JDK-8228818
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)
   on x86_64 and x86_32 Linux
   specjvm on x86_64 Linux

Thanks,

-Zhengyu


From thomas.schatzl at oracle.com  Wed Jan  8 13:38:44 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 8 Jan 2020 14:38:44 +0100
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and closed
 archive heap regions
Message-ID: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>

Hi all,

   could I have reviews for this small cleanup/simplification that 
merges the open and closed archive region map into a single one?

CR:
https://bugs.openjdk.java.net/browse/JDK-8214277
Webrev:
http://cr.openjdk.java.net/~tschatzl/8214277/webrev/
Testing:
hs-tier1-5 (almost done, no issues), local gc/g1 jtreg

Thanks,
   Thomas


From stuart.monteith at linaro.org  Wed Jan  8 14:23:27 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Wed, 8 Jan 2020 14:23:27 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
Message-ID: <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>

Hi,
 I see there is LIR_Assembler::int_constant, which is only for C1, the
equivalent is MacroAssembler::ldr_constant, which uses an
InternalAddress.  I had looked at doing that first, however I had an
issue getting the address of the constants. The _consts CodeSection
information exists in the CodeBuffer when the code is being generated,
but I hadn't worked out after that point where the constant would be
after the relocation, which is what we'd need when arming/disarming
the guard, from which we have to navigate from the nmethod class.  My
assumption was that the nmethod would consist of the header,followed
by the relocations, followed by the code section which would include
the consts, and that my constant might be one of the first ones. I
don't believe I could make that a consistent address.  I chose to put
the guard into the nmethod data structure itself, but that of course
meant we had to check whether or not the guard needs initialized -
although I was thinking we could add an additional initialization
stage.

It occurs to me now that we could pull out the address of the guard
from the ADR instruction, as that will be relocated and we know its
location, and could do that unconditionally. I'll revisit using the
constants, as maybe the address of it is more deterministic than I
first thought.

Thanks,
   Stuart


On Wed, 8 Jan 2020 at 10:10, Andrew Haley <aph at redhat.com> wrote:
>
> On 1/7/20 11:34 PM, Stuart Monteith wrote:
> > There don't appear to be existing
> > ways to emit a relocation (there aren't spare bits to do a small change)
> > from an address in nmethod emitted code into the nmethod data structure.
>
> Yeah, there is. See MacroAssembler::int_constant.
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>


From aph at redhat.com  Wed Jan  8 15:37:22 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 8 Jan 2020 15:37:22 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
Message-ID: <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>

On 1/8/20 2:23 PM, Stuart Monteith wrote:
>  I see there is LIR_Assembler::int_constant, which is only for C1, the
> equivalent is MacroAssembler::ldr_constant, which uses an
> InternalAddress. 

There is MacroAssembler::int_constant(n). It is there, and it returns an
address that you can use with ADR and/or LDR . It won't work with a native
method because they have no constant pool (int_constant() will return NULL)
but I don't think you need barriers for native methods.

(Um, perhaps you do, for synchronized ones? They have a reference to a class.)

Anyway, this is your patch with a working (probably) deoptimize handler:

http://cr.openjdk.java.net/~aph/aarch64-jdk-nmethod-barriers-3.patch

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From rkennke at redhat.com  Wed Jan  8 16:39:25 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 8 Jan 2020 17:39:25 +0100
Subject: [15] RFR 8228818: Shenandoah: Processing weak roots in concurrent
 phase when possible
In-Reply-To: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>
References: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>
Message-ID: <f19e7d9c-0b11-dd99-d023-12577c1fcd89@redhat.com>

Hi Zhengyu,

src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp:
+class ShenandoahEvacUpdateCleanupOopStorageRootsClosure : public
BasicOopIterateClosure {

Why can't this go in shenandoahHeap.cpp (only place where it's used)?

src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp:
+  // Cleanup/Evacuate weak roots
+  if (heap->marking_context()->is_complete() &&
ShenandoahConcurrentRoots::should_do_concurrent_class_unloading()) {
+    heap->entry_weak_roots();

Are you sure that we only want to do cleanup when we do
conc-class-unloading? Originally, we hoooked this up in the
entry_roots(), why's that not good enough?

src/hotspot/share/gc/shenandoah/shenandoahRootVerifier.cpp:

What's that change?
-  return (_types & type) != 0;
+  return (_types & type) == type;

Thanks,
Roman

> Please review this enhancement that moves some of weak root processing
> into concurrent phase whenever possible.
> 
> When concurrent class unloading is enabled, the weak roots that backed
> by OopStorage can be processed in concurrent phase, as Shenandoah native
> LRB can properly resolve the object and hide dead oops from mutators.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8228818
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> ? on x86_64 and x86_32 Linux
> ? specjvm on x86_64 Linux
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Wed Jan  8 17:21:08 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 8 Jan 2020 12:21:08 -0500
Subject: [15] RFR 8228818: Shenandoah: Processing weak roots in concurrent
 phase when possible
In-Reply-To: <f19e7d9c-0b11-dd99-d023-12577c1fcd89@redhat.com>
References: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>
 <f19e7d9c-0b11-dd99-d023-12577c1fcd89@redhat.com>
Message-ID: <29be2bac-6192-237e-7234-23e23a2966ad@redhat.com>


On 1/8/20 11:39 AM, Roman Kennke wrote:
> Hi Zhengyu,
> 
> src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp:
> +class ShenandoahEvacUpdateCleanupOopStorageRootsClosure : public
> BasicOopIterateClosure {
> 
> Why can't this go in shenandoahHeap.cpp (only place where it's used)?

Sure,  updated: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.01/

> 
> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp:
> +  // Cleanup/Evacuate weak roots
> +  if (heap->marking_context()->is_complete() &&
> ShenandoahConcurrentRoots::should_do_concurrent_class_unloading()) {
> +    heap->entry_weak_roots();
> 
> Are you sure that we only want to do cleanup when we do
> conc-class-unloading? Originally, we hoooked this up in the
> entry_roots(), why's that not good enough?

Yes, otherwise, weak roots are still processed at final mark/init evac 
pause.

I split into 2 phases, because I think it is logically simpler: when 
weak roots processing failed, degenerated GC simply re-executes related 
logic (parallel cleaning) and disarm nmethods, which is equivalent to 
STW version.

> 
> src/hotspot/share/gc/shenandoah/shenandoahRootVerifier.cpp:
> 
> What's that change?
> -  return (_types & type) != 0;
> +  return (_types & type) == type;
> 

Because WeakRoots is the combination of SerialWeakRoots and 
ConcurrentWeakRoots now, when we test WeakRoots, expect both bits are set.


Thanks,

-Zhengyu


> Thanks,
> Roman
> 
>> Please review this enhancement that moves some of weak root processing
>> into concurrent phase whenever possible.
>>
>> When concurrent class unloading is enabled, the weak roots that backed
>> by OopStorage can be processed in concurrent phase, as Shenandoah native
>> LRB can properly resolve the object and hide dead oops from mutators.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8228818
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.00/
>>
>> Test:
>>  ? hotspot_gc_shenandoah (fastdebug and release)
>>  ? on x86_64 and x86_32 Linux
>>  ? specjvm on x86_64 Linux
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From rkennke at redhat.com  Wed Jan  8 21:25:01 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 8 Jan 2020 22:25:01 +0100
Subject: RFR: 8236815: Shenandoah: Fix weak roots in final Traversal GC phase
Message-ID: <3dc6cf7d-bdb0-becd-4335-787ec418c001@redhat.com>

We're not fixing up all weak roots in final-traversal. But we have to,
because weak roots are not scanned+evacuated at init-traversal, and may
thus keep dangling pointers that would leak out to the next cycle. This
can lead to heap corruption, crashes, etc.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8236815
Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8236815/webrev.00/

Testing: several runs of hotspot_gc_shenandoah, which *sometimes*
exposed the bug. I couldn't reproduce it. I suggest to give it more
spins in CI.

Can I please get a review?

Thanks,
Roman


From zgu at redhat.com  Wed Jan  8 21:31:02 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 8 Jan 2020 16:31:02 -0500
Subject: RFR: 8236815: Shenandoah: Fix weak roots in final Traversal GC
 phase
In-Reply-To: <3dc6cf7d-bdb0-becd-4335-787ec418c001@redhat.com>
References: <3dc6cf7d-bdb0-becd-4335-787ec418c001@redhat.com>
Message-ID: <4531912b-4af2-3dcc-484e-bf13c422a0a6@redhat.com>

Looks good.

-Zhengyu

On 1/8/20 4:25 PM, Roman Kennke wrote:
> We're not fixing up all weak roots in final-traversal. But we have to,
> because weak roots are not scanned+evacuated at init-traversal, and may
> thus keep dangling pointers that would leak out to the next cycle. This
> can lead to heap corruption, crashes, etc.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8236815
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8236815/webrev.00/
> 
> Testing: several runs of hotspot_gc_shenandoah, which *sometimes*
> exposed the bug. I couldn't reproduce it. I suggest to give it more
> spins in CI.
> 
> Can I please get a review?
> 
> Thanks,
> Roman
> 


From jianglizhou at google.com  Wed Jan  8 23:41:32 2020
From: jianglizhou at google.com (Jiangli Zhou)
Date: Wed, 8 Jan 2020 15:41:32 -0800
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
Message-ID: <CALrW1jyN=wiWXA8WZHOnv5PJJ3UT+bdC_aTp-hOpo2_fDQuYCA@mail.gmail.com>

Hi Thomas,

Looks good! Can we also remove '_open_archive_region_map' from
g1Allocator.* as it's no longer needed?

Best,

Jiangli

On Wed, Jan 8, 2020 at 5:39 AM Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>
> Hi all,
>
>    could I have reviews for this small cleanup/simplification that
> merges the open and closed archive region map into a single one?
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8214277
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev/
> Testing:
> hs-tier1-5 (almost done, no issues), local gc/g1 jtreg
>
> Thanks,
>    Thomas


From kim.barrett at oracle.com  Thu Jan  9 00:25:53 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 8 Jan 2020 19:25:53 -0500
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
Message-ID: <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>

> On Jan 8, 2020, at 8:38 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  could I have reviews for this small cleanup/simplification that merges the open and closed archive region map into a single one?
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8214277
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev/
> Testing:
> hs-tier1-5 (almost done, no issues), local gc/g1 jtreg
> 
> Thanks,
>  Thomas

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1Allocator.inline.hpp
 151 inline void G1ArchiveAllocator::clear_range_archive(MemRegion range, bool open) {

clear_range_archive no longer uses the open argument for anything
other than logging.  Is it worth keeping?

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1Allocator.inline.hpp 
 186   return (archive_check_enabled() &&
 187          (_archive_region_map.get_by_address((HeapWord*)object) != G1ArchiveRegionMap::NoArchive));

The indentation of line 187 is confusing; the indentation suggests the
open paren there is at the same nesting level as the one on the
preceeding line directly above.  The first open paren on line 186 and
the associated close could just be removed.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1Allocator.inline.hpp 
 162   // This is the out-of-line part of is_closed_archive_object test, done separately
 163   // to avoid additional performance impact when the check is not enabled.

Pre-existing: Given that it is in an inline definition, the accuracy
of this comment seems questionable.

------------------------------------------------------------------------------

Jiangli already pointed out that _open_archive_region_map is no longer
used.


From zgu at redhat.com  Thu Jan  9 01:41:48 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 8 Jan 2020 20:41:48 -0500
Subject: [15] RFR 8228818: Shenandoah: Processing weak roots in concurrent
 phase when possible
In-Reply-To: <29be2bac-6192-237e-7234-23e23a2966ad@redhat.com>
References: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>
 <f19e7d9c-0b11-dd99-d023-12577c1fcd89@redhat.com>
 <29be2bac-6192-237e-7234-23e23a2966ad@redhat.com>
Message-ID: <42a9f680-1a0d-eebe-e253-198aeb3a0a5e@redhat.com>


On 1/8/20 12:21 PM, Zhengyu Gu wrote:
> 
> 
> On 1/8/20 11:39 AM, Roman Kennke wrote:
>> Hi Zhengyu,
>>
>> src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp:
>> +class ShenandoahEvacUpdateCleanupOopStorageRootsClosure : public
>> BasicOopIterateClosure {
>>
>> Why can't this go in shenandoahHeap.cpp (only place where it's used)?
> 
> Sure,? updated: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.01/
> 
>>
>> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp:
>> +? // Cleanup/Evacuate weak roots
>> +? if (heap->marking_context()->is_complete() &&
>> ShenandoahConcurrentRoots::should_do_concurrent_class_unloading()) {
>> +??? heap->entry_weak_roots();
>>
>> Are you sure that we only want to do cleanup when we do
>> conc-class-unloading? Originally, we hoooked this up in the
>> entry_roots(), why's that not good enough?
> 
> Yes, otherwise, weak roots are still processed at final mark/init evac 
> pause.
> 
> I split into 2 phases, because I think it is logically simpler: when 
> weak roots processing failed, degenerated GC simply re-executes related 
> logic (parallel cleaning) and disarm nmethods, which is equivalent to 
> STW version.

Took another look, it does not seem to be a good idea, as it duplicates 
the work of update_roots(), that is called at the end of degenerated GC 
cycle to fix the roots.

Merged weak_roots phase into concurrent roots phase, and removed 
weak_roots degenerated point, also simplified the patch.

Updated webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.02/

Test:
   Reran hotspot_gc_shenandoah (fastdebug and release)
   on x86_64 and x86_32 Linux.

Thanks,

-Zhengyu


> 
>>
>> src/hotspot/share/gc/shenandoah/shenandoahRootVerifier.cpp:
>>
>> What's that change?
>> -? return (_types & type) != 0;
>> +? return (_types & type) == type;
>>
> 
> Because WeakRoots is the combination of SerialWeakRoots and 
> ConcurrentWeakRoots now, when we test WeakRoots, expect both bits are set.
> 
> 
> Thanks,
> 
> -Zhengyu
> 
> 
>> Thanks,
>> Roman
>>
>>> Please review this enhancement that moves some of weak root processing
>>> into concurrent phase whenever possible.
>>>
>>> When concurrent class unloading is enabled, the weak roots that backed
>>> by OopStorage can be processed in concurrent phase, as Shenandoah native
>>> LRB can properly resolve the object and hide dead oops from mutators.
>>>
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8228818
>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.00/
>>>
>>> Test:
>>> ?? hotspot_gc_shenandoah (fastdebug and release)
>>> ?? on x86_64 and x86_32 Linux
>>> ?? specjvm on x86_64 Linux
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>


From rkennke at redhat.com  Thu Jan  9 08:44:19 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 9 Jan 2020 09:44:19 +0100
Subject: [15] RFR 8228818: Shenandoah: Processing weak roots in concurrent
 phase when possible
In-Reply-To: <42a9f680-1a0d-eebe-e253-198aeb3a0a5e@redhat.com>
References: <c8657621-09c8-4e31-edd4-106b06dedf14@redhat.com>
 <f19e7d9c-0b11-dd99-d023-12577c1fcd89@redhat.com>
 <29be2bac-6192-237e-7234-23e23a2966ad@redhat.com>
 <42a9f680-1a0d-eebe-e253-198aeb3a0a5e@redhat.com>
Message-ID: <a770ddd5-19e6-ad25-5357-55df5bd1a7e9@redhat.com>

Hi Zhengyu,

the latest patch looks good to me.

Thanks,
Roman


> On 1/8/20 12:21 PM, Zhengyu Gu wrote:
>>
>>
>> On 1/8/20 11:39 AM, Roman Kennke wrote:
>>> Hi Zhengyu,
>>>
>>> src/hotspot/share/gc/shenandoah/shenandoahClosures.hpp:
>>> +class ShenandoahEvacUpdateCleanupOopStorageRootsClosure : public
>>> BasicOopIterateClosure {
>>>
>>> Why can't this go in shenandoahHeap.cpp (only place where it's used)?
>>
>> Sure,? updated: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.01/
>>
>>>
>>> src/hotspot/share/gc/shenandoah/shenandoahControlThread.cpp:
>>> +? // Cleanup/Evacuate weak roots
>>> +? if (heap->marking_context()->is_complete() &&
>>> ShenandoahConcurrentRoots::should_do_concurrent_class_unloading()) {
>>> +??? heap->entry_weak_roots();
>>>
>>> Are you sure that we only want to do cleanup when we do
>>> conc-class-unloading? Originally, we hoooked this up in the
>>> entry_roots(), why's that not good enough?
>>
>> Yes, otherwise, weak roots are still processed at final mark/init evac
>> pause.
>>
>> I split into 2 phases, because I think it is logically simpler: when
>> weak roots processing failed, degenerated GC simply re-executes
>> related logic (parallel cleaning) and disarm nmethods, which is
>> equivalent to STW version.
> 
> Took another look, it does not seem to be a good idea, as it duplicates
> the work of update_roots(), that is called at the end of degenerated GC
> cycle to fix the roots.
> 
> Merged weak_roots phase into concurrent roots phase, and removed
> weak_roots degenerated point, also simplified the patch.
> 
> Updated webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.02/
> 
> Test:
> ? Reran hotspot_gc_shenandoah (fastdebug and release)
> ? on x86_64 and x86_32 Linux.
> 
> Thanks,
> 
> -Zhengyu
> 
> 
>>
>>>
>>> src/hotspot/share/gc/shenandoah/shenandoahRootVerifier.cpp:
>>>
>>> What's that change?
>>> -? return (_types & type) != 0;
>>> +? return (_types & type) == type;
>>>
>>
>> Because WeakRoots is the combination of SerialWeakRoots and
>> ConcurrentWeakRoots now, when we test WeakRoots, expect both bits are
>> set.
>>
>>
>> Thanks,
>>
>> -Zhengyu
>>
>>
>>> Thanks,
>>> Roman
>>>
>>>> Please review this enhancement that moves some of weak root processing
>>>> into concurrent phase whenever possible.
>>>>
>>>> When concurrent class unloading is enabled, the weak roots that backed
>>>> by OopStorage can be processed in concurrent phase, as Shenandoah
>>>> native
>>>> LRB can properly resolve the object and hide dead oops from mutators.
>>>>
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8228818
>>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8228818/webrev.00/
>>>>
>>>> Test:
>>>> ?? hotspot_gc_shenandoah (fastdebug and release)
>>>> ?? on x86_64 and x86_32 Linux
>>>> ?? specjvm on x86_64 Linux
>>>>
>>>> Thanks,
>>>>
>>>> -Zhengyu
>>>>
>>>
> 


From thomas.schatzl at oracle.com  Thu Jan  9 11:37:54 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 9 Jan 2020 12:37:54 +0100
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
 <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
Message-ID: <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>

Hi Kim, Jiangli,

   thanks for your reviews.

On 09.01.20 01:25, Kim Barrett wrote:
>> On Jan 8, 2020, at 8:38 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   could I have reviews for this small cleanup/simplification that merges the open and closed archive region map into a single one?
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8214277
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8214277/webrev/
>> Testing:
>> hs-tier1-5 (almost done, no issues), local gc/g1 jtreg
>>
>> Thanks,
>>   Thomas
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
>   151 inline void G1ArchiveAllocator::clear_range_archive(MemRegion range, bool open) {
> 
> clear_range_archive no longer uses the open argument for anything
> other than logging.  Is it worth keeping?

Removed.

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
>   186   return (archive_check_enabled() &&
>   187          (_archive_region_map.get_by_address((HeapWord*)object) != G1ArchiveRegionMap::NoArchive));
> 
> The indentation of line 187 is confusing; the indentation suggests the
> open paren there is at the same nesting level as the one on the
> preceeding line directly above.  The first open paren on line 186 and
> the associated close could just be removed.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1Allocator.inline.hpp
>   162   // This is the out-of-line part of is_closed_archive_object test, done separately
>   163   // to avoid additional performance impact when the check is not enabled.
> 
> Pre-existing: Given that it is in an inline definition, the accuracy
> of this comment seems questionable.

Removed.

> 
> ------------------------------------------------------------------------------
> 
> Jiangli already pointed out that _open_archive_region_map is no longer
> used.
> 

All fixed in
http://cr.openjdk.java.net/~tschatzl/8214277/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8214277/webrev.1/ (full)

Thanks,
   Thomas


From rkennke at redhat.com  Thu Jan  9 12:46:38 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 9 Jan 2020 13:46:38 +0100
Subject: RFR: 8236851: Shenandoah: More details in Traversal GC event messages
Message-ID: <a537a0b0-4a6b-d7d2-9ac7-282f30017cb0@redhat.com>

We currently only print e.g. "Pause Init Traversal" in event messages
for Traversal GC. We should also include information whether or not the
cycle also does unload classes and/or process references, like we do for
the normal mode.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8236851

Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8236851/webrev.00/

Testing: hotspot_gc_shenandoah, manual inspection of hs_err files

Can I please get a review?

Thanks,
Roman


From stuart.monteith at linaro.org  Thu Jan  9 16:02:35 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Thu, 9 Jan 2020 16:02:35 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
Message-ID: <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>

Thank you Andrew, that compiles and runs without error - the
deoptimize method is definitely being provoked. and continues without
apparent problems.

I've been trying to insert constants, and the issue you mention is
tripped when we enter a native method wrapper. Eric can perhaps
correct me, but I presume we might have to deoptimise a native method
if it was overriding a JIT-compiled method and it is subsequently been
unloaded.

In x86 it is inserted here:
  http://hg.openjdk.java.net/jdk/jdk/file/6d23020e3da0/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#l2204
on aarch64 I added the change in our generate_native_wrapper.


BR,
   Stuart


On Wed, 8 Jan 2020 at 15:37, Andrew Haley <aph at redhat.com> wrote:
>
> On 1/8/20 2:23 PM, Stuart Monteith wrote:
> >  I see there is LIR_Assembler::int_constant, which is only for C1, the
> > equivalent is MacroAssembler::ldr_constant, which uses an
> > InternalAddress.
>
> There is MacroAssembler::int_constant(n). It is there, and it returns an
> address that you can use with ADR and/or LDR . It won't work with a native
> method because they have no constant pool (int_constant() will return NULL)
> but I don't think you need barriers for native methods.
>
> (Um, perhaps you do, for synchronized ones? They have a reference to a class.)
>
> Anyway, this is your patch with a working (probably) deoptimize handler:
>
> http://cr.openjdk.java.net/~aph/aarch64-jdk-nmethod-barriers-3.patch
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>


From aph at redhat.com  Thu Jan  9 16:14:31 2020
From: aph at redhat.com (Andrew Haley)
Date: Thu, 9 Jan 2020 16:14:31 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
 <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
Message-ID: <45a22fc8-af0f-6632-3505-fe333a67145d@redhat.com>

On 1/9/20 4:02 PM, Stuart Monteith wrote:
> Thank you Andrew, that compiles and runs without error - the
> deoptimize method is definitely being provoked. and continues without
> apparent problems.
> 
> I've been trying to insert constants, and the issue you mention is
> tripped when we enter a native method wrapper. Eric can perhaps
> correct me, but I presume we might have to deoptimise a native method
> if it was overriding a JIT-compiled method and it is subsequently been
> unloaded.

OK, so we can stick with a field in nmethod. It is probably more economical
on space anyway.

You should be able to use adr(rscratch1, InternalAddress(nmethod field));

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From kim.barrett at oracle.com  Thu Jan  9 17:43:08 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 9 Jan 2020 12:43:08 -0500
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
 <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
 <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
Message-ID: <680B79D8-6E3E-4BB1-96FD-6317718783FA@oracle.com>

> On Jan 9, 2020, at 6:37 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> All fixed in
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev.1/ (full)
> 
> Thanks,
>  Thomas

Looks good.


From thomas.schatzl at oracle.com  Thu Jan  9 19:01:53 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 09 Jan 2020 20:01:53 +0100
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <680B79D8-6E3E-4BB1-96FD-6317718783FA@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
 <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
 <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
 <680B79D8-6E3E-4BB1-96FD-6317718783FA@oracle.com>
Message-ID: <a0b885e6ce344345f08a1e40d56e6cfbb5c83644.camel@oracle.com>

Hi,

On Thu, 2020-01-09 at 12:43 -0500, Kim Barrett wrote:
> > On Jan 9, 2020, at 6:37 AM, Thomas Schatzl <
> > thomas.schatzl at oracle.com> wrote:
> > All fixed in
> > http://cr.openjdk.java.net/~tschatzl/8214277/webrev.0_to_1 (diff)
> > http://cr.openjdk.java.net/~tschatzl/8214277/webrev.1/ (full)
> > 
> > Thanks,
> >  Thomas
> 
> Looks good.
> 

  thanks for your review.

Thomas


From jianglizhou at google.com  Thu Jan  9 20:19:45 2020
From: jianglizhou at google.com (Jiangli Zhou)
Date: Thu, 9 Jan 2020 12:19:45 -0800
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
 <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
 <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
Message-ID: <CALrW1jxawqsGir3SpqjZFYFGUM7r3EbM_4NwOBDzeg6d4ASucQ@mail.gmail.com>

Hi Thomas,

The update looks good.

Best regards,
Jiangli

On Thu, Jan 9, 2020 at 3:39 AM Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>
> Hi Kim, Jiangli,
>
>    thanks for your reviews.
>
> On 09.01.20 01:25, Kim Barrett wrote:
> >> On Jan 8, 2020, at 8:38 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> >>
> >> Hi all,
> >>
> >>   could I have reviews for this small cleanup/simplification that merges the open and closed archive region map into a single one?
> >>
> >> CR:
> >> https://bugs.openjdk.java.net/browse/JDK-8214277
> >> Webrev:
> >> http://cr.openjdk.java.net/~tschatzl/8214277/webrev/
> >> Testing:
> >> hs-tier1-5 (almost done, no issues), local gc/g1 jtreg
> >>
> >> Thanks,
> >>   Thomas
> >
> > ------------------------------------------------------------------------------
> > src/hotspot/share/gc/g1/g1Allocator.inline.hpp
> >   151 inline void G1ArchiveAllocator::clear_range_archive(MemRegion range, bool open) {
> >
> > clear_range_archive no longer uses the open argument for anything
> > other than logging.  Is it worth keeping?
>
> Removed.
>
> >
> > ------------------------------------------------------------------------------
> > src/hotspot/share/gc/g1/g1Allocator.inline.hpp
> >   186   return (archive_check_enabled() &&
> >   187          (_archive_region_map.get_by_address((HeapWord*)object) != G1ArchiveRegionMap::NoArchive));
> >
> > The indentation of line 187 is confusing; the indentation suggests the
> > open paren there is at the same nesting level as the one on the
> > preceeding line directly above.  The first open paren on line 186 and
> > the associated close could just be removed.
> >
> > ------------------------------------------------------------------------------
> > src/hotspot/share/gc/g1/g1Allocator.inline.hpp
> >   162   // This is the out-of-line part of is_closed_archive_object test, done separately
> >   163   // to avoid additional performance impact when the check is not enabled.
> >
> > Pre-existing: Given that it is in an inline definition, the accuracy
> > of this comment seems questionable.
>
> Removed.
>
> >
> > ------------------------------------------------------------------------------
> >
> > Jiangli already pointed out that _open_archive_region_map is no longer
> > used.
> >
>
> All fixed in
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8214277/webrev.1/ (full)
>
> Thanks,
>    Thomas


From thomas.schatzl at oracle.com  Thu Jan  9 20:59:02 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 09 Jan 2020 21:59:02 +0100
Subject: RFR (S): 8214277: Use merged G1ArchiveRegionMap for open and
 closed archive heap regions
In-Reply-To: <CALrW1jxawqsGir3SpqjZFYFGUM7r3EbM_4NwOBDzeg6d4ASucQ@mail.gmail.com>
References: <180618f6-8fe4-22f0-dbc0-5f275c5b1d90@oracle.com>
 <1AB85763-9BEA-4A7A-A37E-DCB70F6C9D0E@oracle.com>
 <b9a8a400-5143-7069-0310-b2ccaba5a5e4@oracle.com>
 <CALrW1jxawqsGir3SpqjZFYFGUM7r3EbM_4NwOBDzeg6d4ASucQ@mail.gmail.com>
Message-ID: <13588b7a3d2af571caeffcfe9425282a92215e8d.camel@oracle.com>

Hi,

On Thu, 2020-01-09 at 12:19 -0800, Jiangli Zhou wrote:
> Hi Thomas,
> 
> The update looks good.
> 

  thanks for your review. Passed hs-tier1-5 in the meantime. Pushed :)

> Best regards,
> Jiangli
> 

Thanks,
  Thomas


From shade at redhat.com  Thu Jan  9 21:15:12 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 9 Jan 2020 22:15:12 +0100
Subject: RFR: 8236851: Shenandoah: More details in Traversal GC event
 messages
In-Reply-To: <a537a0b0-4a6b-d7d2-9ac7-282f30017cb0@redhat.com>
References: <a537a0b0-4a6b-d7d2-9ac7-282f30017cb0@redhat.com>
Message-ID: <ff070ea4-e9d9-d97c-8e8b-28bb6ae9728b@redhat.com>

On 1/9/20 1:46 PM, Roman Kennke wrote:
> We currently only print e.g. "Pause Init Traversal" in event messages
> for Traversal GC. We should also include information whether or not the
> cycle also does unload classes and/or process references, like we do for
> the normal mode.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8236851
> 
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8236851/webrev.00/

Looks fine to me.

-- 
Thanks,
-Aleksey


From erik.osterlund at oracle.com  Thu Jan  9 22:35:12 2020
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 9 Jan 2020 23:35:12 +0100
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
 <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
Message-ID: <40c3cc91-8b7c-9bd6-1ff5-933b8a7a3166@oracle.com>

Hi Stuart and Andrew,

Right, when it comes to native wrappers, we do inject entry barriers for 
that on x86. The main reason for that is that I am allergic to "special" 
nmethods that you have to remember work differently all the time. We 
have too many of them. The only nmethod that regrettably does not have 
entry barriers is the method handle intrinsic. That seems fine but I'm 
not quite happy about it.

Other than that, we also do need the barriers for correctness. Last time 
I thought about that, I recall there were a few problematic hypothetical 
situations I wanted to avoid. For example, consider the following 
obscure race condition (suitable beverage while reading advised):

1. Load abstract class A with non-static method foo.
2. Load class B, inheriting from A, from a separate class loader, 
overriding foo with a native method (that gets a native wrapper).
3. JIT nmethod with a virtual call to A.foo. The compiler will with CHA 
decide that there is only a single concrete foo implementation in the 
system (B::foo), due to there being a single implementation of A, which 
turns out to be our native wrapper. When this happens an optimized 
virtual call is generated with a direct call emitted (originally 
pointing at a resolution stub for the very first call), but the holder 
oop of B (it's class loader) is not inserted to the oop section. 
Instead, an entry is added in the dependency context to keep track of 
this nmethod so the caller nmethod (calling the native wrapper) can get 
deoptimized if the unique callee for A assumption changes.
4. Call the JIT-compiled call of A.foo with an instance of B, resolve it 
and patch the direct call to the native wrapper (B.foo *verified* entry, 
due to being an optimized virtual call).
5. Release the reference to the class loader of B, and wait until the 
class loader dies, and hence B dies.
6. Before concurrent class unloading kicks in (concurrently) and walks 
dependency contexts of dead things to invalidate them (which would 
invalidate the caller nmethod), load a class C also inheriting from A 
and overriding a concrete implementation of foo. When loading that 
class, the dependency context walk for invalidating e.g. CHA 
inconsistencies skip over the is_unloading() nmethods (including the 
native wrapper), due to race conditions that ended up giving that 
responsibility to the concurrent GC thread (which has not gotten to it yet).
7. Reuse the same JIT-compiled virtual call of A.foo but pass in a new 
instance of C. The state of the callsite is now a direct call to B.foo, 
and it's about to get deoptimized, but isn't yet. But B.foo 
is_unloading() because B is dead, making the one oop of the native 
wrapper (the holder oop of B) dead, and hence the native wrapper 
is_unloading().

Now in this scenario, without an nmethod entry barrier, we can end up 
calling a dead method. The nmethod entry barrier guards that by 
enforcing the invariant that we can't enter dead nmethods.

Hope this makes sense and helps understanding why the native wrapper 
ought to have an entry barrier.

Thanks,
/Erik

On 2020-01-09 17:02, Stuart Monteith wrote:
> Thank you Andrew, that compiles and runs without error - the
> deoptimize method is definitely being provoked. and continues without
> apparent problems.
>
> I've been trying to insert constants, and the issue you mention is
> tripped when we enter a native method wrapper. Eric can perhaps
> correct me, but I presume we might have to deoptimise a native method
> if it was overriding a JIT-compiled method and it is subsequently been
> unloaded.
>
> In x86 it is inserted here:
>    http://hg.openjdk.java.net/jdk/jdk/file/6d23020e3da0/src/hotspot/cpu/x86/sharedRuntime_x86_64.cpp#l2204
> on aarch64 I added the change in our generate_native_wrapper.
>
>
> BR,
>     Stuart
>
>
> On Wed, 8 Jan 2020 at 15:37, Andrew Haley <aph at redhat.com> wrote:
>> On 1/8/20 2:23 PM, Stuart Monteith wrote:
>>>   I see there is LIR_Assembler::int_constant, which is only for C1, the
>>> equivalent is MacroAssembler::ldr_constant, which uses an
>>> InternalAddress.
>> There is MacroAssembler::int_constant(n). It is there, and it returns an
>> address that you can use with ADR and/or LDR . It won't work with a native
>> method because they have no constant pool (int_constant() will return NULL)
>> but I don't think you need barriers for native methods.
>>
>> (Um, perhaps you do, for synchronized ones? They have a reference to a class.)
>>
>> Anyway, this is your patch with a working (probably) deoptimize handler:
>>
>> http://cr.openjdk.java.net/~aph/aarch64-jdk-nmethod-barriers-3.patch
>>
>> --
>> Andrew Haley  (he/him)
>> Java Platform Lead Engineer
>> Red Hat UK Ltd. <https://www.redhat.com>
>> https://keybase.io/andrewhaley
>> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>>


From zgu at redhat.com  Fri Jan 10 12:01:52 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 10 Jan 2020 07:01:52 -0500
Subject: [14] RFR 8236902: Shenandoah: Missing string dedup roots in all root
 scanner
Message-ID: <ac21af8f-f6a4-285f-b5c3-47f0cd64d96f@redhat.com>

String dedup roots are missing in all roots scanner.

The problem was discovered while running TestHeapDump.java test with 
StringDeduplication enabled, let's add a test there to make sure.

Bug: https://bugs.openjdk.java.net/browse/JDK-8236902
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236902/webrev.00/index.html

Test:
   hotspot_gc_shenandoah (fastdebug and release)

Thanks,

-Zhengyu


From rkennke at redhat.com  Fri Jan 10 12:04:56 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 10 Jan 2020 13:04:56 +0100
Subject: [14] RFR 8236902: Shenandoah: Missing string dedup roots in all
 root scanner
In-Reply-To: <ac21af8f-f6a4-285f-b5c3-47f0cd64d96f@redhat.com>
References: <ac21af8f-f6a4-285f-b5c3-47f0cd64d96f@redhat.com>
Message-ID: <903438b8-87c6-a6fd-5ff0-8e60090228d5@redhat.com>

Hi Zhengyu,

the change looks good! Thanks!

Roman

> String dedup roots are missing in all roots scanner.
> 
> The problem was discovered while running TestHeapDump.java test with
> StringDeduplication enabled, let's add a test there to make sure.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236902
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236902/webrev.00/index.html
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Fri Jan 10 14:48:16 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 10 Jan 2020 09:48:16 -0500
Subject: [15] RFR 8236878: Use atomic instruction to update StringDedupTable's
 entries and entries_removed counters
Message-ID: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>

Hi,

Please review this small change that uses atomic operations to update 
StringDedupTable's entries and entries_removed counter.

This is *not* a correctness fix or performance enhancement, but for 
Shenandoah GC to move StringDedupTable cleanup task into concurrent 
phase, while holding StringDedupTable_lock.

Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html

Test:
   hotspot_gc (fastdebug and release) on x86_64 Linux
   Submit test in progress.

Thanks,

-Zhengyu


From stuart.monteith at linaro.org  Fri Jan 10 16:56:16 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Fri, 10 Jan 2020 16:56:16 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <45a22fc8-af0f-6632-3505-fe333a67145d@redhat.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
 <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
 <45a22fc8-af0f-6632-3505-fe333a67145d@redhat.com>
Message-ID: <CAEGA6kYQ=pVUSXjNT-n_9B3GSTuyVMOV6NpAVsvtxFxaybwDvg@mail.gmail.com>

Hello,
   Something like "adr(rscratch1, InternalAddress(nmethod field))" has
been suggested before. My problem has been finding what the address of
the nmethod fields would be without knowing where the instruction is
relative to them before the nmethod has been allocated. Relocations
can be performed between the sections  - constants, instructions and
stubs - with section_word_Relocation instead of
internal_word_Relocation. However, there doesn't appear to be a
concept for the CodeBlob header during code emissions.

If we have a method here:

 total in heap  [0x0000ffff68e0b010,0x0000ffff68e0b410] = 1024
 relocation     [0x0000ffff68e0b190,0x0000ffff68e0b1a8] = 24
 main code      [0x0000ffff68e0b1c0,0x0000ffff68e0b300] = 320
 stub code      [0x0000ffff68e0b300,0x0000ffff68e0b3a0] = 160
 metadata       [0x0000ffff68e0b3a0,0x0000ffff68e0b3b8] = 24
 scopes data    [0x0000ffff68e0b3b8,0x0000ffff68e0b3c8] = 16
 scopes pcs     [0x0000ffff68e0b3c8,0x0000ffff68e0b408] = 64
 dependencies   [0x0000ffff68e0b408,0x0000ffff68e0b410] = 8

The nmethod structure is at 0x0000ffff68e0b010. Between that address
and the main code is the relocation section, which we don't know the
size of during instruction emission into the CodeBuffer.
It appears that to relocate references to the nmethod structure from
the code section before the nmethod is constructed would require its
own relocation of sorts. I'm looking at adding a HEADER CodeSection
that would allow relocation of entries into the nmethod/CodeBlob
header.  There is no guarantee our CodeBuffer has a CodeBlob during
instruction emission - I've not looked to see whether that would be
useful.

BR,
  Stuart


On Thu, 9 Jan 2020 at 16:14, Andrew Haley <aph at redhat.com> wrote:
>
> On 1/9/20 4:02 PM, Stuart Monteith wrote:
> > Thank you Andrew, that compiles and runs without error - the
> > deoptimize method is definitely being provoked. and continues without
> > apparent problems.
> >
> > I've been trying to insert constants, and the issue you mention is
> > tripped when we enter a native method wrapper. Eric can perhaps
> > correct me, but I presume we might have to deoptimise a native method
> > if it was overriding a JIT-compiled method and it is subsequently been
> > unloaded.
>
> OK, so we can stick with a field in nmethod. It is probably more economical
> on space anyway.
>
> You should be able to use adr(rscratch1, InternalAddress(nmethod field));
>
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671
>


From rkennke at redhat.com  Mon Jan 13 14:06:44 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 13 Jan 2020 15:06:44 +0100
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
Message-ID: <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>

OK.

Thanks,
Roman


> 
> Please review this small change that uses atomic operations to update
> StringDedupTable's entries and entries_removed counter.
> 
> This is *not* a correctness fix or performance enhancement, but for
> Shenandoah GC to move StringDedupTable cleanup task into concurrent
> phase, while holding StringDedupTable_lock.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html
> 
> Test:
> ? hotspot_gc (fastdebug and release) on x86_64 Linux
> ? Submit test in progress.
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Mon Jan 13 14:27:22 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 13 Jan 2020 09:27:22 -0500
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
Message-ID: <dd4afd38-942d-fcc5-0903-c59aa701a2b3@redhat.com>

Thanks, Roman.

-Zhengyu

On 1/13/20 9:06 AM, Roman Kennke wrote:
> OK.
> 
> Thanks,
> Roman
> 
> 
>>
>> Please review this small change that uses atomic operations to update
>> StringDedupTable's entries and entries_removed counter.
>>
>> This is *not* a correctness fix or performance enhancement, but for
>> Shenandoah GC to move StringDedupTable cleanup task into concurrent
>> phase, while holding StringDedupTable_lock.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html
>>
>> Test:
>>  ? hotspot_gc (fastdebug and release) on x86_64 Linux
>>  ? Submit test in progress.
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From aph at redhat.com  Mon Jan 13 14:59:59 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 13 Jan 2020 14:59:59 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <CAEGA6kYQ=pVUSXjNT-n_9B3GSTuyVMOV6NpAVsvtxFxaybwDvg@mail.gmail.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
 <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
 <45a22fc8-af0f-6632-3505-fe333a67145d@redhat.com>
 <CAEGA6kYQ=pVUSXjNT-n_9B3GSTuyVMOV6NpAVsvtxFxaybwDvg@mail.gmail.com>
Message-ID: <cc4916ed-d8f0-110a-7bbf-77cb633fe07c@redhat.com>

On 1/10/20 4:56 PM, Stuart Monteith wrote:

>    Something like "adr(rscratch1, InternalAddress(nmethod field))" has
> been suggested before. My problem has been finding what the address of
> the nmethod fields would be without knowing where the instruction is
> relative to them before the nmethod has been allocated. Relocations
> can be performed between the sections  - constants, instructions and
> stubs - with section_word_Relocation instead of
> internal_word_Relocation. However, there doesn't appear to be a
> concept for the CodeBlob header during code emissions.

OK, I see..

> The nmethod structure is at 0x0000ffff68e0b010. Between that address
> and the main code is the relocation section, which we don't know the
> size of during instruction emission into the CodeBuffer.
> It appears that to relocate references to the nmethod structure from
> the code section before the nmethod is constructed would require its
> own relocation of sorts. I'm looking at adding a HEADER CodeSection
> that would allow relocation of entries into the nmethod/CodeBlob
> header.

We don't need another section. I think there are only two problems
with using the constant section. There is a large (and probably
pointless) alignment gap so that even a single-word constant takes up
a lot of space. Also, native methods don't have constant sections,
but that should be easy to fix.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Mon Jan 13 15:01:41 2020
From: aph at redhat.com (Andrew Haley)
Date: Mon, 13 Jan 2020 15:01:41 +0000
Subject: aarch64: Concurrent class unloading, nmethod barriers, ZGC
In-Reply-To: <40c3cc91-8b7c-9bd6-1ff5-933b8a7a3166@oracle.com>
References: <65e96ab7-3625-d5be-e5e0-be66c3137c8b@linaro.org>
 <be8d4231-dcaf-5bc7-2d04-3c37c81b846d@redhat.com>
 <CAEGA6kaxymA8tDmwM3-wTL4Qt4ASXJ7fubRMYCBUDdJYMfUg4A@mail.gmail.com>
 <69b38972-8061-b706-befe-28d49af42fe7@redhat.com>
 <CAEGA6kaw5PGhdjJAqG1-vLBtDdc8jVUY3tG=-ihD9DH1PeNodw@mail.gmail.com>
 <40c3cc91-8b7c-9bd6-1ff5-933b8a7a3166@oracle.com>
Message-ID: <1bad4ee7-93c3-041d-d805-fa8c0c76d3e2@redhat.com>

On 1/9/20 10:35 PM, Erik ?sterlund wrote:
> Now in this scenario, without an nmethod entry barrier, we can end up 
> calling a dead method. The nmethod entry barrier guards that by 
> enforcing the invariant that we can't enter dead nmethods.

Wow.  :-)

Thank you for that. it's useful to have some discussion of this
permanently recorded online for posterity,

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From zgu at redhat.com  Mon Jan 13 15:11:02 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 13 Jan 2020 10:11:02 -0500
Subject: [15] RFR(T) 8237017: Shenandoah: Remove racy assertion
Message-ID: <d12271ae-25d4-9cd2-eb4b-2bab0d64a1e5@redhat.com>

During concurrent weak root processing, it tries to CAS in NULL if the 
oop is dead, then asserts that the slot is indeed NULL.

The assertion is racy, because there can be another thread to release 
the slot and then reuse it (that's why it uses CAS in the first place), 
that can result assertion to fail.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237017
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237017/webrev.00/index.html

Test:
   hotspot_gc_shenandoah fastdebug on x86_64 Linux

Thanks,

-Zhengyu


From per.liden at oracle.com  Mon Jan 13 15:12:51 2020
From: per.liden at oracle.com (Per Liden)
Date: Mon, 13 Jan 2020 16:12:51 +0100
Subject: RFR: 8236153: ZGC: gc/z/TestUncommit.java fails with
 java.lang.Exception: Uncommitted too fast
Message-ID: <24c41e59-3e2c-9284-489a-af487e6cccc0@oracle.com>

The test gc/z/TestUncommit.java fails now and then on Windows when using 
-Xcomp. This test can fail if it's severely starved on CPU, as it will 
cause the timing to be off. The logs confirms that the test took an 
unusually long time to execute, suggesting it was starved on CPU. This 
only happens in test tiers using -Xcomp, which is likely causing the 
unusually high load. This patch disables this test when using -Xcomp. 
I've enabled some GC logging, which should helpful if this test ever 
fails again.

Bug: https://bugs.openjdk.java.net/browse/JDK-8236153
Webrev: http://cr.openjdk.java.net/~pliden/8236153/webrev.0

/Per


From thomas.schatzl at oracle.com  Mon Jan 13 15:15:09 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 13 Jan 2020 16:15:09 +0100
Subject: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics
In-Reply-To: <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
Message-ID: <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>

Hi Liang,

   thanks for your contribution!

I looked through the change a bit and have a few comments.

What I noticed quickly after initial browsing through it is that this 
change implements three different features:
   1) moving the memory uncommit into the concurrent phase
   2) uncommit at the end of (almost) every GC
   3) SoftMaxHeapSize

These should be split across three separate changes (I already filed 
JDK-8236926 last week). No particular order I think, but the concurrent 
uncommit changes are probably hardest and will probably take most time.

Some additional initial comments:

- in G1Collectedheap::check_soft_max_heap_size_changed(), instead of the 
check for "AllocateOldGenAt != NULL" we probably want to ask the 
HeapRegionManager directly about whether it supports this. Also print 
some log message that it failed. Even on success, print a log message :)

- in that same method, I recommend first doing the alignment adjustment 
(which probably needs to be done for that suggested soft_max_capacity() 
method below too) and then check if it changed. That saves the repeated 
!= _prev_soft_max_heap_size check.

Actually, just using the suggested soft_max_heap_size() method should be 
fine.

- changes in G1CollectedHeap::resize_heap_if_necessary: please make the 
method to always use the soft max heap size as I do not understand why 
you would not do that.
I recommend adding a "soft_max_capacity()" method in G1CollectedHeap, 
and let that return MIN2(align_up(SoftMaxHeapSize, heap_alignment), 
max_capacity()).

There are a few places that check SoftMaxHeapSize validity (e.g. 
SoftMaxHeapSize <= capacity()), they could probably all be removed then.

- doing timing: you might have noticed, we are currently transitioning 
to use Ticks/Tickspan for points in time and durations at least for the 
calculation, so in any new code please avoid using os::elapsedTime().

Use:

Ticks start = Ticks::now();
// timed code
phase_times()->record....((Ticks::now() - start).seconds() * 1000.0);

instead.

- in G1CollectedHeap::shrink_heap_after_young_collection() I would 
prefer a structure like in expand_heap_after_collection, i.e.:

size_t shrink_bytes = _heap_sizing_policy->shrink_amount();
if (shrink_bytes  > 0) {
   // do actual preparation for shrinking
}

and put all that logic determining the amount to shrink in that 
shrink_amount() method.

- concurrent uncommit:
  - as mentioned, please split the related changes out from the other 
changes. This change is hard enough to get right as is.

- I would really prefer if we did not need to introduce another helper 
thread for uncommitting memory. Did you try using the 
G1YoungRemSetSamplingThread? I understand that uncommit might then delay 
young gen sampling, but I do not expect these events to occur all the 
time (but I have no reference here).
In the first implementation we could have another thread if others do 
not object, but every additional thread takes some time to startup and 
teardown, and memory for at least one stack page.

- please move the change in G1CollectedHeap::abort_concurrent_cycle() 
into a separate method - waiting for completion of the concurrent 
uncommit and the concurrent marking are completely different concerns.

  - I admit I haven't looked at all cases in detail, but the split in 
is_available() and is_unavailable_for_allocation() in HeapRegionManager 
seems incomplete and unncessary. Particularly because of bad naming, as 
the documentation for is_available() says it's actually 
is_available[_for_allocation]. Disregarding the negation, these two look 
equivalent with the problem that !is_available() != is_unavailable..., 
which is really bad style.

I have not found a case where it is harmful to not consider the 
_concurrent_resizing_map in is_available().

The split the state of a region between two bitmaps in HeapRegionManager 
(the available_map and the _concurrent_resizing_map) may be susceptible 
to tricky races. Please consider changing this to a real "state" as in 
"Available -> Removing -> Unavailable". This would make the code easier 
to read too. (And in the future "Adding" if required).

  - it should be possible to disable concurrent uncommit/resize via an 
experimental flag. Also there should be no concurrent resize thread if 
the Heapregionmanager does not support it.
G1 couold immediately do the heap change in that case.

The reason for this flag is to allow users too disable this if they 
experience problems.

  - not sure about why some methods in HeapRegionManager have "resizing" 
in their method name. As far as I can tell, the change only allows 
concurrent uncommit. Maybe use the above "remove" for regions instead of 
"uncommit" regions.

Background: The code and comments in HeapRegionManager are aware and 
fairly consistent (I hope) to not use the wording commit/uncommit for 
operations on HeapRegions. Only operations on memory pages should use 
committed/uncommit. The naming in the added methods does not respect that.

  - some of the methods (e.g. to find free regions) should inform the 
caller that there are to-be-removed regions to maybe retry after waiting 
for completion of that thread to avoid unexpected OOM.

  - I have a feeling that if the concurrent uncommit thread worked on 
pages, not regions, the code would be easier to understand. It would 
also solve the issue you asked about with the 
G1RegionsSmallerThanCommitSizeMapper. You may still need to pass region 
numbers anyway for logging, but otoh the logging could be done 
asynchroniusly.

  - s/parallely/concurrently a few times

  - there is a bug in the synchronization of the concurrent uncommit 
thread: it seems possible that the uncommit thread is still working 
(iterating over the list of regions to uncommit) while a completed 
(young) GC may add new regions to that list as young gcs do not wait for 
completion of the uncommit thread.

  - the concurrently uncommitted regions only become available for 
commit at the next gc, which seems very long. Why not make them 
available for commit "immediately"?

Related to that is the use of par_set/clear_bit in e.g. the available 
bitmap: since all par/clear actions are asserted to be in the vm thread 
at a safepoint, there does not seem to be a need for using the parallel 
variants of set/clear bit (if keeping the current mechanism).

  - please document the supposed interactions and assumptions like in 
the above two paragraphs between the "resize" thread and the other 
threads and safepoints.

  - please use the existing HeapRegionManager::shrink_by() 
method+infrastructure for passing a shrink request to the HRM, either 
immediately shrinking the heap or deferring for later shrinking 
(probably controlled by a flag) instead of adding new methods for the 
same purpose (with mostly the same contents).

E.g. there is a lot of code duplication in the new code in 
HeapRegionManager, particularly the combination of 
HeapRegionManager::concurrent_uncommit_regions_memory, 
HRM::synchronize_uncommit_regions_memory and HRM::uncommit_regions could 
probably be cut to almost 1/3rd.


On 13.01.20 12:45, Thomas Schatzl wrote:
> Hi Liang,
> 
> On 07.01.20 17:33, Liang Mao wrote:
>> Hi Thomas,
>>
>> As we previously discussed, I use the concurrent heap uncommit/commit 
>> mechanism to implement the SoftMaxHeapSize for G1. It is also for 
>> thfurther implementation of
>> G1ElasticHeap for ergonomic
>> change of heap size. In the previous 8u implementation, we had some 
>> limitations which are all
>> removed now in this patch. The concurrent uncommit/commit can also 
>> work with some senarios for
>> immediate heap expansion.
>>
>> Here is the webrev link:
>> http://cr.openjdk.java.net/~luchsh/8236073.webrev/
>>
>> We still have some questions.
>> 1. Does the SoftMaxHeapSize limitation need to consider the GC time 
>> ratio as in
>> expand_heap_after_young_collection? Now we haven't put the logic in yet.

I am not completely clear what you are asking about, but the gc time 
ratio only affects the current "optimal" heap size which is bounded by 
SoftMaxHeapsize/max_capacity.

>> 2. The concurrent uncommit/commit can only work for 
>> G1RegionsLargerThanCommitSizeMapper but not
>> G1RegionsSmallerThanCommitSizeMapper which might need some locks to 
>> ensure the multi-thread
>> synchronization issue( heap may expand?immediately). I think bringing 
>> the lock synchronization
>> may not be worthy for the little gain. Another idea is can we just not 
>> uncommit the pages of
>> auxiliary data if in?G1RegionsSmallerThanCommitSizeMapper? Heap 
>> regions should not be
>> G1RegionsSmallerThanCommitSizeMapper most of time I guess...
>>
>> Looking forward to your advice:)

Thanks,
   Thomas


From rkennke at redhat.com  Mon Jan 13 16:24:49 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 13 Jan 2020 17:24:49 +0100
Subject: [15] RFR(T) 8237017: Shenandoah: Remove racy assertion
In-Reply-To: <d12271ae-25d4-9cd2-eb4b-2bab0d64a1e5@redhat.com>
References: <d12271ae-25d4-9cd2-eb4b-2bab0d64a1e5@redhat.com>
Message-ID: <f7bca4aa-417c-6984-d482-a8dd05a9c40e@redhat.com>

Yes please remove that.

Thanks,
Roman

> During concurrent weak root processing, it tries to CAS in NULL if the
> oop is dead, then asserts that the slot is indeed NULL.
> 
> The assertion is racy, because there can be another thread to release
> the slot and then reuse it (that's why it uses CAS in the first place),
> that can result assertion to fail.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237017
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237017/webrev.00/index.html
> 
> Test:
> ? hotspot_gc_shenandoah fastdebug on x86_64 Linux
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Mon Jan 13 18:18:57 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 13 Jan 2020 13:18:57 -0500
Subject: [14] RFR 8237038: Shenandoah: Reduce thread pool size in
 TestEvilSyncBug.java test
Message-ID: <b2c8aeb0-94fa-5c84-d0cf-a93dcb404f4a@redhat.com>

Please review this small patch to reduce thread pool size in 
TestEvilSyncBug.java test.

I have observed problems with the test on many core system, including 
crashes on arm server with 48 cores when running 4 concurrent test jobs.


Bug: https://bugs.openjdk.java.net/browse/JDK-8237038
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237038/webrev.00/


Test:
   hotspot_gc_shenandoah (fastdebug and release) with 4 concurrent test 
jobs on 48 cores arm machine.

Thanks,

-Zhengyu


From shade at redhat.com  Mon Jan 13 18:24:39 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 13 Jan 2020 19:24:39 +0100
Subject: [14] RFR 8237038: Shenandoah: Reduce thread pool size in
 TestEvilSyncBug.java test
In-Reply-To: <b2c8aeb0-94fa-5c84-d0cf-a93dcb404f4a@redhat.com>
References: <b2c8aeb0-94fa-5c84-d0cf-a93dcb404f4a@redhat.com>
Message-ID: <9750a26b-6428-06b1-331c-50c9eea1c8b0@redhat.com>

On 1/13/20 7:18 PM, Zhengyu Gu wrote:
> Please review this small patch to reduce thread pool size in 
> TestEvilSyncBug.java test.
> 
> I have observed problems with the test on many core system, including 
> crashes on arm server with 48 cores when running 4 concurrent test jobs.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237038
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237038/webrev.00/

Looks fine.

Suggestion to name this thing "numJobs" and shorter comment (not tested):

   // Use 1/4 of available processors to avoid over-saturation.
  int numJobs = Math.max(1, Runtime.getRuntime().availableProcessors() / 4);
  ExecutorService pool = Executors.newFixedThreadPool(numJobs);


-- 
Thanks,
-Aleksey


From zgu at redhat.com  Mon Jan 13 18:54:15 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 13 Jan 2020 13:54:15 -0500
Subject: [14] RFR 8237038: Shenandoah: Reduce thread pool size in
 TestEvilSyncBug.java test
In-Reply-To: <9750a26b-6428-06b1-331c-50c9eea1c8b0@redhat.com>
References: <b2c8aeb0-94fa-5c84-d0cf-a93dcb404f4a@redhat.com>
 <9750a26b-6428-06b1-331c-50c9eea1c8b0@redhat.com>
Message-ID: <88df9400-e14e-9d15-3bf8-7b565d65019b@redhat.com>


On 1/13/20 1:24 PM, Aleksey Shipilev wrote:
> On 1/13/20 7:18 PM, Zhengyu Gu wrote:
>> Please review this small patch to reduce thread pool size in
>> TestEvilSyncBug.java test.
>>
>> I have observed problems with the test on many core system, including
>> crashes on arm server with 48 cores when running 4 concurrent test jobs.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237038
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237038/webrev.00/
> 
> Looks fine.
> 
> Suggestion to name this thing "numJobs" and shorter comment (not tested):
> 
>     // Use 1/4 of available processors to avoid over-saturation.
>    int numJobs = Math.max(1, Runtime.getRuntime().availableProcessors() / 4);
>    ExecutorService pool = Executors.newFixedThreadPool(numJobs);
> 
Updated as you suggested and pushed.

Thanks,

-Zhengyu


> 


From maoliang.ml at alibaba-inc.com  Tue Jan 14 09:07:50 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Tue, 14 Jan 2020 17:07:50 +0800
Subject: =?UTF-8?B?UmU6IFJGUjogODIzNjA3MzogRzE6IFVzZSBTb2Z0TWF4SGVhcFNpemUgdG8gZ3VpZGUgR0Mg?=
 =?UTF-8?B?aGV1cmlzdGljcw==?=
In-Reply-To: <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>,
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
Message-ID: <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Thank you for the detailed comments!
Most of suggestions I will follow to do the modification. 
And I still have some questions:

>> 1. Does the SoftMaxHeapSize limitation need to consider the GC time 
>> ratio as in
>> expand_heap_after_young_collection? Now we haven't put the logic in yet.

> I am not completely clear what you are asking about, but the gc time 
> ratio only affects the current "optimal" heap size which is bounded by 
> SoftMaxHeapsize/max_capacity.

The decision to shrink to SoftMaxHeapSize in this patch is based 
on the method "G1HeapSizingPolicy::can_shrink_heap_size_to" which
counts "used" + "reserve" + "young". We will change it to 
_heap_sizing_policy->shrink_amount(); as you commented.
I'm not considering the GC time ratio as a factor to determine
whether the heap can be shinked to SoftMaxHeapSize.

> - changes in G1CollectedHeap::resize_heap_if_necessary: please make the 
> method to always use the soft max heap size as I do not understand why 
> you would not do that.

Do you think we need to apply the logic "can_shrink_heap_size_to"
inside resize_heap_if_necessary to determine whether to make soft
 max size as limit?


>  - there is a bug in the synchronization of the concurrent uncommit 
> thread: it seems possible that the uncommit thread is still working 
> (iterating over the list of regions to uncommit) while a completed 
> (young) GC may add new regions to that list as young gcs do not wait for 
> completion of the uncommit thread.

Uncommit thread could be working parallelly with VMThread but
VMThread will not add regions to the concurrent_resizing_list
if concurrent resizing thread is in "working" state.

> Related to that is the use of par_set/clear_bit in e.g. the available 
> bitmap: since all par/clear actions are asserted to be in the vm thread 
> at a safepoint, there does not seem to be a need for using the parallel 
> variants of set/clear bit (if keeping the current mechanism).

For above reason that concurrent uncommit can run parallely with
VMThread, the bit set/clear in vm thread at safepint have
to be parallel.

>  - I have a feeling that if the concurrent uncommit thread worked on 
> pages, not regions, the code would be easier to understand. It would 
> also solve the issue you asked about with the 
> G1RegionsSmallerThanCommitSizeMapper. You may still need to pass region 
> numbers anyway for logging, but otoh the logging could be done 
> asynchroniusly.

I don't quite understand this part... For the G1RegionsSmallerThanCommitSizeMapper,
a page can be simultaneously requested to commit in VMThread to expand
heap and uncommit in concurrent thread to shrink heap. Looks like lowering
uncommit work to page level couldn't help this...


For the features you listed below,
   1) moving the memory uncommit into the concurrent phase
   2) uncommit at the end of (almost) every GC
   3) SoftMaxHeapSize

Since most of code is for the concurrent framework,
do you think 2) and 3) can be together and implemented first?
(The uncommit will happen immediately)

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 13 (Mon.) 23:15
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi Liang,

   thanks for your contribution!

I looked through the change a bit and have a few comments.

What I noticed quickly after initial browsing through it is that this 
change implements three different features:
   1) moving the memory uncommit into the concurrent phase
   2) uncommit at the end of (almost) every GC
   3) SoftMaxHeapSize

These should be split across three separate changes (I already filed 
JDK-8236926 last week). No particular order I think, but the concurrent 
uncommit changes are probably hardest and will probably take most time.

Some additional initial comments:

- in G1Collectedheap::check_soft_max_heap_size_changed(), instead of the 
check for "AllocateOldGenAt != NULL" we probably want to ask the 
HeapRegionManager directly about whether it supports this. Also print 
some log message that it failed. Even on success, print a log message :)

- in that same method, I recommend first doing the alignment adjustment 
(which probably needs to be done for that suggested soft_max_capacity() 
method below too) and then check if it changed. That saves the repeated 
!= _prev_soft_max_heap_size check.

Actually, just using the suggested soft_max_heap_size() method should be 
fine.

- changes in G1CollectedHeap::resize_heap_if_necessary: please make the 
method to always use the soft max heap size as I do not understand why 
you would not do that.
I recommend adding a "soft_max_capacity()" method in G1CollectedHeap, 
and let that return MIN2(align_up(SoftMaxHeapSize, heap_alignment), 
max_capacity()).

There are a few places that check SoftMaxHeapSize validity (e.g. 
SoftMaxHeapSize <= capacity()), they could probably all be removed then.

- doing timing: you might have noticed, we are currently transitioning 
to use Ticks/Tickspan for points in time and durations at least for the 
calculation, so in any new code please avoid using os::elapsedTime().

Use:

Ticks start = Ticks::now();
// timed code
phase_times()->record....((Ticks::now() - start).seconds() * 1000.0);

instead.

- in G1CollectedHeap::shrink_heap_after_young_collection() I would 
prefer a structure like in expand_heap_after_collection, i.e.:

size_t shrink_bytes = _heap_sizing_policy->shrink_amount();
if (shrink_bytes  > 0) {
   // do actual preparation for shrinking
}

and put all that logic determining the amount to shrink in that 
shrink_amount() method.

- concurrent uncommit:
  - as mentioned, please split the related changes out from the other 
changes. This change is hard enough to get right as is.

- I would really prefer if we did not need to introduce another helper 
thread for uncommitting memory. Did you try using the 
G1YoungRemSetSamplingThread? I understand that uncommit might then delay 
young gen sampling, but I do not expect these events to occur all the 
time (but I have no reference here).
In the first implementation we could have another thread if others do 
not object, but every additional thread takes some time to startup and 
teardown, and memory for at least one stack page.

- please move the change in G1CollectedHeap::abort_concurrent_cycle() 
into a separate method - waiting for completion of the concurrent 
uncommit and the concurrent marking are completely different concerns.

  - I admit I haven't looked at all cases in detail, but the split in 
is_available() and is_unavailable_for_allocation() in HeapRegionManager 
seems incomplete and unncessary. Particularly because of bad naming, as 
the documentation for is_available() says it's actually 
is_available[_for_allocation]. Disregarding the negation, these two look 
equivalent with the problem that !is_available() != is_unavailable..., 
which is really bad style.

I have not found a case where it is harmful to not consider the 
_concurrent_resizing_map in is_available().

The split the state of a region between two bitmaps in HeapRegionManager 
(the available_map and the _concurrent_resizing_map) may be susceptible 
to tricky races. Please consider changing this to a real "state" as in 
"Available -> Removing -> Unavailable". This would make the code easier 
to read too. (And in the future "Adding" if required).

  - it should be possible to disable concurrent uncommit/resize via an 
experimental flag. Also there should be no concurrent resize thread if 
the Heapregionmanager does not support it.
G1 couold immediately do the heap change in that case.

The reason for this flag is to allow users too disable this if they 
experience problems.

  - not sure about why some methods in HeapRegionManager have "resizing" 
in their method name. As far as I can tell, the change only allows 
concurrent uncommit. Maybe use the above "remove" for regions instead of 
"uncommit" regions.

Background: The code and comments in HeapRegionManager are aware and 
fairly consistent (I hope) to not use the wording commit/uncommit for 
operations on HeapRegions. Only operations on memory pages should use 
committed/uncommit. The naming in the added methods does not respect that.

  - some of the methods (e.g. to find free regions) should inform the 
caller that there are to-be-removed regions to maybe retry after waiting 
for completion of that thread to avoid unexpected OOM.

  - I have a feeling that if the concurrent uncommit thread worked on 
pages, not regions, the code would be easier to understand. It would 
also solve the issue you asked about with the 
G1RegionsSmallerThanCommitSizeMapper. You may still need to pass region 
numbers anyway for logging, but otoh the logging could be done 
asynchroniusly.

  - s/parallely/concurrently a few times

  - there is a bug in the synchronization of the concurrent uncommit 
thread: it seems possible that the uncommit thread is still working 
(iterating over the list of regions to uncommit) while a completed 
(young) GC may add new regions to that list as young gcs do not wait for 
completion of the uncommit thread.

  - the concurrently uncommitted regions only become available for 
commit at the next gc, which seems very long. Why not make them 
available for commit "immediately"?

Related to that is the use of par_set/clear_bit in e.g. the available 
bitmap: since all par/clear actions are asserted to be in the vm thread 
at a safepoint, there does not seem to be a need for using the parallel 
variants of set/clear bit (if keeping the current mechanism).

  - please document the supposed interactions and assumptions like in 
the above two paragraphs between the "resize" thread and the other 
threads and safepoints.

  - please use the existing HeapRegionManager::shrink_by() 
method+infrastructure for passing a shrink request to the HRM, either 
immediately shrinking the heap or deferring for later shrinking 
(probably controlled by a flag) instead of adding new methods for the 
same purpose (with mostly the same contents).

E.g. there is a lot of code duplication in the new code in 
HeapRegionManager, particularly the combination of 
HeapRegionManager::concurrent_uncommit_regions_memory, 
HRM::synchronize_uncommit_regions_memory and HRM::uncommit_regions could 
probably be cut to almost 1/3rd.


On 13.01.20 12:45, Thomas Schatzl wrote:
> Hi Liang,
> 
> On 07.01.20 17:33, Liang Mao wrote:
>> Hi Thomas,
>>
>> As we previously discussed, I use the concurrent heap uncommit/commit 
>> mechanism to implement the SoftMaxHeapSize for G1. It is also for 
>> thfurther implementation of
>> G1ElasticHeap for ergonomic
>> change of heap size. In the previous 8u implementation, we had some 
>> limitations which are all
>> removed now in this patch. The concurrent uncommit/commit can also 
>> work with some senarios for
>> immediate heap expansion.
>>
>> Here is the webrev link:
>> http://cr.openjdk.java.net/~luchsh/8236073.webrev/
>>
>> We still have some questions.
>> 1. Does the SoftMaxHeapSize limitation need to consider the GC time 
>> ratio as in
>> expand_heap_after_young_collection? Now we haven't put the logic in yet.

I am not completely clear what you are asking about, but the gc time 
ratio only affects the current "optimal" heap size which is bounded by 
SoftMaxHeapsize/max_capacity.

>> 2. The concurrent uncommit/commit can only work for 
>> G1RegionsLargerThanCommitSizeMapper but not
>> G1RegionsSmallerThanCommitSizeMapper which might need some locks to 
>> ensure the multi-thread
>> synchronization issue( heap may expand immediately). I think bringing 
>> the lock synchronization
>> may not be worthy for the little gain. Another idea is can we just not 
>> uncommit the pages of
>> auxiliary data if in G1RegionsSmallerThanCommitSizeMapper? Heap 
>> regions should not be
>> G1RegionsSmallerThanCommitSizeMapper most of time I guess...
>>
>> Looking forward to your advice:)

Thanks,
   Thomas


From erik.osterlund at oracle.com  Tue Jan 14 09:33:31 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Tue, 14 Jan 2020 10:33:31 +0100
Subject: RFR: 8236153: ZGC: gc/z/TestUncommit.java fails with
 java.lang.Exception: Uncommitted too fast
In-Reply-To: <24c41e59-3e2c-9284-489a-af487e6cccc0@oracle.com>
References: <24c41e59-3e2c-9284-489a-af487e6cccc0@oracle.com>
Message-ID: <53eea1e2-d57a-1697-b5e3-282f4bf0f04d@oracle.com>

Hi Per,

Looks good.

/Erik

On 1/13/20 4:12 PM, Per Liden wrote:
> The test gc/z/TestUncommit.java fails now and then on Windows when 
> using -Xcomp. This test can fail if it's severely starved on CPU, as 
> it will cause the timing to be off. The logs confirms that the test 
> took an unusually long time to execute, suggesting it was starved on 
> CPU. This only happens in test tiers using -Xcomp, which is likely 
> causing the unusually high load. This patch disables this test when 
> using -Xcomp. I've enabled some GC logging, which should helpful if 
> this test ever fails again.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236153
> Webrev: http://cr.openjdk.java.net/~pliden/8236153/webrev.0
>
> /Per


From per.liden at oracle.com  Tue Jan 14 09:41:35 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 14 Jan 2020 10:41:35 +0100
Subject: RFR: 8236153: ZGC: gc/z/TestUncommit.java fails with
 java.lang.Exception: Uncommitted too fast
In-Reply-To: <53eea1e2-d57a-1697-b5e3-282f4bf0f04d@oracle.com>
References: <24c41e59-3e2c-9284-489a-af487e6cccc0@oracle.com>
 <53eea1e2-d57a-1697-b5e3-282f4bf0f04d@oracle.com>
Message-ID: <ba2202a4-5179-6de0-8c6f-b99f001c5375@oracle.com>

Thanks Erik!

/Per

On 1/14/20 10:33 AM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> /Erik
> 
> On 1/13/20 4:12 PM, Per Liden wrote:
>> The test gc/z/TestUncommit.java fails now and then on Windows when 
>> using -Xcomp. This test can fail if it's severely starved on CPU, as 
>> it will cause the timing to be off. The logs confirms that the test 
>> took an unusually long time to execute, suggesting it was starved on 
>> CPU. This only happens in test tiers using -Xcomp, which is likely 
>> causing the unusually high load. This patch disables this test when 
>> using -Xcomp. I've enabled some GC logging, which should helpful if 
>> this test ever fails again.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236153
>> Webrev: http://cr.openjdk.java.net/~pliden/8236153/webrev.0
>>
>> /Per
> 


From thomas.schatzl at oracle.com  Tue Jan 14 11:36:22 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 14 Jan 2020 12:36:22 +0100
Subject: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics
In-Reply-To: <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
Message-ID: <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>

Hi,

On 14.01.20 10:07, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for the detailed comments!
> Most of suggestions I will follow to do the modification.
> And I still have some questions:
> 
>>>?1.?Does?the?SoftMaxHeapSize?limitation?need?to?consider?the?GC?time 
>>>?ratio?as?in
>>>?expand_heap_after_young_collection??Now?we?haven't?put?the?logic?in?yet.
> 
>> I?am?not?completely?clear?what?you?are?asking?about,?but?the?gc?time 
>> ratio?only?affects?the?current?"optimal"?heap?size?which?is?bounded?by 
>> SoftMaxHeapsize/max_capacity.
> 
> The decision to shrink to SoftMaxHeapSize in this patch is based
> on the method "G1HeapSizingPolicy::can_shrink_heap_size_to" which
> counts "used" + "reserve" + "young". We will change it to
> _heap_sizing_policy->shrink_amount(); as you commented.
> I'm not considering the GC time ratio as a factor to determine
> whether the heap can be shrinked to SoftMaxHeapSize.

Going back to the "spec":

"When -XX:SoftMaxHeapSize is set, the GC should strive to not grow heap 
size beyond the specified size, unless the GC decides it's necessary to 
do so. The soft max heap size should not be allowed to be set to a value 
smaller than min heap size (-Xms) or greater than max heap size (-Xmx). 
When not set on the command-line, this flag should default to the max 
heap size." (https://bugs.openjdk.java.net/browse/JDK-8222181)

This is a very loose definition, and "unless the GC decides it's 
necessary" may mean anything.

Looking at ZGC code, it mostly uses it to drive the GCs (and determine 
expansion amount), and let regular expansion/shrinking do the work 
without new rules.

I would at first tend to do the same: if existing heap policy indicates 
that an expansion at young gc (which takes GCTimeRatio into account) is 
needed for whatever reason, I would actually let G1 keep doing it; 
conversely I would also take GCTimeRatio into account when trying to 
shrink to keep the policy symmetric.

The current implementation probably preferentially shrinks towards 
SoftMaxHeapSize, correct? (this shrinking also seems to be limited to 
exactly SoftMaxHeapSize - why not below that if the collector thinks it 
would be okay?)

Do you have measurements with/without GCTimeRatio in the shrinking? Can 
you describe, with over-time heap occupancy graphs that this does not 
work at all in your workloads?

Measurements of committed heap over time of the current solution would 
be appreciated too.

(I haven't had the time yet to set up some useful testing application 
that can be used to simulate phases of such a workload to show heap 
shrinking but I assume you have some data.)

>> -?changes?in?G1CollectedHeap::resize_heap_if_necessary:?please?make?the 
>> method?to?always?use?the?soft?max?heap?size?as?I?do?not?understand?why 
>> you?would?not?do?that.
> 
> Do you think we need to apply the logic "can_shrink_heap_size_to"
> inside resize_heap_if_necessary to determine whether to make soft
> max size as limit?

Resize_heap_if_necessary() is the equivalent of 
adjust_heap_after_young_collection() for after full gc (the naming could 
be unified at some point).

What the change implements right now is to respect SoftMaxHeapSize only 
on an explicit gc in resize_heap_if_necessary(), while always respecting 
it during young gcs.

Do you have a reason for this decision? This seems like an inconsistency 
I can not find a reason for.

As mentioned above, I would try to keep SoftMaxHeapSize only a goal for 
starting (concurrent) garbage collections, with "natural" sizing trying 
to keep the SoftMaxHeapSize goal.

Particularly, if a (compacting) full gc won't meet the SoftMaxHeapSize, 
what else will? It is indeed unfortunate that you might need to tweak 
Min/MaxFreeRatio to achieve higher uncommit ratio at full gc...

This change (system.gc specifically trying to meet SoftMaxHeapSize) also 
seems to be an artifact of your usage of this feature - maybe you happen 
to always issue a system.gc() after you changed SoftMaxHeapSize?
It may probably be better if a concurrent cycle were triggered done 
automatically similar to periodic gcs.

>>? -?there?is?a?bug?in?the?synchronization?of?the?concurrent?uncommit 
>> thread:?it?seems?possible?that?the?uncommit?thread?is?still?working 
>> (iterating?over?the?list?of?regions?to?uncommit)?while?a?completed 
>> (young)?GC?may?add?new?regions?to?that?list?as?young?gcs?do?not?wait?for 
>> completion?of?the?uncommit?thread.
> 
> Uncommit thread could be working parallelly with VMThread but
> VMThread will not add regions to the concurrent_resizing_list
> if concurrent resizing thread is in "working" state.

Okay, I now see the check in 
G1CollectedHeap::adjust_heap_after_young_collection(), but that also 
prohibits expansion during young GC which seems relatively disruptive. I 
think I mentioned earlier, that in this case (if we want to expand) it 
might be better to wait on completion of the parallel uncommit _if_ the 
other remaining regions are not enough.

(Alternatively one could add another uncommit/commit request to a 
hypothetical uncommit task queue for that resize thread).

My thinking is that at worst, this would result in the same behavior as 
before (i.e. blocking because of commit/uncommit in progress) instead of 
changing the overall behavior by denying expansion requests (silently, 
which is pretty bad). This could probably result in the heap sizing 
policy to get a few samples with high gc time ratio, expanding even more 
when G1 is finally allowed to expand. (Also, fortunately 
UseGCOverheadLimit is not implemented in G1, as that might kill the VM 
spuriously because of that...)

Or do you have some reason for this particular implementation?

(The reason I am asking so much about details is to get a common 
understanding on the solution, i.e. what should happen when, and 
hopefully including why for the next guy ;))

Maybe it would be good to refactor the code in 
G1CollectedHeap::adjust_heap_after_young_collection() (similar to or 
expand in G1CollectedHeap::adjust_heap_after_young_collection()), e.g. 
calculate some heap_size_change() value from the heap sizing policy, and 
then use that value depending on whether it is positive or negative to 
expand or shrink.

> 
>> Related?to?that?is?the?use?of?par_set/clear_bit?in?e.g.?the?available 
>> bitmap:?since?all?par/clear?actions?are?asserted?to?be?in?the?vm?thread 
>> at?a?safepoint,?there?does?not?seem?to?be?a?need?for?using?the?parallel 
>> variants?of?set/clear?bit?(if?keeping?the?current?mechanism).
> 
> For above reason that concurrent uncommit can run parallely with
> VMThread, the bit set/clear invm thread at safepint have
> to be parallel.

If concurrent uncommit is working, both heap shrinking and expansion in 
the safepoint is disabled as far as I can tell. I.e.

2996 void G1CollectedHeap::adjust_heap_after_young_collection() {
2997   if (concurrent_heap_resize()->resize_thread()->during_cycle()) {
2998     // Do not adjust if concurrent resizing is in progress
2999     return;
3000   }
3001
3002   double start_time_ms = os::elapsedTime();
3003   shrink_heap_after_young_collection();
3004   phase_times()->recor...[..]
3005
        // Shrinking might have started resizing
3006   if (!concurrent_heap_resize()->resize_thread()->during_cycle()) {
3007     expand_heap_after_young_collection();
3008   }
3009 }

This method seems to be the only caller to 
expand/shrink_heap_after_young_collection().

Another question I had but forgot is that the thread and in other 
places "resize" is used in the method and class names instead of e.g. 
"uncommit" or "shrink". Do you plan to add concurrent commit too?

>>? -?I?have?a?feeling?that?if?the?concurrent?uncommit?thread?worked?on 
>> pages,?not?regions,?the?code?would?be?easier?to?understand.?It?would 
>> also?solve?the?issue?you?asked?about?with?the 
>> G1RegionsSmallerThanCommitSizeMapper.?You?may?still?need?to?pass?region 
>> numbers?anyway?for?logging,?but?otoh?the?logging?could?be?done 
>> asynchroniusly.
> 
> I don't quite understand this part... For 
> the?G1RegionsSmallerThanCommitSizeMapper,
> a page can be simultaneously requested to commit in VMThread to expand
> heap and uncommit in concurrent thread to shrink heap. Looks like lowering
> uncommit work to page level couldn't help this...

I see the problem, but as above I think expansion (commit) and shrinking 
(uncommit) can't happen at the same time at the moment, and it actually 
might be premature to allow expansion and shrinking occur at the same time.

Some more options are serializing these requests; one I described above 
(if a request is pending, wait for its completion) before starting a new 
one. This might be the default option for other reasons anyway.

Another is certainly to simply not uncommit them concurrently then, 
maybe we can still uncommit them immediately?

> For the features you listed below,
>  ???1)?moving?the?memory?uncommit?into?the?concurrent?phase
>  ???2)?uncommit?at?the?end?of?(almost)?every?GC
>  ???3)?SoftMaxHeapSize
> 
> Since most of code is for the concurrent framework,
> do you think 2) and 3) canbe together and implemented first?
> (The uncommit will happen immediately)
I think that would be fine.


Thanks,
   Thomas


From aph at redhat.com  Tue Jan 14 14:22:30 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 14:22:30 +0000
Subject: Is CPU_MULTI_COPY_ATOMIC the correct test here?
Message-ID: <e8f01f56-c8d9-bc48-afe7-758fc9c3284f@redhat.com>

AArch64 is multi-copy atomic, but it has a relaxed memory model. I'm looking
at the CPU_MULTI_COPY_ATOMIC in this code:

template<class E, MEMFLAGS F, unsigned int N>
bool GenericTaskQueue<E, F, N>::pop_global(volatile E& t) {
  Age oldAge = _age.get();
  // Architectures with weak memory model require a barrier here
  // to guarantee that bottom is not older than age,
  // which is crucial for the correctness of the algorithm.
#ifndef CPU_MULTI_COPY_ATOMIC
  OrderAccess::fence();
#endif
  uint localBot = Atomic::load_acquire(&_bottom);
  uint n_elems = size(localBot, oldAge.top());
  if (n_elems == 0) {
    return false;
  }

It seems to me that what we're asking here is not whether the CPU is
multi-copy atomic, but whether it's TSO or not. I'd like to turn
CPU_MULTI_COPY_ATOMIC off for AArch64, but I think that
GenericTaskQueue::pop_global() will break if I do.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From zgu at redhat.com  Tue Jan 14 14:37:18 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 14 Jan 2020 09:37:18 -0500
Subject: Is CPU_MULTI_COPY_ATOMIC the correct test here?
In-Reply-To: <e8f01f56-c8d9-bc48-afe7-758fc9c3284f@redhat.com>
References: <e8f01f56-c8d9-bc48-afe7-758fc9c3284f@redhat.com>
Message-ID: <220ace02-5c1d-e184-9b98-3b586bc85bd4@redhat.com>

Hi Andrew,

Here is the discussion on this particular issue ...

https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-September/034900.html

-Zhengyu

On 1/14/20 9:22 AM, Andrew Haley wrote:
> AArch64 is multi-copy atomic, but it has a relaxed memory model. I'm looking
> at the CPU_MULTI_COPY_ATOMIC in this code:
> 
> template<class E, MEMFLAGS F, unsigned int N>
> bool GenericTaskQueue<E, F, N>::pop_global(volatile E& t) {
>    Age oldAge = _age.get();
>    // Architectures with weak memory model require a barrier here
>    // to guarantee that bottom is not older than age,
>    // which is crucial for the correctness of the algorithm.
> #ifndef CPU_MULTI_COPY_ATOMIC
>    OrderAccess::fence();
> #endif
>    uint localBot = Atomic::load_acquire(&_bottom);
>    uint n_elems = size(localBot, oldAge.top());
>    if (n_elems == 0) {
>      return false;
>    }
> 
> It seems to me that what we're asking here is not whether the CPU is
> multi-copy atomic, but whether it's TSO or not. I'd like to turn
> CPU_MULTI_COPY_ATOMIC off for AArch64, but I think that
> GenericTaskQueue::pop_global() will break if I do.
> 


From aph at redhat.com  Tue Jan 14 14:58:09 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 14:58:09 +0000
Subject: Is CPU_MULTI_COPY_ATOMIC the correct test here?
In-Reply-To: <220ace02-5c1d-e184-9b98-3b586bc85bd4@redhat.com>
References: <e8f01f56-c8d9-bc48-afe7-758fc9c3284f@redhat.com>
 <220ace02-5c1d-e184-9b98-3b586bc85bd4@redhat.com>
Message-ID: <1193d905-5d44-8688-0cca-2c4bd71b9f3d@redhat.com>

On 1/14/20 2:37 PM, Zhengyu Gu wrote:
>
> Here is the discussion on this particular issue ...
>
> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-September/034900.html

The code (and its comment) seems to be related to whether this is a
relaxed-memory machine. Much of the discussion seems to be related to
that, too. I can't see any discussion about multi-copy atomicity.

It seems to me that we do want this fence on AArch64, but we should not
define CPU_MULTI_COPY_ATOMIC. I can't see why the concepts are mixed
up.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jan 14 15:00:42 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 15:00:42 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
Message-ID: <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>

On 8/15/19 4:49 PM, Derek White wrote:
> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in changing behavior (removing fence in taskqueue) that should be looked at and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest deferring changing this AArch64 behavior to a separate issue.

Well, yes. What i don't understand is what any of this has to do with
multi-copy atomicity. The fence is needed for _age.get() on all machines
with relaxed memory consitency, AFAICS.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jan 14 15:05:01 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 15:05:01 +0000
Subject: Is CPU_MULTI_COPY_ATOMIC the correct test here?
In-Reply-To: <220ace02-5c1d-e184-9b98-3b586bc85bd4@redhat.com>
References: <e8f01f56-c8d9-bc48-afe7-758fc9c3284f@redhat.com>
 <220ace02-5c1d-e184-9b98-3b586bc85bd4@redhat.com>
Message-ID: <eb7cfa10-3b98-e336-e6c0-01b94f23f104@redhat.com>

On 1/14/20 2:37 PM, Zhengyu Gu wrote:
> Here is the discussion on this particular issue ...
> 
> https://mail.openjdk.java.net/pipermail/hotspot-compiler-dev/2019-September/034900.html

Ah, I found the answer in another thread:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jan 14 15:05:43 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 15:05:43 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
Message-ID: <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>

On 1/14/20 3:00 PM, Andrew Haley wrote:
> On 8/15/19 4:49 PM, Derek White wrote:
>> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in changing behavior (removing fence in taskqueue) that should be looked at and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest deferring changing this AArch64 behavior to a separate issue.
> 
> Well, yes. What i don't understand is what any of this has to do with
> multi-copy atomicity. The fence is needed for _age.get() on all machines
> with relaxed memory consitency, AFAICS.

Ah, I found the answer in another thread:

http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jan 14 15:08:42 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 15:08:42 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
Message-ID: <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>

On 1/14/20 3:05 PM, Andrew Haley wrote:
> On 1/14/20 3:00 PM, Andrew Haley wrote:
>> On 8/15/19 4:49 PM, Derek White wrote:
>>> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in changing behavior (removing fence in taskqueue) that should be looked at and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest deferring changing this AArch64 behavior to a separate issue.
>>
>> Well, yes. What i don't understand is what any of this has to do with
>> multi-copy atomicity. The fence is needed for _age.get() on all machines
>> with relaxed memory consitency, AFAICS.
>
> Ah, I found the answer in another thread:
>
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html

   "No, the problem is not reordering.  The problem is that _bottom,
   which is read after _age, might be older than _age because another
   processor didn't write it back yet.  The fence (sync) makes the
   current thread wait until it has the new _bottom.

   "On Power, a write is not visible to all other threads simultaneously
   (no multipl-copy-atomicity)."

So, my question: on a machine with relaxed memory, do we still need an acquire
fence?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From martin.doerr at sap.com  Tue Jan 14 15:52:57 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 14 Jan 2020 15:52:57 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
Message-ID: <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Hi Andrew,

good catch. I think you're right. A multi-copy-atomic, but weak architecture (e.g. aarch64) needs an instruction which orders both volatile loads.
(IA64 compilers use acquiring loads when accessing volatile fields, that's why IA64 is not affected by this problem.)

Best regards,
Martin


> -----Original Message-----
> From: Andrew Haley <aph at redhat.com>
> Sent: Dienstag, 14. Januar 2020 16:09
> To: Derek White <derekw at marvell.com>; Doerr, Martin
> <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>;
> hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: Re: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> On 1/14/20 3:05 PM, Andrew Haley wrote:
> > On 1/14/20 3:00 PM, Andrew Haley wrote:
> >> On 8/15/19 4:49 PM, Derek White wrote:
> >>> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result
> in changing behavior (removing fence in taskqueue) that should be looked at
> and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest
> deferring changing this AArch64 behavior to a separate issue.
> >>
> >> Well, yes. What i don't understand is what any of this has to do with
> >> multi-copy atomicity. The fence is needed for _age.get() on all machines
> >> with relaxed memory consitency, AFAICS.
> >
> > Ah, I found the answer in another thread:
> >
> > http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-
> March/008853.html
> 
>    "No, the problem is not reordering.  The problem is that _bottom,
>    which is read after _age, might be older than _age because another
>    processor didn't write it back yet.  The fence (sync) makes the
>    current thread wait until it has the new _bottom.
> 
>    "On Power, a write is not visible to all other threads simultaneously
>    (no multipl-copy-atomicity)."
> 
> So, my question: on a machine with relaxed memory, do we still need an
> acquire
> fence?
> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Tue Jan 14 16:15:36 2020
From: aph at redhat.com (Andrew Haley)
Date: Tue, 14 Jan 2020 16:15:36 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>

On 1/14/20 3:52 PM, Doerr, Martin wrote:

> good catch. I think you're right. A multi-copy-atomic, but weak
> architecture (e.g. aarch64) needs an instruction which orders both
> volatile loads.

Good, I thought so.

Given that TSO machines define OrderAccess::acquire() as no more than
a compiler barrier, I believe that we could do something like

#ifdef CPU_MULTI_COPY_ATOMIC
   OrderAccess::acquire();
#else
   OrderAccess::fence();
#endif

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From martin.doerr at sap.com  Tue Jan 14 16:19:32 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 14 Jan 2020 16:19:32 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
Message-ID: <AM0PR0202MB3297E7320908229C430DF4879A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Excellent. I'd propose the same fix.

I've added Thomas Schatzl. Maybe he can have a look, too.

Best regards,
Martin


> -----Original Message-----
> From: Andrew Haley <aph at redhat.com>
> Sent: Dienstag, 14. Januar 2020 17:16
> To: Doerr, Martin <martin.doerr at sap.com>; Derek White
> <derekw at marvell.com>; David Holmes <david.holmes at oracle.com>;
> hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: Re: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> On 1/14/20 3:52 PM, Doerr, Martin wrote:
> 
> > good catch. I think you're right. A multi-copy-atomic, but weak
> > architecture (e.g. aarch64) needs an instruction which orders both
> > volatile loads.
> 
> Good, I thought so.
> 
> Given that TSO machines define OrderAccess::acquire() as no more than
> a compiler barrier, I believe that we could do something like
> 
> #ifdef CPU_MULTI_COPY_ATOMIC
>    OrderAccess::acquire();
> #else
>    OrderAccess::fence();
> #endif
> 
> --
> Andrew Haley  (he/him)
> Java Platform Lead Engineer
> Red Hat UK Ltd. <https://www.redhat.com>
> https://keybase.io/andrewhaley
> EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From zgu at redhat.com  Tue Jan 14 17:19:33 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 14 Jan 2020 12:19:33 -0500
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
Message-ID: <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>

Submit test also passed.

May I get a second review?

Thanks,

-Zhengyu

On 1/13/20 9:06 AM, Roman Kennke wrote:
> OK.
> 
> Thanks,
> Roman
> 
> 
>>
>> Please review this small change that uses atomic operations to update
>> StringDedupTable's entries and entries_removed counter.
>>
>> This is *not* a correctness fix or performance enhancement, but for
>> Shenandoah GC to move StringDedupTable cleanup task into concurrent
>> phase, while holding StringDedupTable_lock.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html
>>
>> Test:
>>  ? hotspot_gc (fastdebug and release) on x86_64 Linux
>>  ? Submit test in progress.
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From mikael.vidstedt at oracle.com  Wed Jan 15 00:18:25 2020
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Tue, 14 Jan 2020 16:18:25 -0800
Subject: 8237182(T): Update copyright header for shenandoah and epsilon files
Message-ID: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>


Please review this small change which adjusts the copyright headers for shenandoah and epsilon related files.

JBS: https://bugs.openjdk.java.net/browse/JDK-8237182 <https://bugs.openjdk.java.net/browse/JDK-8237182>
webrev: http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/>

Description:

Many/most of the shenandoah and epsilon related files are missing the following line in the copyright header: 

"DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.?

This change simply adds that line to the relevant files. In src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp there was also a missing empty line before one of the paragraphs in the header.

Cheers,
Mikael


From igor.ignatyev at oracle.com  Wed Jan 15 00:24:05 2020
From: igor.ignatyev at oracle.com (Igor Ignatyev)
Date: Tue, 14 Jan 2020 16:24:05 -0800
Subject: 8237182(T): Update copyright header for shenandoah and epsilon
 files
In-Reply-To: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>
References: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>
Message-ID: <F0198B57-857B-49A0-B875-66C1B1A7A9AC@oracle.com>

Hi Mikael,

LGTM
-- Igor

> On Jan 14, 2020, at 4:18 PM, Mikael Vidstedt <mikael.vidstedt at oracle.com> wrote:
> 
> 
> Please review this small change which adjusts the copyright headers for shenandoah and epsilon related files.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8237182 <https://bugs.openjdk.java.net/browse/JDK-8237182>
> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/>
> 
> Description:
> 
> Many/most of the shenandoah and epsilon related files are missing the following line in the copyright header: 
> 
> "DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.?
> 
> This change simply adds that line to the relevant files. In src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp there was also a missing empty line before one of the paragraphs in the header.
> 
> Cheers,
> Mikael
> 


From david.holmes at oracle.com  Wed Jan 15 01:00:55 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 15 Jan 2020 11:00:55 +1000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
Message-ID: <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>

On 15/01/2020 2:15 am, Andrew Haley wrote:
> On 1/14/20 3:52 PM, Doerr, Martin wrote:
> 
>> good catch. I think you're right. A multi-copy-atomic, but weak
>> architecture (e.g. aarch64) needs an instruction which orders both
>> volatile loads.
> 
> Good, I thought so.
> 
> Given that TSO machines define OrderAccess::acquire() as no more than
> a compiler barrier, I believe that we could do something like
> 
> #ifdef CPU_MULTI_COPY_ATOMIC
>     OrderAccess::acquire();
> #else
>     OrderAccess::fence();
> #endif

"acquire" isn't used to order loads it is used to pair with a "release" 
associated with the store of the variable now being loaded.

If this is the code referred to:

   Age oldAge = _age.get();
   // Architectures with weak memory model require a barrier here
   // to guarantee that bottom is not older than age,
   // which is crucial for the correctness of the algorithm.
#ifndef CPU_MULTI_COPY_ATOMIC
   OrderAccess::fence();
#endif
   uint localBot = Atomic::load_acquire(&_bottom);

then I think there is an assumption (perhaps incorrect) that the 
load_acquire will prevent reordering as well as performing the necessary 
"acquire" semantics. If the load_acquire doesn't prevent reordering then 
surely a loadload() barrier is what is needed.

David
-----


From zgu at redhat.com  Wed Jan 15 03:11:56 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 14 Jan 2020 22:11:56 -0500
Subject: 8237182(T): Update copyright header for shenandoah and epsilon
 files
In-Reply-To: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>
References: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>
Message-ID: <3925a75d-f7f0-cd69-670f-f35422a17250@redhat.com>

Looks good to me.

Thanks for fixing it.

-Zhengyu

On 1/14/20 7:18 PM, Mikael Vidstedt wrote:
> 
> Please review this small change which adjusts the copyright headers for shenandoah and epsilon related files.
> 
> JBS: https://bugs.openjdk.java.net/browse/JDK-8237182 <https://bugs.openjdk.java.net/browse/JDK-8237182>
> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/>
> 
> Description:
> 
> Many/most of the shenandoah and epsilon related files are missing the following line in the copyright header:
> 
> "DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.?
> 
> This change simply adds that line to the relevant files. In src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp there was also a missing empty line before one of the paragraphs in the header.
> 
> Cheers,
> Mikael
> 


From mikael.vidstedt at oracle.com  Wed Jan 15 03:37:01 2020
From: mikael.vidstedt at oracle.com (Mikael Vidstedt)
Date: Tue, 14 Jan 2020 19:37:01 -0800
Subject: 8237182(T): Update copyright header for shenandoah and epsilon
 files
In-Reply-To: <3925a75d-f7f0-cd69-670f-f35422a17250@redhat.com>
References: <BB39A44B-28DA-455A-943C-2178AE4D9BAD@oracle.com>
 <3925a75d-f7f0-cd69-670f-f35422a17250@redhat.com>
Message-ID: <7D1F566C-838D-4DAA-B525-F1012AD78DFC@oracle.com>


Igor & Zhengyu,

Thanks for the reviews!

Cheers,
Mikael

> On Jan 14, 2020, at 7:11 PM, Zhengyu Gu <zgu at redhat.com> wrote:
> 
> Looks good to me.
> 
> Thanks for fixing it.
> 
> -Zhengyu
> 
> On 1/14/20 7:18 PM, Mikael Vidstedt wrote:
>> Please review this small change which adjusts the copyright headers for shenandoah and epsilon related files.
>> JBS: https://bugs.openjdk.java.net/browse/JDK-8237182 <https://bugs.openjdk.java.net/browse/JDK-8237182>
>> webrev: http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/ <http://cr.openjdk.java.net/~mikael/webrevs/8237182/webrev.00/open/webrev/>
>> Description:
>> Many/most of the shenandoah and epsilon related files are missing the following line in the copyright header:
>> "DO NOT ALTER OR REMOVE COPYRIGHT NOTICES OR THIS FILE HEADER.?
>> This change simply adds that line to the relevant files. In src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp there was also a missing empty line before one of the paragraphs in the header.
>> Cheers,
>> Mikael
> 


From maoliang.ml at alibaba-inc.com  Wed Jan 15 03:52:02 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 15 Jan 2020 11:52:02 +0800
Subject: =?UTF-8?B?UmU6IFJGUjogODIzNjA3MzogRzE6IFVzZSBTb2Z0TWF4SGVhcFNpemUgdG8gZ3VpZGUgR0Mg?=
 =?UTF-8?B?aGV1cmlzdGljcw==?=
In-Reply-To: <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>,
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
Message-ID: <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>

Hi Thomas,

I summarize the issues in as following:

1. Criterion of SoftMaxHeapSize
I agree to keep the policy of SoftMaxHeapSize similar with ZGC to make
it unified. So "expand_heap_after_young_collection" is used for meeting the
basic GCTimeRatio and expand heap immediately which cannot be blocked by
 any reasons. "adjust_heap_after_young_collection" cannot change the logic
and I will take both expansion and shrink into consideration. Is my 
understanding correct here?

2. Full GC with SoftMaxHeapSize
In my thought non-explicit Full GC probably means the insufficiency of heap
 capacity and we may not keep shrinking within SoftMaxHeapSize but explicit 
FGC don't have that issue. That's the only reason why I checked if it is 
explicit. But we will have the same determine logic to check if the heap
can be shrinked so "explicit" check could be meaningless and I will remove that.

3. SoftMaxHeapSizeConstraintFunc doesn't check Xms
The constraint function didn't make sure the SoftMaxHeapSize should less 
than Xms. Do we need to add the checking? It will not only affect G1...

4. commit/uncommit parallelism
The concurrent uncommit will work with VMThread doing GC and GC may request
to expand heap if not enough empty regions. So the parallelism is possible and
immediate uncommit is a solution.

4. More heap expansion/shrink heuristics further
We have some data and experience in dynamimc heap adjustment in our workloads.
The default GCTimeRatio 12 is really well tuned number that we found applications
will have obvious timeout erros if it is less than ~12. So it is kind of *hard*
limit and we need to expand immediately if GCTimeRatio drops below 12. The
 difference in our workloads is that we will keep a GCTimeRatio nearly the original
 value 99 to make GC in a heathy state because allocation rate and outside input
can vary violently that we don't want frequent adjustment. You know that in our 8u 
implementation we just keep a conservative GC interval to achieve that. Comparing to the
current code in JDK15, keeping GCTimeRatio as 99 is a different behavior which might
have more memory footprint. I propose if we can still use the original option
 "-XX:+G1ElasticHeap" to keep the GCTimeRatio around 99 or a specified number.
The default flow will make sure the GCTimeRatio is above the threshold 12 and concurrent
commit/uncommit will adjust the heap to keep GCTimeRatio in a proper number that
the adjustment is not urgent.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 14 (Tue.) 19:36
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On 14.01.20 10:07, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for the detailed comments!
> Most of suggestions I will follow to do the modification.
> And I still have some questions:
> 
>>> 1. Does the SoftMaxHeapSize limitation need to consider the GC time 
>>> ratio as in
>>> expand_heap_after_young_collection? Now we haven't put the logic in yet.
> 
>> I am not completely clear what you are asking about, but the gc time 
>> ratio only affects the current "optimal" heap size which is bounded by 
>> SoftMaxHeapsize/max_capacity.
> 
> The decision to shrink to SoftMaxHeapSize in this patch is based
> on the method "G1HeapSizingPolicy::can_shrink_heap_size_to" which
> counts "used" + "reserve" + "young". We will change it to
> _heap_sizing_policy->shrink_amount(); as you commented.
> I'm not considering the GC time ratio as a factor to determine
> whether the heap can be shrinked to SoftMaxHeapSize.

Going back to the "spec":

"When -XX:SoftMaxHeapSize is set, the GC should strive to not grow heap 
size beyond the specified size, unless the GC decides it's necessary to 
do so. The soft max heap size should not be allowed to be set to a value 
smaller than min heap size (-Xms) or greater than max heap size (-Xmx). 
When not set on the command-line, this flag should default to the max 
heap size." (https://bugs.openjdk.java.net/browse/JDK-8222181)

This is a very loose definition, and "unless the GC decides it's 
necessary" may mean anything.

Looking at ZGC code, it mostly uses it to drive the GCs (and determine 
expansion amount), and let regular expansion/shrinking do the work 
without new rules.

I would at first tend to do the same: if existing heap policy indicates 
that an expansion at young gc (which takes GCTimeRatio into account) is 
needed for whatever reason, I would actually let G1 keep doing it; 
conversely I would also take GCTimeRatio into account when trying to 
shrink to keep the policy symmetric.

The current implementation probably preferentially shrinks towards 
SoftMaxHeapSize, correct? (this shrinking also seems to be limited to 
exactly SoftMaxHeapSize - why not below that if the collector thinks it 
would be okay?)

Do you have measurements with/without GCTimeRatio in the shrinking? Can 
you describe, with over-time heap occupancy graphs that this does not 
work at all in your workloads?

Measurements of committed heap over time of the current solution would 
be appreciated too.

(I haven't had the time yet to set up some useful testing application 
that can be used to simulate phases of such a workload to show heap 
shrinking but I assume you have some data.)

>> - changes in G1CollectedHeap::resize_heap_if_necessary: please make the 
>> method to always use the soft max heap size as I do not understand why 
>> you would not do that.
> 
> Do you think we need to apply the logic "can_shrink_heap_size_to"
> inside resize_heap_if_necessary to determine whether to make soft
> max size as limit?

Resize_heap_if_necessary() is the equivalent of 
adjust_heap_after_young_collection() for after full gc (the naming could 
be unified at some point).

What the change implements right now is to respect SoftMaxHeapSize only 
on an explicit gc in resize_heap_if_necessary(), while always respecting 
it during young gcs.

Do you have a reason for this decision? This seems like an inconsistency 
I can not find a reason for.

As mentioned above, I would try to keep SoftMaxHeapSize only a goal for 
starting (concurrent) garbage collections, with "natural" sizing trying 
to keep the SoftMaxHeapSize goal.

Particularly, if a (compacting) full gc won't meet the SoftMaxHeapSize, 
what else will? It is indeed unfortunate that you might need to tweak 
Min/MaxFreeRatio to achieve higher uncommit ratio at full gc...

This change (system.gc specifically trying to meet SoftMaxHeapSize) also 
seems to be an artifact of your usage of this feature - maybe you happen 
to always issue a system.gc() after you changed SoftMaxHeapSize?
It may probably be better if a concurrent cycle were triggered done 
automatically similar to periodic gcs.

>>  - there is a bug in the synchronization of the concurrent uncommit 
>> thread: it seems possible that the uncommit thread is still working 
>> (iterating over the list of regions to uncommit) while a completed 
>> (young) GC may add new regions to that list as young gcs do not wait for 
>> completion of the uncommit thread.
> 
> Uncommit thread could be working parallelly with VMThread but
> VMThread will not add regions to the concurrent_resizing_list
> if concurrent resizing thread is in "working" state.

Okay, I now see the check in 
G1CollectedHeap::adjust_heap_after_young_collection(), but that also 
prohibits expansion during young GC which seems relatively disruptive. I 
think I mentioned earlier, that in this case (if we want to expand) it 
might be better to wait on completion of the parallel uncommit _if_ the 
other remaining regions are not enough.

(Alternatively one could add another uncommit/commit request to a 
hypothetical uncommit task queue for that resize thread).

My thinking is that at worst, this would result in the same behavior as 
before (i.e. blocking because of commit/uncommit in progress) instead of 
changing the overall behavior by denying expansion requests (silently, 
which is pretty bad). This could probably result in the heap sizing 
policy to get a few samples with high gc time ratio, expanding even more 
when G1 is finally allowed to expand. (Also, fortunately 
UseGCOverheadLimit is not implemented in G1, as that might kill the VM 
spuriously because of that...)

Or do you have some reason for this particular implementation?

(The reason I am asking so much about details is to get a common 
understanding on the solution, i.e. what should happen when, and 
hopefully including why for the next guy ;))

Maybe it would be good to refactor the code in 
G1CollectedHeap::adjust_heap_after_young_collection() (similar to or 
expand in G1CollectedHeap::adjust_heap_after_young_collection()), e.g. 
calculate some heap_size_change() value from the heap sizing policy, and 
then use that value depending on whether it is positive or negative to 
expand or shrink.

> 
>> Related to that is the use of par_set/clear_bit in e.g. the available 
>> bitmap: since all par/clear actions are asserted to be in the vm thread 
>> at a safepoint, there does not seem to be a need for using the parallel 
>> variants of set/clear bit (if keeping the current mechanism).
> 
> For above reason that concurrent uncommit can run parallely with
> VMThread, the bit set/clear invm thread at safepint have
> to be parallel.

If concurrent uncommit is working, both heap shrinking and expansion in 
the safepoint is disabled as far as I can tell. I.e.

2996 void G1CollectedHeap::adjust_heap_after_young_collection() {
2997   if (concurrent_heap_resize()->resize_thread()->during_cycle()) {
2998     // Do not adjust if concurrent resizing is in progress
2999     return;
3000   }
3001
3002   double start_time_ms = os::elapsedTime();
3003   shrink_heap_after_young_collection();
3004   phase_times()->recor...[..]
3005
        // Shrinking might have started resizing
3006   if (!concurrent_heap_resize()->resize_thread()->during_cycle()) {
3007     expand_heap_after_young_collection();
3008   }
3009 }

This method seems to be the only caller to 
expand/shrink_heap_after_young_collection().

Another question I had but forgot is that the thread and in other 
places "resize" is used in the method and class names instead of e.g. 
"uncommit" or "shrink". Do you plan to add concurrent commit too?

>>  - I have a feeling that if the concurrent uncommit thread worked on 
>> pages, not regions, the code would be easier to understand. It would 
>> also solve the issue you asked about with the 
>> G1RegionsSmallerThanCommitSizeMapper. You may still need to pass region 
>> numbers anyway for logging, but otoh the logging could be done 
>> asynchroniusly.
> 
> I don't quite understand this part... For 
> the G1RegionsSmallerThanCommitSizeMapper,
> a page can be simultaneously requested to commit in VMThread to expand
> heap and uncommit in concurrent thread to shrink heap. Looks like lowering
> uncommit work to page level couldn't help this...

I see the problem, but as above I think expansion (commit) and shrinking 
(uncommit) can't happen at the same time at the moment, and it actually 
might be premature to allow expansion and shrinking occur at the same time.

Some more options are serializing these requests; one I described above 
(if a request is pending, wait for its completion) before starting a new 
one. This might be the default option for other reasons anyway.

Another is certainly to simply not uncommit them concurrently then, 
maybe we can still uncommit them immediately?

> For the features you listed below,
>     1) moving the memory uncommit into the concurrent phase
>     2) uncommit at the end of (almost) every GC
>     3) SoftMaxHeapSize
> 
> Since most of code is for the concurrent framework,
> do you think 2) and 3) canbe together and implemented first?
> (The uncommit will happen immediately)
I think that would be fine.


Thanks,
   Thomas


From felix.yang at huawei.com  Wed Jan 15 06:58:32 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Wed, 15 Jan 2020 06:58:32 +0000
Subject: [RFC] ZGC proposal for aarch64  jdk11u
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>

Hi,

Currently, we only have zgc for the jdk11 x86 platform.  Zgc for aarch64 platform was later added in jdk13 by Stuart from Linaro.
So it?s an interesting question whether zgc for aarch64 platform should be backported to jdk11.

    Dozens of our arm-based cloud customers are switching to jdk11 and some of them have a strong demand for the zgc feature for their business.
To satisfy that requirement, we took the action to backport this feature in our jdk11 release.
But we think this work should not be kept private.  Other jdk11 vendors may come to the same problem.
It?s appreciated if this work could be incorporated in the upstream jdk11 repo and further improved.

    We have backported the following zgc related patches to jdk11u:
https://bugs.openjdk.java.net/browse/JDK-8217745
https://bugs.openjdk.java.net/browse/JDK-8224187
https://bugs.openjdk.java.net/browse/JDK-8214527
https://bugs.openjdk.java.net/browse/JDK-8224675<javascript:;>

   Basic test such as jtreg, jcstress looks good.  Specjbb2015 test with zgc gives us anticipated max & critical score.
    I can provide more details and propose a webrev for the backport.  But before that I would like to hear your comments & suggestions.

Thanks,
Felix

From thomas.schatzl at oracle.com  Wed Jan 15 08:37:21 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 15 Jan 2020 09:37:21 +0100
Subject: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics
In-Reply-To: <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
 ,<36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>
Message-ID: <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>

Hi,

On Wed, 2020-01-15 at 11:52 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> I summarize the issues in as following:
> 
> 1. Criterion of SoftMaxHeapSize
> I agree to keep the policy of SoftMaxHeapSize similar with ZGC to
> make it unified. So "expand_heap_after_young_collection" is used for
> meeting the basic GCTimeRatio and expand heap immediately which
> cannot be blocked by any
> reasons. "adjust_heap_after_young_collection" cannot change the
> logic
> and I will take both expansion and shrink into consideration. Is my 
> understanding correct here?

Yes, ideally we would be close to ZGC in behavior with SoftMaxHeapSize.
If for some reason this does not work we may need to reconsider - but
we need a reason if possible backed by numbers/graphs of actual
behavior.

> 
> 2. Full GC with SoftMaxHeapSize
> In my thought non-explicit Full GC probably means the insufficiency 
> of heap capacity and we may not keep shrinking within SoftMaxHeapSize
> but explicit FGC don't have that issue. That's the only reason why I 

People run explicit FGC for many reasons, and the one you describe is
just one of them.

E.g. explicit FGC can be converted to a concurrent cycle or disabled
for other reasons, so having special behavior for this particular case
may just not work as intended in many cases. Users may then need to
decide then whether they want this behavor, or the system.gc-starts-
concurrent-cycle one they might also rely on.

The lone "System.gc()" call is insufficient to transport the actual
intent of the user - but that is a different issue.

> checked if it is explicit. But we will have the same determine logic
> to check if the heap can be shrinked so "explicit" check could be
> meaningless and I will remove that.

Exactly. 

> 
> 3. SoftMaxHeapSizeConstraintFunc doesn't check Xms
> The constraint function didn't make sure the SoftMaxHeapSize should
> less than Xms. Do we need to add the checking? It will not only
> affect G1...

I will check again later, but from what I remember from yesterday it
does check it at VM start (-Xms sets both minimum and initial heap
size). The constraint func does not check when the user changes the
value during runtime. So code using it must still maintain this
invariant in behavior.

> 4. commit/uncommit parallelism
> The concurrent uncommit will work with VMThread doing GC and GC may
> request to expand heap if not enough empty regions. So the
> parallelism is possible and immediate uncommit is a solution.

There may be others, but it actually seems easiest as blocking such a
request seems actually harder to implement, at least it's less
localized in the code. Completely *dropping* the request seems against
the rule that "SoftMaxHeapSize is a hint" guideline and may have other
unforeseen consequences too. Like I said, since G1 does not expand
then, there will be more GCs with the small heap, increasing the
current GCTimeRatio more than it should. Which means when ultimately
the request comes through as G1 will certainly try again, the increase
may be huge. (The increase is proportional to the difference in actual
and requested GCTimeRatio iirc).

Again, if there are good reasons to do otherwise I am open to
discussion, but it would be nice to have numbers to base decisions on.

> 4. More heap expansion/shrink heuristics further
> We have some data and experience in dynamimc heap adjustment in our
> workloads.
> The default GCTimeRatio 12 is really well tuned number that we found
> applications will have obvious timeout erros if it is less than ~12. 

It is actually *very* interesting to hear that the default G1
GCTimeRatio fits you well. Given over-time improvements in G1 gc
performance, I was already privately asking myselves whether to
decrease the default percentage, increasing this value (I hope I got
the directions right ;)) and similarly adjust the default
MaxGCPauseMillis down to reflect that from time to time.

> So it is kind of *hard* limit and we need to expand immediately if
> GCTimeRatio drops below 12. The difference in our workloads is that
> we will keep a GCTimeRatio nearly the original value 99 to make GC in

I.e. you set it to 99 at startup?

> a heathy state because allocation rate and outside input can vary
> violently that we don't want frequent adjustment. You know that in
> our 8u  implementation we just keep a conservative GC interval to
> achieve that. Comparing to the current code in JDK15, keeping
> GCTimeRatio as 99 is a different behavior which might have more
> memory footprint. 

As mentioned above, I think given that we both very thinking about
this, we might actually evaluate changing the defaults.

> I propose if we can still use the original option
> "-XX:+G1ElasticHeap" to keep the GCTimeRatio around 99 or a specified
> number. The default flow will make sure the GCTimeRatio is above the
> threshold 12 and concurrent commit/uncommit will adjust the heap to
> keep GCTimeRatio in a proper number that the adjustment is not
> urgent.

I am not completely sure what you want to achieve here or what the
problem is. I probably need to understand more about the problem and
potentially other solutions can be found.

As for a new -XX:+G1ElasticHeap option, it does not seem to make a
difference to set this or -XX:GCTimeRatio in this case (both are single
options). But I do not completely know the details here.

Thanks,
  Thomas


From maoliang.ml at alibaba-inc.com  Wed Jan 15 10:58:40 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 15 Jan 2020 18:58:40 +0800
Subject: =?UTF-8?B?UmU6IFJGUjogODIzNjA3MzogRzE6IFVzZSBTb2Z0TWF4SGVhcFNpemUgdG8gZ3VpZGUgR0Mg?=
 =?UTF-8?B?aGV1cmlzdGljcw==?=
In-Reply-To: <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>	
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>	
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>	
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>	,
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>	
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>,
 <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>
Message-ID: <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>

Hi Thomas,

>> 3. SoftMaxHeapSizeConstraintFunc doesn't check Xms
>> The constraint function didn't make sure the SoftMaxHeapSize should
>> less than Xms. Do we need to add the checking? It will not only
>> affect G1...

> I will check again later, but from what I remember from yesterday it
> does check it at VM start (-Xms sets both minimum and initial heap
> size). The constraint func does not check when the user changes the
> value during runtime. So code using it must still maintain this
> invariant in behavior.

The default constraint function will be both checked in VM startup
and during runtime via jinfo. By looking into the code, ZGC seems
to allow SoftMaxHeapSize less than Xms. So do we need to create 
another mail thread to discuss it?

>> 4. commit/uncommit parallelism
>> The concurrent uncommit will work with VMThread doing GC and GC may
>> request to expand heap if not enough empty regions. So the
>> parallelism is possible and immediate uncommit is a solution.

> There may be others, but it actually seems easiest as blocking such a
> request seems actually harder to implement, at least it's less
> localized in the code. Completely *dropping* the request seems against
> the rule that "SoftMaxHeapSize is a hint" guideline and may have other
> unforeseen consequences too. Like I said, since G1 does not expand
> then, there will be more GCs with the small heap, increasing the
> current GCTimeRatio more than it should. Which means when ultimately
> the request comes through as G1 will certainly try again, the increase
> may be huge. (The increase is proportional to the difference in actual
> and requested GCTimeRatio iirc).

> Again, if there are good reasons to do otherwise I am open to
> discussion, but it would be nice to have numbers to base decisions on.

I'm not on the side of blocking the expand request:)
G1RegionsLargerThanCommitSizeMapper can do uncommit/commit 
parallelly and G1RegionsSmallerThanCommitSizeMapper
can do uncommit/commit immediately. So I think we don't have issues
so far? 

>> So it is kind of *hard* limit and we need to expand immediately if
>> GCTimeRatio drops below 12. The difference in our workloads is that
>> we will keep a GCTimeRatio nearly the original value 99 to make GC in

>I.e. you set it to 99 at startup?

In fact we are not controlling GCTimeRatio. In a lot of applications
 running in exclusive containers we set Xms same to Xmx to avoid 
any heap expansion during runtime which might cause allocation
 stalls and timeout. 

>> I propose if we can still use the original option
>> "-XX:+G1ElasticHeap" to keep the GCTimeRatio around 99 or a specified
>> number. The default flow will make sure the GCTimeRatio is above the
>> threshold 12 and concurrent commit/uncommit will adjust the heap to
>> keep GCTimeRatio in a proper number that the adjustment is not
>> urgent.

> I am not completely sure what you want to achieve here or what the
> problem is. I probably need to understand more about the problem and
> potentially other solutions can be found.

> As for a new -XX:+G1ElasticHeap option, it does not seem to make a
> difference to set this or -XX:GCTimeRatio in this case (both are single
> options). But I do not completely know the details here.

Theoretically Java heap will not return memory in default and
ZGC/Shenandoah have options to control by "ZUncommit" and "ShenandoahUncommit"
to info user that memory can be uncommit... So I think G1 needs
the same thing as well. In my opinion, here are 2 espects. The 
default value of GCTimeRatio is the basic line so we might
 need to expand immediately to avoid frequent GCs if using 
concurrent flow. But the G1ElasticHeap is an optimization
to keep the balance of GC health and memory utility so the 
policy should be more conservative and we also need to do it
concurrently by not bringing any obvious pause overhead.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 15 (Wed.) 16:37
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On Wed, 2020-01-15 at 11:52 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> I summarize the issues in as following:
> 
> 1. Criterion of SoftMaxHeapSize
> I agree to keep the policy of SoftMaxHeapSize similar with ZGC to
> make it unified. So "expand_heap_after_young_collection" is used for
> meeting the basic GCTimeRatio and expand heap immediately which
> cannot be blocked by any
> reasons. "adjust_heap_after_young_collection" cannot change the
> logic
> and I will take both expansion and shrink into consideration. Is my 
> understanding correct here?

Yes, ideally we would be close to ZGC in behavior with SoftMaxHeapSize.
If for some reason this does not work we may need to reconsider - but
we need a reason if possible backed by numbers/graphs of actual
behavior.

> 
> 2. Full GC with SoftMaxHeapSize
> In my thought non-explicit Full GC probably means the insufficiency 
> of heap capacity and we may not keep shrinking within SoftMaxHeapSize
> but explicit FGC don't have that issue. That's the only reason why I 

People run explicit FGC for many reasons, and the one you describe is
just one of them.

E.g. explicit FGC can be converted to a concurrent cycle or disabled
for other reasons, so having special behavior for this particular case
may just not work as intended in many cases. Users may then need to
decide then whether they want this behavor, or the system.gc-starts-
concurrent-cycle one they might also rely on.

The lone "System.gc()" call is insufficient to transport the actual
intent of the user - but that is a different issue.

> checked if it is explicit. But we will have the same determine logic
> to check if the heap can be shrinked so "explicit" check could be
> meaningless and I will remove that.

Exactly. 

> 
> 3. SoftMaxHeapSizeConstraintFunc doesn't check Xms
> The constraint function didn't make sure the SoftMaxHeapSize should
> less than Xms. Do we need to add the checking? It will not only
> affect G1...

I will check again later, but from what I remember from yesterday it
does check it at VM start (-Xms sets both minimum and initial heap
size). The constraint func does not check when the user changes the
value during runtime. So code using it must still maintain this
invariant in behavior.

> 4. commit/uncommit parallelism
> The concurrent uncommit will work with VMThread doing GC and GC may
> request to expand heap if not enough empty regions. So the
> parallelism is possible and immediate uncommit is a solution.

There may be others, but it actually seems easiest as blocking such a
request seems actually harder to implement, at least it's less
localized in the code. Completely *dropping* the request seems against
the rule that "SoftMaxHeapSize is a hint" guideline and may have other
unforeseen consequences too. Like I said, since G1 does not expand
then, there will be more GCs with the small heap, increasing the
current GCTimeRatio more than it should. Which means when ultimately
the request comes through as G1 will certainly try again, the increase
may be huge. (The increase is proportional to the difference in actual
and requested GCTimeRatio iirc).

Again, if there are good reasons to do otherwise I am open to
discussion, but it would be nice to have numbers to base decisions on.

> 4. More heap expansion/shrink heuristics further
> We have some data and experience in dynamimc heap adjustment in our
> workloads.
> The default GCTimeRatio 12 is really well tuned number that we found
> applications will have obvious timeout erros if it is less than ~12. 

It is actually *very* interesting to hear that the default G1
GCTimeRatio fits you well. Given over-time improvements in G1 gc
performance, I was already privately asking myselves whether to
decrease the default percentage, increasing this value (I hope I got
the directions right ;)) and similarly adjust the default
MaxGCPauseMillis down to reflect that from time to time.

> So it is kind of *hard* limit and we need to expand immediately if
> GCTimeRatio drops below 12. The difference in our workloads is that
> we will keep a GCTimeRatio nearly the original value 99 to make GC in

I.e. you set it to 99 at startup?

> a heathy state because allocation rate and outside input can vary
> violently that we don't want frequent adjustment. You know that in
> our 8u  implementation we just keep a conservative GC interval to
> achieve that. Comparing to the current code in JDK15, keeping
> GCTimeRatio as 99 is a different behavior which might have more
> memory footprint. 

As mentioned above, I think given that we both very thinking about
this, we might actually evaluate changing the defaults.

> I propose if we can still use the original option
> "-XX:+G1ElasticHeap" to keep the GCTimeRatio around 99 or a specified
> number. The default flow will make sure the GCTimeRatio is above the
> threshold 12 and concurrent commit/uncommit will adjust the heap to
> keep GCTimeRatio in a proper number that the adjustment is not
> urgent.

I am not completely sure what you want to achieve here or what the
problem is. I probably need to understand more about the problem and
potentially other solutions can be found.

As for a new -XX:+G1ElasticHeap option, it does not seem to make a
difference to set this or -XX:GCTimeRatio in this case (both are single
options). But I do not completely know the details here.

Thanks,
  Thomas


From thomas.schatzl at oracle.com  Wed Jan 15 11:44:18 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 15 Jan 2020 12:44:18 +0100
Subject: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics
In-Reply-To: <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>
 <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>
 <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>
Message-ID: <9d21b384-790b-c2e2-d801-0025c9257656@oracle.com>

Hi,

On 15.01.20 11:58, Liang Mao wrote:
> Hi Thomas,
> 
>>>?3.?SoftMaxHeapSizeConstraintFunc?doesn't?check?Xms
>>>?The?constraint?function?didn't?make?sure?the?SoftMaxHeapSize?should
>>> less?than?Xms.?Do?we?need?to?add?the?checking??It?will?not?only
>>> affect?G1...
> 
>> I?will?check?again?later,?but?from?what?I?remember?from?yesterday?it
>> does?check?it?at?VM?start?(-Xms?sets?both?minimum?and?initial?heap
>> size).?The?constraint?func?does?not?check?when?the?user?changes?the
>> value?during?runtime.?So?code?using?it?must?still?maintain?this
>> invariant?in?behavior.
> 
> The default constraint function will be both checked in VM startup
> and during runtime via jinfo. By looking into the code, ZGC seems
> to allow SoftMaxHeapSize less than Xms. So do we need to create
> another mail thread to discuss it?

Colleagues mentioned that ZGC allows setting SoftMaxHeapSize below 
MinheapSize, but does not uncommit memory below it.

I do not see a problem for allowing the user set SoftMaxHeapSize below 
MinHeapSize so it may have limited use. If jinfo prevents this too, then 
it seems that the code can assume that SoftMaxHeapSize is within 
Min/MaxHeapSize.

> 
>>>?4.?commit/uncommit?parallelism
>>>?The?concurrent?uncommit?will?work?with?VMThread?doing?GC?and?GC?may
>>>?request?to?expand?heap?if?not?enough?empty?regions.?So?the
>>>?parallelism?is?possible?and?immediate?uncommit?is?a?solution.
> 
>> There?may?be?others,?but?it?actually?seems?easiest?as?blocking?such?a
>> request?seems?actually?harder?to?implement,?at?least?it's?less
>> localized?in?the?code.?Completely?*dropping*?the?request?seems?against
>> the?rule?that?"SoftMaxHeapSize?is?a?hint"?guideline?and?may?have?other
>> unforeseen?consequences?too.?Like?I?said,?since?G1?does?not?expand
>> then,?there?will?be?more?GCs?with?the?small?heap,?increasing?the
>> current?GCTimeRatio?more?than?it?should.?Which?means?when?ultimately
>> the?request?comes?through?as?G1?will?certainly?try?again,?the?increase
>> may?be?huge.?(The?increase?is?proportional?to?the?difference?in?actual
>> and?requested?GCTimeRatio?iirc).
> 
>> Again,?if?there?are?good?reasons?to?do?otherwise?I?am?open?to
>> discussion,?but?it?would?be?nice?to?have?numbers?to?base?decisions?on.
> 
> I'm not on the side of blocking the expand request:)
> G1RegionsLargerThanCommitSizeMapper can do uncommit/commit
> parallelly and G1RegionsSmallerThanCommitSizeMapper
> can do uncommit/commit immediately. So I think we don't have issues
> so far?

:)

> 
>>>?So?it?is?kind?of?*hard*?limit?and?we?need?to?expand?immediately?if
>>>?GCTimeRatio?drops?below?12.?The?difference?in?our?workloads?is?that
>>>?we?will?keep?a?GCTimeRatio?nearly?the?original?value?99?to?make?GC?in
> 
>>I.e.?you?set?it?to?99?at?startup?
> 
> In fact we are not controlling GCTimeRatio. In a lot of applications
> running in exclusive containers we set Xms same to Xmx to avoid
> any heap expansion during runtime which might cause allocation
> stalls and timeout.

Okay.

> 
>>>?I?propose?if?we?can?still?use?the?original?option
>>>?"-XX:+G1ElasticHeap"?to?keep?the?GCTimeRatio?around?99?or?a?specified
>>>?number.?The?default?flow?will?make?sure?the?GCTimeRatio?is?above?the
>>>?threshold?12?and?concurrent?commit/uncommit?will?adjust?the?heap?to
>>>?keep?GCTimeRatio?in?a?proper?number?that?the?adjustment?is?not
>>>?urgent.
> 
>> I?am?not?completely?sure?what?you?want?to?achieve?here?or?what?the
>> problem?is.?I?probably?need?to?understand?more?about?the?problem?and
>> potentially?other?solutions?can?be?found.
> 
>> As?for?a?new?-XX:+G1ElasticHeap?option,?it?does?not?seem?to?make?a
>> difference?to?set?this?or?-XX:GCTimeRatio?in?this?case?(both?are?single
>> options).?But?I?do?not?completely?know?the?details?here.
> 
> Theoretically Java heap will not return memory in default and
> ZGC/Shenandoah have options to control by "ZUncommit" and 
> "ShenandoahUncommit"
> to info user that memory can be uncommit... So I think G1 needs
> the same thing as well. In my opinion, here are 2 espects. The

G1 uncommits unused memory by default since a long time ago. There is no 
flag to disable this behavior except setting -Xms == -Xmx. The policies 
when are also different (using Min/MaxHeapFreeRatio) compared to other 
collectors.

However only lately (JDK12 or 13) it does so at the end of the Remark 
pause - earlier it only did so after full gc.

The changes provided also enable shrinking of the heap during most young 
GCs.

It may be a problem that full gcs (including "concurrent full gc") and 
young gcs use a different policy btw as occurred to me yesterday after 
sending the email. That's something to explore.

> default value of?GCTimeRatio?is the basic line so we might
> need to expand immediately to avoid frequent GCs if using
> concurrent flow. But the G1ElasticHeap is an optimization
> to keep the balance of GC health and memory utility so the
> policy should?be more conservative and we also need to do it
> concurrently by not bringing any obvious pause overhead.
> 

Changing GCTimeRatio to a higher value should improve the response time 
on memory needs. The changes provided by you are also going to fix the 
concurrent (un-)commit.

Thanks,
   Thomas


From per.liden at oracle.com  Wed Jan 15 12:07:42 2020
From: per.liden at oracle.com (Per Liden)
Date: Wed, 15 Jan 2020 13:07:42 +0100
Subject: [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
Message-ID: <38a15dc5-9cee-0f44-13ee-98f185ee72ae@oracle.com>

Hi,

Please note that backporting JDK-8224675 "Late GC barrier insertion for 
ZGC" is not great idea, since that patch introduced stability issues and 
the whole approach was later superseded by JDK-8230565 "ZGC: Redesign C2 
load barrier to expand on the MachNode level".

If you want to go down this path, I'd suggest that you either don't 
backport JDK-8224675 at all, or backport everything up to JDK-8224675 + 
JDK-8230565. Also note that if you include JDK-8230565 you want to be 
careful to also include any followup bug fixes, like JDK-8233506.

In general, a lot of stability and performance improvements have gone 
into ZGC since JDK 11. If at all possible, I would strongly recommend 
using JDK 14 instead, where you already have aarch64 support and all 
other goodies.

cheers,
Per

On 1/15/20 7:58 AM, Yangfei (Felix) wrote:
> Hi,
> 
> Currently, we only have zgc for the jdk11 x86 platform.  Zgc for aarch64 platform was later added in jdk13 by Stuart from Linaro.
> So it?s an interesting question whether zgc for aarch64 platform should be backported to jdk11.
> 
>      Dozens of our arm-based cloud customers are switching to jdk11 and some of them have a strong demand for the zgc feature for their business.
> To satisfy that requirement, we took the action to backport this feature in our jdk11 release.
> But we think this work should not be kept private.  Other jdk11 vendors may come to the same problem.
> It?s appreciated if this work could be incorporated in the upstream jdk11 repo and further improved.
> 
>      We have backported the following zgc related patches to jdk11u:
> https://bugs.openjdk.java.net/browse/JDK-8217745
> https://bugs.openjdk.java.net/browse/JDK-8224187
> https://bugs.openjdk.java.net/browse/JDK-8214527
> https://bugs.openjdk.java.net/browse/JDK-8224675<javascript:;>
> 
>     Basic test such as jtreg, jcstress looks good.  Specjbb2015 test with zgc gives us anticipated max & critical score.
>      I can provide more details and propose a webrev for the backport.  But before that I would like to hear your comments & suggestions.
> 
> Thanks,
> Felix
> 


From maoliang.ml at alibaba-inc.com  Wed Jan 15 12:53:20 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 15 Jan 2020 20:53:20 +0800
Subject: =?UTF-8?B?UmU6IFJGUjogODIzNjA3MzogRzE6IFVzZSBTb2Z0TWF4SGVhcFNpemUgdG8gZ3VpZGUgR0Mg?=
 =?UTF-8?B?aGV1cmlzdGljcw==?=
In-Reply-To: <9d21b384-790b-c2e2-d801-0025c9257656@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>
 <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>
 <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>,
 <9d21b384-790b-c2e2-d801-0025c9257656@oracle.com>
Message-ID: <693b04b7-d13d-4ef5-b425-febc81984dbc.maoliang.ml@alibaba-inc.com>

Hi Thomas,

So G1 doesn't need to shrink below Xms if SoftMaxHeapSize is 
below Xms, does it?

Another question is that no matter we have an additional option we 
had better have 2 criterions. The first is for urgent expansion that
 GCTimeRatio is quite low and concurrent expansion with frequent GCs 
is more harmful and expansion should be done immediately. It's the current
default flow as we found that 12 is a good number below which applications
can obviously incur timeout errors. The second is to keep the GCTimeRatio
 and memory footprint in a balanced state so any adjustments are better
 to be concurrent. The original number 99 fits well here.
 If we have only one option "GCTimeRatio", we might not be able to
achieve both. Maybe we can have a LowGCTimeRatio below which suppose to be
not acceptable and a HighTimeRatio which is certainly healthy. 

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 15 (Wed.) 19:44
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On 15.01.20 11:58, Liang Mao wrote:
> Hi Thomas,
> 
>>> 3. SoftMaxHeapSizeConstraintFunc doesn't check Xms
>>> The constraint function didn't make sure the SoftMaxHeapSize should
>>> less than Xms. Do we need to add the checking? It will not only
>>> affect G1...
> 
>> I will check again later, but from what I remember from yesterday it
>> does check it at VM start (-Xms sets both minimum and initial heap
>> size). The constraint func does not check when the user changes the
>> value during runtime. So code using it must still maintain this
>> invariant in behavior.
> 
> The default constraint function will be both checked in VM startup
> and during runtime via jinfo. By looking into the code, ZGC seems
> to allow SoftMaxHeapSize less than Xms. So do we need to create
> another mail thread to discuss it?

Colleagues mentioned that ZGC allows setting SoftMaxHeapSize below 
MinheapSize, but does not uncommit memory below it.

I do not see a problem for allowing the user set SoftMaxHeapSize below 
MinHeapSize so it may have limited use. If jinfo prevents this too, then 
it seems that the code can assume that SoftMaxHeapSize is within 
Min/MaxHeapSize.

> 
>>> 4. commit/uncommit parallelism
>>> The concurrent uncommit will work with VMThread doing GC and GC may
>>> request to expand heap if not enough empty regions. So the
>>> parallelism is possible and immediate uncommit is a solution.
> 
>> There may be others, but it actually seems easiest as blocking such a
>> request seems actually harder to implement, at least it's less
>> localized in the code. Completely *dropping* the request seems against
>> the rule that "SoftMaxHeapSize is a hint" guideline and may have other
>> unforeseen consequences too. Like I said, since G1 does not expand
>> then, there will be more GCs with the small heap, increasing the
>> current GCTimeRatio more than it should. Which means when ultimately
>> the request comes through as G1 will certainly try again, the increase
>> may be huge. (The increase is proportional to the difference in actual
>> and requested GCTimeRatio iirc).
> 
>> Again, if there are good reasons to do otherwise I am open to
>> discussion, but it would be nice to have numbers to base decisions on.
> 
> I'm not on the side of blocking the expand request:)
> G1RegionsLargerThanCommitSizeMapper can do uncommit/commit
> parallelly and G1RegionsSmallerThanCommitSizeMapper
> can do uncommit/commit immediately. So I think we don't have issues
> so far?

:)

> 
>>> So it is kind of *hard* limit and we need to expand immediately if
>>> GCTimeRatio drops below 12. The difference in our workloads is that
>>> we will keep a GCTimeRatio nearly the original value 99 to make GC in
> 
>>I.e. you set it to 99 at startup?
> 
> In fact we are not controlling GCTimeRatio. In a lot of applications
> running in exclusive containers we set Xms same to Xmx to avoid
> any heap expansion during runtime which might cause allocation
> stalls and timeout.

Okay.

> 
>>> I propose if we can still use the original option
>>> "-XX:+G1ElasticHeap" to keep the GCTimeRatio around 99 or a specified
>>> number. The default flow will make sure the GCTimeRatio is above the
>>> threshold 12 and concurrent commit/uncommit will adjust the heap to
>>> keep GCTimeRatio in a proper number that the adjustment is not
>>> urgent.
> 
>> I am not completely sure what you want to achieve here or what the
>> problem is. I probably need to understand more about the problem and
>> potentially other solutions can be found.
> 
>> As for a new -XX:+G1ElasticHeap option, it does not seem to make a
>> difference to set this or -XX:GCTimeRatio in this case (both are single
>> options). But I do not completely know the details here.
> 
> Theoretically Java heap will not return memory in default and
> ZGC/Shenandoah have options to control by "ZUncommit" and 
> "ShenandoahUncommit"
> to info user that memory can be uncommit... So I think G1 needs
> the same thing as well. In my opinion, here are 2 espects. The

G1 uncommits unused memory by default since a long time ago. There is no 
flag to disable this behavior except setting -Xms == -Xmx. The policies 
when are also different (using Min/MaxHeapFreeRatio) compared to other 
collectors.

However only lately (JDK12 or 13) it does so at the end of the Remark 
pause - earlier it only did so after full gc.

The changes provided also enable shrinking of the heap during most young 
GCs.

It may be a problem that full gcs (including "concurrent full gc") and 
young gcs use a different policy btw as occurred to me yesterday after 
sending the email. That's something to explore.

> default value of GCTimeRatio is the basic line so we might
> need to expand immediately to avoid frequent GCs if using
> concurrent flow. But the G1ElasticHeap is an optimization
> to keep the balance of GC health and memory utility so the
> policy should be more conservative and we also need to do it
> concurrently by not bringing any obvious pause overhead.
> 

Changing GCTimeRatio to a higher value should improve the response time 
on memory needs. The changes provided by you are also going to fix the 
concurrent (un-)commit.

Thanks,
   Thomas


From per.liden at oracle.com  Wed Jan 15 12:57:06 2020
From: per.liden at oracle.com (Per Liden)
Date: Wed, 15 Jan 2020 13:57:06 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
Message-ID: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>

Hi,

Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer, 
which aims to de-duplicate some of the multi-mapping code. I've split 
the change into three separate patches, the main patch followed by two 
patches doing some renaming.


1) The ZBackingFile code was designed to allow platforms to decide if 
they want to use heap multi-mapping or some other (possibly HW 
supported) scheme. As of today, all our supported platforms do heap 
multi-mapping, so there's some degree of code duplication in 
ZBackingFile for each platform. This patch moves common multi-mapping 
code into ZPhysicalMemoryManager. If we in the future find that we want 
to support a platform that doesn't do multi-mapping, then we can 
introduce an abstraction for this again.
Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0


2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is 
somewhat misleading on platforms other than Linux.
Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0


3) Rename ZBackingPath to ZMountPoint, as it's a better name in light of 
JDK-8237199.
Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0


cheers,
Per


From per.liden at oracle.com  Wed Jan 15 13:03:39 2020
From: per.liden at oracle.com (Per Liden)
Date: Wed, 15 Jan 2020 14:03:39 +0100
Subject: RFR: 8237201: ZGC: Remove unused
 ZRelocationSetSelector::fragmentation()
Message-ID: <35727b46-b4d8-8336-b484-1119bff15468@oracle.com>

ZRelocationSetSelector::fragmentation() is not used and can be removed.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237201
Webrev: http://cr.openjdk.java.net/~pliden/8237201/webrev.0

/Per


From stuart.monteith at linaro.org  Wed Jan 15 13:10:57 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Wed, 15 Jan 2020 13:10:57 +0000
Subject: [aarch64-port-dev ] [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
Message-ID: <CAEGA6kZzXgo-BozGHtWnpn6vw1RPrhFL_s42cG9Vu9vK4-C6hQ@mail.gmail.com>

Hello Felix,
     I'm pleased that there is interest in ZGC on aarch64, that the
performance is at expected levels and is apparently trouble free.
However, I'd like to understand why this backporting is being done. If
it is for running in production, then I'd expect Per, etc, to not be
upset or disagree when I that ZGC on aarch64 in JDK 13 isn't
production ready. I understand it would be easier to test with an
existing software stack on top of JDK 11 rather than moving onto JDK
13, etc. However, ZGC is an experimental VM feature, and the model
OpenJDK has moved to is for 6 monthly releases. Per and his team have
made lots of improvements, and fixes, in ZGC since 11, so I would
expect people to run and test on current release to avoid hitting
already known issues, whether that be on x86 or aarch64.

There might be a possibility of backporting some features, if there is
no effect when they are disabled. However, it is probably better to
track ZGC development through the released versions as a single
backport will not be enough, I wouldn't want to burden Per's team with
maintaining something that is in as much development as ZGC at this
point.

BR,
   Stuart

On Wed, 15 Jan 2020 at 06:59, Yangfei (Felix) <felix.yang at huawei.com> wrote:
>
> Hi,
>
> Currently, we only have zgc for the jdk11 x86 platform.  Zgc for aarch64 platform was later added in jdk13 by Stuart from Linaro.
> So it?s an interesting question whether zgc for aarch64 platform should be backported to jdk11.
>
>     Dozens of our arm-based cloud customers are switching to jdk11 and some of them have a strong demand for the zgc feature for their business.
> To satisfy that requirement, we took the action to backport this feature in our jdk11 release.
> But we think this work should not be kept private.  Other jdk11 vendors may come to the same problem.
> It?s appreciated if this work could be incorporated in the upstream jdk11 repo and further improved.
>
>     We have backported the following zgc related patches to jdk11u:
> https://bugs.openjdk.java.net/browse/JDK-8217745
> https://bugs.openjdk.java.net/browse/JDK-8224187
> https://bugs.openjdk.java.net/browse/JDK-8214527
> https://bugs.openjdk.java.net/browse/JDK-8224675<javascript:;>
>
>    Basic test such as jtreg, jcstress looks good.  Specjbb2015 test with zgc gives us anticipated max & critical score.
>     I can provide more details and propose a webrev for the backport.  But before that I would like to hear your comments & suggestions.
>
> Thanks,
> Felix


From thomas.schatzl at oracle.com  Wed Jan 15 15:05:15 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 15 Jan 2020 16:05:15 +0100
Subject: RFR: 8237201: ZGC: Remove unused
 ZRelocationSetSelector::fragmentation()
In-Reply-To: <35727b46-b4d8-8336-b484-1119bff15468@oracle.com>
References: <35727b46-b4d8-8336-b484-1119bff15468@oracle.com>
Message-ID: <1a626930-3be4-4976-8738-9f3d716873ce@oracle.com>

Hi,

On 15.01.20 14:03, Per Liden wrote:
> ZRelocationSetSelector::fragmentation() is not used and can be removed.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237201
> Webrev: http://cr.openjdk.java.net/~pliden/8237201/webrev.0
> 
> /Per

   looks good.

Thomas


From thomas.schatzl at oracle.com  Wed Jan 15 17:57:49 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 15 Jan 2020 18:57:49 +0100
Subject: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics
In-Reply-To: <693b04b7-d13d-4ef5-b425-febc81984dbc.maoliang.ml@alibaba-inc.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>
 <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>
 <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>
 ,<9d21b384-790b-c2e2-d801-0025c9257656@oracle.com>
 <693b04b7-d13d-4ef5-b425-febc81984dbc.maoliang.ml@alibaba-inc.com>
Message-ID: <fa3ea91209d45b143f40840aec6c84143b8e0859.camel@oracle.com>

Hi,

On Wed, 2020-01-15 at 20:53 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> So G1 doesn't need to shrink below Xms if SoftMaxHeapSize is 
> below Xms, does it?
> 

No, never shrink below MinHeapSize.

> Another question is that no matter we have an additional option we 
> had better have 2 criterions. The first is for urgent expansion that
> GCTimeRatio is quite low and concurrent expansion with frequent GCs 
> is more harmful and expansion should be done immediately. It's the
> current default flow as we found that 12 is a good number below which
> applications can obviously incur timeout errors. The second is to
> keep the GCTimeRatio and memory footprint in a balanced state so
> any adjustments are better to be concurrent. The original number 99
> fits well here. If we have only one option "GCTimeRatio", we might
> not be able to achieve both. Maybe we can have a LowGCTimeRatio below
> which suppose to be not acceptable and a HighTimeRatio which is
> certainly healthy. 

So far the change has been about shrinking the heap concurrently, and
not expansion.

Let's concentrate on the issue at hand, i.e. see how heap shrinking at
more places turns out. I believe there will be lots of tweaking needed
for this change to not show too many regressions in other applications.
Remember that the defaults should work well for a large body of
applications, not just a few. There may be knobs to tune it for others.

Then look concurrent expansion, at application phase changes in the
application, how to detect, and how to react best.

Just for reference, last time we changed the sizing algorithm it took a
few months to get it "right", with mostly improvements all around.

Thanks,
  Thomas


From aph at redhat.com  Wed Jan 15 18:03:54 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 15 Jan 2020 18:03:54 +0000
Subject: [aarch64-port-dev ] [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <CAEGA6kZzXgo-BozGHtWnpn6vw1RPrhFL_s42cG9Vu9vK4-C6hQ@mail.gmail.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
 <CAEGA6kZzXgo-BozGHtWnpn6vw1RPrhFL_s42cG9Vu9vK4-C6hQ@mail.gmail.com>
Message-ID: <6b716737-ba03-71c2-5488-b5654093c447@redhat.com>

On 1/15/20 1:10 PM, Stuart Monteith wrote:
>      I'm pleased that there is interest in ZGC on aarch64, that the
> performance is at expected levels and is apparently trouble free.
> However, I'd like to understand why this backporting is being done. If
> it is for running in production, then I'd expect Per, etc, to not be
> upset or disagree when I that ZGC on aarch64 in JDK 13 isn't
> production ready.

In particular, it's perhaps odd that something which is still an experimental
feature in mainline is being considered for a backport.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From shade at redhat.com  Wed Jan 15 18:48:28 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 15 Jan 2020 19:48:28 +0100
Subject: RFR (T) 8237217: Incorrect G1StringDedupEntry type used in
 StringDedupTable destructor
Message-ID: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8237217

Spotted this when reading the strdedup code. This is a trivial leftover from JDK-8203641.
G1StringDedupEntry symbol does not even exist, and the whole thing works because FREE_C_HEAP_ARRAY
ignores that parameter. But it should be consistent anyway with constructor anyway.

I would not bother with jdk-submit testing, as it looks pretty trivial.

Fix:

diff -r f7edb9ca045c src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp
--- a/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Fri Jan 10 15:38:25 2020 +0100
+++ b/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Wed Jan 15 19:47:47 2020 +0100
@@ -234,11 +234,11 @@
   _buckets = NEW_C_HEAP_ARRAY(StringDedupEntry*, _size, mtGC);
   memset(_buckets, 0, _size * sizeof(StringDedupEntry*));
 }

 StringDedupTable::~StringDedupTable() {
-  FREE_C_HEAP_ARRAY(G1StringDedupEntry*, _buckets);
+  FREE_C_HEAP_ARRAY(StringDedupEntry*, _buckets);
 }

Testing: x86_64 fastdebug build; hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From kim.barrett at oracle.com  Wed Jan 15 18:51:03 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 15 Jan 2020 13:51:03 -0500
Subject: RFR (T) 8237217: Incorrect G1StringDedupEntry type used in
 StringDedupTable destructor
In-Reply-To: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>
References: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>
Message-ID: <A9F7040F-70EC-46B5-9D62-5206527352C1@oracle.com>

> On Jan 15, 2020, at 1:48 PM, Aleksey Shipilev <shade at redhat.com> wrote:
> 
> Bug:
>  https://bugs.openjdk.java.net/browse/JDK-8237217
> 
> Spotted this when reading the strdedup code. This is a trivial leftover from JDK-8203641.
> G1StringDedupEntry symbol does not even exist, and the whole thing works because FREE_C_HEAP_ARRAY
> ignores that parameter. But it should be consistent anyway with constructor anyway.
> 
> I would not bother with jdk-submit testing, as it looks pretty trivial.

Looks good, and trivial.

> 
> Fix:
> 
> diff -r f7edb9ca045c src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp
> --- a/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Fri Jan 10 15:38:25 2020 +0100
> +++ b/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Wed Jan 15 19:47:47 2020 +0100
> @@ -234,11 +234,11 @@
>   _buckets = NEW_C_HEAP_ARRAY(StringDedupEntry*, _size, mtGC);
>   memset(_buckets, 0, _size * sizeof(StringDedupEntry*));
> }
> 
> StringDedupTable::~StringDedupTable() {
> -  FREE_C_HEAP_ARRAY(G1StringDedupEntry*, _buckets);
> +  FREE_C_HEAP_ARRAY(StringDedupEntry*, _buckets);
> }
> 
> Testing: x86_64 fastdebug build; hotspot_gc_shenandoah
> 
> -- 
> Thanks,
> -Aleksey


From zgu at redhat.com  Wed Jan 15 18:51:50 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 15 Jan 2020 13:51:50 -0500
Subject: RFR (T) 8237217: Incorrect G1StringDedupEntry type used in
 StringDedupTable destructor
In-Reply-To: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>
References: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>
Message-ID: <161f8838-c3f7-e718-d72e-8990809bed26@redhat.com>

Good and trivial.

-Zhengyu

On 1/15/20 1:48 PM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8237217
> 
> Spotted this when reading the strdedup code. This is a trivial leftover from JDK-8203641.
> G1StringDedupEntry symbol does not even exist, and the whole thing works because FREE_C_HEAP_ARRAY
> ignores that parameter. But it should be consistent anyway with constructor anyway.
> 
> I would not bother with jdk-submit testing, as it looks pretty trivial.
> 
> Fix:
> 
> diff -r f7edb9ca045c src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp
> --- a/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Fri Jan 10 15:38:25 2020 +0100
> +++ b/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp   Wed Jan 15 19:47:47 2020 +0100
> @@ -234,11 +234,11 @@
>     _buckets = NEW_C_HEAP_ARRAY(StringDedupEntry*, _size, mtGC);
>     memset(_buckets, 0, _size * sizeof(StringDedupEntry*));
>   }
> 
>   StringDedupTable::~StringDedupTable() {
> -  FREE_C_HEAP_ARRAY(G1StringDedupEntry*, _buckets);
> +  FREE_C_HEAP_ARRAY(StringDedupEntry*, _buckets);
>   }
> 
> Testing: x86_64 fastdebug build; hotspot_gc_shenandoah
> 


From shade at redhat.com  Wed Jan 15 19:05:11 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 15 Jan 2020 20:05:11 +0100
Subject: RFR (T) 8237217: Incorrect G1StringDedupEntry type used in
 StringDedupTable destructor
In-Reply-To: <A9F7040F-70EC-46B5-9D62-5206527352C1@oracle.com>
References: <1a60a3d0-7329-b919-cd3d-3dcddfa50b8e@redhat.com>
 <A9F7040F-70EC-46B5-9D62-5206527352C1@oracle.com>
Message-ID: <506fa331-792c-8550-a546-2ed1a8d1f279@redhat.com>

On 1/15/20 7:51 PM, Kim Barrett wrote:
>> On Jan 15, 2020, at 1:48 PM, Aleksey Shipilev <shade at redhat.com> wrote:
>>
>> Bug:
>>  https://bugs.openjdk.java.net/browse/JDK-8237217
>>
>> Spotted this when reading the strdedup code. This is a trivial leftover from JDK-8203641.
>> G1StringDedupEntry symbol does not even exist, and the whole thing works because FREE_C_HEAP_ARRAY
>> ignores that parameter. But it should be consistent anyway with constructor anyway.
>>
>> I would not bother with jdk-submit testing, as it looks pretty trivial.
> 
> Looks good, and trivial.
Thanks, pushed.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Wed Jan 15 19:17:24 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 15 Jan 2020 20:17:24 +0100
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
 <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>
Message-ID: <6471ea70-e89f-17ef-9585-20f4c16a3e23@redhat.com>

On 1/14/20 6:19 PM, Zhengyu Gu wrote:
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html

It is odd to mix the atomic update and locked update. We can lose locked updates that do not expect
anyone to modify the field when lock is held. It is probably fine for _entries_removed, as it is
used for statistics. It seems riskier to do for _table->_entries: are we sure nothing in the String
dedup table relies on that being very accurate?

Can you explain a little bit why we cannot block on StringDedupTable_lock here? Is this a reentrancy
issue?

-- 
Thanks,
-Aleksey


From shade at redhat.com  Wed Jan 15 19:50:57 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 15 Jan 2020 20:50:57 +0100
Subject: RFR (XS) 8237223: Shenandoah: important flags should not be ergonomic
 for concurrent class unloading
Message-ID: <bc757b91-09c4-beb6-9151-9aa21d8447cf@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8237223

Fix:

diff -r 53b6aad22933 src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp  Wed Jan 15 20:04:51 2020 +0100
+++ b/src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp  Wed Jan 15 20:49:30 2020 +0100
@@ -34,10 +34,11 @@

 void ShenandoahNormalMode::initialize_flags() const {
+  if (ShenandoahConcurrentRoots::can_do_concurrent_class_unloading()) {
+    FLAG_SET_DEFAULT(ShenandoahSuspendibleWorkers, true);
+    FLAG_SET_DEFAULT(VerifyBeforeExit, false);
+  }
+
   SHENANDOAH_ERGO_ENABLE_FLAG(ExplicitGCInvokesConcurrent);
   SHENANDOAH_ERGO_ENABLE_FLAG(ShenandoahImplicitGCInvokesConcurrent);
-  if (ShenandoahConcurrentRoots::can_do_concurrent_class_unloading()) {
-    SHENANDOAH_ERGO_ENABLE_FLAG(ShenandoahSuspendibleWorkers);
-    SHENANDOAH_ERGO_DISABLE_FLAG(VerifyBeforeExit);
-  }

Testing: hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Wed Jan 15 20:03:44 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 15 Jan 2020 15:03:44 -0500
Subject: RFR (XS) 8237223: Shenandoah: important flags should not be
 ergonomic for concurrent class unloading
In-Reply-To: <bc757b91-09c4-beb6-9151-9aa21d8447cf@redhat.com>
References: <bc757b91-09c4-beb6-9151-9aa21d8447cf@redhat.com>
Message-ID: <d52fc8bc-767b-eb23-940c-22b43fe2e8af@redhat.com>

Ah, okay. Looks good to me.

Thanks,

-Zhengyu

On 1/15/20 2:50 PM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8237223
> 
> Fix:
> 
> diff -r 53b6aad22933 src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp  Wed Jan 15 20:04:51 2020 +0100
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahNormalMode.cpp  Wed Jan 15 20:49:30 2020 +0100
> @@ -34,10 +34,11 @@
> 
>   void ShenandoahNormalMode::initialize_flags() const {
> +  if (ShenandoahConcurrentRoots::can_do_concurrent_class_unloading()) {
> +    FLAG_SET_DEFAULT(ShenandoahSuspendibleWorkers, true);
> +    FLAG_SET_DEFAULT(VerifyBeforeExit, false);
> +  }
> +
>     SHENANDOAH_ERGO_ENABLE_FLAG(ExplicitGCInvokesConcurrent);
>     SHENANDOAH_ERGO_ENABLE_FLAG(ShenandoahImplicitGCInvokesConcurrent);
> -  if (ShenandoahConcurrentRoots::can_do_concurrent_class_unloading()) {
> -    SHENANDOAH_ERGO_ENABLE_FLAG(ShenandoahSuspendibleWorkers);
> -    SHENANDOAH_ERGO_DISABLE_FLAG(VerifyBeforeExit);
> -  }
> 
> Testing: hotspot_gc_shenandoah
> 


From zgu at redhat.com  Wed Jan 15 21:29:34 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 15 Jan 2020 16:29:34 -0500
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <6471ea70-e89f-17ef-9585-20f4c16a3e23@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
 <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>
 <6471ea70-e89f-17ef-9585-20f4c16a3e23@redhat.com>
Message-ID: <8eeccdc6-960b-591a-d1b1-42bb50f868ad@redhat.com>


On 1/15/20 2:17 PM, Aleksey Shipilev wrote:
> On 1/14/20 6:19 PM, Zhengyu Gu wrote:
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236878
>>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.00/index.html
> 
> It is odd to mix the atomic update and locked update. We can lose locked updates that do not expect
> anyone to modify the field when lock is held. It is probably fine for _entries_removed, as it is
> used for statistics. It seems riskier to do for _table->_entries: are we sure nothing in the String
> dedup table relies on that being very accurate?

Atomic update and locked update do not overlap. The counter updates only 
happens at a safepoint (before this patch) or with StringDedupTable_lock 
held (with this patch). Added following patch to make it more obvious.

diff -r 7ce7d01e68ec 
src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp
--- a/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp 
Wed Jan 15 14:37:34 2020 -0500
+++ b/src/hotspot/share/gc/shared/stringdedup/stringDedupTable.cpp 
Wed Jan 15 14:48:21 2020 -0500
@@ -479,6 +479,7 @@

    // Delayed update to avoid contention on the table lock
    if (removed > 0) {
+    assert_locked_or_safepoint_weak(StringDedupTable_lock);
      Atomic::sub(&_table->_entries, removed);
      Atomic::add(&_entries_removed, removed);
    }


> 
> Can you explain a little bit why we cannot block on StringDedupTable_lock here? Is this a reentrancy
> issue?

Table rehashing is part of cleanup, and it utilizes workers to perform 
parallel rehashing, therefore, it needs to use lock or atomic operation 
to update entry counters from each worker.

Concurrent string dedup cleaning task need to take StringDedupTable_lock 
to avoid modification to the table from mutators, so that, workers can 
not acquire the lock. Otherwise, deadlock.

As far as I know, _table->_entries is only used to make rehashing 
decision, so there is no blocking requirement.

Updated webrev: 
http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.01/index.html

Test:
   Reran hotspot_gc test.


Thanks,

-Zhengyu


> 


From manc at google.com  Thu Jan 16 00:08:46 2020
From: manc at google.com (Man Cao)
Date: Wed, 15 Jan 2020 16:08:46 -0800
Subject: Work-in-progress: 8236485: Epoch synchronization protocol for G1
 concurrent refinement
In-Reply-To: <CA+w6HxZVSbcpTyNavxTt9oNgn+RVb+T=_AqYEy4USbEKcHKfQg@mail.gmail.com>
References: <CA+w6HxZVSbcpTyNavxTt9oNgn+RVb+T=_AqYEy4USbEKcHKfQg@mail.gmail.com>
Message-ID: <CA+w6HxbcLdZbJw+qhuRrGePxU7-Z-s1Prw2eGNruEeQ8cSipMw@mail.gmail.com>

We had an offline discussion on this. To keep the community in the loop,
here is what we discussed.

a. Using Linux membarrier syscall or equivalent on other OSes seems a
cleaner solution than thread-local handshake (TLH). But we need to have a
backup mechanism for OSes and older Linuxes that do not have such a syscall.
b. For the blocking property of TLH,
https://bugs.openjdk.java.net/browse/JDK-8230594 may help solve the problem
once it is implemented.
c. TLH could be issued to a subset of all threads, e.g. only to thread that
have not yet reached the global epoch. This could save a lot of time for
the handshake.
d. Compiler threads are Java threads but they are mostly not in Java state.
They could be a source of problem for the epoch synchronization protocol.
e. The filter in G1EpochSynchronizer::check_and_update_frontier() may be
incorrect, because it racily reads a remote thread's state, which may not
observe all threads in Java state.
f. Implementing asynchronous processing of the dirty card buffers could
avoid a lot of TLH requests, so the speed of TLH may not be hugely
concerning.
g. It may be OK to slow down the native post-write barrier a bit with more
frequent execution of the StoreLoad fence. We could do some benchmarking to
test this. A more debatable issue is if we would make the native post-write
barrier different from the post-write barrier in Java code, that only the
native barrier has the StoreLoad fence.

I will further work on these issues.

-Man


On Sun, Dec 22, 2019 at 8:50 AM Man Cao <manc at google.com> wrote:

> Hi all,
>
> I have written up a description and challenges for implementing an epoch
> synchronization protocol. This protocol is necessary for removing the
> StoreLoad fence in G1's post-write barrier (JDK-8226731)
>
> Description: https://bugs.openjdk.java.net/browse/JDK-8236485
> Work-in-progress webrev:
> https://cr.openjdk.java.net/~manc/8236485/webrev_wip0/
>
> There are two main challenges that I'm not sure how to resolve:
> - Triggering a thread-local handshake is a blocking operation that can
> pass a safepoint.
> - There are native post-write barriers executed by threads in native/VM
> state.
>
> Discussions and suggestions are highly appreciated!
>
> -Man
>


From felix.yang at huawei.com  Thu Jan 16 03:01:55 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Thu, 16 Jan 2020 03:01:55 +0000
Subject: [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <38a15dc5-9cee-0f44-13ee-98f185ee72ae@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
 <38a15dc5-9cee-0f44-13ee-98f185ee72ae@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA7BE@dggeml527-mbx.china.huawei.com>

Hi,

> Hi,
> 
> Please note that backporting JDK-8224675 "Late GC barrier insertion for ZGC"
> is not great idea, since that patch introduced stability issues and the whole
> approach was later superseded by JDK-8230565 "ZGC: Redesign C2 load
> barrier to expand on the MachNode level".
> 
> If you want to go down this path, I'd suggest that you either don't backport
> JDK-8224675 at all, or backport everything up to JDK-8224675 + JDK-8230565.
> Also note that if you include JDK-8230565 you want to be careful to also include
> any followup bug fixes, like JDK-8233506.

Thanks for pointing this out.  It's helpful for our current work.  
We plan to start with the four patches and will check for other necessary ones.  
We noticed patches like JDK-8230565 are necessary for x86 zgc, but it's not there in jdk11.  
Users who want to stay with LTS versions like jdk11 will most likely come to the problems when they try zgc on the x86 platform.  
Is there a plan to incorporate these patches in jdk11?  

> In general, a lot of stability and performance improvements have gone into ZGC
> since JDK 11. If at all possible, I would strongly recommend using JDK 14
> instead, where you already have aarch64 support and all other goodies.

Does that mean zgc in jdk11 will not be maintained by the community?  


Thanks,
Felix

From maoliang.ml at alibaba-inc.com  Thu Jan 16 03:21:13 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Thu, 16 Jan 2020 11:21:13 +0800
Subject: =?UTF-8?B?UmU6IFJGUjogODIzNjA3MzogRzE6IFVzZSBTb2Z0TWF4SGVhcFNpemUgdG8gZ3VpZGUgR0Mg?=
 =?UTF-8?B?aGV1cmlzdGljcw==?=
In-Reply-To: <fa3ea91209d45b143f40840aec6c84143b8e0859.camel@oracle.com>
References: <ffaedd62-9cfb-462f-8a2c-b3c11da5eeb4.maoliang.ml@alibaba-inc.com>	
 <5b24e235-5466-15a1-78a6-6f63bfa1878e@oracle.com>	
 <43090624-d8be-8600-a55e-1e10b1920135@oracle.com>	
 <359fbef8-6735-4958-b76f-56430f1a4108.maoliang.ml@alibaba-inc.com>	
 <36b31e9a-ee86-50d9-8042-bc79e6756777@oracle.com>	
 <a2916294-4de8-4f5b-9a61-43cd03832b8c.maoliang.ml@alibaba-inc.com>	
 <08e025f0a9520b12d06df8157d63d73b4e7e11a4.camel@oracle.com>	
 <4fe7f936-95bb-4a4e-85ed-e5c4423f9d06.maoliang.ml@alibaba-inc.com>	,
 <9d21b384-790b-c2e2-d801-0025c9257656@oracle.com>	
 <693b04b7-d13d-4ef5-b425-febc81984dbc.maoliang.ml@alibaba-inc.com>,
 <fa3ea91209d45b143f40840aec6c84143b8e0859.camel@oracle.com>
Message-ID: <d9b4c65f-440a-40df-b469-b5cbbd7b8b2c.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Yes. We can focus on the current concurrent shrinking for now.
You are right that changing the default behavior will be sensitive
 since you need to cover all types of applications including 
throughput and low-latency while our previous patch is mostly
designed for low-latency. We'll figure this out later:)

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 16 (Thu.) 01:57
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On Wed, 2020-01-15 at 20:53 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> So G1 doesn't need to shrink below Xms if SoftMaxHeapSize is 
> below Xms, does it?
> 

No, never shrink below MinHeapSize.

> Another question is that no matter we have an additional option we 
> had better have 2 criterions. The first is for urgent expansion that
> GCTimeRatio is quite low and concurrent expansion with frequent GCs 
> is more harmful and expansion should be done immediately. It's the
> current default flow as we found that 12 is a good number below which
> applications can obviously incur timeout errors. The second is to
> keep the GCTimeRatio and memory footprint in a balanced state so
> any adjustments are better to be concurrent. The original number 99
> fits well here. If we have only one option "GCTimeRatio", we might
> not be able to achieve both. Maybe we can have a LowGCTimeRatio below
> which suppose to be not acceptable and a HighTimeRatio which is
> certainly healthy. 

So far the change has been about shrinking the heap concurrently, and
not expansion.

Let's concentrate on the issue at hand, i.e. see how heap shrinking at
more places turns out. I believe there will be lots of tweaking needed
for this change to not show too many regressions in other applications.
Remember that the defaults should work well for a large body of
applications, not just a few. There may be knobs to tune it for others.

Then look concurrent expansion, at application phase changes in the
application, how to detect, and how to react best.

Just for reference, last time we changed the sizing algorithm it took a
few months to get it "right", with mostly improvements all around.

Thanks,
  Thomas


From kim.barrett at oracle.com  Thu Jan 16 05:57:17 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jan 2020 00:57:17 -0500
Subject: RFR: 8237261: Concurrent refinement activation threshold not updated
 for card counts 
Message-ID: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>

Please review this change to the activation threshold for the primary
(first) concurrent refinement thread.  The special calculation used
for that thread's threshold wasn't updated to account for the change
from using buffer counts to using counts of the cards in the buffers
by JDK-8230109.

Also fixed a parameter name that wasn't updated by that same change
from buffer counts to card counts.

CR:
https://bugs.openjdk.java.net/browse/JDK-8237261

Webrev:
https://cr.openjdk.java.net/~kbarrett/8237261/open.00/

Testing:
mach5 tier1 by itself
mach5 tier1-5 and some perf testing with in development change for JDK-8237143


From felix.yang at huawei.com  Thu Jan 16 06:42:29 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Thu, 16 Jan 2020 06:42:29 +0000
Subject: [aarch64-port-dev ] [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <6b716737-ba03-71c2-5488-b5654093c447@redhat.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
 <CAEGA6kZzXgo-BozGHtWnpn6vw1RPrhFL_s42cG9Vu9vK4-C6hQ@mail.gmail.com>
 <6b716737-ba03-71c2-5488-b5654093c447@redhat.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA8AB@dggeml527-mbx.china.huawei.com>

Hi,

> On 1/15/20 1:10 PM, Stuart Monteith wrote:
> >      I'm pleased that there is interest in ZGC on aarch64, that the
> > performance is at expected levels and is apparently trouble free.
> > However, I'd like to understand why this backporting is being done. If
> > it is for running in production, then I'd expect Per, etc, to not be
> > upset or disagree when I that ZGC on aarch64 in JDK 13 isn't
> > production ready.
> 
> In particular, it's perhaps odd that something which is still an experimental
> feature in mainline is being considered for a backport.

So long as zgc in jdk11 is continually maintained by the community, it may not be a bad idea
to enable it on one more arch provided that the risk is acceptable after code review.  Otherwise, we are on our own.  
Users are much conservative when it comes to migrating to a new jdk version.  Also they have their own decision about which GC policy to use.  


Thanks,
Felix

From per.liden at oracle.com  Thu Jan 16 08:22:33 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 16 Jan 2020 09:22:33 +0100
Subject: [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA7BE@dggeml527-mbx.china.huawei.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
 <38a15dc5-9cee-0f44-13ee-98f185ee72ae@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE3DFA7BE@dggeml527-mbx.china.huawei.com>
Message-ID: <e30afa6e-375b-3b30-38b5-6a754ac38810@oracle.com>

Hi,

ZGC in JDK 11 is fairly stable as it is, so there's no super compelling 
reason to spend time and resources on backporting JDK-8233506 at this 
time. However, backporting only JDK-8224675 would be a mistake, as it 
would destabilize ZGC (including the x86 port) so you would basically 
have to go all the way to JDK-8230565, or alternatively don't backport 
JDK-8224675 and adjust the aarch64 port accordingly.

Whatever path you take here, it would require significant work and 
testing, which is why I'd again recommend that you to consider using JDK 
14 (when it's GA) for these workloads.

cheers,
Per

On 1/16/20 4:01 AM, Yangfei (Felix) wrote:
> Hi,
> 
>> Hi,
>>
>> Please note that backporting JDK-8224675 "Late GC barrier insertion for ZGC"
>> is not great idea, since that patch introduced stability issues and the whole
>> approach was later superseded by JDK-8230565 "ZGC: Redesign C2 load
>> barrier to expand on the MachNode level".
>>
>> If you want to go down this path, I'd suggest that you either don't backport
>> JDK-8224675 at all, or backport everything up to JDK-8224675 + JDK-8230565.
>> Also note that if you include JDK-8230565 you want to be careful to also include
>> any followup bug fixes, like JDK-8233506.
> 
> Thanks for pointing this out.  It's helpful for our current work.
> We plan to start with the four patches and will check for other necessary ones.
> We noticed patches like JDK-8230565 are necessary for x86 zgc, but it's not there in jdk11.
> Users who want to stay with LTS versions like jdk11 will most likely come to the problems when they try zgc on the x86 platform.
> Is there a plan to incorporate these patches in jdk11?
> 
>> In general, a lot of stability and performance improvements have gone into ZGC
>> since JDK 11. If at all possible, I would strongly recommend using JDK 14
>> instead, where you already have aarch64 support and all other goodies.
> 
> Does that mean zgc in jdk11 will not be maintained by the community?
> 
> 
> Thanks,
> Felix
> 


From kim.barrett at oracle.com  Thu Jan 16 08:51:17 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jan 2020 03:51:17 -0500
Subject: RFR: 8237143: Eliminate DirtyCardQ_cbl_mon 
Message-ID: <745E91C1-AE1A-4DA2-80EE-59B70897F4BF@oracle.com>

Please review this change to eliminate the DirtyCardQ_cbl_mon.  This
is one of the two remaining super-special "access" ranked mutexes.
(The other is the Shared_DirtyCardQ_lock, whose elimination is covered
by JDK-8221360.)

There are three main parts to this change.

(1) Replace the under-a-lock FIFO queue in G1DirtyCardQueueSet with a
lock-free FIFO queue.

(2) Replace the use of a HotSpot monitor for signaling activation of
concurrent refinement threads with a semaphore-based solution.

(3) Handle pausing of buffer refinement in the middle of a buffer in
order to handle a pending safepoint request.  This can no longer just
push the partially processed buffer back onto the queue, due to ABA
problems now that the buffer is lock-free.

CR:
https://bugs.openjdk.java.net/browse/JDK-8237143

Webrev:
https://cr.openjdk.java.net/~kbarrett/8237143/open.00/

Testing:
mach5 tier1-5
Normal performance testing showed no significant change.
specjbb2015 on a very big machine showed a 3.5% average critical-jOPS
improvement, though not statistically significant; removing contention
for that lock by many hardware threads may be a little bit noticeable.


From shade at redhat.com  Thu Jan 16 08:51:42 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 16 Jan 2020 09:51:42 +0100
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <8eeccdc6-960b-591a-d1b1-42bb50f868ad@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
 <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>
 <6471ea70-e89f-17ef-9585-20f4c16a3e23@redhat.com>
 <8eeccdc6-960b-591a-d1b1-42bb50f868ad@redhat.com>
Message-ID: <74b22231-bb01-0bc3-5707-0a1107065181@redhat.com>

On 1/15/20 10:29 PM, Zhengyu Gu wrote:
> Updated webrev: 
> http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.01/index.html

OK, thanks for explaining. I guess that makes sense.

This comment is outdated then:
   480   // Delayed update to avoid contention on the table lock

I'd suggest to rewrite it to:
  // Do atomic update here instead of taking StringDedupTable_lock. This allows concurrent
  // cleanup when multiple workers are cleaning up the table, while the mutators are blocked
  // on StringDedupTable_lock.

...or some such.

-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Thu Jan 16 09:24:18 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 16 Jan 2020 10:24:18 +0100
Subject: RFR: 8237201: ZGC: Remove unused
 ZRelocationSetSelector::fragmentation()
In-Reply-To: <1a626930-3be4-4976-8738-9f3d716873ce@oracle.com>
References: <35727b46-b4d8-8336-b484-1119bff15468@oracle.com>
 <1a626930-3be4-4976-8738-9f3d716873ce@oracle.com>
Message-ID: <2d04ae6b-c532-fd3a-48a5-62946b554fc3@oracle.com>

Thanks for reviewing, Thomas!

/Per

On 1/15/20 4:05 PM, Thomas Schatzl wrote:
> Hi,
> 
> On 15.01.20 14:03, Per Liden wrote:
>> ZRelocationSetSelector::fragmentation() is not used and can be removed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237201
>> Webrev: http://cr.openjdk.java.net/~pliden/8237201/webrev.0
>>
>> /Per
> 
>  ? looks good.
> 
> Thomas


From stefan.johansson at oracle.com  Thu Jan 16 10:10:18 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 16 Jan 2020 11:10:18 +0100
Subject: RFR: 8237261: Concurrent refinement activation threshold not
 updated for card counts
In-Reply-To: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>
References: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>
Message-ID: <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>

Hi Kim,

On 2020-01-16 06:57, Kim Barrett wrote:
> Please review this change to the activation threshold for the primary
> (first) concurrent refinement thread.  The special calculation used
> for that thread's threshold wasn't updated to account for the change
> from using buffer counts to using counts of the cards in the buffers
> by JDK-8230109.
> 
> Also fixed a parameter name that wasn't updated by that same change
> from buffer counts to card counts.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8237261
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8237261/open.00/
> 
Looks good,
Stefan

> Testing:
> mach5 tier1 by itself
> mach5 tier1-5 and some perf testing with in development change for JDK-8237143
> 


From stefan.karlsson at oracle.com  Thu Jan 16 10:13:53 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 16 Jan 2020 11:13:53 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
Message-ID: <60712fdc-b2b6-4ecb-6216-a44b5c4996c2@oracle.com>

Looks good.

StefanK

On 2020-01-15 13:57, Per Liden wrote:
> Hi,
> 
> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer, 
> which aims to de-duplicate some of the multi-mapping code. I've split 
> the change into three separate patches, the main patch followed by two 
> patches doing some renaming.
> 
> 
> 1) The ZBackingFile code was designed to allow platforms to decide if 
> they want to use heap multi-mapping or some other (possibly HW 
> supported) scheme. As of today, all our supported platforms do heap 
> multi-mapping, so there's some degree of code duplication in 
> ZBackingFile for each platform. This patch moves common multi-mapping 
> code into ZPhysicalMemoryManager. If we in the future find that we want 
> to support a platform that doesn't do multi-mapping, then we can 
> introduce an abstraction for this again.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
> 
> 
> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is 
> somewhat misleading on platforms other than Linux.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
> 
> 
> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light of 
> JDK-8237199.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
> 
> 
> cheers,
> Per


From per.liden at oracle.com  Thu Jan 16 10:50:12 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 16 Jan 2020 11:50:12 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <60712fdc-b2b6-4ecb-6216-a44b5c4996c2@oracle.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
 <60712fdc-b2b6-4ecb-6216-a44b5c4996c2@oracle.com>
Message-ID: <02128613-0ced-dc22-d6c1-d9b474063d76@oracle.com>

Thanks Stefan!

/Per

On 1/16/20 11:13 AM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2020-01-15 13:57, Per Liden wrote:
>> Hi,
>>
>> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer, 
>> which aims to de-duplicate some of the multi-mapping code. I've split 
>> the change into three separate patches, the main patch followed by two 
>> patches doing some renaming.
>>
>>
>> 1) The ZBackingFile code was designed to allow platforms to decide if 
>> they want to use heap multi-mapping or some other (possibly HW 
>> supported) scheme. As of today, all our supported platforms do heap 
>> multi-mapping, so there's some degree of code duplication in 
>> ZBackingFile for each platform. This patch moves common multi-mapping 
>> code into ZPhysicalMemoryManager. If we in the future find that we 
>> want to support a platform that doesn't do multi-mapping, then we can 
>> introduce an abstraction for this again.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
>> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
>>
>>
>> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is 
>> somewhat misleading on platforms other than Linux.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
>> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
>>
>>
>> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light 
>> of JDK-8237199.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
>> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
>>
>>
>> cheers,
>> Per


From thomas.schatzl at oracle.com  Thu Jan 16 11:37:17 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 16 Jan 2020 12:37:17 +0100
Subject: RFR: 8237261: Concurrent refinement activation threshold not
 updated for card counts
In-Reply-To: <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>
References: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>
 <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>
Message-ID: <9fb2208e-677c-db88-b8c7-641c11a24405@oracle.com>

Hi,

On 16.01.20 11:10, Stefan Johansson wrote:
> Hi Kim,
> 
> On 2020-01-16 06:57, Kim Barrett wrote:
>> Please review this change to the activation threshold for the primary
>> (first) concurrent refinement thread.? The special calculation used
>> for that thread's threshold wasn't updated to account for the change
>> from using buffer counts to using counts of the cards in the buffers
>> by JDK-8230109.
>>
>> Also fixed a parameter name that wasn't updated by that same change
>> from buffer counts to card counts.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8237261
>>
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8237261/open.00/
>>
> Looks good,
> Stefan

   +1

Thomas


From fweimer at redhat.com  Thu Jan 16 12:06:42 2020
From: fweimer at redhat.com (Florian Weimer)
Date: Thu, 16 Jan 2020 13:06:42 +0100
Subject: Work-in-progress: 8236485: Epoch synchronization protocol for G1
 concurrent refinement
In-Reply-To: <CA+w6HxbcLdZbJw+qhuRrGePxU7-Z-s1Prw2eGNruEeQ8cSipMw@mail.gmail.com>
 (Man Cao's message of "Wed, 15 Jan 2020 16:08:46 -0800")
References: <CA+w6HxZVSbcpTyNavxTt9oNgn+RVb+T=_AqYEy4USbEKcHKfQg@mail.gmail.com>
 <CA+w6HxbcLdZbJw+qhuRrGePxU7-Z-s1Prw2eGNruEeQ8cSipMw@mail.gmail.com>
Message-ID: <87blr3vc4t.fsf@oldenburg2.str.redhat.com>

* Man Cao:

> We had an offline discussion on this. To keep the community in the loop,
> here is what we discussed.
>
> a. Using Linux membarrier syscall or equivalent on other OSes seems a
> cleaner solution than thread-local handshake (TLH). But we need to have a
> backup mechanism for OSes and older Linuxes that do not have such a
> syscall.

Can you do with a membarrier call that doesn't require registration?

The usual fallback for membarrier is sending a special signal to all
threads, and make sure that they have run code in a signal handler
(possibly using a CPU barrier there).  But of course this is rather
slow.

membarrier has seen some backporting activity, but as far as I can see,
that hasn't been consistent across architectures.

Thanks,
Florian


From thomas.schatzl at oracle.com  Thu Jan 16 13:20:15 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 16 Jan 2020 14:20:15 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due to
 parallel modification during optional evacuation
Message-ID: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>

Hi all,

   can I get reviews for this change that fixes a bug in the abortable 
mixed gc algorithm where G1 might corrupt oops embedded in nmethods due 
to parallel modification during an optional evacuation phase?

G1 currently collects embedded oops in nmethods twice: once in the 
optional roots list, and once as nmethods in the strong code roots list 
for a particular region.

Now it can happen that this oop embedded in in the code stream is 
unaligned, so if that oop is modified during relocation word tearing may 
occur, causing follow-up crashes.

The fix is to not collect oops from nmethods in the optional code root 
list as the strong code root list for a particular region already always 
contains it anyway.

Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
testing and the discussion around it.

CR:
https://bugs.openjdk.java.net/browse/JDK-8235305
Webrev:
http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
Testing:
multiple runs of hs-tier1-5, multiple runs of the crashing application 
(24h kitchensink) with and without a VM modification and also with some 
G1 settings that caused crashes within 1-2 hours that reproduced the 
issue within 5 minutes.
Currently starting perf test runs with and without this change: however 
since this change strictly reduces the work done at all times I am not 
expecting any regressions (and hence I am asking for review in advance).

Thanks,
   Thomas


From stuart.monteith at linaro.org  Thu Jan 16 13:24:28 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Thu, 16 Jan 2020 13:24:28 +0000
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
Message-ID: <CAEGA6kY=QGR=rZJ14t5kURq=w9nWKNDKBQ+WXX6Xj4o5VDe61g@mail.gmail.com>

Looks good to me, thanks.

Stuart

On Wed, 15 Jan 2020 at 12:57, Per Liden <per.liden at oracle.com> wrote:
>
> Hi,
>
> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer,
> which aims to de-duplicate some of the multi-mapping code. I've split
> the change into three separate patches, the main patch followed by two
> patches doing some renaming.
>
>
> 1) The ZBackingFile code was designed to allow platforms to decide if
> they want to use heap multi-mapping or some other (possibly HW
> supported) scheme. As of today, all our supported platforms do heap
> multi-mapping, so there's some degree of code duplication in
> ZBackingFile for each platform. This patch moves common multi-mapping
> code into ZPhysicalMemoryManager. If we in the future find that we want
> to support a platform that doesn't do multi-mapping, then we can
> introduce an abstraction for this again.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
>
>
> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is
> somewhat misleading on platforms other than Linux.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
>
>
> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light of
> JDK-8237199.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
>
>
> cheers,
> Per


From per.liden at oracle.com  Thu Jan 16 13:31:34 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 16 Jan 2020 14:31:34 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <CAEGA6kY=QGR=rZJ14t5kURq=w9nWKNDKBQ+WXX6Xj4o5VDe61g@mail.gmail.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
 <CAEGA6kY=QGR=rZJ14t5kURq=w9nWKNDKBQ+WXX6Xj4o5VDe61g@mail.gmail.com>
Message-ID: <06c09f7f-a07a-b6ce-41f3-fa148f3af1c7@oracle.com>

Thanks for reviewing, Stuart!

cheers,
Per

On 1/16/20 2:24 PM, Stuart Monteith wrote:
> Looks good to me, thanks.
> 
> Stuart
> 
> On Wed, 15 Jan 2020 at 12:57, Per Liden <per.liden at oracle.com> wrote:
>>
>> Hi,
>>
>> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer,
>> which aims to de-duplicate some of the multi-mapping code. I've split
>> the change into three separate patches, the main patch followed by two
>> patches doing some renaming.
>>
>>
>> 1) The ZBackingFile code was designed to allow platforms to decide if
>> they want to use heap multi-mapping or some other (possibly HW
>> supported) scheme. As of today, all our supported platforms do heap
>> multi-mapping, so there's some degree of code duplication in
>> ZBackingFile for each platform. This patch moves common multi-mapping
>> code into ZPhysicalMemoryManager. If we in the future find that we want
>> to support a platform that doesn't do multi-mapping, then we can
>> introduce an abstraction for this again.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
>> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
>>
>>
>> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is
>> somewhat misleading on platforms other than Linux.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
>> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
>>
>>
>> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light of
>> JDK-8237199.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
>> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
>>
>>
>> cheers,
>> Per


From erik.osterlund at oracle.com  Thu Jan 16 14:44:36 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Thu, 16 Jan 2020 15:44:36 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
Message-ID: <47579bf7-e727-6233-c821-82d9fb9bdfba@oracle.com>

Hi Per,

I like the red stuff. +1

/Erik

On 1/15/20 1:57 PM, Per Liden wrote:
> Hi,
>
> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer, 
> which aims to de-duplicate some of the multi-mapping code. I've split 
> the change into three separate patches, the main patch followed by two 
> patches doing some renaming.
>
>
> 1) The ZBackingFile code was designed to allow platforms to decide if 
> they want to use heap multi-mapping or some other (possibly HW 
> supported) scheme. As of today, all our supported platforms do heap 
> multi-mapping, so there's some degree of code duplication in 
> ZBackingFile for each platform. This patch moves common multi-mapping 
> code into ZPhysicalMemoryManager. If we in the future find that we 
> want to support a platform that doesn't do multi-mapping, then we can 
> introduce an abstraction for this again.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
>
>
> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is 
> somewhat misleading on platforms other than Linux.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
>
>
> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light 
> of JDK-8237199.
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
>
>
> cheers,
> Per


From per.liden at oracle.com  Thu Jan 16 15:44:50 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 16 Jan 2020 16:44:50 +0100
Subject: RFR: 8237198+8237199+8237200: ZGC: Share heap multi-mapping code
 across platforms
In-Reply-To: <47579bf7-e727-6233-c821-82d9fb9bdfba@oracle.com>
References: <befa7d4e-8c45-6ff5-620d-da1bb81ffd7b@oracle.com>
 <47579bf7-e727-6233-c821-82d9fb9bdfba@oracle.com>
Message-ID: <1a06d55a-28b9-d00e-3d78-c81d9816be97@oracle.com>

Thanks Erik!

/Per

On 1/16/20 3:44 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> I like the red stuff. +1
> 
> /Erik
> 
> On 1/15/20 1:57 PM, Per Liden wrote:
>> Hi,
>>
>> Please review this cleanup of the ZPhysicalMemory/ZBackingFile layer, 
>> which aims to de-duplicate some of the multi-mapping code. I've split 
>> the change into three separate patches, the main patch followed by two 
>> patches doing some renaming.
>>
>>
>> 1) The ZBackingFile code was designed to allow platforms to decide if 
>> they want to use heap multi-mapping or some other (possibly HW 
>> supported) scheme. As of today, all our supported platforms do heap 
>> multi-mapping, so there's some degree of code duplication in 
>> ZBackingFile for each platform. This patch moves common multi-mapping 
>> code into ZPhysicalMemoryManager. If we in the future find that we 
>> want to support a platform that doesn't do multi-mapping, then we can 
>> introduce an abstraction for this again.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237198
>> Webrev: http://cr.openjdk.java.net/~pliden/8237198/webrev.0
>>
>>
>> 2) Rename ZBackingFile to ZPhysicalMemoryBacking, since "File" is 
>> somewhat misleading on platforms other than Linux.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237199
>> Webrev: http://cr.openjdk.java.net/~pliden/8237199/webrev.0
>>
>>
>> 3) Rename ZBackingPath to ZMountPoint, as it's a better name in light 
>> of JDK-8237199.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237200
>> Webrev: http://cr.openjdk.java.net/~pliden/8237200/webrev.0
>>
>>
>> cheers,
>> Per
> 


From leo.korinth at oracle.com  Thu Jan 16 16:06:29 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Thu, 16 Jan 2020 17:06:29 +0100
Subject: RFR (M): 8235860: Obsolete the UseParallelOldGC option
In-Reply-To: <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>
References: <292ab94f-f2c8-b373-d5a5-46a45470540e@oracle.com>
 <2A4B1955-26D5-4544-B476-6E9E5E8009D4@oracle.com>
 <5e21e50d-a026-98ba-d03d-3f7aa1c31e21@oracle.com>
 <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
 <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>
Message-ID: <84170868-4b43-c578-f134-bb169c4f2708@oracle.com>

Hi!

I believe _name and old_gen_name() in PSOldGen should be removed and the 
virtual name() should return the string literal directly. Change this if 
you want.

With or without using my suggestions, your changes looks good to me.

Thanks for cleaning this!
Leo

On 07/01/2020 11:55, Thomas Schatzl wrote:
> Hi Kim,
> 
> On 18.12.19 16:45, Kim Barrett wrote:
>>
>>
>>> On Dec 18, 2019, at 4:52 AM, Thomas Schatzl 
>>> <thomas.schatzl at oracle.com> wrote:
>>>
>>> Fixed in
>>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.0_to_1 (diff)
>>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.1 (full)
>>
>> Looks good.
>>
> 
> Thanks for your review.
> 
>>>
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>> src/hotspot/share/gc/parallel/psParallelCompact.hpp
>>>> Pre-existing: It seems like the big block comment before SplitInfo
>>>> should have received some updates as part of the recent shadow-region
>>>> patch, but it wasn't touched.
>>>> ------------------------------------------------------------------------------ 
>>>>
>>>
>>> I am filing a CR for that.
>>
>> The comment before PSParallelCompact in the same file might also need 
>> some updating.
>>
>> (I was a bit confused in my earlier review about where the relevant 
>> comments were.)
>>
> 
>  ? I filed JDK-8141637 before the holidays. I added your recent comment.
> 
> Thanks,
>  ? Thomas


From zgu at redhat.com  Thu Jan 16 19:08:52 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 16 Jan 2020 14:08:52 -0500
Subject: [14] RFR 8237369: Shenandoah: failed
 vmTestbase/nsk/jvmti/AttachOnDemand/attach021/TestDescription.java test
Message-ID: <e0d9d0de-2f42-2265-01b3-320807778b9e@redhat.com>

Please review this small patch. keep_alive is only applicable during 
marking phase.


Bug: https://bugs.openjdk.java.net/browse/JDK-8237369
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237369/webrev.00/

Test:
   hotspot_gc_shenandoah

   vmTestbase/nsk/jvmti/AttachOnDemand/attach021/TestDescription.java
   with Shenandoah GC (normal and traversal mode)


Thank,

-Zhengyu


From rkennke at redhat.com  Thu Jan 16 19:22:30 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 16 Jan 2020 20:22:30 +0100
Subject: [14] RFR 8237369: Shenandoah: failed
 vmTestbase/nsk/jvmti/AttachOnDemand/attach021/TestDescription.java test
In-Reply-To: <e0d9d0de-2f42-2265-01b3-320807778b9e@redhat.com>
References: <e0d9d0de-2f42-2265-01b3-320807778b9e@redhat.com>
Message-ID: <7f64ce2c-721b-90be-05ac-12386b7a55e8@redhat.com>

Yes, good catch!

Roman

> Please review this small patch. keep_alive is only applicable during
> marking phase.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237369
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237369/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah
> 
> ? vmTestbase/nsk/jvmti/AttachOnDemand/attach021/TestDescription.java
> ? with Shenandoah GC (normal and traversal mode)
> 
> 
> Thank,
> 
> -Zhengyu
> 


From kim.barrett at oracle.com  Thu Jan 16 19:53:45 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jan 2020 14:53:45 -0500
Subject: RFR: 8237261: Concurrent refinement activation threshold not
 updated for card counts
In-Reply-To: <9fb2208e-677c-db88-b8c7-641c11a24405@oracle.com>
References: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>
 <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>
 <9fb2208e-677c-db88-b8c7-641c11a24405@oracle.com>
Message-ID: <BA693214-A655-4FA1-A597-6D8ED8282E6A@oracle.com>

> On Jan 16, 2020, at 6:37 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 16.01.20 11:10, Stefan Johansson wrote:
>> Hi Kim,
>> On 2020-01-16 06:57, Kim Barrett wrote:
>>> Please review this change to the activation threshold for the primary
>>> (first) concurrent refinement thread.  The special calculation used
>>> for that thread's threshold wasn't updated to account for the change
>>> from using buffer counts to using counts of the cards in the buffers
>>> by JDK-8230109.
>>> 
>>> Also fixed a parameter name that wasn't updated by that same change
>>> from buffer counts to card counts.
>>> 
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8237261
>>> 
>>> Webrev:
>>> https://cr.openjdk.java.net/~kbarrett/8237261/open.00/
>>> 
>> Looks good,
>> Stefan
> 
>  +1
> 
> Thomas

Thanks.


From kim.barrett at oracle.com  Thu Jan 16 19:53:35 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 16 Jan 2020 14:53:35 -0500
Subject: RFR: 8237261: Concurrent refinement activation threshold not
 updated for card counts
In-Reply-To: <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>
References: <CF72946C-FC21-4A4A-A695-2A6D5110C430@oracle.com>
 <1f213c0e-5425-e06d-836e-4770bb7596f4@oracle.com>
Message-ID: <EF759BD6-C560-45FD-8B24-69E08F6A9F95@oracle.com>

> On Jan 16, 2020, at 5:10 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> On 2020-01-16 06:57, Kim Barrett wrote:
>> Please review this change to the activation threshold for the primary
>> (first) concurrent refinement thread.  The special calculation used
>> for that thread's threshold wasn't updated to account for the change
>> from using buffer counts to using counts of the cards in the buffers
>> by JDK-8230109.
>> Also fixed a parameter name that wasn't updated by that same change
>> from buffer counts to card counts.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8237261
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8237261/open.00/
> Looks good,
> Stefan

Thanks.


From zgu at redhat.com  Thu Jan 16 20:21:38 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 16 Jan 2020 15:21:38 -0500
Subject: [14] RFR 8237392: Shenandoah: Remove unreliable assertion
Message-ID: <c1f03d1b-5d78-b757-a743-7ea0e2b2ccca@redhat.com>

Offline discussion concluded that the assertion added by JDK-8237369 is 
not reliable.

For piggyback reference updating cycle, has_forwarded_objects flag is 
carried into next GC cycle and Shenandoah resets marking bitmap 
concurrently just before new GC cycle. So, there is a short period 
without reliable marking bitmap, could trigger false assertion.


Bug: https://bugs.openjdk.java.net/browse/JDK-8237392
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237392/webrev.00/

Test:
   hotspot_gc_shenandoah

Thanks,

-Zhengyu


From shade at redhat.com  Thu Jan 16 20:29:39 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 16 Jan 2020 21:29:39 +0100
Subject: [14] RFR 8237392: Shenandoah: Remove unreliable assertion
In-Reply-To: <c1f03d1b-5d78-b757-a743-7ea0e2b2ccca@redhat.com>
References: <c1f03d1b-5d78-b757-a743-7ea0e2b2ccca@redhat.com>
Message-ID: <43b72cb9-d7b9-835e-8a76-5e03d3ce4259@redhat.com>

On 1/16/20 9:21 PM, Zhengyu Gu wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237392
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237392/webrev.00/

Looks good.

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Thu Jan 16 23:37:48 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 16 Jan 2020 18:37:48 -0500
Subject: [15] RFR 8236878: Use atomic instruction to update
 StringDedupTable's entries and entries_removed counters
In-Reply-To: <74b22231-bb01-0bc3-5707-0a1107065181@redhat.com>
References: <97d04872-7abb-396d-7552-f85b4cf1b97b@redhat.com>
 <efa797ff-f688-4c81-6006-ee3d5b666d45@redhat.com>
 <b5470b5a-4799-7956-530e-95f5286b8312@redhat.com>
 <6471ea70-e89f-17ef-9585-20f4c16a3e23@redhat.com>
 <8eeccdc6-960b-591a-d1b1-42bb50f868ad@redhat.com>
 <74b22231-bb01-0bc3-5707-0a1107065181@redhat.com>
Message-ID: <155fc21c-f6ca-a680-a681-c6d11482c34e@redhat.com>


On 1/16/20 3:51 AM, Aleksey Shipilev wrote:
> On 1/15/20 10:29 PM, Zhengyu Gu wrote:
>> Updated webrev:
>> http://cr.openjdk.java.net/~zgu/JDK-8236878/webrev.01/index.html
> 
> OK, thanks for explaining. I guess that makes sense.
> 
> This comment is outdated then:
>     480   // Delayed update to avoid contention on the table lock
> 
> I'd suggest to rewrite it to:
>    // Do atomic update here instead of taking StringDedupTable_lock. This allows concurrent
>    // cleanup when multiple workers are cleaning up the table, while the mutators are blocked
>    // on StringDedupTable_lock.

Updated as you suggested and pushed.


Thanks,

-Zhengyu

> 
> ...or some such.
> 


From manc at google.com  Fri Jan 17 00:53:04 2020
From: manc at google.com (Man Cao)
Date: Thu, 16 Jan 2020 16:53:04 -0800
Subject: Discussion: improve humongous objects handling for G1
Message-ID: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>

Hi all,

While migrating our workload from CMS to G1, we found many production
applications suffer from humongous allocations.
The default threshold for humongous objects is often too small for our
applications with heap sizes between 2GB-15GB.
Humongous allocations caused noticeable increase in the frequency of
concurrent old-gen collections, mixed collections and CPU usage.
We could advise applications to increase G1HeapRegionSize. But some
applications still suffer with G1HeapRegionSize=32M.
We could also advise applications to refactor code to break down large
objects. But it is a high cost effort that may not always be feasible.

We'd like to work with the OpenJDK community together to improve G1's
handling of humongous objects.
Thomas Schatzl mentioned to me a few efforts/ideas on this front in an
offline chat:
a. Allocation into tail regions of humongous object: JDK-8172713,
JDK-8031381
b. Commit additional virtual address space for humongous objects.
c. Improve the region selection heuristics (e.g., first-fit, best-fit) for
humongous objects.

I didn't find open CRs for b. and c. Could someone give pointers?
Are there any other ideas/prototypes on this front?

-Man


From felix.yang at huawei.com  Fri Jan 17 08:00:24 2020
From: felix.yang at huawei.com (Yangfei (Felix))
Date: Fri, 17 Jan 2020 08:00:24 +0000
Subject: [RFC] ZGC proposal for aarch64 jdk11u
In-Reply-To: <e30afa6e-375b-3b30-38b5-6a754ac38810@oracle.com>
References: <DA41BE1DDCA941489001C7FBD7A8820EE3DFA504@dggeml527-mbx.china.huawei.com>
 <38a15dc5-9cee-0f44-13ee-98f185ee72ae@oracle.com>
 <DA41BE1DDCA941489001C7FBD7A8820EE3DFA7BE@dggeml527-mbx.china.huawei.com>
 <e30afa6e-375b-3b30-38b5-6a754ac38810@oracle.com>
Message-ID: <DA41BE1DDCA941489001C7FBD7A8820EE3DFAC6A@dggeml527-mbx.china.huawei.com>

Hi,

> ZGC in JDK 11 is fairly stable as it is, so there's no super compelling reason to
> spend time and resources on backporting JDK-8233506 at this time. However,
> backporting only JDK-8224675 would be a mistake, as it would destabilize ZGC
> (including the x86 port) so you would basically have to go all the way to
> JDK-8230565, or alternatively don't backport
> JDK-8224675 and adjust the aarch64 port accordingly.

Yes, we see your point here and will look into it.  

> Whatever path you take here, it would require significant work and testing,
> which is why I'd again recommend that you to consider using JDK
> 14 (when it's GA) for these workloads.

Thanks again for your helpful comments.  We will consider it when people are willing to switch to higher jdk versions.  

Best regards,
Felix

From shade at redhat.com  Fri Jan 17 08:54:23 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 17 Jan 2020 09:54:23 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
References: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
Message-ID: <30800755-ca6b-bae2-98a5-e0be08c67166@redhat.com>

On 1/17/20 1:53 AM, Man Cao wrote:
> a. Allocation into tail regions of humongous object: JDK-8172713, JDK-8031381

Caveat: allocations near the "grandfather" humongous object would probably enjoy lots of nepotism.

> b. Commit additional virtual address space for humongous objects.

Caveat: users do like us not going over -Xmx! So this thing is better to be inside the "actual" heap.

> c. Improve the region selection heuristics (e.g., first-fit, best-fit) for
> humongous objects.

That works for solving external fragmentation (splitting the free space with a humongous alloc, for
next humongous alloc to not fit), right? Not the internal fragmentation (unused tail in the region).

> Are there any other ideas/prototypes on this front?

In Shenandoah, we found that compacting humongous regions, at least at Full GC, makes the collector
survive heavy external fragmentation, albeit at grand cost. G1 has the RFE open here:
  https://bugs.openjdk.java.net/browse/JDK-8191565

I remember J9 people telling me their GCs have "arraylets" that they spread across the regions, and
that works well right up to the point you need to do a JNI GetCritical on it.

For quite some time, I speculated that carving out the adjustable subset of regions for humongous
allocs and doing power-of-two buddy-system allocation there would be a thing to try. But, I have not
researched this thing very deeply.

-- 
Thanks,
-Aleksey


From thomas.schatzl at oracle.com  Fri Jan 17 08:55:52 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 09:55:52 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
Message-ID: <6cb6a34c-cb6b-f904-a0af-6e1b160073e1@oracle.com>

Hi,

On 16.01.20 14:20, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I get reviews for this change that fixes a bug in the abortable 
> mixed gc algorithm where G1 might corrupt oops embedded in nmethods due 
> to parallel modification during an optional evacuation phase?
> 
> G1 currently collects embedded oops in nmethods twice: once in the 
> optional roots list, and once as nmethods in the strong code roots list 
> for a particular region.
> 
> Now it can happen that this oop embedded in in the code stream is 
> unaligned, so if that oop is modified during relocation word tearing may 
> occur, causing follow-up crashes.
> 
> The fix is to not collect oops from nmethods in the optional code root 
> list as the strong code root list for a particular region already always 
> contains it anyway.
> 
> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
> testing and the discussion around it.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8235305
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
> Testing:
> multiple runs of hs-tier1-5, multiple runs of the crashing application 
> (24h kitchensink) with and without a VM modification and also with some 
> G1 settings that caused crashes within 1-2 hours that reproduced the 
> issue within 5 minutes.
> Currently starting perf test runs with and without this change: however 
> since this change strictly reduces the work done at all times I am not 
> expecting any regressions (and hence I am asking for review in advance).
> 

   no perf differences as expected.

Another hs-tier1-5 completed, and hs-tier6-8 almost done without new issues.

Thanks,
   Thomas


From stefan.johansson at oracle.com  Fri Jan 17 09:06:50 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 17 Jan 2020 10:06:50 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
Message-ID: <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>

Hi Thomas,

On 2020-01-16 14:20, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I get reviews for this change that fixes a bug in the abortable 
> mixed gc algorithm where G1 might corrupt oops embedded in nmethods due 
> to parallel modification during an optional evacuation phase?
> 
> G1 currently collects embedded oops in nmethods twice: once in the 
> optional roots list, and once as nmethods in the strong code roots list 
> for a particular region.
> 
> Now it can happen that this oop embedded in in the code stream is 
> unaligned, so if that oop is modified during relocation word tearing may 
> occur, causing follow-up crashes.
> 
> The fix is to not collect oops from nmethods in the optional code root 
> list as the strong code root list for a particular region already always 
> contains it anyway.
> 
> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
> testing and the discussion around it.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8235305
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/

Fix looks good. Just some things around the naming of the template 
parameter and enum after adding this. I don't have a much better idea 
but I don't think "barrier" is exactly what this is. I do think it would 
make sense to call the new value G1BarrierNMethod to be more inline with 
the other names. I also think it would make sense to move the comment 
about why this is needed to where we use it in g1OopClosures.inline.hpp. 
Me and StefanK talked a bit about this and if we move the comment and do 
the check for the barrier as a separate if-statement, it should be more 
obvious when this is needed.

Thanks,
Stefan

> Testing:
> multiple runs of hs-tier1-5, multiple runs of the crashing application 
> (24h kitchensink) with and without a VM modification and also with some 
> G1 settings that caused crashes within 1-2 hours that reproduced the 
> issue within 5 minutes.
> Currently starting perf test runs with and without this change: however 
> since this change strictly reduces the work done at all times I am not 
> expecting any regressions (and hence I am asking for review in advance).
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Fri Jan 17 10:00:23 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 11:00:23 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
References: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
Message-ID: <4623ce42-7b6c-8b46-5915-8ff708b82f5c@oracle.com>

Hi,

On 17.01.20 01:53, Man Cao wrote:
> Hi all,
> 
> While migrating our workload from CMS to G1, we found many production
> applications suffer from humongous allocations.
> The default threshold for humongous objects is often too small for our
> applications with heap sizes between 2GB-15GB.
> Humongous allocations caused noticeable increase in the frequency of
> concurrent old-gen collections, mixed collections and CPU usage.
> We could advise applications to increase G1HeapRegionSize. But some
> applications still suffer with G1HeapRegionSize=32M.
> We could also advise applications to refactor code to break down large
> objects. But it is a high cost effort that may not always be feasible.
> 
> We'd like to work with the OpenJDK community together to improve G1's
> handling of humongous objects.
> Thomas Schatzl mentioned to me a few efforts/ideas on this front in an
> offline chat:
> a. Allocation into tail regions of humongous object: JDK-8172713,
> JDK-8031381
> b. Commit additional virtual address space for humongous objects.
> c. Improve the region selection heuristics (e.g., first-fit, best-fit) for
> humongous objects.
> 
> I didn't find open CRs for b. and c. Could someone give pointers?
> Are there any other ideas/prototypes on this front?

TLDR: we in the Oracle gc team have quite a few ideas that can decrease 
the issue significantly. We are happy to help with implementation of any 
of these.
We would appreciate a sample application.

Long version:

The problems with humongous object allocation in G1:

- internal fragmentation: the tail end of a humongous object is wasted 
space.

- external fragmentation: sometimes you can't find enough contiguous 
space for a humongous object.

There are quite a few CRs related to this problem in the bug tracker; I 
just now connected them together using a "g1-humongous" label [0].

Here's a rundown of our ideas, categorized a little (note that these CRs 
predate significant changes due to how G1 works now, so the ideas may 
need to be adapted to the current situation):

- try to get rid of humongous asap, i.e. improve eager reclaim support 
by allowing eager reclaim with reference arrays (JDK-8048180) or 
non-objArrays (JDK-8073288).
I remember the main problem with that were stale remembered set entries 
after removal (and SATB marking, but you could just not do eager reclaim 
during marking).
In the applications we had at hand at that time, reference arrays tended 
to be not eager reclaimable most of the time, and humongous regular 
objects were rare.
So the benefit to look into this might be small.

- allow allocation into the tail end of humongous objects (JDK-8172713); 
there has once been an internal prototype for that, but it has been 
abandoned because of implementation issues (it was a hack that has not 
been completed to a stable state, mainly because humongous object 
management had been full of odd quirks wrt to region management. This 
has been fixed since. Also the example application benefitted more from 
eager reclaim).

While the argument from Aleksey about nepotism in the other thread is 
valid (as far as I understand it), it depends on the implementation. The 
area at the tail end could be considered as a separate evacuation 
source, i.e. evacuated independently of the humongous object (and that 
would actually improve the code to clean out HeapRegion ;)).
(This needs more care with single-region humongous objects but does not 
seem completely problematic; single-region humongous objects may 
nowadays not be a big issue to just move during GC).

- external fragmentation can be approached in many ways:

   - or just ignored by letting G1 reserve a multiple of MaxHeapSize 
while only ever committing MaxHeapSize (JDK-8229373). The main drawback 
here is that it impacts the range of heaps where compressed oops can be 
used, and 32 bit (particularly Windows) VMs (if you still care, but the 
feature could be disabled as well).
Compressed oops typically improve throughput significantly. Of course, 
as long as the total size of the reservation is below the threshold, it 
does not really matter.

Fwiw, when using the HeterogeneousHeapRegionManager, this is already 
attempted (for other reasons).

   - improve the region allocator to decrease the problem (JDK-8229373). 
The way G1 currently allocates regions is a first-fit approach which 
interferes a bit with destination region selection for old and survivor 
regions, likely creating more fragmentation than necessary. (Basically: 
it does not care at all, so go figure ;) ).
Also during mixed gc one could explicitly prefer regions to evacuate 
that break long runs of free regions, weighing those regions higher 
(evacuating earlier). This needs to be done in conjunction with the 
remembered set selection at end of marking, before creating them.

Long time ago, on a different regional collector, I started looking into 
this.

   - actively defragment the heap during GC. This may either be full gc 
(JDK-8191565) like shenandoah does, or any young gc assuming that G1 
first kept remembered sets for potential candidates (JDK-8038487).

- never create humongous objects

   - potentially implement one of the various ideas in the literature to 
break down large objects into smaller ones, J9's arraylets being one of 
them.

There are other solutions like completely separate allocation of 
humongous objects like ZGC does, but that typically has the same problem 
as reserving more space (i.e. compressed oops range, but ZGC does not 
care at this time).

I think it would help potential contributors if there were some 
application available where the impact of changes could be shown on in 
some way. In the past, whenever there had been someone with that 
problem, these persons were happy to just increase heap region size - 
which is great for them, but does not fix the problem :)

We would in any case help anyone taking a stab of one of these ideas (or 
others).

Thanks,
   Thomas

[0] 
https://bugs.openjdk.java.net/browse/JDK-8237466?jql=labels%20%3D%20g1-humongous


From thomas.schatzl at oracle.com  Fri Jan 17 10:26:37 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 11:26:37 +0100
Subject: RFR (M): 8235860: Obsolete the UseParallelOldGC option
In-Reply-To: <84170868-4b43-c578-f134-bb169c4f2708@oracle.com>
References: <292ab94f-f2c8-b373-d5a5-46a45470540e@oracle.com>
 <2A4B1955-26D5-4544-B476-6E9E5E8009D4@oracle.com>
 <5e21e50d-a026-98ba-d03d-3f7aa1c31e21@oracle.com>
 <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
 <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>
 <84170868-4b43-c578-f134-bb169c4f2708@oracle.com>
Message-ID: <657d7f7d-95e5-e355-917a-2c527bac0436@oracle.com>

Hi Leo,

On 16.01.20 17:06, Leo Korinth wrote:
> Hi!
> 
> I believe _name and old_gen_name() in PSOldGen should be removed and the 
> virtual name() should return the string literal directly. Change this if 
> you want.
> 
> With or without using my suggestions, your changes looks good to me.

   thanks for your review.

Here are latest changes, fixing the issue:

http://cr.openjdk.java.net/~tschatzl/8235860/webrev.1_to_2 (diff)
http://cr.openjdk.java.net/~tschatzl/8235860/webrev.2 (full)

Thanks,
   Thomas


From leo.korinth at oracle.com  Fri Jan 17 12:10:49 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 17 Jan 2020 13:10:49 +0100
Subject: RFR (M): 8235860: Obsolete the UseParallelOldGC option
In-Reply-To: <657d7f7d-95e5-e355-917a-2c527bac0436@oracle.com>
References: <292ab94f-f2c8-b373-d5a5-46a45470540e@oracle.com>
 <2A4B1955-26D5-4544-B476-6E9E5E8009D4@oracle.com>
 <5e21e50d-a026-98ba-d03d-3f7aa1c31e21@oracle.com>
 <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
 <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>
 <84170868-4b43-c578-f134-bb169c4f2708@oracle.com>
 <657d7f7d-95e5-e355-917a-2c527bac0436@oracle.com>
Message-ID: <45de9071-a546-4c3b-939f-a7723f5a2cdf@oracle.com>

On 17/01/2020 11:26, Thomas Schatzl wrote:
> Hi Leo,
> 
> On 16.01.20 17:06, Leo Korinth wrote:
>> Hi!
>>
>> I believe _name and old_gen_name() in PSOldGen should be removed and 
>> the virtual name() should return the string literal directly. Change 
>> this if you want.
>>
>> With or without using my suggestions, your changes looks good to me.
> 
>  ? thanks for your review.
> 
> Here are latest changes, fixing the issue:
> 
> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.1_to_2 (diff)
> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.2 (full)

Looks good!

Thanks,
Leo
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Fri Jan 17 12:59:31 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 13:59:31 +0100
Subject: RFR (M): 8235860: Obsolete the UseParallelOldGC option
In-Reply-To: <45de9071-a546-4c3b-939f-a7723f5a2cdf@oracle.com>
References: <292ab94f-f2c8-b373-d5a5-46a45470540e@oracle.com>
 <2A4B1955-26D5-4544-B476-6E9E5E8009D4@oracle.com>
 <5e21e50d-a026-98ba-d03d-3f7aa1c31e21@oracle.com>
 <56A9296B-089E-4A00-9C43-5E3CBDF4A29B@oracle.com>
 <48af99ac-9e7a-c112-800e-db13e3b3bbcb@oracle.com>
 <84170868-4b43-c578-f134-bb169c4f2708@oracle.com>
 <657d7f7d-95e5-e355-917a-2c527bac0436@oracle.com>
 <45de9071-a546-4c3b-939f-a7723f5a2cdf@oracle.com>
Message-ID: <156f1607-b336-3089-d264-ff71fd95ef6f@oracle.com>

Hi,

On 17.01.20 13:10, Leo Korinth wrote:
> On 17/01/2020 11:26, Thomas Schatzl wrote:
>> Hi Leo,
>>
>> On 16.01.20 17:06, Leo Korinth wrote:
>>> Hi!
>>>
>>> I believe _name and old_gen_name() in PSOldGen should be removed and 
>>> the virtual name() should return the string literal directly. Change 
>>> this if you want.
>>>
>>> With or without using my suggestions, your changes looks good to me.
>>
>> ?? thanks for your review.
>>
>> Here are latest changes, fixing the issue:
>>
>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.1_to_2 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8235860/webrev.2 (full)
> 
> Looks good!
> 
> Thanks,
> Leo

   thanks for your review.

Thomas


From thomas.schatzl at oracle.com  Fri Jan 17 13:27:59 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 14:27:59 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
 <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>
Message-ID: <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>

Hi Stefan,

On 17.01.20 10:06, Stefan Johansson wrote:
> Hi Thomas,
> 
> On 2020-01-16 14:20, Thomas Schatzl wrote:
>> Hi all,
>>
>> ?? can I get reviews for this change that fixes a bug in the abortable 
>> mixed gc algorithm where G1 might corrupt oops embedded in nmethods 
>> due to parallel modification during an optional evacuation phase?
>>
>> G1 currently collects embedded oops in nmethods twice: once in the 
>> optional roots list, and once as nmethods in the strong code roots 
>> list for a particular region.
>>
>> Now it can happen that this oop embedded in in the code stream is 
>> unaligned, so if that oop is modified during relocation word tearing 
>> may occur, causing follow-up crashes.
>>
>> The fix is to not collect oops from nmethods in the optional code root 
>> list as the strong code root list for a particular region already 
>> always contains it anyway.
>>
>> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
>> testing and the discussion around it.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8235305
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
> 
> Fix looks good. 

Thanks for your review.

> Just some things around the naming of the template 
> parameter and enum after adding this. I don't have a much better idea 
> 
[...]

Talked to them about this and I'm good with their suggestion:

http://cr.openjdk.java.net/~tschatzl/8235305/webrev.1 (full)
http://cr.openjdk.java.net/~tschatzl/8235305/webrev.0_to_1 (diff)

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Fri Jan 17 13:31:05 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 17 Jan 2020 14:31:05 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
Message-ID: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>

Hi all,

Please review this patch to remove the automatic "is in heap" 
verification from OopIterateClosure.

https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8237363

OopIterateClosure provides some automatic verification that loaded 
objects are inside the heap. Closures can opt out from this by 
overriding should_verify_oops().

I propose that we move this verification, and the way to turn it off, 
and instead let the implementations of the closures decide the kind of 
verification that is appropriate. I want to do this to de-clutter the 
closure APIs a bit.

I've gone through all OopIterateClosures that don't override 
should_verify_oops() and added calls to 
assert_oop_field_points_to_object_in_heap[_or_null] where the closures 
didn't have equivalent checks.

A lot of the places didn't explicitly check that the object is within 
the heap but they would check for other things like:
- Is the corresponding bit index within the range
- Is the heap region index within range
- Is the object in the reserved heap range (weaker than is_in)

I've added asserts to those places. If you think I should remove some of 
them, please let me now.

Tested with tier1-3

Thanks,
StefanK


From thomas.schatzl at oracle.com  Fri Jan 17 14:11:29 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 15:11:29 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
Message-ID: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>

Hi all,

   can I have reviews for this small test fix to unclutter CI with 
unnecessary failures?

So this attempts test checks the GC cycle, and in the logs this fails 
because for some unknown reason (timing?) we get to-space exhaustion 
and ultimately a full gc which prevents the expected mixed gc.

The problem (demonstrated with an even more heap-reduced test) is that 
with 10m heap, 2 regions are already taken by archive regions, leaving 8 
regions for allocation. Default policy allows g1 to use 4 regions of 
eden straight away, meaning that if the right amount of fragmentation 
occurs, we could expand these 4 eden regions in just a bit more than 4 
destination regions, causing the evacuation failure.

The fix is to limit young gen size so that this situation can not occur 
(verified that max number of regions used is significantly smaller than 
before visually); I added another small fix to not rely on OOME 
exception to trigger the mixed gcs we want to check for.

As I could never locally reproduce the issue with original VM settings, 
I also added a bit more logging to the runs.

I would like to push this into 14 to avoid noise there too as it also 
occurs there.

CR:
https://bugs.openjdk.java.net/browse/JDK-8237079
Webrev:
http://cr.openjdk.java.net/~tschatzl/8237079/webrev
Testing:
4k passed runs with the new test, local testing

Thanks,
   Thomas


From stefan.johansson at oracle.com  Fri Jan 17 14:41:13 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 17 Jan 2020 15:41:13 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
 <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>
 <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>
Message-ID: <54852a22-369a-f6ae-7f4d-bffa1dc89aee@oracle.com>

Hi Thomas,

On 2020-01-17 14:27, Thomas Schatzl wrote:
> Hi Stefan,
> 
> On 17.01.20 10:06, Stefan Johansson wrote:
>> Hi Thomas,
>>
>> On 2020-01-16 14:20, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>> ?? can I get reviews for this change that fixes a bug in the 
>>> abortable mixed gc algorithm where G1 might corrupt oops embedded in 
>>> nmethods due to parallel modification during an optional evacuation 
>>> phase?
>>>
>>> G1 currently collects embedded oops in nmethods twice: once in the 
>>> optional roots list, and once as nmethods in the strong code roots 
>>> list for a particular region.
>>>
>>> Now it can happen that this oop embedded in in the code stream is 
>>> unaligned, so if that oop is modified during relocation word tearing 
>>> may occur, causing follow-up crashes.
>>>
>>> The fix is to not collect oops from nmethods in the optional code 
>>> root list as the strong code root list for a particular region 
>>> already always contains it anyway.
>>>
>>> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
>>> testing and the discussion around it.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8235305
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
>>
>> Fix looks good. 
> 
> Thanks for your review.
> 
>> Just some things around the naming of the template parameter and enum 
>> after adding this. I don't have a much better idea
> [...]
> 
> Talked to them about this and I'm good with their suggestion:
> 
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.1 (full)
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.0_to_1 (diff)
This looks good!

Thanks,
Stefan

> 
> Thanks,
>  ? Thomas
> 


From leo.korinth at oracle.com  Fri Jan 17 15:07:19 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 17 Jan 2020 16:07:19 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
Message-ID: <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>

Hi Thomas,

This is not a review. This code is basically the same code as is 
duplicated at least three times in the test code. One of the 
duplications you can blame me for, *sorry*. I believe it should be moved 
to a common library method. I also believe the last fix you did in 
TestG1ParallelPhases.java makes that version look cleaner than what you 
propose here (it does not need the last allocation loop at all).

How about using the TestG1ParallelPhases.java version for all three test 
cases? If not, do the third version in TestOldGenCollectionUsage really 
work???

Thanks,
Leo

On 17/01/2020 15:11, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this small test fix to unclutter CI with 
> unnecessary failures?
> 
> So this attempts test checks the GC cycle, and in the logs this fails 
> because for some unknown reason (timing?) we get to-space exhaustion and 
> ultimately a full gc which prevents the expected mixed gc.
> 
> The problem (demonstrated with an even more heap-reduced test) is that 
> with 10m heap, 2 regions are already taken by archive regions, leaving 8 
> regions for allocation. Default policy allows g1 to use 4 regions of 
> eden straight away, meaning that if the right amount of fragmentation 
> occurs, we could expand these 4 eden regions in just a bit more than 4 
> destination regions, causing the evacuation failure.
> 
> The fix is to limit young gen size so that this situation can not occur 
> (verified that max number of regions used is significantly smaller than 
> before visually); I added another small fix to not rely on OOME 
> exception to trigger the mixed gcs we want to check for.
> 
> As I could never locally reproduce the issue with original VM settings, 
> I also added a bit more logging to the runs.
> 
> I would like to push this into 14 to avoid noise there too as it also 
> occurs there.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8237079
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8237079/webrev
> Testing:
> 4k passed runs with the new test, local testing
> 
> Thanks,
>  ? Thomas


From zgu at redhat.com  Fri Jan 17 15:28:42 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 17 Jan 2020 10:28:42 -0500
Subject: [14] RFR 8237396: JvmtiTagMap::weak_oops_do() should not trigger
 barriers
Message-ID: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>

Please review this small patch that avoids barriers in 
JvmtiTagMap::weak_oops_do() method.

The method is used by GC and GC expects to see raw oops.


Bug: https://bugs.openjdk.java.net/browse/JDK-8237396
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237396/webrev.00/

Test:
   hotspot_gc
   vmTestbase_nsk_jvmti
   (fastdebug and release) on x86_64 Linux

   Submit test in progress.

Thanks,

-Zhengyu


From zgu at redhat.com  Fri Jan 17 16:34:30 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 17 Jan 2020 11:34:30 -0500
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
Message-ID: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>

Please review this patch that moves string deduplication cleanup task 
into concurrent phase.

The cleanup task composites two subtasks: StringDedupTable and 
StringDedupQueue cleanup.

Concurrent StringDedupTable cleanup is very straightforward. GC takes 
StringDedupTable_lock to block out mutators from modifying the table, 
then performs multi-thread cleanup, just as it does at STW pause.

Concurrent StringDedupQueue cleanup is more complicated. GC takes 
StringDedupQueue_lock, only blocks queue structure changes, while 
mutators can still enqueue new string candidates and dedup thread can 
still perform deduplication. So there are a couple of synchronizations 
need to be established.

1) When mutator enqueues a candidate, the enqueued oop should be valid 
before the slot can be made visible to GC threads.

2) When GC thread updates oop, it needs to make sure that dedup thread 
does not see partially updated oop.

The implementation uses load_acquire/release_store pair to ensure above 
synchronization held.

GC threads may miss some just enqueued oops by mutators. This is not a 
concern, since LRB guarantees they are in to-space.


Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/


Test:
   hotspot_gc_shenandoah with -XX:+UseStringDeduplication
   (fastdebug and release) on x86_64 and aarch64 Linux

Thanks,

-Zhengyu


From thomas.schatzl at oracle.com  Fri Jan 17 16:55:13 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 17 Jan 2020 17:55:13 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails
 with "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
 <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
Message-ID: <1ddd23c37080bf0069100116e923cee86a3115b1.camel@oracle.com>

Hi,

On Fri, 2020-01-17 at 16:07 +0100, Leo Korinth wrote:
> Hi Thomas,
> 
> This is not a review. This code is basically the same code as is 
> duplicated at least three times in the test code. One of the 
> duplications you can blame me for, *sorry*. I believe it should be
> moved 
> to a common library method. I also believe the last fix you did in 
> TestG1ParallelPhases.java makes that version look cleaner than what
> you 
> propose here (it does not need the last allocation loop at all).

I figured that this code looked familiar but did not know where I saw
that before. I should have looked through the other tests.... let me
look at the other implementations and redo this change.

> 
> How about using the TestG1ParallelPhases.java version for all three
> test cases? If not, do the third version in TestOldGenCollectionUsage
> really work???

I will check it out.

Thomas


From manc at google.com  Sat Jan 18 04:08:05 2020
From: manc at google.com (Man Cao)
Date: Fri, 17 Jan 2020 20:08:05 -0800
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <4623ce42-7b6c-8b46-5915-8ff708b82f5c@oracle.com>
References: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
 <4623ce42-7b6c-8b46-5915-8ff708b82f5c@oracle.com>
Message-ID: <CA+w6Hxa7D=k6=LTx2wVJ=Qzpx3aWt1t6Di2sbvty27UjANF3vg@mail.gmail.com>

Thanks for the in-depth responses!

For a sample application, I actually have a modified BigRamTester that
allocates humongous objects, and it can demonstrate some of the problems.
Would JDK-8204689 be addressed soon? Then we can merge the variants of
BigRamTester.
A possible concern is that the "humongous BigRamTester" is not
representative of the production workload's problem with humongous objects.
The humongous objects in production workload are more likely short-lived,
whereas they are long-lived in "humongous BigRamTester". Perhaps we can
modify it further to make it the humongous objects short-lived. I will keep
this topic on my radar and see if I can find more realistic benchmarks.

For OOMs due to fragmentation and ideas related to full GC (JDK-8191565,
JDK-8038487), I'd like to point out that the near-OOM cases are less of a
concern for our production applications. Their heap sizes are sufficiently
large in order to keep GC overhead low with CMS in the past. When they move
to G1, they almost never trigger full GCs even with a non-trivial number of
humongous allocations.
The problem is the high frequency of concurrent cycles and mixed
collections as a result of humongous allocations. Fundamentally it is also
due to fragmentation, but only addressing the near-OOM cases would not
solve the problem. Doing more active defragmentation could indeed help.

It might be better to first fully explore the feasibilities of those
crazier ideas. If one of them works, then we don't need to continuously
improve G1 here and there. So far there are 3 of them. They all can get rid
of humongous regions completely if I understand correctly.
a. let G1 reserve a multiple of MaxHeapSize while only ever committing
MaxHeapSize (JDK-8229373)
    I like this approach most, especially since JDK-8211425 is already
implemented. I'll further think about the issue with compressed oops.
b. break down large objects into smaller ones like J9's arraylets
    A few questions on this approach:
    We probably don't need to handle large non-array objects, right? They
should be extremely rare.
    Is this approach compliant with JLS [1] and JVMS [2]? I read about them
but couldn't find evidence of noncompliance.
    Supporting JNI GetCritical does look tricky. Another tricky issue is
that we should preserve O(1) complexity for accesses by index.
c. carving out the adjustable subset of regions for humongous allocs and
doing power-of-two buddy-system allocation
    I have also thought about a quite similar idea by introducing a
dynamic-sized humongous space. It might be better to support multiple
dynamic-sized humongous spaces. I admit I probably have not thought this
approach as deep as Aleksey has.

[1] https://docs.oracle.com/javase/specs/jls/se13/html/jls-10.html
[2]
https://docs.oracle.com/javase/specs/jvms/se13/html/jvms-6.html#jvms-6.5.newarray

-Man

On Fri, Jan 17, 2020 at 2:01 AM Thomas Schatzl <thomas.schatzl at oracle.com>
wrote:

> Hi,
>
> On 17.01.20 01:53, Man Cao wrote:
> > Hi all,
> >
> > While migrating our workload from CMS to G1, we found many production
> > applications suffer from humongous allocations.
> > The default threshold for humongous objects is often too small for our
> > applications with heap sizes between 2GB-15GB.
> > Humongous allocations caused noticeable increase in the frequency of
> > concurrent old-gen collections, mixed collections and CPU usage.
> > We could advise applications to increase G1HeapRegionSize. But some
> > applications still suffer with G1HeapRegionSize=32M.
> > We could also advise applications to refactor code to break down large
> > objects. But it is a high cost effort that may not always be feasible.
> >
> > We'd like to work with the OpenJDK community together to improve G1's
> > handling of humongous objects.
> > Thomas Schatzl mentioned to me a few efforts/ideas on this front in an
> > offline chat:
> > a. Allocation into tail regions of humongous object: JDK-8172713,
> > JDK-8031381
> > b. Commit additional virtual address space for humongous objects.
> > c. Improve the region selection heuristics (e.g., first-fit, best-fit)
> for
> > humongous objects.
> >
> > I didn't find open CRs for b. and c. Could someone give pointers?
> > Are there any other ideas/prototypes on this front?
>
> TLDR: we in the Oracle gc team have quite a few ideas that can decrease
> the issue significantly. We are happy to help with implementation of any
> of these.
> We would appreciate a sample application.
>
> Long version:
>
> The problems with humongous object allocation in G1:
>
> - internal fragmentation: the tail end of a humongous object is wasted
> space.
>
> - external fragmentation: sometimes you can't find enough contiguous
> space for a humongous object.
>
> There are quite a few CRs related to this problem in the bug tracker; I
> just now connected them together using a "g1-humongous" label [0].
>
> Here's a rundown of our ideas, categorized a little (note that these CRs
> predate significant changes due to how G1 works now, so the ideas may
> need to be adapted to the current situation):
>
> - try to get rid of humongous asap, i.e. improve eager reclaim support
> by allowing eager reclaim with reference arrays (JDK-8048180) or
> non-objArrays (JDK-8073288).
> I remember the main problem with that were stale remembered set entries
> after removal (and SATB marking, but you could just not do eager reclaim
> during marking).
> In the applications we had at hand at that time, reference arrays tended
> to be not eager reclaimable most of the time, and humongous regular
> objects were rare.
> So the benefit to look into this might be small.
>
> - allow allocation into the tail end of humongous objects (JDK-8172713);
> there has once been an internal prototype for that, but it has been
> abandoned because of implementation issues (it was a hack that has not
> been completed to a stable state, mainly because humongous object
> management had been full of odd quirks wrt to region management. This
> has been fixed since. Also the example application benefitted more from
> eager reclaim).
>
> While the argument from Aleksey about nepotism in the other thread is
> valid (as far as I understand it), it depends on the implementation. The
> area at the tail end could be considered as a separate evacuation
> source, i.e. evacuated independently of the humongous object (and that
> would actually improve the code to clean out HeapRegion ;)).
> (This needs more care with single-region humongous objects but does not
> seem completely problematic; single-region humongous objects may
> nowadays not be a big issue to just move during GC).
>
> - external fragmentation can be approached in many ways:
>
>    - or just ignored by letting G1 reserve a multiple of MaxHeapSize
> while only ever committing MaxHeapSize (JDK-8229373). The main drawback
> here is that it impacts the range of heaps where compressed oops can be
> used, and 32 bit (particularly Windows) VMs (if you still care, but the
> feature could be disabled as well).
> Compressed oops typically improve throughput significantly. Of course,
> as long as the total size of the reservation is below the threshold, it
> does not really matter.
>
> Fwiw, when using the HeterogeneousHeapRegionManager, this is already
> attempted (for other reasons).
>
>    - improve the region allocator to decrease the problem (JDK-8229373).
> The way G1 currently allocates regions is a first-fit approach which
> interferes a bit with destination region selection for old and survivor
> regions, likely creating more fragmentation than necessary. (Basically:
> it does not care at all, so go figure ;) ).
> Also during mixed gc one could explicitly prefer regions to evacuate
> that break long runs of free regions, weighing those regions higher
> (evacuating earlier). This needs to be done in conjunction with the
> remembered set selection at end of marking, before creating them.
>
> Long time ago, on a different regional collector, I started looking into
> this.
>
>    - actively defragment the heap during GC. This may either be full gc
> (JDK-8191565) like shenandoah does, or any young gc assuming that G1
> first kept remembered sets for potential candidates (JDK-8038487).
>
> - never create humongous objects
>
>    - potentially implement one of the various ideas in the literature to
> break down large objects into smaller ones, J9's arraylets being one of
> them.
>
> There are other solutions like completely separate allocation of
> humongous objects like ZGC does, but that typically has the same problem
> as reserving more space (i.e. compressed oops range, but ZGC does not
> care at this time).
>
> I think it would help potential contributors if there were some
> application available where the impact of changes could be shown on in
> some way. In the past, whenever there had been someone with that
> problem, these persons were happy to just increase heap region size -
> which is great for them, but does not fix the problem :)
>
> We would in any case help anyone taking a stab of one of these ideas (or
> others).
>
> Thanks,
>    Thomas
>
> [0]
>
> https://bugs.openjdk.java.net/browse/JDK-8237466?jql=labels%20%3D%20g1-humongous
>


From thomas.schatzl at oracle.com  Sat Jan 18 10:13:31 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 18 Jan 2020 11:13:31 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <CA+w6Hxa7D=k6=LTx2wVJ=Qzpx3aWt1t6Di2sbvty27UjANF3vg@mail.gmail.com>
References: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
 <4623ce42-7b6c-8b46-5915-8ff708b82f5c@oracle.com>
 <CA+w6Hxa7D=k6=LTx2wVJ=Qzpx3aWt1t6Di2sbvty27UjANF3vg@mail.gmail.com>
Message-ID: <f51941ec89dcdcc737b867c16a6eb11f7aa7933c.camel@oracle.com>

Hi,

On Fri, 2020-01-17 at 20:08 -0800, Man Cao wrote:
> Thanks for the in-depth responses!
> 
> For a sample application, I actually have a modified BigRamTester
> that allocates humongous objects, and it can demonstrate some of the
> problems.
> Would JDK-8204689 be addressed soon? Then we can merge the variants 

Given previous track record on that, unfortunately not.

> of BigRamTester. A possible concern is that the "humongous
> BigRamTester" is not representative of the production workload's
> problem with humongous objects.
> The humongous objects in production workload are more likely short-
> lived, whereas they are long-lived in "humongous BigRamTester". 

For short-lived humongous objects eager reclaim can do miracles. If
your objects are non-objArrays, you could check for the reason why they
are not eagerly reclaimed - maybe the threshold for the amount of
remembered set entries to keep these humongous objects as eligible for
eager reclaim is too low, and increasing that one would just make it
work. Enabling gc+humongous=debug can give more information.

Note that in JDK13 we (implicitly) increased this threshold, and in
JDK14 we removed the main reason why the threshold is as low as it is
(calculating the number of rememebered set entries).

It is likely possible to increase this threshold by one or even two
magnitudes now, potentially increasing its effectiveness significantly
with a one-liner change. I will file a CR for that, thought of it but
forgot when doing the jdk14 modification.

> Perhaps we can modify it further to make it the humongous objects
> short-lived. I will keep this topic on my radar and see if I can find
> more realistic benchmarks.
> 
> For OOMs due to fragmentation and ideas related to full GC (JDK-
> 8191565, JDK-8038487), I'd like to point out that the near-OOM cases
> are less of a concern for our production applications. Their heap
> sizes are sufficiently large in order to keep GC overhead low with
> CMS in the past. When they move to G1, they almost never trigger full
> GCs even with a non-trivial number of humongous allocations.
> The problem is the high frequency of concurrent cycles and mixed
> collections as a result of humongous allocations. Fundamentally it is

Which indicates that eager reclaim does not work in this application
for some reason.

> also due to fragmentation, but only addressing the near-OOM cases
> would not solve the problem. Doing more active defragmentation could
> indeed help.

To me, spending the effort on combating internal fragmentation (allow
allocation in tail ends) and external fragmentation by actively
defragmenting seems to be at least worth comparing to other options.

It could help with all problems but cases where you allocate a very
large of humongous objects and you can't keep the humognous object
tails filled. This option still keeps the invariant that humongous
objects need to be allocated at a region boundary.

Most of the other ideas you propose below also (seem to) retain this
property.

> It might be better to first fully explore the feasibilities of those
> crazier ideas. If one of them works, then we don't need to
> continuously improve G1 here and there. So far there are 3 of them.
> They all can get rid of humongous regions completely if I understand
> correctly.
> a. let G1 reserve a multiple of MaxHeapSize while only ever
> committing MaxHeapSize (JDK-8229373)
>     I like this approach most, especially since JDK-8211425 is
> already implemented. I'll further think about the issue with
> compressed oops.

It is simplest, but does not solve the issue with internal
fragmentation which is ultimately responsible for concurrent cycle
frequency.

Maybe it is sufficient as "most" applications only use single or low
double-digit GB heaps at the moment where the entire reservation still
fits into the 32gb barrier.

If the heap is already larger than the compressed oops range, then this
solution would certainly be simplest for the external fragmentation
issue. If you are already way beyond that barrier, you might just use
ZGC though for other reasons too if you are fine with any potential
throughput hit.

> b. break down large objects into smaller ones like J9's arraylets
>     A few questions on this approach:
>     We probably don't need to handle large non-array objects, right?
> They should be extremely rare.

Arraylets do not solve that problem either.

>     Is this approach compliant with JLS [1] and JVMS [2]? I read
> about them but couldn't find evidence of noncompliance.

I do not think there is an issue but I did not specifically read the
specs again. Given that J9 is spec compliant afaik when they use
arraylets (with the default balanced collector), so would Hotspot.

>     Supporting JNI GetCritical does look tricky. Another tricky issue

You could double-map like 
https://blog.openj9.org/2019/05/01/double-map-arraylets/ does for
native access.

Btw the same text also indicates that copying seems like a non-starter
anyway, as, quoting from the text "One use case, SPECjbb2015 benchmark
is not being able to finish RT curve...".

> is that we should preserve O(1) complexity for accesses by index.

Not sure what prevents arraylets in particular from being O(1); a
particular access is slower though due to the additional indirection
with the spine.

Using the double-mapped array for JITted code may have the same problem
with compressed oops as other solutions; particularly if you do not
know the size of the processed array in advance, you need to create
extra code.

Which means that there is significant optimization work needed to make
array access "as fast" as before in jitted code.

> c. carving out the adjustable subset of regions for humongous allocs
> and doing power-of-two buddy-system allocation

The buddy system (as I understand it, maybe Aleksey could share more
details) still suffers from internal fragmentation, potentially even
more than now.

>     I have also thought about a quite similar idea by introducing a
> dynamic-sized humongous space. It might be better to support multiple
> dynamic-sized humongous spaces. I admit I probably have not thought
> this approach as deep as Aleksey has.

This is the approach ZGC takes, which has the associated problems with
compressed oops.
I do not think we can completely give up the compressed oops use case
at least until alternatives are explored.

> 
> [1] https://docs.oracle.com/javase/specs/jls/se13/html/jls-10.html
> [2]
> 
https://docs.oracle.com/javase/specs/jvms/se13/html/jvms-6.html#jvms-6.5.newarray
> 

Thanks,
  Thomas


From maoliang.ml at alibaba-inc.com  Sun Jan 19 07:08:38 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Sun, 19 Jan 2020 15:08:38 +0800
Subject: =?UTF-8?B?RGlzY3Vzc2lvbjogaW1wcm92ZSBodW1vbmdvdXMgb2JqZWN0cyBoYW5kbGluZyBmb3IgRzE=?=
Message-ID: <695ee6a6-a182-40a9-bfbf-49214d2fdaaa.maoliang.ml@alibaba-inc.com>

Hi Guys,

We Alibaba have experienced the same problem as Man introduced. 
Some applications got frequent concurrent mark cycles and high
cpu usage and even some to-space exhausted failures because of 
large amount of humongous object allocation even with
G1HeapRegionSize=32m. But those applications worked fine
with ParNew/CMS. We are working on some enhancements for better
reclamation of humongous objects. Our first intention is to reduce
the frequent concurrent cycles and possible to-space exhausted so
the heap utility or arraylets are not taken into consideration yet.

Our solution is more like a ParNew/CMS flow and will treat a 
humongous object as young or old. 
1. Humongous object allocation in mutator will be considered into
eden size and won't directly trigger concurrent mark cycle. That
will avoid the possible to-space exhausted while concurrent mark
is working and humongous allocations are "eating" the free regions.
2. Enhance the reclamation of short-live humongous object by
covering object array that current eager reclaim only supports 
primitive type for now. This part looks same to JDK-8048180 and
JDK-8073288 Thomas mentioned. The evacuation flow will iterate 
the humongous object array as a regular object if the humongous
object is "young" which can be distinguished by the "age" field
in markoop.

The patch is being tested. We will share it once it proves to
 work fine with our applications. I don't know if any similar
 approach has been already tried and any advices?

Thanks,
Liang

From thomas.schatzl at oracle.com  Mon Jan 20 10:46:06 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 20 Jan 2020 11:46:06 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <f51941ec89dcdcc737b867c16a6eb11f7aa7933c.camel@oracle.com>
References: <CA+w6HxaDthEghTog-6xY1-Mkfq5jfVBaU2XCcPjnZULwWmakFQ@mail.gmail.com>
 <4623ce42-7b6c-8b46-5915-8ff708b82f5c@oracle.com>
 <CA+w6Hxa7D=k6=LTx2wVJ=Qzpx3aWt1t6Di2sbvty27UjANF3vg@mail.gmail.com>
 <f51941ec89dcdcc737b867c16a6eb11f7aa7933c.camel@oracle.com>
Message-ID: <48d29ef4-800d-9402-3bfb-7dab1c895a54@oracle.com>

Hi,

On 18.01.20 11:13, Thomas Schatzl wrote:
> Hi,
> 
> On Fri, 2020-01-17 at 20:08 -0800, Man Cao wrote:
>> Thanks for the in-depth responses!
>>
[...]
> 
>> of BigRamTester. A possible concern is that the "humongous
>> BigRamTester" is not representative of the production workload's
>> problem with humongous objects.
>> The humongous objects in production workload are more likely short-
>> lived, whereas they are long-lived in "humongous BigRamTester".
> 
> For short-lived humongous objects eager reclaim can do miracles. If
> your objects are non-objArrays, you could check for the reason why they
> are not eagerly reclaimed - maybe the threshold for the amount of
> remembered set entries to keep these humongous objects as eligible for
> eager reclaim is too low, and increasing that one would just make it
> work. Enabling gc+humongous=debug can give more information.
> 
> Note that in JDK13 we (implicitly) increased this threshold, and in
> JDK14 we removed the main reason why the threshold is as low as it is
> (calculating the number of rememebered set entries).
> 
> It is likely possible to increase this threshold by one or even two
> magnitudes now, potentially increasing its effectiveness significantly
> with a one-liner change. I will file a CR for that, thought of it but
> forgot when doing the jdk14 modification.

JDK-8237500.

>>
>> For OOMs due to fragmentation and ideas related to full GC (JDK-
>> 8191565, JDK-8038487), I'd like to point out that the near-OOM cases
>> are less of a concern for our production applications. Their heap
>> sizes are sufficiently large in order to keep GC overhead low with
>> CMS in the past. When they move to G1, they almost never trigger full
>> GCs even with a non-trivial number of humongous allocations.
>> The problem is the high frequency of concurrent cycles and mixed
>> collections as a result of humongous allocations. Fundamentally it is
> 
> Which indicates that eager reclaim does not work in this application
> for some reason.

Note that it would be appreciated if we all were able to discuss issues 
on an actual log (gc+heap=debug,gc+humongous=debug; some rough 
comparison of gc's performed with g1 and CMS, with some distribution of 
g1 gc pauses) than trying to guess what each others actual problems are.


>> also due to fragmentation, but only addressing the near-OOM cases
>> would not solve the problem. Doing more active defragmentation could
>> indeed help.
> 
> To me, spending the effort on combating internal fragmentation (allow
> allocation in tail ends) and external fragmentation by actively
> defragmenting seems to be at least worth comparing to other options.
> 
> It could help with all problems but cases where you allocate a very
> large of humongous objects and you can't keep the humognous object
> tails filled. This option still keeps the invariant that humongous
> objects need to be allocated at a region boundary.
> 
> Most of the other ideas you propose below also (seem to) retain this
> property.

After some more thought, all these solutions actually all seem to do so. 
Even the arraylets would suffer from the same internal fragmentation for 
the last arrayoid as it does now since they seem to stay humongous to 
avoid constant copying and remapping.

There is some remark in some tech paper about arraylets 
(https://www.ibm.com/developerworks/websphere/techjournal/1108_sciampacone/1108_sciampacone.html 
thatt indicates that the  balanced collector seems to not move the 
arrayoids too. ([...] Additionally, the balanced collector never needs 
to move an arraylet leaf once it has been allocated. The cost of 
relocating an array is limited to the cost of relocating the spine, so 
large arrays do not contribute to higher defragmentation times. [...]).

Thanks,
   Thomas


From roy.sunny.zhang007 at gmail.com  Mon Jan 20 10:54:37 2020
From: roy.sunny.zhang007 at gmail.com (Roy Zhang)
Date: Mon, 20 Jan 2020 18:54:37 +0800
Subject: Abnormal high sys time in G1 GC
In-Reply-To: <CAKAyR_K8-rwkk-z+Do3jMsBXa=52QqKMynd9CNe+mzJATVsenA@mail.gmail.com>
References: <CAKAyR_K8-rwkk-z+Do3jMsBXa=52QqKMynd9CNe+mzJATVsenA@mail.gmail.com>
Message-ID: <CAKAyR_LSJieNYCOERi-SE4Z7_nZ5f0euOOXyDN2AU1ntxA1n1w@mail.gmail.com>

Sent to hotspot-gc-dev mail list as well :)

Thank you for ur help in advance!!!

Thanks,
Roy

On Mon, Jan 20, 2020 at 6:22 PM Roy Zhang <roy.sunny.zhang007 at gmail.com>
wrote:

> Dear JVM experts,
>
> Recently we found GC spike (long STW minor GC), and sys time is high when
> we GC time is high. Normally sys time is near 0 seconds and minor GC is
> less than 500ms.
>
> From
> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2017-October/020630.html
>  and https://blog.gceasy.io/2016/12/11/sys-time-greater-than-user-time/,
> high sys time could be caused by operation system problem/VM related
> problem/memory constraint/disk IO pressure/Transparent Huge Pages.
>
> I checked them one by one, don't find any clue, could u please kindly
> provide suggestion? Thanks in advance!
>
> 1.operation system problem
> --We have enough CPU/memory/disk (48 cpu cores + 373 RAM with 160G heap,
> disk is enough), and there is no error in /var/log/dmesg
> 2. memory constraint
> -- We have enough available memory. available memory (free -m) is 263G
> 3. disk IO pressure
> -- Not find issue from disk info from prometheus node exporter.
> Granularity is 15s, and I can't find counterpart of avgqu-sz & util metrics
> (disk IO util and saturation metrics)  which is part of iostat. It could be
> caused by big Granularity???
> 4. VM related problem
> -- We are using physical machine
> 5. Transparent Huge Pages.
> It is madvise. It could be a problem, but we don't have this issue
> previously. It has been running for nearly 20 weeks.
>
> *cat /sys/kernel/mm/transparent_hugepage/enabledalways [madvise] never*
>
> *JDK version:*
> OpenJDK Runtime Environment, 1.8.0_222-b10
>
> *Java Opts:*
> -javaagent:/server/jmx_prometheus_javaagent-0.12.0.jar=xxxx:/server/config.yaml
>
> -server
> -Dcom.sun.management.jmxremote
> -Dcom.sun.management.jmxremote.port=xxxx
> -Dcom.sun.management.jmxremote.rmi.port=xxxx
> -Dcom.sun.management.jmxremote.local.only=false
> -Dcom.sun.management.jmxremote.authenticate=false
> -Dcom.sun.management.jmxremote.ssl=false
> -Xloggc:/server/xxxx.log
> -XX:+PrintGCDateStamps
> -XX:AutoBoxCacheMax=1000000
> -XX:+UseG1GC
> -XX:MaxGCPauseMillis=500
> -XX:+UnlockExperimentalVMOptions
> -XX:G1NewSizePercent=50
> -XX:InitiatingHeapOccupancyPercent=70
> -XX:+ParallelRefProcEnabled
> -XX:+ExplicitGCInvokesConcurrent
> -XX:+UseStringDeduplication
> -XX:+PrintGCDetails
> -XX:+PrintGCTimeStamps
> -Xms160g
> -Xmx160g
> -XX:+HeapDumpOnOutOfMemoryError
>
> *Snippet of GC log:*
>
> 2020-01-20T07:27:03.166+0000: 2756.665: [GC pause (G1 Evacuation Pause)
> (young), *6.2899024 secs*]
>    [Parallel Time: 6255.0 ms, GC Workers: 33]
>       [GC Worker Start (ms): Min: 2756664.9, Avg: 2756665.5, Max:
> 2756666.1, Diff: 1.2]
>       [Ext Root Scanning (ms): Min: 0.0, Avg: 0.5, Max: 5.3, Diff: 5.3,
> Sum: 16.8]
>       [Update RS (ms): Min: 0.0, Avg: 0.8, Max: 1.1, Diff: 1.1, Sum: 25.6]
>          [Processed Buffers: Min: 0, Avg: 1.6, Max: 4, Diff: 4, Sum: 53]
>       [Scan RS (ms): Min: 142.0, Avg: 145.3, Max: 146.4, Diff: 4.4, Sum:
> 4794.1]
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.3, Max: 3.5, Diff: 3.5,
> Sum: 8.8]
>      * [Object Copy (ms): Min: 6100.1, Avg: 6101.8, Max: 6106.5, Diff:
> 6.4, Sum: 201358.4]*
>       [Termination (ms): Min: 0.1, Avg: 5.2, Max: 6.7, Diff: 6.6, Sum:
> 172.9]
>          [Termination Attempts: Min: 1, Avg: 1353.0, Max: 1476, Diff:
> 1475, Sum: 44650]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.4, Diff: 0.4, Sum:
> 7.0]
>       [GC Worker Total (ms): Min: 6253.4, Avg: 6254.1, Max: 6254.7, Diff:
> 1.2, Sum: 206383.7]
>       [GC Worker End (ms): Min: 2762919.4, Avg: 2762919.6, Max: 2762919.8,
> Diff: 0.4]
>    [Code Root Fixup: 0.6 ms]
>    [Code Root Purge: 0.0 ms]
>    [String Dedup Fixup: 0.7 ms, GC Workers: 33]
>       [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.1, Diff: 0.1, Sum: 0.4]
>       [Table Fixup (ms): Min: 0.0, Avg: 0.1, Max: 0.6, Diff: 0.6, Sum: 2.0]
>    [Clear CT: 4.0 ms]
>    [Other: 29.6 ms]
>       [Choose CSet: 0.1 ms]
>       [Ref Proc: 10.3 ms]
>       [Ref Enq: 0.6 ms]
>       [Redirty Cards: 11.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 6.5 ms]
>    [Eden: 72576.0M(72576.0M)->0.0B(80896.0M) Survivors: 9344.0M->1024.0M
> Heap: 83520.0M(160.0G)->11046.9M(160.0G)]
> * [Times: user=27.19 sys=162.28, real=6.30 secs] *
>
> 2020-01-20T06:59:23.382+0000: 1096.881: [GC pause (G1 Evacuation Pause)
> (young) (initial-mark), *4.1248088 secs*]
>    [Parallel Time: 4098.0 ms, GC Workers: 33]
>       [GC Worker Start (ms): Min: 1096882.1, Avg: 1096882.8, Max:
> 1096883.2, Diff: 1.2]
>       [Ext Root Scanning (ms): Min: 4.0, Avg: 4.8, Max: 6.1, Diff: 2.0,
> Sum: 159.7]
>       [Update RS (ms): Min: 0.0, Avg: 0.3, Max: 1.1, Diff: 1.1, Sum: 9.5]
>          [Processed Buffers: Min: 0, Avg: 1.3, Max: 6, Diff: 6, Sum: 43]
>      * [Scan RS (ms): Min: 2001.2, Avg: 2012.2, Max: 2013.4, Diff: 12.2,
> Sum: 66401.0]*
>       [Code Root Scanning (ms): Min: 0.0, Avg: 0.6, Max: 10.7, Diff: 10.7,
> Sum: 18.5]
>    *   [Object Copy (ms): Min: 2039.3, Avg: 2049.2, Max: 2079.5, Diff:
> 40.2, Sum: 67623.1]*
>       [Termination (ms): Min: 0.0, Avg: 29.6, Max: 39.7, Diff: 39.7, Sum:
> 978.0]
>          [Termination Attempts: Min: 1, Avg: 6587.0, Max: 8068, Diff:
> 8067, Sum: 217372]
>       [GC Worker Other (ms): Min: 0.0, Avg: 0.2, Max: 0.5, Diff: 0.4, Sum:
> 7.9]
>       [GC Worker Total (ms): Min: 4096.3, Avg: 4096.9, Max: 4097.7, Diff:
> 1.4, Sum: 135197.8]
>       [GC Worker End (ms): Min: 1100979.5, Avg: 1100979.7, Max: 1100979.9,
> Diff: 0.4]
>    [Code Root Fixup: 0.6 ms]
>    [Code Root Purge: 0.2 ms]
>    [String Dedup Fixup: 1.0 ms, GC Workers: 33]
>       [Queue Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.0, Diff: 0.0, Sum: 0.0]
>       [Table Fixup (ms): Min: 0.0, Avg: 0.0, Max: 0.7, Diff: 0.7, Sum: 1.4]
>    [Clear CT: 3.4 ms]
>    [Other: 21.7 ms]
>       [Choose CSet: 0.0 ms]
>       [Ref Proc: 9.1 ms]
>       [Ref Enq: 0.9 ms]
>       [Redirty Cards: 4.3 ms]
>       [Humongous Register: 0.2 ms]
>       [Humongous Reclaim: 0.0 ms]
>       [Free CSet: 5.3 ms]
>    [Eden: 81184.0M(81184.0M)->0.0B(72576.0M) Survivors: 736.0M->9344.0M
> Heap: 83508.0M(160.0G)->10944.0M(160.0G)]
>
> * [Times: user=68.40 sys=9.11, real=4.13 secs] *
>
> Thanks,
> Roy
>


From thomas.schatzl at oracle.com  Mon Jan 20 11:11:18 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 20 Jan 2020 12:11:18 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <695ee6a6-a182-40a9-bfbf-49214d2fdaaa.maoliang.ml@alibaba-inc.com>
References: <695ee6a6-a182-40a9-bfbf-49214d2fdaaa.maoliang.ml@alibaba-inc.com>
Message-ID: <50da5a35-3a7c-566e-67d1-0659f1e068c2@oracle.com>

Hi Liang,

On 19.01.20 08:08, Liang Mao wrote:
> Hi Guys,
> 
> We Alibaba have experienced the same problem as Man introduced.
> Some applications got frequent concurrent mark cycles and high
> cpu usage and even some to-space exhausted failures because of
> large amount of humongous object allocation even with
> G1HeapRegionSize=32m. But those applications worked fine
> with ParNew/CMS. We are working on some enhancements for better

Can you provide logs? (with gc+heap=debug,gc+humongous=debug)

> reclamation of humongous objects. Our first intention is to reduce
> the frequent concurrent cycles and possible to-space exhausted so
> the heap utility or arraylets are not taken into consideration yet.
> 
> Our solution is more like a ParNew/CMS flow and will treat a
> humongous object as young or old.
> 1. Humongous object allocation in mutator will be considered into
> eden size and won't directly trigger concurrent mark cycle. That
> will avoid the possible to-space exhausted while concurrent mark
> is working and humongous allocations are "eating" the free regions.

(I am trying to imagine situations here where this would be a problem 
since I do not have a log)

That helps if G1 is already trying to do a marking cycle if the space is 
tight and already eating into the reserve that has explicitly been set 
aside for this case (G1ReservePercent - did you try increasing that for 
a workaround?). It does make young collections much more frequent than 
necessary otherwise.

Particularly if these humongous regions are eager-reclaimable. In these 
cases the humongous allocations would be "free", while with that policy 
they would cause a young gc.

The other issue, if these humongous allocations cause too many 
concurrent cycles could be managed by looking into canceling the 
concurrent marking if that concurrent start gc freed lots and lots of 
humongous objects, e.g. getting way below the mark threshold again.

I did not think this through though, of course at some point you do need 
to start the concurrent mark.

Some (or most) of that heap pressure might have been caused by the 
internal fragmentation, so allowing allocation into the tail ends would 
very likely decrease that pressure too.
This would likely be the first thing I would be looking into if the logs 
indicate that.

> 2. Enhance the reclamation of short-live humongous object by
> covering object array that current eager reclaim only supports
> primitive type for now. This part looks same to JDK-8048180 and
> JDK-8073288 Thomas mentioned. The evacuation flow will iterate
> the humongous object array as a regular object if the humongous
> object is "young" which can be distinguished by the "age" field
> in markoop. >
> The patch is being tested. We will share it once it proves to
> work fine with our applications. I don't know if any similar
> approach has been already tried and any advices?

The problem with treating humongous reference arrays as young is that 
this heuristic significantly increases the garbage collection time if 
that object survives the collection.
I.e. the collector needs to iterate over all young objects, and while 
you do save the time to copy the object by in-place aging, scanning the 
references tends to take more time than copying.

In that "different regional collector" I referenced in the other email 
exactly this had been implemented with the above issues. That collector 
also had configurable regions down to 64k (well, basically even less, 
but anything below that was just for experimentation, and 64k had been 
very debatable too), so the humongous object problem had been a lot 
larger. It might not be the case with G1's "giant" humongous objects.

Treating them as old like they are now within G1 allows you to be a lot 
more selective about what you take in for garbage collection. Now the 
policy isn't particularly smart (just take humongous objects of a 
particular type with less than a low, fixed threshold of remembered set 
entries), but that could be improved.

I.e. G1 has a measure of how long scanning a remembered set entry 
approximately takes, so that could be made dependent on available time.

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Mon Jan 20 11:28:03 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 20 Jan 2020 12:28:03 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
 <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>
 <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>
Message-ID: <10988744-5a87-5c70-eaae-cbad447dc2b3@oracle.com>

Looks good.

StefanK

On 2020-01-17 14:27, Thomas Schatzl wrote:
> Hi Stefan,
> 
> On 17.01.20 10:06, Stefan Johansson wrote:
>> Hi Thomas,
>>
>> On 2020-01-16 14:20, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>> ?? can I get reviews for this change that fixes a bug in the 
>>> abortable mixed gc algorithm where G1 might corrupt oops embedded in 
>>> nmethods due to parallel modification during an optional evacuation 
>>> phase?
>>>
>>> G1 currently collects embedded oops in nmethods twice: once in the 
>>> optional roots list, and once as nmethods in the strong code roots 
>>> list for a particular region.
>>>
>>> Now it can happen that this oop embedded in in the code stream is 
>>> unaligned, so if that oop is modified during relocation word tearing 
>>> may occur, causing follow-up crashes.
>>>
>>> The fix is to not collect oops from nmethods in the optional code 
>>> root list as the strong code root list for a particular region 
>>> already always contains it anyway.
>>>
>>> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
>>> testing and the discussion around it.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8235305
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
>>
>> Fix looks good. 
> 
> Thanks for your review.
> 
>> Just some things around the naming of the template parameter and enum 
>> after adding this. I don't have a much better idea
> [...]
> 
> Talked to them about this and I'm good with their suggestion:
> 
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.1 (full)
> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.0_to_1 (diff)
> 
> Thanks,
>  ? Thomas
> 


From thomas.schatzl at oracle.com  Mon Jan 20 11:57:03 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 20 Jan 2020 12:57:03 +0100
Subject: [14] RFR (XS): 8235305: Corrupted oops embedded in nmethods due
 to parallel modification during optional evacuation
In-Reply-To: <10988744-5a87-5c70-eaae-cbad447dc2b3@oracle.com>
References: <15678a97-f219-e0f4-c0b6-a4a2a06e6768@oracle.com>
 <1ecf88a0-68dd-276a-0a7a-f068c587168d@oracle.com>
 <78970e90-7b4c-1618-af6f-0b8e37af47f3@oracle.com>
 <10988744-5a87-5c70-eaae-cbad447dc2b3@oracle.com>
Message-ID: <5337c7ba-5872-70c7-1082-abe149a02645@oracle.com>

Hi Stefan + Stefan,

   thanks for your reviews :)

Thomas

On 20.01.20 12:28, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2020-01-17 14:27, Thomas Schatzl wrote:
>> Hi Stefan,
>>
>> On 17.01.20 10:06, Stefan Johansson wrote:
>>> Hi Thomas,
>>>
>>> On 2020-01-16 14:20, Thomas Schatzl wrote:
>>>> Hi all,
>>>>
>>>> ?? can I get reviews for this change that fixes a bug in the 
>>>> abortable mixed gc algorithm where G1 might corrupt oops embedded in 
>>>> nmethods due to parallel modification during an optional evacuation 
>>>> phase?
>>>>
>>>> G1 currently collects embedded oops in nmethods twice: once in the 
>>>> optional roots list, and once as nmethods in the strong code roots 
>>>> list for a particular region.
>>>>
>>>> Now it can happen that this oop embedded in in the code stream is 
>>>> unaligned, so if that oop is modified during relocation word tearing 
>>>> may occur, causing follow-up crashes.
>>>>
>>>> The fix is to not collect oops from nmethods in the optional code 
>>>> root list as the strong code root list for a particular region 
>>>> already always contains it anyway.
>>>>
>>>> Thanks go to stefank, eriko and sjohanss for helping with analyzing, 
>>>> testing and the discussion around it.
>>>>
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8235305
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev/
>>>
>>> Fix looks good. 
>>
>> Thanks for your review.
>>
>>> Just some things around the naming of the template parameter and enum 
>>> after adding this. I don't have a much better idea
>> [...]
>>
>> Talked to them about this and I'm good with their suggestion:
>>
>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.1 (full)
>> http://cr.openjdk.java.net/~tschatzl/8235305/webrev.0_to_1 (diff)
>>
>> Thanks,
>> ?? Thomas
>>


From stefan.karlsson at oracle.com  Mon Jan 20 13:03:14 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 20 Jan 2020 14:03:14 +0100
Subject: [14] RFR 8237396: JvmtiTagMap::weak_oops_do() should not trigger
 barriers
In-Reply-To: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
References: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
Message-ID: <96981f56-0406-af11-d184-62c819d90cab@oracle.com>

Hi Zhengyu,

On 2020-01-17 16:28, Zhengyu Gu wrote:
> Please review this small patch that avoids barriers in 
> JvmtiTagMap::weak_oops_do() method.
> 
> The method is used by GC and GC expects to see raw oops.

For the record, ZGC doesn't require to see raw oops here. The 
unnecessary load barriers will simply pre-clean the oops before the 
closures are applied.

> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237396
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237396/webrev.00/

Looks good.

I've tested with ZGC as well.

Thanks,
StefanK

> 
> Test:
>  ? hotspot_gc
>  ? vmTestbase_nsk_jvmti
>  ? (fastdebug and release) on x86_64 Linux
> 
>  ? Submit test in progress.
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Mon Jan 20 13:15:55 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 20 Jan 2020 08:15:55 -0500
Subject: [14] RFR 8237396: JvmtiTagMap::weak_oops_do() should not trigger
 barriers
In-Reply-To: <96981f56-0406-af11-d184-62c819d90cab@oracle.com>
References: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
 <96981f56-0406-af11-d184-62c819d90cab@oracle.com>
Message-ID: <27f41387-5219-23fa-5b52-bd0b1f689438@redhat.com>

Thanks, Stefan.

-Zhengyu

On 1/20/20 8:03 AM, Stefan Karlsson wrote:
> Hi Zhengyu,
> 
> On 2020-01-17 16:28, Zhengyu Gu wrote:
>> Please review this small patch that avoids barriers in 
>> JvmtiTagMap::weak_oops_do() method.
>>
>> The method is used by GC and GC expects to see raw oops.
> 
> For the record, ZGC doesn't require to see raw oops here. The 
> unnecessary load barriers will simply pre-clean the oops before the 
> closures are applied.
> 
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237396
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237396/webrev.00/
> 
> Looks good.
> 
> I've tested with ZGC as well.
> 
> Thanks,
> StefanK
> 
>>
>> Test:
>> ?? hotspot_gc
>> ?? vmTestbase_nsk_jvmti
>> ?? (fastdebug and release) on x86_64 Linux
>>
>> ?? Submit test in progress.
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From rkennke at redhat.com  Mon Jan 20 13:43:42 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 20 Jan 2020 14:43:42 +0100
Subject: [14] RFR 8237396: JvmtiTagMap::weak_oops_do() should not trigger
 barriers
In-Reply-To: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
References: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
Message-ID: <1a6a6fa8-164a-8bba-29fa-9a271ac0e9b3@redhat.com>

Hi Zhengyu,

The change looks good to me. I was worried about other GCs, and Stefan
has confirmed that it's ok there.

Thanks,
Roman


> Please review this small patch that avoids barriers in
> JvmtiTagMap::weak_oops_do() method.
> 
> The method is used by GC and GC expects to see raw oops.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237396
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237396/webrev.00/
> 
> Test:
> ? hotspot_gc
> ? vmTestbase_nsk_jvmti
> ? (fastdebug and release) on x86_64 Linux
> 
> ? Submit test in progress.
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Mon Jan 20 13:56:10 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 20 Jan 2020 08:56:10 -0500
Subject: [14] RFR 8237396: JvmtiTagMap::weak_oops_do() should not trigger
 barriers
In-Reply-To: <1a6a6fa8-164a-8bba-29fa-9a271ac0e9b3@redhat.com>
References: <0088d47f-9dc5-5275-7242-47d1b544cc33@redhat.com>
 <1a6a6fa8-164a-8bba-29fa-9a271ac0e9b3@redhat.com>
Message-ID: <c6708ad9-1ed4-835b-ee4b-3c75643627ee@redhat.com>

Thanks, Roman.

-Zhengyu

On 1/20/20 8:43 AM, Roman Kennke wrote:
> Hi Zhengyu,
> 
> The change looks good to me. I was worried about other GCs, and Stefan
> has confirmed that it's ok there.
> 
> Thanks,
> Roman
> 
> 
>> Please review this small patch that avoids barriers in
>> JvmtiTagMap::weak_oops_do() method.
>>
>> The method is used by GC and GC expects to see raw oops.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237396
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237396/webrev.00/
>>
>> Test:
>>  ? hotspot_gc
>>  ? vmTestbase_nsk_jvmti
>>  ? (fastdebug and release) on x86_64 Linux
>>
>>  ? Submit test in progress.
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From rkennke at redhat.com  Mon Jan 20 15:20:13 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 20 Jan 2020 16:20:13 +0100
Subject: RFR: 8237543: Shenandoah: More asserts around code roots
Message-ID: <a2fe4a53-71a4-642f-2b6e-37faa43311ed@redhat.com>

We are still observing occasional corrupted code roots in Traversal GC.
The assert always happens in code roots, and always at init-traversal.
There are two ways this seems likely to happen: either when new code is
generated, or during the previous GC cycle. We should plant some
verifications there to ensure we fail earlier.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8237543
Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8237543/webrev/

Testing: hotspot_gc_shenandoah (fastdebug+release) ok

Can I please get a review?

Thanks,
Roman


From zgu at redhat.com  Mon Jan 20 15:36:06 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 20 Jan 2020 10:36:06 -0500
Subject: RFR: 8237543: Shenandoah: More asserts around code roots
In-Reply-To: <a2fe4a53-71a4-642f-2b6e-37faa43311ed@redhat.com>
References: <a2fe4a53-71a4-642f-2b6e-37faa43311ed@redhat.com>
Message-ID: <a1d8ec0a-a4ee-5af4-3cbf-e03ce8ddcfcc@redhat.com>

Okay, looks good to me.

-Zhengyu

On 1/20/20 10:20 AM, Roman Kennke wrote:
> We are still observing occasional corrupted code roots in Traversal GC.
> The assert always happens in code roots, and always at init-traversal.
> There are two ways this seems likely to happen: either when new code is
> generated, or during the previous GC cycle. We should plant some
> verifications there to ensure we fail earlier.
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8237543
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8237543/webrev/
> 
> Testing: hotspot_gc_shenandoah (fastdebug+release) ok
> 
> Can I please get a review?
> 
> Thanks,
> Roman
> 


From maoliang.ml at alibaba-inc.com  Tue Jan 21 06:25:51 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Tue, 21 Jan 2020 14:25:51 +0800
Subject: =?UTF-8?B?RGlzY3Vzc2lvbjogaW1wcm92ZSBodW1vbmdvdXMgb2JqZWN0cyBoYW5kbGluZyBmb3IgRzE=?=
Message-ID: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>

Hi Thomas,

In fact we saw this issue with 8u. One issue I forgot to tell is that when 
CPU usage is quite high which is nearly 100% the concurrent mark will
get very slow so the to-space exhuasted happened. BTW, is there any improvements
for this point in JDK11 or higher versions? I didn't notice so far. Increasing
reserve percent could alleviate the problem but seems not a completed solution.

Cancelling concurrent mark cycle in initial-mark pause seems a delicate optimization
which can cover some issues if a lot of humongous regions have been reclaimed in
this pause. It can avoid the unnecessary cm cycle and also trigger cm earlier if neened.
We will take this into the consideration. Thanks for the great idea:)

If there is a short-live humongous object array which also references other
 short-live objects the situation could be worse. If we increase the G1HeapRegionSize,
some humongous objects become normal objects and the behavior is more like CMS then
everything goes fine. I don't think we have to not allow humongous objects to behave 
as normal ones. A new allocated humongous object array can probably reference 
objects in young generation and scanning the object array by remset couldn't be better
 than directly iterating the array in evacuation because of possible prefetch. We can
 have an alternative max survivor age for humongous object, maybe 5 or 8 at most
 otherwise let eager reclam do it. A tradeoff can be made to balance the pause time
 and reclamation possibility of short-live objects.

So the enhanced solution can be
1. Cancelling concurrent mark if not necessary.
2. Increase the reclamation possibility of short-live humongous objects.

 An important reason for this issue is that Java developers easily challenge CMS can 
handle the application without significant CPU usage increase(caused by concurrent mark)
 but why G1 cannot. Personally I believe G1 can do anything not worse than CMS:) 
This proposal aims for the throughput gap comparing to CMS. If works with the barrier
optimization which is proposed by Man and Google, imho the gap could be obviously reduced.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 20 (Mon.) 19:11
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: Discussion: improve humongous objects handling for G1

Hi Liang,

On 19.01.20 08:08, Liang Mao wrote:
> Hi Guys,
> 
> We Alibaba have experienced the same problem as Man introduced.
> Some applications got frequent concurrent mark cycles and high
> cpu usage and even some to-space exhausted failures because of
> large amount of humongous object allocation even with
> G1HeapRegionSize=32m. But those applications worked fine
> with ParNew/CMS. We are working on some enhancements for better

Can you provide logs? (with gc+heap=debug,gc+humongous=debug)

> reclamation of humongous objects. Our first intention is to reduce
> the frequent concurrent cycles and possible to-space exhausted so
> the heap utility or arraylets are not taken into consideration yet.
> 
> Our solution is more like a ParNew/CMS flow and will treat a
> humongous object as young or old.
> 1. Humongous object allocation in mutator will be considered into
> eden size and won't directly trigger concurrent mark cycle. That
> will avoid the possible to-space exhausted while concurrent mark
> is working and humongous allocations are "eating" the free regions.

(I am trying to imagine situations here where this would be a problem 
since I do not have a log)

That helps if G1 is already trying to do a marking cycle if the space is 
tight and already eating into the reserve that has explicitly been set 
aside for this case (G1ReservePercent - did you try increasing that for 
a workaround?). It does make young collections much more frequent than 
necessary otherwise.

Particularly if these humongous regions are eager-reclaimable. In these 
cases the humongous allocations would be "free", while with that policy 
they would cause a young gc.

The other issue, if these humongous allocations cause too many 
concurrent cycles could be managed by looking into canceling the 
concurrent marking if that concurrent start gc freed lots and lots of 
humongous objects, e.g. getting way below the mark threshold again.

I did not think this through though, of course at some point you do need 
to start the concurrent mark.

Some (or most) of that heap pressure might have been caused by the 
internal fragmentation, so allowing allocation into the tail ends would 
very likely decrease that pressure too.
This would likely be the first thing I would be looking into if the logs 
indicate that.

> 2. Enhance the reclamation of short-live humongous object by
> covering object array that current eager reclaim only supports
> primitive type for now. This part looks same to JDK-8048180 and
> JDK-8073288 Thomas mentioned. The evacuation flow will iterate
> the humongous object array as a regular object if the humongous
> object is "young" which can be distinguished by the "age" field
> in markoop. >
> The patch is being tested. We will share it once it proves to
> work fine with our applications. I don't know if any similar
> approach has been already tried and any advices?

The problem with treating humongous reference arrays as young is that 
this heuristic significantly increases the garbage collection time if 
that object survives the collection.
I.e. the collector needs to iterate over all young objects, and while 
you do save the time to copy the object by in-place aging, scanning the 
references tends to take more time than copying.

In that "different regional collector" I referenced in the other email 
exactly this had been implemented with the above issues. That collector 
also had configurable regions down to 64k (well, basically even less, 
but anything below that was just for experimentation, and 64k had been 
very debatable too), so the humongous object problem had been a lot 
larger. It might not be the case with G1's "giant" humongous objects.

Treating them as old like they are now within G1 allows you to be a lot 
more selective about what you take in for garbage collection. Now the 
policy isn't particularly smart (just take humongous objects of a 
particular type with less than a low, fixed threshold of remembered set 
entries), but that could be improved.

I.e. G1 has a measure of how long scanning a remembered set entry 
approximately takes, so that could be made dependent on available time.

Thanks,
   Thomas


From kim.barrett at oracle.com  Tue Jan 21 08:31:20 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 21 Jan 2020 03:31:20 -0500
Subject: RFR: 8233822: VM_G1CollectForAllocation should always check for
 upgrade to full 
Message-ID: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>

Please review this G1 change to always check whether a full collection
should be performed after a non-full collection pause, e.g. the
collection needs to be "upgraded" to a full collection.  There are
various conditions which can lead to needing to do that, and as the CR
suggests, we need to be consistent about checking for and performing
such an upgrade.

This is accomplished by moving most of do_collection_pause_at_safepoint
into a helper function and changing that existing function to call the
helper, then check for and, if needed, perform a needed upgrade to a
full collection.  Callers of that function are updated to remove
explict conditional upgrading, where present.  This also addresses the
surprisingly placed call in a G1-specific block of code in gc/shared
(see also JDK-8237567).

CR:
https://bugs.openjdk.java.net/browse/JDK-8233822

Webrev:
https://cr.openjdk.java.net/~kbarrett/8233822/open.00/

Testing:
mach5 tier1-5
Locally (linux-x64) ran modified InfiniteList.java test (allocate
small rather than arrays) and verified some upgrades occurred as
expected.


From per.liden at oracle.com  Tue Jan 21 10:18:26 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 21 Jan 2020 11:18:26 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
Message-ID: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>

When using -Xlog:gc*, I now and then find that I miss basic relocation 
information, since it's currently printed at the debug level on the 
relocation set selector. I think we should leave the current logging as 
is, since that's still useful when debugging the relocation set selector 
itself. However, I think we should propagate some of the high level 
information and print it on the info level.

Here's an example of what the output looks like with this patch:

[...]
[68.926s][info][gc,reloc    ] GC(6) Small Pages: 529 / 1058M(93%), 
Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
[68.926s][info][gc,reloc    ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
0M(0%), Compacting: 64M(6%)->32M(3%)
[68.926s][info][gc,reloc    ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
6M(1%), Compacting: 0M(0%)->0M(0%)
[68.926s][info][gc,reloc    ] GC(6) Relocation: Successful
[...]

Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0

/Per


From thomas.schatzl at oracle.com  Tue Jan 21 10:19:52 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 21 Jan 2020 11:19:52 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
Message-ID: <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>

Hi,

On 21.01.20 07:25, Liang Mao wrote:
> Hi Thomas,
> 
> In fact we saw this issue with 8u. One issue I forgot to tell is that when
> CPU usage is quite high which is nearly 100% the concurrent mark will
> get very slow so the to-space exhuasted happened. BTW, is there any 
> improvements for this point in JDK11 or higher versions? I didn't notice so far. 

JDK13 has some implicit increases in the thresholds to take more 
humongous candidate regions. Not a lot though.

> Increasing reserve percent could alleviate the problem but seems not a completed 
> solution.

It would be nicer if g1 automatically adjusted this reserve based on 
actual allocation of course. ;)

Which is another option btw - there are many ways to avoid the 
evacuation failure situation.

> Cancelling concurrent mark cycle in initial-mark pause seems a delicate 
> optimization which can cover some issues if a lot of humongous regions have been 
> reclaimed in this pause. It can avoid the unnecessary cm cycle and also trigger cm 
> earlier if neened.
> We will take this into the consideration. Thanks for the great idea:)
> 
> If there is a short-live humongous object array which also references other
> short-live objects the situation could be worse. If we increase the 
> G1HeapRegionSize, some humongous objects become normal objects and the behavior > is more like CMS then everything goes fine. I don't think we have to 
not allow humongous
> objects to behave as normal ones. A new allocated humongous object array can probably 
> reference objects in young generation and scanning the object array by remset 
> couldn't be better than directly iterating the array in evacuation because of possible 
> prefetch. We can have an alternative max survivor age for humongous object, maybe 5 or 8 

If I read this paragraph correctly you argue that keeping a large 
humongous objArray in young is okay because

a) if you increase the heap region size, it has a high chance that it 
would be below the thresholds anyway, so you would scan it anyway

b) scanning a humongous objArray with a few references is not much 
different performance wise than targeted scanning of the corresponding 
cards in the remembered set because of hardware.

Regarding a) Since I have yet to see logs, I can't tell what the typical 
size of these arrays are (and I have not seen a "typical" humongous 
object distribution graph for these applications). However regions sizes 
are kind of proportional with heap size which kind of corresponds to the 
hardware that you need to use. I.e. you likely won't see G1 using 100 
threads on 200m heap with 32m regions with current ergonomics.

Even then this limits objArrays to 16M (at 32m region size), which 
limits the time spent scanning the object (and if ergonomics select 32m 
regions, the heap and the machine are probably quite big anyway). From 
what you and Man were telling, you seem to have a significant amount of 
humongous objects of unknown type that are much(?) larger than that.

Regarding b) that has been wrong years ago when I did experiments on 
that (even the "limit age on humongous obj arrays" workaround - you can 
easily go as low as a max tenuring threshold of 1 to catch almost all of 
the relevant ones), and very likely still is.

Let me do some over-the-thumb calculations: Assuming that we have 32M 
objects (random number, i.e. ~8m references), with, say 1k references 
(which is more than a handful), the remembered set would make you scan 
only 1.5% max (1000*512 bytes/card) of the object. I seriously doubt 
that prefetching or some magic hardware will make that amount additional 
work disappear.

 From a performance POV, with 20 GB/s bandwidth available, (which I am 
not sure you will reach during GC for whatever reasons; random number), 
you are spending 1.5ms (if I calculated correctly) cpu time just for 
finding out that the 32M object is completely full of null-s in the 
worst case. That's also the minimum amount of time you need per such object.

Keeping it outside of young gen, and particularly if it has been 
allocated just recently it won't have a lot remembered set entries, 
would likely be much cheaper than that (as mentioned, G1 has a good 
measure of how long scanning a card will take so we could take this number).
Only if G1 is going to scan it almost completely anyway (which we agree 
on is unlikely to be the case as it has "just" been allocated), then 
keeping it outside is disadvantagous.

Note that its allocation could still be counted against the eden 
allowance in some situations. This could be seen as a way to slow down 
the mutator while it is busy trying to complete the marking.

I am however not sure if it helps a lot assuming that changes to perform 
eager reclaim on objArrays won't work during marking btw. There would be 
need for a different kind of enforcing such an allocation penalty.

Without more thinking and measurements I would not know when and how to 
account that, and what has to happen with existing mechanisms to absorb 
allocation spikes (i.e. G1ReservePercent). I just assume that you 
probably do not want both. Also something to consider.

> at most otherwise let eager reclam do it. A tradeoff can be made to balance the 
> pause time and reclamation possibility of short-live objects.
> 
> So the enhanced solution can be
> 1. Cancelling concurrent mark if not necessary.
> 2. Increase the reclamation possibility of short-live humongous objects.

These are valid possibilities to improve the overall situation without 
fixing actual fragmentation issues ;)

> An important reason for this issue is that Java developers easily 
> challenge CMS can handle the application without significant CPU usage increase > (caused by concurrent mark)
> but why G1 cannot. Personally I believe G1 can do anything not worse 
> than CMS:)
> This proposal aims for the throughput gap comparing to CMS. If works 
> with the barrier optimization which is proposed by Man and Google, imho the gap could be 
> obviously reduced.


Thanks,
   Thomas


From shade at redhat.com  Tue Jan 21 10:21:01 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 21 Jan 2020 11:21:01 +0100
Subject: RFR (S) 8237570: Shenandoah: cleanup uses of allocation/free
 threshold in static heuristics
Message-ID: <b21bed82-a645-0e8d-8fb1-7ade83a69032@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8237570

Fix:
  https://cr.openjdk.java.net/~shade/8237570/webrev.01/

As noted by Justinas in the separate thread, ShAllocThresh has no effect on "static" heuristics, so
it should not be adjusted. Also, it should use ShMinFreeThresh, as other heuristics use. This makes
ShFreeThresh unused, and it is removed for clarity.

Testing: hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Tue Jan 21 10:43:35 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 21 Jan 2020 11:43:35 +0100
Subject: RFR (S) 8237570: Shenandoah: cleanup uses of allocation/free
 threshold in static heuristics
In-Reply-To: <b21bed82-a645-0e8d-8fb1-7ade83a69032@redhat.com>
References: <b21bed82-a645-0e8d-8fb1-7ade83a69032@redhat.com>
Message-ID: <51cf1b77-e89a-653c-abd3-152813ea936d@redhat.com>

Makes sense. Good!

Thanks,
Roman

> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8237570
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8237570/webrev.01/
> 
> As noted by Justinas in the separate thread, ShAllocThresh has no effect on "static" heuristics, so
> it should not be adjusted. Also, it should use ShMinFreeThresh, as other heuristics use. This makes
> ShFreeThresh unused, and it is removed for clarity.
> 
> Testing: hotspot_gc_shenandoah
> 


From per.liden at oracle.com  Tue Jan 21 12:56:07 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 21 Jan 2020 13:56:07 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
In-Reply-To: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
References: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
Message-ID: <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>

I got some off-line comments from Stefan, updated webrev:

Diff: http://cr.openjdk.java.net/~pliden/8234440/webrev.1-diff
Full: http://cr.openjdk.java.net/~pliden/8234440/webrev.1

/Per

On 1/21/20 11:18 AM, Per Liden wrote:
> When using -Xlog:gc*, I now and then find that I miss basic relocation 
> information, since it's currently printed at the debug level on the 
> relocation set selector. I think we should leave the current logging as 
> is, since that's still useful when debugging the relocation set selector 
> itself. However, I think we should propagate some of the high level 
> information and print it on the info level.
> 
> Here's an example of what the output looks like with this patch:
> 
> [...]
> [68.926s][info][gc,reloc??? ] GC(6) Small Pages: 529 / 1058M(93%), 
> Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
> [68.926s][info][gc,reloc??? ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
> 0M(0%), Compacting: 64M(6%)->32M(3%)
> [68.926s][info][gc,reloc??? ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
> 6M(1%), Compacting: 0M(0%)->0M(0%)
> [68.926s][info][gc,reloc??? ] GC(6) Relocation: Successful
> [...]
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
> Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0
> 
> /Per


From stefan.karlsson at oracle.com  Tue Jan 21 13:37:19 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 21 Jan 2020 14:37:19 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
In-Reply-To: <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
References: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
 <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
Message-ID: <11776dca-f130-46db-505d-8604d69a0cff@oracle.com>

Looks good.

StefanK

On 2020-01-21 13:56, Per Liden wrote:
> I got some off-line comments from Stefan, updated webrev:
> 
> Diff: http://cr.openjdk.java.net/~pliden/8234440/webrev.1-diff
> Full: http://cr.openjdk.java.net/~pliden/8234440/webrev.1
> 
> /Per
> 
> On 1/21/20 11:18 AM, Per Liden wrote:
>> When using -Xlog:gc*, I now and then find that I miss basic relocation 
>> information, since it's currently printed at the debug level on the 
>> relocation set selector. I think we should leave the current logging 
>> as is, since that's still useful when debugging the relocation set 
>> selector itself. However, I think we should propagate some of the high 
>> level information and print it on the info level.
>>
>> Here's an example of what the output looks like with this patch:
>>
>> [...]
>> [68.926s][info][gc,reloc??? ] GC(6) Small Pages: 529 / 1058M(93%), 
>> Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
>> [68.926s][info][gc,reloc??? ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
>> 0M(0%), Compacting: 64M(6%)->32M(3%)
>> [68.926s][info][gc,reloc??? ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
>> 6M(1%), Compacting: 0M(0%)->0M(0%)
>> [68.926s][info][gc,reloc??? ] GC(6) Relocation: Successful
>> [...]
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
>> Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0
>>
>> /Per


From per.liden at oracle.com  Tue Jan 21 13:58:08 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 21 Jan 2020 14:58:08 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
In-Reply-To: <11776dca-f130-46db-505d-8604d69a0cff@oracle.com>
References: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
 <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
 <11776dca-f130-46db-505d-8604d69a0cff@oracle.com>
Message-ID: <5e36116a-377e-7d8c-391a-fc59f66bba6f@oracle.com>

Thanks Stefan!

/Per

On 1/21/20 2:37 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2020-01-21 13:56, Per Liden wrote:
>> I got some off-line comments from Stefan, updated webrev:
>>
>> Diff: http://cr.openjdk.java.net/~pliden/8234440/webrev.1-diff
>> Full: http://cr.openjdk.java.net/~pliden/8234440/webrev.1
>>
>> /Per
>>
>> On 1/21/20 11:18 AM, Per Liden wrote:
>>> When using -Xlog:gc*, I now and then find that I miss basic 
>>> relocation information, since it's currently printed at the debug 
>>> level on the relocation set selector. I think we should leave the 
>>> current logging as is, since that's still useful when debugging the 
>>> relocation set selector itself. However, I think we should propagate 
>>> some of the high level information and print it on the info level.
>>>
>>> Here's an example of what the output looks like with this patch:
>>>
>>> [...]
>>> [68.926s][info][gc,reloc??? ] GC(6) Small Pages: 529 / 1058M(93%), 
>>> Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
>>> 0M(0%), Compacting: 64M(6%)->32M(3%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
>>> 6M(1%), Compacting: 0M(0%)->0M(0%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Relocation: Successful
>>> [...]
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
>>> Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0
>>>
>>> /Per


From maoliang.ml at alibaba-inc.com  Tue Jan 21 14:26:42 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Tue, 21 Jan 2020 22:26:42 +0800
Subject: =?UTF-8?B?UmU6IERpc2N1c3Npb246IGltcHJvdmUgaHVtb25nb3VzIG9iamVjdHMgaGFuZGxpbmcgZm9y?=
 =?UTF-8?B?IEcx?=
In-Reply-To: <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>,
 <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
Message-ID: <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Thank you for pointing out my mistake for comparing iterating object array 
with card scanning that I missed the detail that card scanning doesn't need
to scan the whole object array.

I didn't provide gc log because I haven't sufficient statistics data about
humongous distribution or the object arrays. The solution is just straightforward
because increasing G1HeapRegionSize fixes the problem so I want to do the same to 
G1HeapRegionSize=32m. In my earlier memory of tunning some typical applications,
 humongous objects occupy more than half of the used heap after young GC with
 default G1HeapRegionSize. I guess perhaps half of our applications may encounter
 the issue with default setting. So currently we use the G1HeapRegionSize as
 approximately 1/500 of Xmx. 
I know that iterating humongous object array in young GC might significantly
degrade the pause time orientied philosophy. But if the pause time is already in
expectation with CMS such behavior isn't doing anything worse but avoid the GC
turbulence by concurrent mark. Beside the obvious penalty to pause time, do
you have any other concerns? 

> Note that its allocation could still be counted against the eden 
> allowance in some situations. This could be seen as a way to slow down 
> the mutator while it is busy trying to complete the marking.

> I am however not sure if it helps a lot assuming that changes to perform 
> eager reclaim on objArrays won't work during marking btw. There would be 
> need for a different kind of enforcing such an allocation penalty.
I'm sorry I didn't get these 2 paragraphs. Could you please explain more?

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 21 (Tue.) 18:20
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: Discussion: improve humongous objects handling for G1

Hi,

On 21.01.20 07:25, Liang Mao wrote:
> Hi Thomas,
> 
> In fact we saw this issue with 8u. One issue I forgot to tell is that when
> CPU usage is quite high which is nearly 100% the concurrent mark will
> get very slow so the to-space exhuasted happened. BTW, is there any 
> improvements for this point in JDK11 or higher versions? I didn't notice so far. 

JDK13 has some implicit increases in the thresholds to take more 
humongous candidate regions. Not a lot though.

> Increasing reserve percent could alleviate the problem but seems not a completed 
> solution.

It would be nicer if g1 automatically adjusted this reserve based on 
actual allocation of course. ;)

Which is another option btw - there are many ways to avoid the 
evacuation failure situation.

> Cancelling concurrent mark cycle in initial-mark pause seems a delicate 
> optimization which can cover some issues if a lot of humongous regions have been 
> reclaimed in this pause. It can avoid the unnecessary cm cycle and also trigger cm 
> earlier if neened.
> We will take this into the consideration. Thanks for the great idea:)
> 
> If there is a short-live humongous object array which also references other
> short-live objects the situation could be worse. If we increase the 
> G1HeapRegionSize, some humongous objects become normal objects and the behavior > is more like CMS then everything goes fine. I don't think we have to 
not allow humongous
> objects to behave as normal ones. A new allocated humongous object array can probably 
> reference objects in young generation and scanning the object array by remset 
> couldn't be better than directly iterating the array in evacuation because of possible 
> prefetch. We can have an alternative max survivor age for humongous object, maybe 5 or 8 

If I read this paragraph correctly you argue that keeping a large 
humongous objArray in young is okay because

a) if you increase the heap region size, it has a high chance that it 
would be below the thresholds anyway, so you would scan it anyway

b) scanning a humongous objArray with a few references is not much 
different performance wise than targeted scanning of the corresponding 
cards in the remembered set because of hardware.

Regarding a) Since I have yet to see logs, I can't tell what the typical 
size of these arrays are (and I have not seen a "typical" humongous 
object distribution graph for these applications). However regions sizes 
are kind of proportional with heap size which kind of corresponds to the 
hardware that you need to use. I.e. you likely won't see G1 using 100 
threads on 200m heap with 32m regions with current ergonomics.

Even then this limits objArrays to 16M (at 32m region size), which 
limits the time spent scanning the object (and if ergonomics select 32m 
regions, the heap and the machine are probably quite big anyway). From 
what you and Man were telling, you seem to have a significant amount of 
humongous objects of unknown type that are much(?) larger than that.

Regarding b) that has been wrong years ago when I did experiments on 
that (even the "limit age on humongous obj arrays" workaround - you can 
easily go as low as a max tenuring threshold of 1 to catch almost all of 
the relevant ones), and very likely still is.

Let me do some over-the-thumb calculations: Assuming that we have 32M 
objects (random number, i.e. ~8m references), with, say 1k references 
(which is more than a handful), the remembered set would make you scan 
only 1.5% max (1000*512 bytes/card) of the object. I seriously doubt 
that prefetching or some magic hardware will make that amount additional 
work disappear.

 From a performance POV, with 20 GB/s bandwidth available, (which I am 
not sure you will reach during GC for whatever reasons; random number), 
you are spending 1.5ms (if I calculated correctly) cpu time just for 
finding out that the 32M object is completely full of null-s in the 
worst case. That's also the minimum amount of time you need per such object.

Keeping it outside of young gen, and particularly if it has been 
allocated just recently it won't have a lot remembered set entries, 
would likely be much cheaper than that (as mentioned, G1 has a good 
measure of how long scanning a card will take so we could take this number).
Only if G1 is going to scan it almost completely anyway (which we agree 
on is unlikely to be the case as it has "just" been allocated), then 
keeping it outside is disadvantagous.

Note that its allocation could still be counted against the eden 
allowance in some situations. This could be seen as a way to slow down 
the mutator while it is busy trying to complete the marking.

I am however not sure if it helps a lot assuming that changes to perform 
eager reclaim on objArrays won't work during marking btw. There would be 
need for a different kind of enforcing such an allocation penalty.

Without more thinking and measurements I would not know when and how to 
account that, and what has to happen with existing mechanisms to absorb 
allocation spikes (i.e. G1ReservePercent). I just assume that you 
probably do not want both. Also something to consider.

> at most otherwise let eager reclam do it. A tradeoff can be made to balance the 
> pause time and reclamation possibility of short-live objects.
> 
> So the enhanced solution can be
> 1. Cancelling concurrent mark if not necessary.
> 2. Increase the reclamation possibility of short-live humongous objects.

These are valid possibilities to improve the overall situation without 
fixing actual fragmentation issues ;)

> An important reason for this issue is that Java developers easily 
> challenge CMS can handle the application without significant CPU usage increase > (caused by concurrent mark)
> but why G1 cannot. Personally I believe G1 can do anything not worse 
> than CMS:)
> This proposal aims for the throughput gap comparing to CMS. If works 
> with the barrier optimization which is proposed by Man and Google, imho the gap could be 
> obviously reduced.


Thanks,
   Thomas


From shade at redhat.com  Tue Jan 21 18:29:36 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 21 Jan 2020 19:29:36 +0100
Subject: RFR (S) 8237586: Shenandoah: provide option to disable periodic GC
Message-ID: <e72fbeb3-322e-8c5a-3877-dd8a9542a3e7@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8237586

Webrev:
  https://cr.openjdk.java.net/~shade/8237586/webrev.01/

The VM option is unsigned, which leaves as with "0" as special value. It also matches the behavior
of G1PeriodicGCInterval and GuaranteedSafepointInterval.

Testing: hotspot_gc_shenandoah (includes new testcases)

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Jan 21 19:18:09 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 21 Jan 2020 14:18:09 -0500
Subject: RFR (S) 8237586: Shenandoah: provide option to disable periodic GC
In-Reply-To: <e72fbeb3-322e-8c5a-3877-dd8a9542a3e7@redhat.com>
References: <e72fbeb3-322e-8c5a-3877-dd8a9542a3e7@redhat.com>
Message-ID: <e15b47f7-bde2-ba62-0104-2e82e760bf81@redhat.com>

Fix looks good to me. Please update copyright years.

Thanks,

-Zhengyu

On 1/21/20 1:29 PM, Aleksey Shipilev wrote:
> RFE:
>    https://bugs.openjdk.java.net/browse/JDK-8237586
> 
> Webrev:
>    https://cr.openjdk.java.net/~shade/8237586/webrev.01/
> 
> The VM option is unsigned, which leaves as with "0" as special value. It also matches the behavior
> of G1PeriodicGCInterval and GuaranteedSafepointInterval.
> 
> Testing: hotspot_gc_shenandoah (includes new testcases)
> 


From manc at google.com  Wed Jan 22 05:05:34 2020
From: manc at google.com (Man Cao)
Date: Tue, 21 Jan 2020 21:05:34 -0800
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
 <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
 <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>
Message-ID: <CA+w6HxYfU9CrCr24GEKVX2bO8x0mOoyr0h01z0QXc_91PZoyjg@mail.gmail.com>

Hi all,

Thanks for the great discussion from Thomas and Liang!

Regarding to GC logs, histogram of humongous allocations, and a more
concrete example, I guess we are in the same boat here. We only advised
users to increase G1HeapRegionSize, which would work around many cases of
the problem. We have not yet closely studied patterns of the problematic
humongous allocations. I will do such a study and follow up with some
statistics and GC logs when I get my hands on them.

>> maybe the threshold for the amount of
>> remembered set entries to keep these humongous objects as eligible for
>> eager reclaim is too low, and increasing that one would just make it
work.
> JDK-8237500
Thanks for this. I will definitely try tuning this if the humongous objects
are non-objArrays.

> You could double-map like
> https://blog.openj9.org/2019/05/01/double-map-arraylets/ does for
> native access.
> There is some remark in some tech paper about arraylets
> (
https://www.ibm.com/developerworks/websphere/techjournal/1108_sciampacone/1108_sciampacone.html
> that indicates that the  balanced collector seems to not move the
> arrayoids too.
Thanks for digging into the details of arraylets. I didn't do much research
on it.

> Btw the same text also indicates that copying seems like a non-starter
> anyway, as, quoting from the text "One use case, SPECjbb2015 benchmark
> is not being able to finish RT curve...".
> Not sure what prevents arraylets in particular from being O(1); a
> particular access is slower though due to the additional indirection
> with the spine.
> ...
> Which means that there is significant optimization work needed to make
> array access "as fast" as before in jitted code
These two issues:
(1) copying for JNI Critical
(2) slowing down typical jitted code for array accesses
do sound like performance deal-breakers, particularly if they are only
required for G1+arraylets but not other collectors. There are some use
cases of JNI Critical on arrays that are solely for performance reasons,
and we'd rather not slow them down.

> It could help with all problems but cases where you allocate a very
> large of humongous objects and you can't keep the humognous object
> tails filled. This option still keeps the invariant that humongous
> objects need to be allocated at a region boundary.
>
> Most of the other ideas you propose below also (seem to) retain this
> property.
Agreed. It seems that JDK-8172713 would help most ideas anyway.

> Maybe it is sufficient as "most" applications only use single or low
> double-digit GB heaps at the moment where the entire reservation still
> fits into the 32gb barrier.
I also had the same thought. Most of our important workloads have heap
sizes less than 20GB.
If the "reserve multiple MaxHeapSize" approach could work with compressed
oops for <16GB heap, then it is quite acceptable.
That said, now I do agree that I should first study the patterns of
humongous allocations and look into improvement on eager reclamation.

For the approach from Liang/Alibaba, I'm optimistic that it could solve
many problems migrating from ParNew+CMS to G1. Because it handles humongous
allocations in a similar way as ParNew+CMS does, plus G1 has the advantage
of not copying humongous objects, the pause duration/frequency would
probably not degrade compared to ParNew/CMS.
I also agree with Thomas that it may increase pause duration compared to
current G1 due to extra scanning, and allocation spikes might affect
other aspects of G1. I noticed in the description for JDK-8027959: "a)
logically keep LOBs in young gen, doing in-place aging", which sounds like
the GC team have explored this approach for eager reclamation before? It
might be the best of both worlds if we could make eager reclamation of
humongous objArrays work without putting them in young gen, and further
improve eager reclamation in general.

-Man


From erik.osterlund at oracle.com  Wed Jan 22 09:04:33 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Wed, 22 Jan 2020 10:04:33 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
In-Reply-To: <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
References: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
 <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
Message-ID: <5f1ca9a1-0130-cdb1-30e6-4bc88aa3a684@oracle.com>

Hi Per,

Looks good.

/Erik

On 1/21/20 1:56 PM, Per Liden wrote:
> I got some off-line comments from Stefan, updated webrev:
>
> Diff: http://cr.openjdk.java.net/~pliden/8234440/webrev.1-diff
> Full: http://cr.openjdk.java.net/~pliden/8234440/webrev.1
>
> /Per
>
> On 1/21/20 11:18 AM, Per Liden wrote:
>> When using -Xlog:gc*, I now and then find that I miss basic 
>> relocation information, since it's currently printed at the debug 
>> level on the relocation set selector. I think we should leave the 
>> current logging as is, since that's still useful when debugging the 
>> relocation set selector itself. However, I think we should propagate 
>> some of the high level information and print it on the info level.
>>
>> Here's an example of what the output looks like with this patch:
>>
>> [...]
>> [68.926s][info][gc,reloc??? ] GC(6) Small Pages: 529 / 1058M(93%), 
>> Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
>> [68.926s][info][gc,reloc??? ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
>> 0M(0%), Compacting: 64M(6%)->32M(3%)
>> [68.926s][info][gc,reloc??? ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
>> 6M(1%), Compacting: 0M(0%)->0M(0%)
>> [68.926s][info][gc,reloc??? ] GC(6) Relocation: Successful
>> [...]
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
>> Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0
>>
>> /Per


From thomas.schatzl at oracle.com  Wed Jan 22 09:40:04 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 10:40:04 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
 <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
 <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>
Message-ID: <a66989d6-e6e4-9dcc-fda3-319c091c7ad1@oracle.com>

Hi Liang,

On 21.01.20 15:26, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for pointing out my mistake for comparing iterating object array
> with card scanning that I missed the detail that card scanning doesn't need
> to scan the whole object array.
> 
> I didn't provide gc log because I haven't sufficient statistics data about
> humongous distribution or the object arrays. The solution is just 
> straightforward
> because increasing G1HeapRegionSize fixes the problem so I want to do 
> the same to
> G1HeapRegionSize=32m. In my earlier memory of tunning some 
> typical?applications,
> humongous objects occupy more than half of the used heap after young GC with
> default G1HeapRegionSize. I guess perhaps half of our applications may 
> encounter
> the issue with default setting. So currently we use the G1HeapRegionSize as
> approximately 1/500 of Xmx.
> I know that iterating humongous object array in young GC might significantly
> degrade the pause time orientied philosophy. But if the pause time is 
> already in
> expectation with CMS such behavior isn't doing anything worse but avoid 
> the GC
> turbulence by concurrent mark. Beside the obvious penalty to pause time, do
> you have any other concerns?

Ultimately, no, but given that there are options that seem all-around 
better or there are things (e.g. humongous object tail allocations) to 
do first I would not spend time on that at this time :)

> 
>> Note?that?its?allocation?could?still?be?counted?against?the?eden 
>> allowance?in?some?situations.?This?could?be?seen?as?a?way?to?slow?down 
>> the?mutator?while?it?is?busy?trying?to?complete?the?marking. >
>> I?am?however?not?sure?if?it?helps?a?lot?assuming?that?changes?to?perform 
>> eager?reclaim?on?objArrays?won't?work?during?marking?btw.?There?would?be 
>> need?for?a?different?kind?of?enforcing?such?an?allocation?penalty.
 >
> I'm sorry I didn't get these 2 paragraphs. Could you please explain more?

The first sentence starts talking about a hybrid approach: keep the 
object in old gen, but still account it against the allowed eden allocation.

The problem is how to account this: you do not really want to account 
the full regions against it, because that would cause more gcs than CMS 
as we don't/can't allocate into tail of humongous objects at all. 
Counting fractions of regions would leave you with the decision to be 
conservative and make an eden allocation region unused.

E.g. eden allocation budget is 50 regions, you allocate 50 humongous 
objects half a region each. Should that exhaust the budget (in case 
regions are counted fully)? This would mean that G1 would do collections 
at double the rate of CMS.
An option would be to make them count at their real size (for purposes 
of determining when eden is "exhausted"), but that might leave you with 
a fraction of a region for the last region.
The other problem is of course, while it's accounted for its actual 
size, it still takes twice space in the heap (i.e. the to-space 
exhaustion issue).

This is why the mention of "in some situations" - if you are eating into 
the reserve already, it's probably better to account them in full anyway.

Note that otoh if you keep the object in young gen logically too, you 
could imagine allocating into its tail. However keeping humongous 
(objArray) objects in young gen logically has the other bad properties 
we talked about earlier. You could even vary the strategy between 
objArray and non-objArrays. So this needs some more thought, and the 
policy written down, and tested on "real" applications. :)

The last sentence in that paragraph refers to somehow slow down the 
mutator when eating in the reserve to complete the marking in time.

The second paragraph is about questioning the slowing down mechanism a 
bit: even during this situation, when you are marking, and slowing down 
the mutator, additional gcs do not help you a lot wrt to eager reclaim 
of humongous objArray objects, as the most likely (initial) 
implementation of  that would not do eager reclaim during marking (it's 
doable, you need to keep the satb invariant).

Yeah, these two paragraphs compressed my thoughts maybe a bit too much. ;)

Thanks,
   Thomas


From per.liden at oracle.com  Wed Jan 22 09:54:17 2020
From: per.liden at oracle.com (Per Liden)
Date: Wed, 22 Jan 2020 10:54:17 +0100
Subject: RFR: 8234440: ZGC: Print relocation information on info level
In-Reply-To: <5f1ca9a1-0130-cdb1-30e6-4bc88aa3a684@oracle.com>
References: <36df9c0c-9b35-0d6f-df7f-8b9b781818cb@oracle.com>
 <9700c26c-8ee9-f58d-1c1c-a1fa6a53b24b@oracle.com>
 <5f1ca9a1-0130-cdb1-30e6-4bc88aa3a684@oracle.com>
Message-ID: <8ea2cd7c-04b5-dd60-3525-c531d343c63f@oracle.com>

Thanks Erik!

/Per

On 1/22/20 10:04 AM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> /Erik
> 
> On 1/21/20 1:56 PM, Per Liden wrote:
>> I got some off-line comments from Stefan, updated webrev:
>>
>> Diff: http://cr.openjdk.java.net/~pliden/8234440/webrev.1-diff
>> Full: http://cr.openjdk.java.net/~pliden/8234440/webrev.1
>>
>> /Per
>>
>> On 1/21/20 11:18 AM, Per Liden wrote:
>>> When using -Xlog:gc*, I now and then find that I miss basic 
>>> relocation information, since it's currently printed at the debug 
>>> level on the relocation set selector. I think we should leave the 
>>> current logging as is, since that's still useful when debugging the 
>>> relocation set selector itself. However, I think we should propagate 
>>> some of the high level information and print it on the info level.
>>>
>>> Here's an example of what the output looks like with this patch:
>>>
>>> [...]
>>> [68.926s][info][gc,reloc??? ] GC(6) Small Pages: 529 / 1058M(93%), 
>>> Empty: 350M(31%), Compacting: 450M(40%)->20M(2%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Medium Pages: 2 / 64M(6%), Empty: 
>>> 0M(0%), Compacting: 64M(6%)->32M(3%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Large Pages: 2 / 12M(1%), Empty: 
>>> 6M(1%), Compacting: 0M(0%)->0M(0%)
>>> [68.926s][info][gc,reloc??? ] GC(6) Relocation: Successful
>>> [...]
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8234440
>>> Webrev: http://cr.openjdk.java.net/~pliden/8234440/webrev.0
>>>
>>> /Per
> 


From per.liden at oracle.com  Wed Jan 22 09:57:25 2020
From: per.liden at oracle.com (Per Liden)
Date: Wed, 22 Jan 2020 10:57:25 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
Message-ID: <03e96d75-4449-460a-fbc9-a6e1d5f7639c@oracle.com>

Looks good to me.

/Per

On 1/17/20 2:31 PM, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to remove the automatic "is in heap" 
> verification from OopIterateClosure.
> 
> https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8237363
> 
> OopIterateClosure provides some automatic verification that loaded 
> objects are inside the heap. Closures can opt out from this by 
> overriding should_verify_oops().
> 
> I propose that we move this verification, and the way to turn it off, 
> and instead let the implementations of the closures decide the kind of 
> verification that is appropriate. I want to do this to de-clutter the 
> closure APIs a bit.
> 
> I've gone through all OopIterateClosures that don't override 
> should_verify_oops() and added calls to 
> assert_oop_field_points_to_object_in_heap[_or_null] where the closures 
> didn't have equivalent checks.
> 
> A lot of the places didn't explicitly check that the object is within 
> the heap but they would check for other things like:
> - Is the corresponding bit index within the range
> - Is the heap region index within range
> - Is the object in the reserved heap range (weaker than is_in)
> 
> I've added asserts to those places. If you think I should remove some of 
> them, please let me now.
> 
> Tested with tier1-3
> 
> Thanks,
> StefanK


From maoliang.ml at alibaba-inc.com  Wed Jan 22 10:02:03 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 22 Jan 2020 18:02:03 +0800
Subject: =?UTF-8?B?UmU6IERpc2N1c3Npb246IGltcHJvdmUgaHVtb25nb3VzIG9iamVjdHMgaGFuZGxpbmcgZm9y?=
 =?UTF-8?B?IEcx?=
In-Reply-To: <a66989d6-e6e4-9dcc-fda3-319c091c7ad1@oracle.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
 <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
 <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>,
 <a66989d6-e6e4-9dcc-fda3-319c091c7ad1@oracle.com>
Message-ID: <45600e6d-08c9-4035-bcc5-78db69bfe8b0.maoliang.ml@alibaba-inc.com>

Hi Thomas,

Thanks for your rich explanation. I saw those problems but didn't
think so much:) My previous approach definitely would increase the
GC frequency so after you provide idea of canceling cm cycle I 
thought it would be a better one.

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 22 (Wed.) 17:40
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; Man Cao <manc at google.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: Discussion: improve humongous objects handling for G1

Hi Liang,

On 21.01.20 15:26, Liang Mao wrote:
> Hi Thomas,
> 
> Thank you for pointing out my mistake for comparing iterating object array
> with card scanning that I missed the detail that card scanning doesn't need
> to scan the whole object array.
> 
> I didn't provide gc log because I haven't sufficient statistics data about
> humongous distribution or the object arrays. The solution is just 
> straightforward
> because increasing G1HeapRegionSize fixes the problem so I want to do 
> the same to
> G1HeapRegionSize=32m. In my earlier memory of tunning some 
> typical applications,
> humongous objects occupy more than half of the used heap after young GC with
> default G1HeapRegionSize. I guess perhaps half of our applications may 
> encounter
> the issue with default setting. So currently we use the G1HeapRegionSize as
> approximately 1/500 of Xmx.
> I know that iterating humongous object array in young GC might significantly
> degrade the pause time orientied philosophy. But if the pause time is 
> already in
> expectation with CMS such behavior isn't doing anything worse but avoid 
> the GC
> turbulence by concurrent mark. Beside the obvious penalty to pause time, do
> you have any other concerns?

Ultimately, no, but given that there are options that seem all-around 
better or there are things (e.g. humongous object tail allocations) to 
do first I would not spend time on that at this time :)

> 
>> Note that its allocation could still be counted against the eden 
>> allowance in some situations. This could be seen as a way to slow down 
>> the mutator while it is busy trying to complete the marking. >
>> I am however not sure if it helps a lot assuming that changes to perform 
>> eager reclaim on objArrays won't work during marking btw. There would be 
>> need for a different kind of enforcing such an allocation penalty.
 >
> I'm sorry I didn't get these 2 paragraphs. Could you please explain more?

The first sentence starts talking about a hybrid approach: keep the 
object in old gen, but still account it against the allowed eden allocation.

The problem is how to account this: you do not really want to account 
the full regions against it, because that would cause more gcs than CMS 
as we don't/can't allocate into tail of humongous objects at all. 
Counting fractions of regions would leave you with the decision to be 
conservative and make an eden allocation region unused.

E.g. eden allocation budget is 50 regions, you allocate 50 humongous 
objects half a region each. Should that exhaust the budget (in case 
regions are counted fully)? This would mean that G1 would do collections 
at double the rate of CMS.
An option would be to make them count at their real size (for purposes 
of determining when eden is "exhausted"), but that might leave you with 
a fraction of a region for the last region.
The other problem is of course, while it's accounted for its actual 
size, it still takes twice space in the heap (i.e. the to-space 
exhaustion issue).

This is why the mention of "in some situations" - if you are eating into 
the reserve already, it's probably better to account them in full anyway.

Note that otoh if you keep the object in young gen logically too, you 
could imagine allocating into its tail. However keeping humongous 
(objArray) objects in young gen logically has the other bad properties 
we talked about earlier. You could even vary the strategy between 
objArray and non-objArrays. So this needs some more thought, and the 
policy written down, and tested on "real" applications. :)

The last sentence in that paragraph refers to somehow slow down the 
mutator when eating in the reserve to complete the marking in time.

The second paragraph is about questioning the slowing down mechanism a 
bit: even during this situation, when you are marking, and slowing down 
the mutator, additional gcs do not help you a lot wrt to eager reclaim 
of humongous objArray objects, as the most likely (initial) 
implementation of  that would not do eager reclaim during marking (it's 
doable, you need to keep the satb invariant).

Yeah, these two paragraphs compressed my thoughts maybe a bit too much. ;)

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Wed Jan 22 10:05:19 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 11:05:19 +0100
Subject: Discussion: improve humongous objects handling for G1
In-Reply-To: <CA+w6HxYfU9CrCr24GEKVX2bO8x0mOoyr0h01z0QXc_91PZoyjg@mail.gmail.com>
References: <3b35ff1d-b717-42c8-bd1f-28a54a5a0ec4.maoliang.ml@alibaba-inc.com>
 <b4ee9a7d-5ec9-3a03-de5f-fa5dd8016b9a@oracle.com>
 <54f40191-82ff-43af-aaa5-5821efc59bed.maoliang.ml@alibaba-inc.com>
 <CA+w6HxYfU9CrCr24GEKVX2bO8x0mOoyr0h01z0QXc_91PZoyjg@mail.gmail.com>
Message-ID: <a59bb7c0-afcb-aca2-6117-b05f52e097c7@oracle.com>

Hi Man,

On 22.01.20 06:05, Man Cao wrote:
> Hi all,
> 
> Thanks for the great discussion from Thomas and Liang!
> 
> Regarding to GC logs, histogram of humongous allocations, and a more 
> concrete example, I guess we are in the same boat here. We only advised 
> users to increase G1HeapRegionSize, which would work around many cases 
> of the problem. We have not yet closely studied patterns of the 
> problematic humongous allocations. I will do such a study and follow up 
> with some statistics and GC logs when I get my hands on?them.
> [...]
> 
>  > Btw the same text also indicates that copying seems like a non-starter
>  > anyway, as, quoting from the text "One use case, SPECjbb2015 benchmark
>  > is not being able to finish RT curve...".
>  > Not sure what prevents arraylets in particular from being O(1); a
>  > particular access is slower though due to the additional indirection
>  > with the spine.
>  > ...
>  > Which means that there is significant optimization work needed to make
>  > array access "as fast" as before in jitted code
> These two issues:
> (1) copying for JNI Critical
> (2) slowing down typical jitted code for array accesses
> do sound like performance deal-breakers, particularly if they are only 
> required for G1+arraylets but not other collectors. There are some use 
> cases of JNI Critical on arrays that are solely for performance reasons, 
> and we'd rather not slow them down

+1

> 
>  > It could help with all problems but cases where you allocate a very
>  > large of humongous objects and you can't keep the humognous object
>  > tails filled. This option still keeps the invariant that humongous
>  > objects need to be allocated at a region boundary.
>  >
>  > Most of the other ideas you propose below also (seem to) retain this
>  > property.
> Agreed. It seems that JDK-8172713 would help most ideas anyway.

Yeah :)

>  > Maybe it is sufficient as "most" applications only use single or low
>  > double-digit GB heaps at the moment where the entire reservation still
>  > fits into the 32gb barrier.
> I also had the same thought. Most of our important workloads have heap 
> sizes less than 20GB.
> If the "reserve multiple MaxHeapSize" approach could work with 
> compressed oops for <16GB heap, then it is quite acceptable.

I agree, it could help in a lot of cases while not (apparently) costing 
much. On 32 bit systems this is simply disabled (ie. -Xmx == reservation 
size).

I think even a small over-reservation would help in a lot of cases for 
external fragmentation.

One could think of "cheating" a little with the actual memory 
usage/commit size by only committing what the humongous object actually 
needs if you wanted. This would complicate size accounting quite a bit 
though (and increase commit/uncommit calls), so JDK-8172713 seems 
favorable at least.

> That said, now I do agree that I should first study the patterns of 
> humongous allocation and look into improvement on eager reclamation.
> 
> For the approach from Liang/Alibaba, I'm optimistic that it could solve 
> many problems migrating from ParNew+CMS to G1. Because it handles 
> humongous allocations in a similar way as ParNew+CMS does, plus G1 has 
> the advantage of not copying humongous objects, the pause 
> duration/frequency would probably not degrade compared to ParNew/CMS.
> I also agree with Thomas that it may increase pause duration compared to 
> current G1 due to extra scanning, and allocation spikes might affect 
> other?aspects of G1. I noticed in the description for JDK-8027959: "a) 
> logically keep LOBs in young gen, doing in-place aging", which sounds 
> like the GC team have explored this approach for eager reclamation 
> before?

Yes, with the issue described before (with objArray humongous objects - 
non-objArrays are not an issue from a scanning POV, but they still are 
from an accounting one).

It might be the best of both worlds if we could make eager
> reclamation of humongous objArrays work without putting them in young 
> gen, and further improve eager reclamation in general.
> 

Considering only the "avoid humongous object fragmentation" area, yes :)

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Wed Jan 22 10:20:36 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 22 Jan 2020 11:20:36 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <03e96d75-4449-460a-fbc9-a6e1d5f7639c@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
 <03e96d75-4449-460a-fbc9-a6e1d5f7639c@oracle.com>
Message-ID: <6ddc429b-a10d-7ec8-0fbb-1e82f7d15dfa@oracle.com>

Thanks, Per.

StefanK

On 2020-01-22 10:57, Per Liden wrote:
> Looks good to me.
> 
> /Per
> 
> On 1/17/20 2:31 PM, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to remove the automatic "is in heap" 
>> verification from OopIterateClosure.
>>
>> https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8237363
>>
>> OopIterateClosure provides some automatic verification that loaded 
>> objects are inside the heap. Closures can opt out from this by 
>> overriding should_verify_oops().
>>
>> I propose that we move this verification, and the way to turn it off, 
>> and instead let the implementations of the closures decide the kind of 
>> verification that is appropriate. I want to do this to de-clutter the 
>> closure APIs a bit.
>>
>> I've gone through all OopIterateClosures that don't override 
>> should_verify_oops() and added calls to 
>> assert_oop_field_points_to_object_in_heap[_or_null] where the closures 
>> didn't have equivalent checks.
>>
>> A lot of the places didn't explicitly check that the object is within 
>> the heap but they would check for other things like:
>> - Is the corresponding bit index within the range
>> - Is the heap region index within range
>> - Is the object in the reserved heap range (weaker than is_in)
>>
>> I've added asserts to those places. If you think I should remove some 
>> of them, please let me now.
>>
>> Tested with tier1-3
>>
>> Thanks,
>> StefanK


From thomas.schatzl at oracle.com  Wed Jan 22 10:46:26 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 11:46:26 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
 <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
Message-ID: <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>

Hi all,

On 17.01.20 16:07, Leo Korinth wrote:
> Hi Thomas,
> 
> This is not a review. This code is basically the same code as is 

I took it as one anyway ;)

> duplicated at least three times in the test code. One of the 
> duplications you can blame me for, *sorry*. I believe it should be moved 
> to a common library method. I also believe the last fix you did in 
> TestG1ParallelPhases.java makes that version look cleaner than what you 
> propose here (it does not need the last allocation loop at all).
> 
> How about using the TestG1ParallelPhases.java version for all three test 
> cases? If not, do the third version in TestOldGenCollectionUsage really 
> work???
> 

Here's a webrev incorporating these suggestions to unify the code:

http://cr.openjdk.java.net/~tschatzl/8237079/webrev.1 (full)

There is no point to provide a diff webrev here as the whole change has 
been redone.

  - factor out and use a MixedGCProvoker class in all three of those tests.
  - some changes in the various tests to align their option a bit more 
and stabilize them
  - TestOldCollectionUsage assumed that there were no previous old gen 
allocations, actually it failed if there were. Since we can't guarantee 
that, loosened the condition to require update of the mixed gc usage only.
  - for gc/g1/mixedgc/Testlogging.java removed the need to match the 
whole log message including the "G1 Evacuation Pause" gc cause message. 
It did not seem to be point of the test to check that the mixed gc has 
been caused "naturally" by eden exhaustion or via whitebox.

Thanks,
   Thomas


From aph at redhat.com  Wed Jan 22 10:52:52 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 22 Jan 2020 10:52:52 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
 <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
Message-ID: <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>

On 1/15/20 1:00 AM, David Holmes wrote:
> On 15/01/2020 2:15 am, Andrew Haley wrote:
>> On 1/14/20 3:52 PM, Doerr, Martin wrote:
>>
>>> good catch. I think you're right. A multi-copy-atomic, but weak
>>> architecture (e.g. aarch64) needs an instruction which orders both
>>> volatile loads.
>>
>> Good, I thought so.
>>
>> Given that TSO machines define OrderAccess::acquire() as no more than
>> a compiler barrier, I believe that we could do something like
>>
>> #ifdef CPU_MULTI_COPY_ATOMIC
>>     OrderAccess::acquire();
>> #else
>>     OrderAccess::fence();
>> #endif
>
> "acquire" isn't used to order loads it is used to pair with a "release"
> associated with the store of the variable now being loaded.
>
> If this is the code referred to:
>
>    Age oldAge = _age.get();
>    // Architectures with weak memory model require a barrier here
>    // to guarantee that bottom is not older than age,
>    // which is crucial for the correctness of the algorithm.
> #ifndef CPU_MULTI_COPY_ATOMIC
>    OrderAccess::fence();
> #endif
>    uint localBot = Atomic::load_acquire(&_bottom);
>
> then I think there is an assumption (perhaps incorrect) that the
> load_acquire will prevent reordering as well as performing the necessary
> "acquire" semantics.

It depends on how _age is written to.

As far as I can see there is no ordering between setting _bottom and setting
_age,

  void set_empty() {
    _bottom = 0;
    _age.set(0);
  }

so it looks like any kind of fence on the reader side is pointless anyway. In
that case, I don't know why we're doing any of this if it doesn't matter
what order the reader threads see updates to _age and _bottom.

It's all rather baffling. _bottom is declared volatile, as is _age, so I
guess there must be some ordering requirements, but no fences on the
writing side to enforce it.

What actually are the ordering requirements between _bottom and _age?

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From thomas.schatzl at oracle.com  Wed Jan 22 11:02:51 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 12:02:51 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
Message-ID: <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>

Hi,

On 17.01.20 14:31, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to remove the automatic "is in heap" 
> verification from OopIterateClosure.
> 
> https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8237363
> 
> OopIterateClosure provides some automatic verification that loaded 
> objects are inside the heap. Closures can opt out from this by 
> overriding should_verify_oops().
> 
> I propose that we move this verification, and the way to turn it off, 
> and instead let the implementations of the closures decide the kind of 
> verification that is appropriate. I want to do this to de-clutter the 
> closure APIs a bit.
> 

While the change is correct, I am not really convinced it is a good idea 
to trade verification in one place to the same verification in many place.

The closure API does not seem to be particularly "cluttered up" by this 
particular API to me. It is true that other code typically has many 
other asserts that would fail anyway, but it would be an additional 
safety net when writing new closures.

This is not a hard no for this change, but is there something else you 
are planning to do in this area where this code would be in the way?

> I've gone through all OopIterateClosures that don't override 
> should_verify_oops() and added calls to 
> assert_oop_field_points_to_object_in_heap[_or_null] where the closures 
> didn't have equivalent checks.
> 
> A lot of the places didn't explicitly check that the object is within 
> the heap but they would check for other things like:
> - Is the corresponding bit index within the range
> - Is the heap region index within range
> - Is the object in the reserved heap range (weaker than is_in)
> 
> I've added asserts to those places. If you think I should remove some of 
> them, please let me now.
> 
> Tested with tier1-3
> 

Thanks,
   Thomas


From maoliang.ml at alibaba-inc.com  Wed Jan 22 11:57:06 2020
From: maoliang.ml at alibaba-inc.com (Liang Mao)
Date: Wed, 22 Jan 2020 19:57:06 +0800
Subject: =?UTF-8?B?UkZSOiA4MjM2MDczOiBHMTogVXNlIFNvZnRNYXhIZWFwU2l6ZSB0byBndWlkZSBHQyBoZXVy?=
 =?UTF-8?B?aXN0aWNz?=
Message-ID: <90aa2259-afce-44af-abb2-31700caea4a0.maoliang.ml@alibaba-inc.com>

Hi Thomas,

I just uploaded the new patch for SoftMaxHeapSize. The shrink works
 immediately. The concurrent uncommit will be in a different patch.

http://cr.openjdk.java.net/~luchsh/8236073.webrev.2/

Thanks,
Liang


------------------------------------------------------------------
From:MAO, Liang <maoliang.ml at alibaba-inc.com>
Send Time:2020 Jan. 16 (Thu.) 11:21
To:Thomas Schatzl <thomas.schatzl at oracle.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi Thomas,

Yes. We can focus on the current concurrent shrinking for now.
You are right that changing the default behavior will be sensitive
 since you need to cover all types of applications including 
throughput and low-latency while our previous patch is mostly
designed for low-latency. We'll figure this out later:)

Thanks,
Liang


------------------------------------------------------------------
From:Thomas Schatzl <thomas.schatzl at oracle.com>
Send Time:2020 Jan. 16 (Thu.) 01:57
To:"MAO, Liang" <maoliang.ml at alibaba-inc.com>; hotspot-gc-dev <hotspot-gc-dev at openjdk.java.net>
Subject:Re: RFR: 8236073: G1: Use SoftMaxHeapSize to guide GC heuristics

Hi,

On Wed, 2020-01-15 at 20:53 +0800, Liang Mao wrote:
> Hi Thomas,
> 
> So G1 doesn't need to shrink below Xms if SoftMaxHeapSize is 
> below Xms, does it?
> 

No, never shrink below MinHeapSize.

> Another question is that no matter we have an additional option we 
> had better have 2 criterions. The first is for urgent expansion that
> GCTimeRatio is quite low and concurrent expansion with frequent GCs 
> is more harmful and expansion should be done immediately. It's the
> current default flow as we found that 12 is a good number below which
> applications can obviously incur timeout errors. The second is to
> keep the GCTimeRatio and memory footprint in a balanced state so
> any adjustments are better to be concurrent. The original number 99
> fits well here. If we have only one option "GCTimeRatio", we might
> not be able to achieve both. Maybe we can have a LowGCTimeRatio below
> which suppose to be not acceptable and a HighTimeRatio which is
> certainly healthy. 

So far the change has been about shrinking the heap concurrently, and
not expansion.

Let's concentrate on the issue at hand, i.e. see how heap shrinking at
more places turns out. I believe there will be lots of tweaking needed
for this change to not show too many regressions in other applications.
Remember that the defaults should work well for a large body of
applications, not just a few. There may be knobs to tune it for others.

Then look concurrent expansion, at application phase changes in the
application, how to detect, and how to react best.

Just for reference, last time we changed the sizing algorithm it took a
few months to get it "right", with mostly improvements all around.

Thanks,
  Thomas


From david.holmes at oracle.com  Wed Jan 22 11:59:21 2020
From: david.holmes at oracle.com (David Holmes)
Date: Wed, 22 Jan 2020 21:59:21 +1000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
 <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
 <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>
Message-ID: <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>

On 22/01/2020 8:52 pm, Andrew Haley wrote:
> On 1/15/20 1:00 AM, David Holmes wrote:
>> On 15/01/2020 2:15 am, Andrew Haley wrote:
>>> On 1/14/20 3:52 PM, Doerr, Martin wrote:
>>>
>>>> good catch. I think you're right. A multi-copy-atomic, but weak
>>>> architecture (e.g. aarch64) needs an instruction which orders both
>>>> volatile loads.
>>>
>>> Good, I thought so.
>>>
>>> Given that TSO machines define OrderAccess::acquire() as no more than
>>> a compiler barrier, I believe that we could do something like
>>>
>>> #ifdef CPU_MULTI_COPY_ATOMIC
>>>      OrderAccess::acquire();
>>> #else
>>>      OrderAccess::fence();
>>> #endif
>>
>> "acquire" isn't used to order loads it is used to pair with a "release"
>> associated with the store of the variable now being loaded.
>>
>> If this is the code referred to:
>>
>>     Age oldAge = _age.get();
>>     // Architectures with weak memory model require a barrier here
>>     // to guarantee that bottom is not older than age,
>>     // which is crucial for the correctness of the algorithm.
>> #ifndef CPU_MULTI_COPY_ATOMIC
>>     OrderAccess::fence();
>> #endif
>>     uint localBot = Atomic::load_acquire(&_bottom);
>>
>> then I think there is an assumption (perhaps incorrect) that the
>> load_acquire will prevent reordering as well as performing the necessary
>> "acquire" semantics.
> 
> It depends on how _age is written to.
> 
> As far as I can see there is no ordering between setting _bottom and setting
> _age,
> 
>    void set_empty() {
>      _bottom = 0;
>      _age.set(0);
>    }
> 
> so it looks like any kind of fence on the reader side is pointless anyway. In
> that case, I don't know why we're doing any of this if it doesn't matter
> what order the reader threads see updates to _age and _bottom.
> 
> It's all rather baffling. _bottom is declared volatile, as is _age, so I
> guess there must be some ordering requirements, but no fences on the
> writing side to enforce it.
> 
> What actually are the ordering requirements between _bottom and _age?

I'm assuming the ordering requirement is to preserve the order as 
expressed in the code. There is likely an assumption that by declaring 
both as volatile that the the compiler will not reorder them; and that 
the load_acquire will prevent the hardware from reordering them. I'm not 
sure if either of those assumptions are actually valid.

But that doesn't explain the complete lack of barriers in set_empty.

The GC folk will need to chime in on the detailed semantic requirements 
of this algorithm.

David


From stefan.karlsson at oracle.com  Wed Jan 22 13:16:53 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 22 Jan 2020 14:16:53 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
 <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>
Message-ID: <246ce191-0f61-1bd5-caec-71299bfcebef@oracle.com>


On 2020-01-22 12:02, Thomas Schatzl wrote:
> Hi,
> 
> On 17.01.20 14:31, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to remove the automatic "is in heap" 
>> verification from OopIterateClosure.
>>
>> https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8237363
>>
>> OopIterateClosure provides some automatic verification that loaded 
>> objects are inside the heap. Closures can opt out from this by 
>> overriding should_verify_oops().
>>
>> I propose that we move this verification, and the way to turn it off, 
>> and instead let the implementations of the closures decide the kind of 
>> verification that is appropriate. I want to do this to de-clutter the 
>> closure APIs a bit.
>>
> 
> While the change is correct, I am not really convinced it is a good idea 
> to trade verification in one place to the same verification in many place.

An alternative would be to simply remove the verification altogether. As 
I said, we almost always check the result of the object address.

> 
> The closure API does not seem to be particularly "cluttered up" by this 
> particular API to me.

It's a slippery slope. Previously, we had a lot of GC specific functions 
in these interfaces. I've been cleaning this over the years, and this is 
one of the last non-essential parts of that interface that implementors 
need to consider.

With my removal people don't have to think about this anymore.

  It is true that other code typically has many
> other asserts that would fail anyway, but it would be an additional 
> safety net when writing new closures.

It's a safety net that works for G1, but almost always is incorrectly 
trips in the assert with ZGC.

> 
> This is not a hard no for this change, but is there something else you 
> are planning to do in this area where this code would be in the way?

No.

StefanK

> 
>> I've gone through all OopIterateClosures that don't override 
>> should_verify_oops() and added calls to 
>> assert_oop_field_points_to_object_in_heap[_or_null] where the closures 
>> didn't have equivalent checks.
>>
>> A lot of the places didn't explicitly check that the object is within 
>> the heap but they would check for other things like:
>> - Is the corresponding bit index within the range
>> - Is the heap region index within range
>> - Is the object in the reserved heap range (weaker than is_in)
>>
>> I've added asserts to those places. If you think I should remove some 
>> of them, please let me now.
>>
>> Tested with tier1-3
>>
> 
> Thanks,
>  ? Thomas
> 


From stefan.karlsson at oracle.com  Wed Jan 22 14:02:32 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Wed, 22 Jan 2020 15:02:32 +0100
Subject: RFR: 8237645: Remove OopsInGenClosure::par_do_barrier
Message-ID: <34dcda2a-ef2b-65e0-b6f2-fae553d95983@oracle.com>

Hi all,

Please review this patch to some dead code after the CMS removal.

https://cr.openjdk.java.net/~stefank/8237645/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8237645

Thanks,
StefanK


From rkennke at redhat.com  Wed Jan 22 14:33:17 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 22 Jan 2020 15:33:17 +0100
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
In-Reply-To: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
References: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
Message-ID: <c6578572-f05d-d82e-7f9f-8a803b13a2c2@redhat.com>

Hi Zhengyu,

Would it be possible to use scoped lockers instead in:

src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.cpp

The rest looks ok to me.

Thanks,
Roman

> Please review this patch that moves string deduplication cleanup task
> into concurrent phase.
> 
> The cleanup task composites two subtasks: StringDedupTable and
> StringDedupQueue cleanup.
> 
> Concurrent StringDedupTable cleanup is very straightforward. GC takes
> StringDedupTable_lock to block out mutators from modifying the table,
> then performs multi-thread cleanup, just as it does at STW pause.
> 
> Concurrent StringDedupQueue cleanup is more complicated. GC takes
> StringDedupQueue_lock, only blocks queue structure changes, while
> mutators can still enqueue new string candidates and dedup thread can
> still perform deduplication. So there are a couple of synchronizations
> need to be established.
> 
> 1) When mutator enqueues a candidate, the enqueued oop should be valid
> before the slot can be made visible to GC threads.
> 
> 2) When GC thread updates oop, it needs to make sure that dedup thread
> does not see partially updated oop.
> 
> The implementation uses load_acquire/release_store pair to ensure above
> synchronization held.
> 
> GC threads may miss some just enqueued oops by mutators. This is not a
> concern, since LRB guarantees they are in to-space.
> 
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/
> 
> 
> Test:
> ? hotspot_gc_shenandoah with -XX:+UseStringDeduplication
> ? (fastdebug and release) on x86_64 and aarch64 Linux
> 
> Thanks,
> 
> -Zhengyu
> 


From zgu at redhat.com  Wed Jan 22 14:37:12 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 22 Jan 2020 09:37:12 -0500
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
In-Reply-To: <c6578572-f05d-d82e-7f9f-8a803b13a2c2@redhat.com>
References: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
 <c6578572-f05d-d82e-7f9f-8a803b13a2c2@redhat.com>
Message-ID: <06325fc4-3ae6-c25c-d293-47f58962417d@redhat.com>

Hi Roman,

Thanks for the review.

On 1/22/20 9:33 AM, Roman Kennke wrote:
> Hi Zhengyu,
> 
> Would it be possible to use scoped lockers instead in:
> 
> src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.cpp
> 

They are conditional and somewhat already scoped, e.g. lock in 
constructor and unlock i destructor.

-Zhengyu

> The rest looks ok to me.
> 
> Thanks,
> Roman
> 
>> Please review this patch that moves string deduplication cleanup task
>> into concurrent phase.
>>
>> The cleanup task composites two subtasks: StringDedupTable and
>> StringDedupQueue cleanup.
>>
>> Concurrent StringDedupTable cleanup is very straightforward. GC takes
>> StringDedupTable_lock to block out mutators from modifying the table,
>> then performs multi-thread cleanup, just as it does at STW pause.
>>
>> Concurrent StringDedupQueue cleanup is more complicated. GC takes
>> StringDedupQueue_lock, only blocks queue structure changes, while
>> mutators can still enqueue new string candidates and dedup thread can
>> still perform deduplication. So there are a couple of synchronizations
>> need to be established.
>>
>> 1) When mutator enqueues a candidate, the enqueued oop should be valid
>> before the slot can be made visible to GC threads.
>>
>> 2) When GC thread updates oop, it needs to make sure that dedup thread
>> does not see partially updated oop.
>>
>> The implementation uses load_acquire/release_store pair to ensure above
>> synchronization held.
>>
>> GC threads may miss some just enqueued oops by mutators. This is not a
>> concern, since LRB guarantees they are in to-space.
>>
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/
>>
>>
>> Test:
>>  ? hotspot_gc_shenandoah with -XX:+UseStringDeduplication
>>  ? (fastdebug and release) on x86_64 and aarch64 Linux
>>
>> Thanks,
>>
>> -Zhengyu
>>
> 


From martin.doerr at sap.com  Wed Jan 22 15:01:01 2020
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 22 Jan 2020 15:01:01 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
 <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
 <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>
 <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>
Message-ID: <AM0PR0202MB3297FFF9258B5042A219AEB09A0C0@AM0PR0202MB3297.eurprd02.prod.outlook.com>

Hi Andrew and David,

the scenario for which these barriers are needed is not so trivial:

Thread1: set bottom (push)
Thread2: read age, read bottom, set age (pop_global)
Thread3: read age, read bottom (pop_global)

The requirement is that Thread3 must never read an older bottom value than Thread2 after Thread3 has seen the age value from Thread2.

The age is updated by cmpxchg in pop_global which already implies strict ordering so there's no extra release barrier.
I'd rather choose OrderAccess::loadload in pop_global, but I don't think it's a big deal.

I'll be glad if somebody from GC can double-check if I remembered this stuff correctly.

Best regards,
Martin


> -----Original Message-----
> From: David Holmes <david.holmes at oracle.com>
> Sent: Mittwoch, 22. Januar 2020 12:59
> To: Andrew Haley <aph at redhat.com>; Doerr, Martin
> <martin.doerr at sap.com>; Derek White <derekw at marvell.com>; hotspot-
> gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: Re: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> On 22/01/2020 8:52 pm, Andrew Haley wrote:
> > On 1/15/20 1:00 AM, David Holmes wrote:
> >> On 15/01/2020 2:15 am, Andrew Haley wrote:
> >>> On 1/14/20 3:52 PM, Doerr, Martin wrote:
> >>>
> >>>> good catch. I think you're right. A multi-copy-atomic, but weak
> >>>> architecture (e.g. aarch64) needs an instruction which orders both
> >>>> volatile loads.
> >>>
> >>> Good, I thought so.
> >>>
> >>> Given that TSO machines define OrderAccess::acquire() as no more than
> >>> a compiler barrier, I believe that we could do something like
> >>>
> >>> #ifdef CPU_MULTI_COPY_ATOMIC
> >>>      OrderAccess::acquire();
> >>> #else
> >>>      OrderAccess::fence();
> >>> #endif
> >>
> >> "acquire" isn't used to order loads it is used to pair with a "release"
> >> associated with the store of the variable now being loaded.
> >>
> >> If this is the code referred to:
> >>
> >>     Age oldAge = _age.get();
> >>     // Architectures with weak memory model require a barrier here
> >>     // to guarantee that bottom is not older than age,
> >>     // which is crucial for the correctness of the algorithm.
> >> #ifndef CPU_MULTI_COPY_ATOMIC
> >>     OrderAccess::fence();
> >> #endif
> >>     uint localBot = Atomic::load_acquire(&_bottom);
> >>
> >> then I think there is an assumption (perhaps incorrect) that the
> >> load_acquire will prevent reordering as well as performing the necessary
> >> "acquire" semantics.
> >
> > It depends on how _age is written to.
> >
> > As far as I can see there is no ordering between setting _bottom and
> setting
> > _age,
> >
> >    void set_empty() {
> >      _bottom = 0;
> >      _age.set(0);
> >    }
> >
> > so it looks like any kind of fence on the reader side is pointless anyway. In
> > that case, I don't know why we're doing any of this if it doesn't matter
> > what order the reader threads see updates to _age and _bottom.
> >
> > It's all rather baffling. _bottom is declared volatile, as is _age, so I
> > guess there must be some ordering requirements, but no fences on the
> > writing side to enforce it.
> >
> > What actually are the ordering requirements between _bottom and _age?
> 
> I'm assuming the ordering requirement is to preserve the order as
> expressed in the code. There is likely an assumption that by declaring
> both as volatile that the the compiler will not reorder them; and that
> the load_acquire will prevent the hardware from reordering them. I'm not
> sure if either of those assumptions are actually valid.
> 
> But that doesn't explain the complete lack of barriers in set_empty.
> 
> The GC folk will need to chime in on the detailed semantic requirements
> of this algorithm.
> 
> David

From rkennke at redhat.com  Wed Jan 22 15:57:30 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 22 Jan 2020 16:57:30 +0100
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
In-Reply-To: <06325fc4-3ae6-c25c-d293-47f58962417d@redhat.com>
References: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
 <c6578572-f05d-d82e-7f9f-8a803b13a2c2@redhat.com>
 <06325fc4-3ae6-c25c-d293-47f58962417d@redhat.com>
Message-ID: <d080fc77-a9be-4594-e327-3381f91c549e@redhat.com>

Hi Zhengyu,

>> Hi Zhengyu,
>>
>> Would it be possible to use scoped lockers instead in:
>>
>> src/hotspot/share/gc/shenandoah/shenandoahRootProcessor.cpp
>>
> 
> They are conditional and somewhat already scoped, e.g. lock in
> constructor and unlock i destructor.

Hmmhmm. Ok then.

Roman


> 
> -Zhengyu
> 
>> The rest looks ok to me.
>>
>> Thanks,
>> Roman
>>
>>> Please review this patch that moves string deduplication cleanup task
>>> into concurrent phase.
>>>
>>> The cleanup task composites two subtasks: StringDedupTable and
>>> StringDedupQueue cleanup.
>>>
>>> Concurrent StringDedupTable cleanup is very straightforward. GC takes
>>> StringDedupTable_lock to block out mutators from modifying the table,
>>> then performs multi-thread cleanup, just as it does at STW pause.
>>>
>>> Concurrent StringDedupQueue cleanup is more complicated. GC takes
>>> StringDedupQueue_lock, only blocks queue structure changes, while
>>> mutators can still enqueue new string candidates and dedup thread can
>>> still perform deduplication. So there are a couple of synchronizations
>>> need to be established.
>>>
>>> 1) When mutator enqueues a candidate, the enqueued oop should be valid
>>> before the slot can be made visible to GC threads.
>>>
>>> 2) When GC thread updates oop, it needs to make sure that dedup thread
>>> does not see partially updated oop.
>>>
>>> The implementation uses load_acquire/release_store pair to ensure above
>>> synchronization held.
>>>
>>> GC threads may miss some just enqueued oops by mutators. This is not a
>>> concern, since LRB guarantees they are in to-space.
>>>
>>>
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
>>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/
>>>
>>>
>>> Test:
>>> ?? hotspot_gc_shenandoah with -XX:+UseStringDeduplication
>>> ?? (fastdebug and release) on x86_64 and aarch64 Linux
>>>
>>> Thanks,
>>>
>>> -Zhengyu
>>>
>>
> 


From leo.korinth at oracle.com  Wed Jan 22 16:02:06 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Wed, 22 Jan 2020 17:02:06 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
 <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
 <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>
Message-ID: <0f272a73-d817-150e-ed68-687737445580@oracle.com>

On 22/01/2020 11:46, Thomas Schatzl wrote:
> Hi all,
> 
> On 17.01.20 16:07, Leo Korinth wrote:
>> Hi Thomas,
>>
>> This is not a review. This code is basically the same code as is 
> 
> I took it as one anyway ;)
> 
>> duplicated at least three times in the test code. One of the 
>> duplications you can blame me for, *sorry*. I believe it should be 
>> moved to a common library method. I also believe the last fix you did 
>> in TestG1ParallelPhases.java makes that version look cleaner than what 
>> you propose here (it does not need the last allocation loop at all).
>>
>> How about using the TestG1ParallelPhases.java version for all three 
>> test cases? If not, do the third version in TestOldGenCollectionUsage 
>> really work???
>>
> 
> Here's a webrev incorporating these suggestions to unify the code:
> 
> http://cr.openjdk.java.net/~tschatzl/8237079/webrev.1 (full)

> 
> There is no point to provide a diff webrev here as the whole change has 
> been redone.
> 
>  ?- factor out and use a MixedGCProvoker class in all three of those tests.
>  ?- some changes in the various tests to align their option a bit more 
> and stabilize them
>  ?- TestOldCollectionUsage assumed that there were no previous old gen 
> allocations, actually it failed if there were. Since we can't guarantee 
> that, loosened the condition to require update of the mixed gc usage only.
>  ?- for gc/g1/mixedgc/Testlogging.java removed the need to match the 
> whole log message including the "G1 Evacuation Pause" gc cause message. 
> It did not seem to be point of the test to check that the mixed gc has 
> been caused "naturally" by eden exhaustion or via whitebox.

I think this looks *really* good. Also, thanks for taking the extra time 
to make it shared and reusable code.

/Leo


> 
> Thanks,
>  ? Thomas
> 


From thomas.schatzl at oracle.com  Wed Jan 22 16:12:39 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 17:12:39 +0100
Subject: RFR: 8237143: Eliminate DirtyCardQ_cbl_mon
In-Reply-To: <745E91C1-AE1A-4DA2-80EE-59B70897F4BF@oracle.com>
References: <745E91C1-AE1A-4DA2-80EE-59B70897F4BF@oracle.com>
Message-ID: <f9505da9-ca8b-f880-5420-bcdb3622b206@oracle.com>

Hi Kim,

On 16.01.20 09:51, Kim Barrett wrote:
> Please review this change to eliminate the DirtyCardQ_cbl_mon.  This
> is one of the two remaining super-special "access" ranked mutexes.
> (The other is the Shared_DirtyCardQ_lock, whose elimination is covered
> by JDK-8221360.)
> 
> There are three main parts to this change.
> 
> (1) Replace the under-a-lock FIFO queue in G1DirtyCardQueueSet with a
> lock-free FIFO queue.
> 
> (2) Replace the use of a HotSpot monitor for signaling activation of
> concurrent refinement threads with a semaphore-based solution.
> 
> (3) Handle pausing of buffer refinement in the middle of a buffer in
> order to handle a pending safepoint request.  This can no longer just
> push the partially processed buffer back onto the queue, due to ABA
> problems now that the buffer is lock-free.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8237143
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8237143/open.00/
> 
> Testing:
> mach5 tier1-5
> Normal performance testing showed no significant change.
> specjbb2015 on a very big machine showed a 3.5% average critical-jOPS
> improvement, though not statistically significant; removing contention
> for that lock by many hardware threads may be a little bit noticeable.
> 
> 

   initial comments only, and so far only about comments :( The code 
itself looks good to me, but I want to look over it again.

- noticed at least in maybe_deactivate, the description of the method 
and parameter and return value documentation is in the cpp file at the 
definition. I would really prefer this information in the hpp file for 
easier reference.

- documentation about the inner workings of the class should imho be put 
in the hpp file too (in the class documentation). Same as documentation 
about member variables in seemingly random places now.

This makes the code unusually hard to read and reference for me - i.e. 
you can easily flip to the definition (hpp file) using a click (or 
accelerator key) when on the identifier to see how stuff is supposed to 
be used (or what it is used for) than literally grepping through the cpp 
file (and these blocks of documentation even reference each other).

Also, IDEs show comments on methods in the hpp as pop-ups too.

E.g. the documentation of _notifier and _should_notify in 
g1ConcurrentRefineThread.hpp:51, or the description of 
G1DirtCardQueueSet::_concurrency in g1DirtyCardQueue.cpp:117 (followed 
by description of (Non)ConcurrentVerifier, or the huge description of 
_completed_buffers_{head,tail} and others in G1DirtyCardQueue.cpp:163, 
and the description of refinement processing in g1DirtyCardQueue.cpp:301.

I mean, if a user tries to get an overview on how the class works or 
what some members do I would argue that you would first reference the 
hpp file - and the only new comment there right now is some reasoning 
for the padding of the new members without any comment about the purpose 
of these.
Maybe some of the comments should be split into more generic parts, and 
very specific implementation details (which can stay in the cpp file 
where they are needed - these seem sufficient now?).

- just a random note: I wasn't really happy with the name of 
G1ConcurrentRefineThread::_should_notify, but several attempts at 
renaming failed for me. It seems rather generic.

- some comments on the long comment about the FIFO queue handling.

// _completed_buffers_{head,tail} and _num_cards provide a lock-free
// FIFO of buffers, linked through their next() fields.

Not sure about whether _num_cards provides anything about the FIFO (it's 
not even the number of buffers), it seems to be solely counting the 
cards held in the FIFO. Which is fine to mention, but not necessarily here.

Also I would put that description of _completed_buffers to the variables 
in the hpp files.

// The key idea to make this work is that pop (get_completed_buffer)
// never returns an element of the queue if it is the only accessible
// element,

If I understand this correctly, maybe "if there is only one buffer in 
the FIFO" is easier to understand than "only accessible element". (or 
define "accessible element").

// e.g.  its "next" value is NULL.  It is expected that there

s/e.g./i.e.?

The code seems to unnecessarily use the NULL_buffer constant. Maybe use 
it here too. Overall I am not sure about the usefulness of using 
NULL_buffer in the code. The NULL value in Hotspot code is generally 
accepted as a special value, and the name "NULL_buffer" does not seem to 
add any information.

// will be a later push/append that will make that element available to

I would prefer if the documentation would be consistent with the 
nomenclature in the code, i.e. use either append/enqueue/get throughout 
or some variants of push/pop (there is no method that starts with either 
push or pop anywhere). While I understand that you'd typically use 
push/pop on a FIFO I would prefer to either rename the methods or drop 
the use of push/pop in this documentation. Additional identical 
terminology seems confusing. At least it takes time to find out what is 
exactly what.

What would be nice in this context would be that G1DirtyCardQueueSet is 
implemented as a FIFO of buffers, and that append/enqueue/get* methods 
are the usual operations. (in the hpp file, in the class description)

// a future pop, or there will eventually be a complete transfer
// (take_all_completed_buffers).
//
// An append operation atomically exchanges the new tail with the queue
// tail. It then sets the "next" value of the old tail to the head of
// the list being appended.  (It is an invariant that the old tail's
// "next" value is NULL.)

Maybe put this invariant somewhere more prominent in the text, not as 
side note.

// But if the old tail is NULL then the queue was
// empty.  In this case the head of the list being appended is instead
// stored in the queue head (which must be NULL).

I would mention the invariant that if old tail is NULL then head must be 
NULL too right next to the "old tail is NULL" sentence.

// A push operation is just a degenerate append, where the buffer being
// pushed is both the head and the tail of the list being appended.

Defining a push operation does not seem to help at all, because the 
documentation always mentions the pair push/append anyway (and there is 
no explicit "push" method in the code). I would suggest to delete this 
paragraph.

//
// This means there is a period between the exchange and the old tail
// update where the queue sequence is split into two parts, the list
// from the queue head to the old tail, and the list being appended.  If
// there are concurrent push/append operations, each may introduce
// another such segment.  But they all eventually get resolved by their
// respective updates of their old tail's "next" value.
//
// pop gets the queue head as the candidate result (returning NULL if
// the queue head was NULL), and then gets that result node's "next"
// value.  If that "next" value is NULL and the queue head hasn't
// changed, then there is only one element in the (accessible) list.  We

It would be nice to define the "accessible" list somewhere explicitly - 
or drop that property because it seems to be the standard "elements 
within the current head and tail" anyway.

// can't return that element, because it may be the old tail of a
// concurrent push/append.  So return NULL in this case.  Otherwise,
// attempt to cmpxchg that "next" value into the queue head, retrying
// the whole operation if that fails. This is the "usual" lock-free pop
// from head of slist, with the additional restriction on taking the

slist? ;)

// last element.

s/taking/popping (or "get"-ing) the last element.

// In order to address the ABA problem for pop, a pop operation protects
// its access to the head of the list with a GlobalCounter critical
// section. This works with the buffer allocator's use of GlobalCounter
// synchronization to prevent ABA from arising in the normal buffer
// usage cycle.  The paused buffer handling prevents another ABA source
// (see record_paused_buffer and enqueue_previous_paused_buffers).

- g1DirtyCardQueue.cpp: s/"// Unreachable"/ShouldNotReachHere(); (or 
just delete)

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Wed Jan 22 16:14:10 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 22 Jan 2020 17:14:10 +0100
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <0f272a73-d817-150e-ed68-687737445580@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
 <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
 <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>
 <0f272a73-d817-150e-ed68-687737445580@oracle.com>
Message-ID: <c6eff4f1-7fca-2a09-79fa-8d78e98a84e9@oracle.com>

Hi Leo,

On 22.01.20 17:02, Leo Korinth wrote:
> On 22/01/2020 11:46, Thomas Schatzl wrote:
>> Hi all,
>>
>> On 17.01.20 16:07, Leo Korinth wrote:
>>> Hi Thomas,
>>>
>>> This is not a review. This code is basically the same code as is 
>>
[...]
>>>
>>> How about using the TestG1ParallelPhases.java version for all three 
>>> test cases? If not, do the third version in TestOldGenCollectionUsage 
>>> really work???
>>>
>>
>> Here's a webrev incorporating these suggestions to unify the code:
>>
>> http://cr.openjdk.java.net/~tschatzl/8237079/webrev.1 (full)
> 
>>
>> There is no point to provide a diff webrev here as the whole change 
>> has been redone.
>>
[...]
> 
> I think this looks *really* good. Also, thanks for taking the extra time 
> to make it shared and reusable code.
> 
> /Leo

Thanks for your review.

Thomas


From aph at redhat.com  Wed Jan 22 17:45:20 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 22 Jan 2020 17:45:20 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
 <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
 <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>
 <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>
Message-ID: <df5a2ec1-eb31-a6c2-ec16-a0b982740511@redhat.com>

On 1/22/20 11:59 AM, David Holmes wrote:
> I'm assuming the ordering requirement is to preserve the order as 
> expressed in the code. There is likely an assumption that by declaring 
> both as volatile that the the compiler will not reorder them; and that 
> the load_acquire will prevent the hardware from reordering them. I'm not 
> sure if either of those assumptions are actually valid.

The compiler won't reorder the stores, but the hardware will.

> But that doesn't explain the complete lack of barriers in set_empty.
> 
> The GC folk will need to chime in on the detailed semantic requirements 
> of this algorithm.

OK, but this looks like a separate problem: we can deal with it later
if we need to.

Thanks.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From aph at redhat.com  Wed Jan 22 17:47:35 2020
From: aph at redhat.com (Andrew Haley)
Date: Wed, 22 Jan 2020 17:47:35 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <AM0PR0202MB3297FFF9258B5042A219AEB09A0C0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <9dbfd063-ea45-10e0-b541-7e84d662581c@redhat.com>
 <a7975f76-2cfc-42b4-7c7d-2e3a18a655f1@redhat.com>
 <d8102752-c770-73c3-da0c-6edc4dd4eea2@redhat.com>
 <AM0PR0202MB32973997450CD20BB3F2E57D9A340@AM0PR0202MB3297.eurprd02.prod.outlook.com>
 <88f97b92-df9e-140c-a972-44982ae3f79b@redhat.com>
 <23d3db9d-6603-c10c-8240-62cd82f4bae9@oracle.com>
 <12ca6024-bec5-8bd1-57d8-5a880fd5ad96@redhat.com>
 <6ef9347b-8139-0eb7-7150-8dce4b5e4dc6@oracle.com>
 <AM0PR0202MB3297FFF9258B5042A219AEB09A0C0@AM0PR0202MB3297.eurprd02.prod.outlook.com>
Message-ID: <18cb433d-2694-3e2f-2a31-c850486d12c9@redhat.com>

On 1/22/20 3:01 PM, Doerr, Martin wrote:
> Thread1: set bottom (push)
> Thread2: read age, read bottom, set age (pop_global)
> Thread3: read age, read bottom (pop_global)
> 
> The requirement is that Thread3 must never read an older bottom value than Thread2 after Thread3 has seen the age value from Thread2.

OK, so all we need here is a LoadLoad between read age and read bottom
in pop_global, as David Holmes said.

-- 
Andrew Haley  (he/him)
Java Platform Lead Engineer
Red Hat UK Ltd. <https://www.redhat.com>
https://keybase.io/andrewhaley
EAC8 43EB D3EF DB98 CC77 2FAD A5CD 6035 332F A671


From shade at redhat.com  Wed Jan 22 18:21:47 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 22 Jan 2020 19:21:47 +0100
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
In-Reply-To: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
References: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
Message-ID: <df6d17d5-6e56-b96a-2c29-46575cf1914b@redhat.com>

On 1/17/20 5:34 PM, Zhengyu Gu wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/

*) shenandoahHeap.cpp: does this change relate to this RFR?

   // When concurrent root is in progress, weak roots may contain dead oops,
   // they should not be used for root scanning.
   if (is_concurrent_root_in_progress()) {

Otherwise looks okay.


-- 
Thanks,
-Aleksey


From zgu at redhat.com  Wed Jan 22 18:27:30 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 22 Jan 2020 13:27:30 -0500
Subject: [15] RFR 8236880: Shenandoah: Move string dedup cleanup into
 concurrent phase
In-Reply-To: <df6d17d5-6e56-b96a-2c29-46575cf1914b@redhat.com>
References: <837e7210-0bd7-e06f-907b-7c5fcc3c3684@redhat.com>
 <df6d17d5-6e56-b96a-2c29-46575cf1914b@redhat.com>
Message-ID: <46f39b06-84f4-66aa-12a6-56904fb3c085@redhat.com>


On 1/22/20 1:21 PM, Aleksey Shipilev wrote:
> On 1/17/20 5:34 PM, Zhengyu Gu wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8236880
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8236880/webrev.00/
> 
> *) shenandoahHeap.cpp: does this change relate to this RFR?
> 
>     // When concurrent root is in progress, weak roots may contain dead oops,
>     // they should not be used for root scanning.
>     if (is_concurrent_root_in_progress()) {

Yes. However, we may need to take another look due to JDK-8237632.

Thanks,

-Zhengyu

> 
> Otherwise looks okay.
> 
> 


From zgu at redhat.com  Wed Jan 22 20:15:22 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 22 Jan 2020 15:15:22 -0500
Subject: [15] RFR 8234399: Shenandoah: Cleanup native load barrier
Message-ID: <2933db7c-f29e-ddbb-3015-05430488a180@redhat.com>

Please review this cleanup of a hack, which was added to workaround the 
problem manifested in JDK-8237396.

With JDK-8237396 resolved, let's remove it.

Bug: https://bugs.openjdk.java.net/browse/JDK-8234399
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8234399/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)

Thanks,

-Zhengyu


From rkennke at redhat.com  Wed Jan 22 20:45:13 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 22 Jan 2020 21:45:13 +0100
Subject: [15] RFR 8234399: Shenandoah: Cleanup native load barrier
In-Reply-To: <2933db7c-f29e-ddbb-3015-05430488a180@redhat.com>
References: <2933db7c-f29e-ddbb-3015-05430488a180@redhat.com>
Message-ID: <6c972c66-7a31-74f5-2c8c-b7b67013334e@redhat.com>

Ok. Thank you!

Roman


> Please review this cleanup of a hack, which was added to workaround the
> problem manifested in JDK-8237396.
> 
> With JDK-8237396 resolved, let's remove it.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8234399
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8234399/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> 
> Thanks,
> 
> -Zhengyu
> 


From kim.barrett at oracle.com  Wed Jan 22 22:21:32 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 22 Jan 2020 17:21:32 -0500
Subject: [14] RFR (S): 8237079: gc/g1/mixedgc/TestLogging.java fails with
 "Pause Young (Mixed) (G1 Evacuation Pause) not found"
In-Reply-To: <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>
References: <b5d42d84-18ff-0ca6-a3ef-f8d6d92dadb4@oracle.com>
 <e3e22a7c-76f9-2eb1-3236-60d558a32d72@oracle.com>
 <39d5a2fd-0a5d-97dd-3ecd-106dbdb8434f@oracle.com>
Message-ID: <3D0FC9F2-6D20-48A5-902C-9615A84C5FBF@oracle.com>

> On Jan 22, 2020, at 5:46 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
> On 17.01.20 16:07, Leo Korinth wrote:
>> Hi Thomas,
>> This is not a review. This code is basically the same code as is 
> 
> I took it as one anyway ;)
> 
>> duplicated at least three times in the test code. One of the duplications you can blame me for, *sorry*. I believe it should be moved to a common library method. I also believe the last fix you did in TestG1ParallelPhases.java makes that version look cleaner than what you propose here (it does not need the last allocation loop at all).
>> How about using the TestG1ParallelPhases.java version for all three test cases? If not, do the third version in TestOldGenCollectionUsage really work???
> 
> Here's a webrev incorporating these suggestions to unify the code:
> 
> http://cr.openjdk.java.net/~tschatzl/8237079/webrev.1 (full)

Looks good.


From stefan.johansson at oracle.com  Thu Jan 23 09:58:11 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 23 Jan 2020 10:58:11 +0100
Subject: RFR: 8237645: Remove OopsInGenClosure::par_do_barrier
In-Reply-To: <34dcda2a-ef2b-65e0-b6f2-fae553d95983@oracle.com>
References: <34dcda2a-ef2b-65e0-b6f2-fae553d95983@oracle.com>
Message-ID: <2f87f787-8c31-c7d0-2e74-ac4e838cd32b@oracle.com>


On 2020-01-22 15:02, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to some dead code after the CMS removal.
> 
> https://cr.openjdk.java.net/~stefank/8237645/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8237645
Looks good,
StefanJ

> 
> Thanks,
> StefanK


From stefan.karlsson at oracle.com  Thu Jan 23 10:01:23 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Thu, 23 Jan 2020 11:01:23 +0100
Subject: RFR: 8237645: Remove OopsInGenClosure::par_do_barrier
In-Reply-To: <2f87f787-8c31-c7d0-2e74-ac4e838cd32b@oracle.com>
References: <34dcda2a-ef2b-65e0-b6f2-fae553d95983@oracle.com>
 <2f87f787-8c31-c7d0-2e74-ac4e838cd32b@oracle.com>
Message-ID: <0d1966b4-c4b7-3b2a-a9f0-fa6ffd7000c2@oracle.com>

Thanks, Stefan.

I'll push this with only one Reviewer.

StefanK

On 2020-01-23 10:58, Stefan Johansson wrote:
> 
> On 2020-01-22 15:02, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to some dead code after the CMS removal.
>>
>> https://cr.openjdk.java.net/~stefank/8237645/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8237645
> Looks good,
> StefanJ
> 
>>
>> Thanks,
>> StefanK


From per.liden at oracle.com  Thu Jan 23 10:02:12 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 23 Jan 2020 11:02:12 +0100
Subject: RFR: 8237758: ZGC: Move get_mempolicy() syscall wrapper to ZSyscall
Message-ID: <3496f019-6790-7c91-cf5d-62779274287b@oracle.com>

System call wrappers should live in ZSyscall, but the wrapper for 
get_mempolicy() currently lives in ZNUMA. We should move it.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237758
Webrev: http://cr.openjdk.java.net/~pliden/8237758/webrev.0

/Per


From per.liden at oracle.com  Thu Jan 23 10:02:07 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 23 Jan 2020 11:02:07 +0100
Subject: RFR: 8237649: ZGC: Improved NUMA support when using small pages
Message-ID: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>

The NUMA allocation support in ZGC works as expected only when using 
-XX:+UseLargePages. The reason is that, on Linux, small pages are 
allocated at commit/fallocate time and is controlled by the NUMA policy 
of the current thread, while large pages are allocated at page fault 
time and is controlled by the NUMA policy of the memory range. ZGC 
currently only sets up the NUMA policy for the memory range, which has 
no effect on small pages (since they are allocated by tmpfs rather than 
being anonymous mappings).

We should fix this, so that the NUMA allocation support works equally 
well for small pages.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237649
Webrev: http://cr.openjdk.java.net/~pliden/8237649/webrev.0

/Per


From thomas.schatzl at oracle.com  Thu Jan 23 10:24:48 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 23 Jan 2020 11:24:48 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <246ce191-0f61-1bd5-caec-71299bfcebef@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
 <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>
 <246ce191-0f61-1bd5-caec-71299bfcebef@oracle.com>
Message-ID: <bb720749-4a95-7c59-e8cd-46409328abb0@oracle.com>

Hi,

On 22.01.20 14:16, Stefan Karlsson wrote:
> 
> 
> On 2020-01-22 12:02, Thomas Schatzl wrote:
>> Hi,
>>
>> On 17.01.20 14:31, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please review this patch to remove the automatic "is in heap" 
>>> verification from OopIterateClosure.
>>>
>>> https://cr.openjdk.java.net/~stefank/8237363/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8237363
>>>
>>> OopIterateClosure provides some automatic verification that loaded 
>>> objects are inside the heap. Closures can opt out from this by 
>>> overriding should_verify_oops().
>>>
>>> I propose that we move this verification, and the way to turn it off, 
>>> and instead let the implementations of the closures decide the kind 
>>> of verification that is appropriate. I want to do this to de-clutter 
>>> the closure APIs a bit.
>>>
>>
>> While the change is correct, I am not really convinced it is a good 
>> idea to trade verification in one place to the same verification in 
>> many place.
> 
> An alternative would be to simply remove the verification altogether. As 
> I said, we almost always check the result of the object address.
> 

Emphasis on "almost".

>>
>> The closure API does not seem to be particularly "cluttered up" by 
>> this particular API to me.
> 
> It's a slippery slope. Previously, we had a lot of GC specific functions 
> in these interfaces. I've been cleaning this over the years, and this is 
> one of the last non-essential parts of that interface that implementors 
> need to consider.
> 
> With my removal people don't have to think about this anymore.

But with the change people have to think about making sure to do the 
verification manually.
This does not seem an improvement at all.

> 
>  ?It is true that other code typically has many
>> other asserts that would fail anyway, but it would be an additional 
>> safety net when writing new closures.
> 
> It's a safety net that works for G1, but almost always is incorrectly 
> trips in the assert with ZGC.
> 

It works for all GCs (+leak profiler) but ZGC given the webrev. This 
does not suggest that this is GC-specific functionality at all. The 
verification method also seems to only uses an innocuous 
CollectedHeap::is_in() call that seems something very basic to support 
for a GC.

What is it in ZGC that prevents CollectedHeap::is_in() to return the 
expected value?

And the opt-out method does have been designed for unusual cases.

Thanks,
   Thomas


From per.liden at oracle.com  Thu Jan 23 13:01:38 2020
From: per.liden at oracle.com (Per Liden)
Date: Thu, 23 Jan 2020 14:01:38 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <bb720749-4a95-7c59-e8cd-46409328abb0@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
 <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>
 <246ce191-0f61-1bd5-caec-71299bfcebef@oracle.com>
 <bb720749-4a95-7c59-e8cd-46409328abb0@oracle.com>
Message-ID: <4667326e-bceb-6db9-e6c1-f078c90a2e2f@oracle.com>

Hi,

On 1/23/20 11:24 AM, Thomas Schatzl wrote:
[...]
>> ??It is true that other code typically has many
>>> other asserts that would fail anyway, but it would be an additional 
>>> safety net when writing new closures.
>>
>> It's a safety net that works for G1, but almost always is incorrectly 
>> trips in the assert with ZGC.
>>
> 
> It works for all GCs (+leak profiler) but ZGC given the webrev. This 
> does not suggest that this is GC-specific functionality at all. The 
> verification method also seems to only uses an innocuous 
> CollectedHeap::is_in() call that seems something very basic to support 
> for a GC.
> 
> What is it in ZGC that prevents CollectedHeap::is_in() to return the 
> expected value?

ZGC is returning the expected value. The problem here is that the 
verification happens _before_ the closure is applied, i.e. it asks if an 
oop that has not yet been fixed points into the heap. ZGC's is_in() is 
precise (i.e. cares about which heap view an oop points into), so an oop 
with a bad color is not considered to point into the heap. It's a 
feature, as it allows for exact verification and will catch oops with 
bad colors.

cheers,
Per

> 
> And the opt-out method does have been designed for unusual cases.
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Thu Jan 23 13:33:05 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 23 Jan 2020 14:33:05 +0100
Subject: RFR: 8237363: Remove automatic is in heap verification in
 OopIterateClosure
In-Reply-To: <4667326e-bceb-6db9-e6c1-f078c90a2e2f@oracle.com>
References: <1cb1e7ea-45dd-6a36-1731-94fe1fe25244@oracle.com>
 <20587626-5998-d756-5c9b-893ce42f40cd@oracle.com>
 <246ce191-0f61-1bd5-caec-71299bfcebef@oracle.com>
 <bb720749-4a95-7c59-e8cd-46409328abb0@oracle.com>
 <4667326e-bceb-6db9-e6c1-f078c90a2e2f@oracle.com>
Message-ID: <97657cf1-435b-e4cf-8653-7fa29d89c310@oracle.com>

Hi,

On 23.01.20 14:01, Per Liden wrote:
> Hi,
> 
> On 1/23/20 11:24 AM, Thomas Schatzl wrote:
> [...]
>>> ??It is true that other code typically has many
>>>> other asserts that would fail anyway, but it would be an additional 
>>>> safety net when writing new closures.
>>>
>>> It's a safety net that works for G1, but almost always is incorrectly 
>>> trips in the assert with ZGC.
>>>
>>
>> It works for all GCs (+leak profiler) but ZGC given the webrev. This 
>> does not suggest that this is GC-specific functionality at all. The 
>> verification method also seems to only uses an innocuous 
>> CollectedHeap::is_in() call that seems something very basic to support 
>> for a GC.
>>
>> What is it in ZGC that prevents CollectedHeap::is_in() to return the 
>> expected value?
> 
> ZGC is returning the expected value. The problem here is that the 
> verification happens _before_ the closure is applied, i.e. it asks if an 
> oop that has not yet been fixed points into the heap. ZGC's is_in() is 
> precise (i.e. cares about which heap view an oop points into), so an oop 
> with a bad color is not considered to point into the heap. It's a 
> feature, as it allows for exact verification and will catch oops with 
> bad colors.

   thanks. After some internal discussions: looks good.

Thomas


From thomas.schatzl at oracle.com  Thu Jan 23 16:06:29 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 23 Jan 2020 17:06:29 +0100
Subject: RFR: 8233822: VM_G1CollectForAllocation should always check for
 upgrade to full
In-Reply-To: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
References: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
Message-ID: <21fc0e88-719a-5779-6855-1830b4bf325a@oracle.com>

Hi Kim,

On 21.01.20 09:31, Kim Barrett wrote:
> Please review this G1 change to always check whether a full collection
> should be performed after a non-full collection pause, e.g. the
> collection needs to be "upgraded" to a full collection.  There are
> various conditions which can lead to needing to do that, and as the CR
> suggests, we need to be consistent about checking for and performing
> such an upgrade.
> 
> This is accomplished by moving most of do_collection_pause_at_safepoint
> into a helper function and changing that existing function to call the
> helper, then check for and, if needed, perform a needed upgrade to a
> full collection.  Callers of that function are updated to remove
> explict conditional upgrading, where present.  This also addresses the
> surprisingly placed call in a G1-specific block of code in gc/shared
> (see also JDK-8237567).
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8233822
> 
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8233822/open.00/
> 
> Testing:
> mach5 tier1-5
> Locally (linux-x64) ran modified InfiniteList.java test (allocate
> small rather than arrays) and verified some upgrades occurred as
> expected.

Minor nit you can ignore: in g1VMOperations.cpp:129 I would have 
probably folded the two if's into a single one.

Looks good with and without any change in this area.

Thanks for fixing this,
   Thomas


From kim.barrett at oracle.com  Thu Jan 23 19:25:58 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 23 Jan 2020 14:25:58 -0500
Subject: RFR: 8233822: VM_G1CollectForAllocation should always check for
 upgrade to full
In-Reply-To: <21fc0e88-719a-5779-6855-1830b4bf325a@oracle.com>
References: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
 <21fc0e88-719a-5779-6855-1830b4bf325a@oracle.com>
Message-ID: <472779BA-E6A4-4231-8B62-9BFC381DC59D@oracle.com>

> On Jan 23, 2020, at 11:06 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 21.01.20 09:31, Kim Barrett wrote:
>> Please review this G1 change to always check whether a full collection
>> should be performed after a non-full collection pause, e.g. the
>> collection needs to be "upgraded" to a full collection.  There are
>> various conditions which can lead to needing to do that, and as the CR
>> suggests, we need to be consistent about checking for and performing
>> such an upgrade.
>> This is accomplished by moving most of do_collection_pause_at_safepoint
>> into a helper function and changing that existing function to call the
>> helper, then check for and, if needed, perform a needed upgrade to a
>> full collection.  Callers of that function are updated to remove
>> explict conditional upgrading, where present.  This also addresses the
>> surprisingly placed call in a G1-specific block of code in gc/shared
>> (see also JDK-8237567).
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8233822
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8233822/open.00/
>> Testing:
>> mach5 tier1-5
>> Locally (linux-x64) ran modified InfiniteList.java test (allocate
>> small rather than arrays) and verified some upgrades occurred as
>> expected.
> 
> Minor nit you can ignore: in g1VMOperations.cpp:129 I would have probably folded the two if's into a single one.

Sure, I?ll do that.

> Looks good with and without any change in this area.
> 
> Thanks for fixing this,
>  Thomas

Thanks.


From kim.barrett at oracle.com  Thu Jan 23 20:10:48 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 23 Jan 2020 15:10:48 -0500
Subject: RFR: 8237143: Eliminate DirtyCardQ_cbl_mon
In-Reply-To: <f9505da9-ca8b-f880-5420-bcdb3622b206@oracle.com>
References: <745E91C1-AE1A-4DA2-80EE-59B70897F4BF@oracle.com>
 <f9505da9-ca8b-f880-5420-bcdb3622b206@oracle.com>
Message-ID: <86BABDA8-E402-49F3-B478-ED0E70490015@oracle.com>

> On Jan 22, 2020, at 11:12 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> On 16.01.20 09:51, Kim Barrett wrote:
>> Please review this change to eliminate the DirtyCardQ_cbl_mon.  This
>> is one of the two remaining super-special "access" ranked mutexes.
>> (The other is the Shared_DirtyCardQ_lock, whose elimination is covered
>> by JDK-8221360.)
>> There are three main parts to this change.
>> (1) Replace the under-a-lock FIFO queue in G1DirtyCardQueueSet with a
>> lock-free FIFO queue.
>> (2) Replace the use of a HotSpot monitor for signaling activation of
>> concurrent refinement threads with a semaphore-based solution.
>> (3) Handle pausing of buffer refinement in the middle of a buffer in
>> order to handle a pending safepoint request.  This can no longer just
>> push the partially processed buffer back onto the queue, due to ABA
>> problems now that the buffer is lock-free.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8237143
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8237143/open.00/
>> Testing:
>> mach5 tier1-5
>> Normal performance testing showed no significant change.
>> specjbb2015 on a very big machine showed a 3.5% average critical-jOPS
>> improvement, though not statistically significant; removing contention
>> for that lock by many hardware threads may be a little bit noticeable.
> 
>  initial comments only, and so far only about comments :( The code itself looks good to me, but I want to look over it again.

After some offline discussion with Thomas, I?m doing some restructuring that
makes it probably not very efficient for anyone else to do a careful review of
the open.00 version.


From stuart.monteith at linaro.org  Fri Jan 24 09:52:04 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Fri, 24 Jan 2020 09:52:04 +0000
Subject: RFR: 8237649: ZGC: Improved NUMA support when using small pages
In-Reply-To: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
References: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
Message-ID: <CAEGA6kYf416jW=yWMZs3oq=Hft2WyXUeVn-9eO5gD5wDe_xd+A@mail.gmail.com>

Hello Per,
  I notice you've left "UseNewCode" in size_t
ZPhysicalMemoryBacking::commit(size_t offset, size_t length). I
presume this is accidental. The rest looks ok to me.

Thanks,
  Stuart

On Thu, 23 Jan 2020 at 10:02, Per Liden <per.liden at oracle.com> wrote:
>
> The NUMA allocation support in ZGC works as expected only when using
> -XX:+UseLargePages. The reason is that, on Linux, small pages are
> allocated at commit/fallocate time and is controlled by the NUMA policy
> of the current thread, while large pages are allocated at page fault
> time and is controlled by the NUMA policy of the memory range. ZGC
> currently only sets up the NUMA policy for the memory range, which has
> no effect on small pages (since they are allocated by tmpfs rather than
> being anonymous mappings).
>
> We should fix this, so that the NUMA allocation support works equally
> well for small pages.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237649
> Webrev: http://cr.openjdk.java.net/~pliden/8237649/webrev.0
>
> /Per


From per.liden at oracle.com  Fri Jan 24 10:25:59 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 24 Jan 2020 11:25:59 +0100
Subject: RFR: 8237649: ZGC: Improved NUMA support when using small pages
In-Reply-To: <CAEGA6kYf416jW=yWMZs3oq=Hft2WyXUeVn-9eO5gD5wDe_xd+A@mail.gmail.com>
References: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
 <CAEGA6kYf416jW=yWMZs3oq=Hft2WyXUeVn-9eO5gD5wDe_xd+A@mail.gmail.com>
Message-ID: <3ad95202-b82c-9894-5408-4bbdac111827@oracle.com>

Hi Stuart,

On 1/24/20 10:52 AM, Stuart Monteith wrote:
> Hello Per,
>    I notice you've left "UseNewCode" in size_t
> ZPhysicalMemoryBacking::commit(size_t offset, size_t length). I
> presume this is accidental. The rest looks ok to me.

Oops... good catch! I'll remove that. Thanks for reviewing!

cheers,
Per

> 
> Thanks,
>    Stuart
> 
> On Thu, 23 Jan 2020 at 10:02, Per Liden <per.liden at oracle.com> wrote:
>>
>> The NUMA allocation support in ZGC works as expected only when using
>> -XX:+UseLargePages. The reason is that, on Linux, small pages are
>> allocated at commit/fallocate time and is controlled by the NUMA policy
>> of the current thread, while large pages are allocated at page fault
>> time and is controlled by the NUMA policy of the memory range. ZGC
>> currently only sets up the NUMA policy for the memory range, which has
>> no effect on small pages (since they are allocated by tmpfs rather than
>> being anonymous mappings).
>>
>> We should fix this, so that the NUMA allocation support works equally
>> well for small pages.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237649
>> Webrev: http://cr.openjdk.java.net/~pliden/8237649/webrev.0
>>
>> /Per


From stefan.karlsson at oracle.com  Fri Jan 24 14:29:37 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 24 Jan 2020 15:29:37 +0100
Subject: RFR: 8237821: 8237637 Broke Shenandoah
Message-ID: <df46253a-83c4-0497-455f-e98e7a05589d@oracle.com>

Hi all,

I broke Shenandoah release builds with 8237821.

During the review I was asked to do some simplification and get rid of 
redundant casts. I did the same for Shenandoah, but the patch wasn't 
complete.

So, to fix the current build I've reverted all the Shenandoah changes 
from 8237821:
https://cr.openjdk.java.net/~stefank/8237821/webrev.01.revert/

Then I only do the simple change of casting with (HeapWord*) to using 
cast_from_oop<HeapWord>:
https://cr.openjdk.java.net/~stefank/8237821/webrev.01.fix/

Both these combined gives this final patch:
https://cr.openjdk.java.net/~stefank/8237821/webrev.01.combined/

I've compiled Shenandoah with both fastdebug and release builds, and am 
currently testing this locally.

Thanks,
StefanK


From shade at redhat.com  Fri Jan 24 14:35:45 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 24 Jan 2020 15:35:45 +0100
Subject: RFR: 8237821: 8237637 Broke Shenandoah
In-Reply-To: <df46253a-83c4-0497-455f-e98e7a05589d@oracle.com>
References: <df46253a-83c4-0497-455f-e98e7a05589d@oracle.com>
Message-ID: <c9edfce9-ce09-cf8f-7e1f-9c1a0abb6167@redhat.com>

On 1/24/20 3:29 PM, Stefan Karlsson wrote:
> Both these combined gives this final patch:
> https://cr.openjdk.java.net/~stefank/8237821/webrev.01.combined/

Unfortunately, those casts end up being rather ugly.
Please let Shenandoah people to fix the code to make it both clear and correct.

-- 
Thanks,
-Aleksey


From stefan.karlsson at oracle.com  Fri Jan 24 14:37:26 2020
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 24 Jan 2020 15:37:26 +0100
Subject: RFR: 8237821: 8237637 Broke Shenandoah
In-Reply-To: <c9edfce9-ce09-cf8f-7e1f-9c1a0abb6167@redhat.com>
References: <df46253a-83c4-0497-455f-e98e7a05589d@oracle.com>
 <c9edfce9-ce09-cf8f-7e1f-9c1a0abb6167@redhat.com>
Message-ID: <4d720013-2e06-cf7f-d941-4e41cf83934b@oracle.com>

On 2020-01-24 15:35, Aleksey Shipilev wrote:
> On 1/24/20 3:29 PM, Stefan Karlsson wrote:
>> Both these combined gives this final patch:
>> https://cr.openjdk.java.net/~stefank/8237821/webrev.01.combined/
> 
> Unfortunately, those casts end up being rather ugly.
> Please let Shenandoah people to fix the code to make it both clear and correct.

Sounds good to me.

StefanK

> 


From per.liden at oracle.com  Fri Jan 24 14:49:32 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 24 Jan 2020 15:49:32 +0100
Subject: RFR: 8237825: ZGC: Replace -XX:ZPath with -XX:AllocateHeapAt
Message-ID: <4142fa91-5b9c-2880-7781-7774f92e56a1@oracle.com>

ZGC has the option -XX:ZPath to allow a user to explicitly specify where 
the backing file system is located. However, after ZPath was introduced, 
a new generic option -XX:AllocateHeapAt was introduced. This option is 
used the other GCs and have the same meaning/purpose and it's arguably 
better named. There's no good reason why ZGC shouldn't use that too, 
instead of -XX:ZPath.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237825
Webrev: http://cr.openjdk.java.net/~pliden/8237825/webrev.0

/Per


From erik.osterlund at oracle.com  Fri Jan 24 14:58:01 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Fri, 24 Jan 2020 15:58:01 +0100
Subject: RFR: 8237649: ZGC: Improved NUMA support when using small pages
In-Reply-To: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
References: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
Message-ID: <ee78ff9b-a78f-3f9e-5021-5b94dbb23177@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 1/23/20 11:02 AM, Per Liden wrote:
> The NUMA allocation support in ZGC works as expected only when using 
> -XX:+UseLargePages. The reason is that, on Linux, small pages are 
> allocated at commit/fallocate time and is controlled by the NUMA 
> policy of the current thread, while large pages are allocated at page 
> fault time and is controlled by the NUMA policy of the memory range. 
> ZGC currently only sets up the NUMA policy for the memory range, which 
> has no effect on small pages (since they are allocated by tmpfs rather 
> than being anonymous mappings).
>
> We should fix this, so that the NUMA allocation support works equally 
> well for small pages.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237649
> Webrev: http://cr.openjdk.java.net/~pliden/8237649/webrev.0
>
> /Per


From erik.osterlund at oracle.com  Fri Jan 24 15:01:44 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Fri, 24 Jan 2020 16:01:44 +0100
Subject: RFR: 8237825: ZGC: Replace -XX:ZPath with -XX:AllocateHeapAt
In-Reply-To: <4142fa91-5b9c-2880-7781-7774f92e56a1@oracle.com>
References: <4142fa91-5b9c-2880-7781-7774f92e56a1@oracle.com>
Message-ID: <9571eb79-286d-3035-eaf3-a15dab48b52d@oracle.com>

Hi Per,

Looks good!

/Erik

On 1/24/20 3:49 PM, Per Liden wrote:
> ZGC has the option -XX:ZPath to allow a user to explicitly specify 
> where the backing file system is located. However, after ZPath was 
> introduced, a new generic option -XX:AllocateHeapAt was introduced. 
> This option is used the other GCs and have the same meaning/purpose 
> and it's arguably better named. There's no good reason why ZGC 
> shouldn't use that too, instead of -XX:ZPath.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237825
> Webrev: http://cr.openjdk.java.net/~pliden/8237825/webrev.0
>
> /Per


From per.liden at oracle.com  Fri Jan 24 15:03:55 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 24 Jan 2020 16:03:55 +0100
Subject: RFR: 8237649: ZGC: Improved NUMA support when using small pages
In-Reply-To: <ee78ff9b-a78f-3f9e-5021-5b94dbb23177@oracle.com>
References: <bb415e49-4404-c18d-b6e9-8163da6bb4c3@oracle.com>
 <ee78ff9b-a78f-3f9e-5021-5b94dbb23177@oracle.com>
Message-ID: <dee2b45e-a1e4-38be-a976-f7ea596baffd@oracle.com>

Thanks Erik!

/Per

On 1/24/20 3:58 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 1/23/20 11:02 AM, Per Liden wrote:
>> The NUMA allocation support in ZGC works as expected only when using 
>> -XX:+UseLargePages. The reason is that, on Linux, small pages are 
>> allocated at commit/fallocate time and is controlled by the NUMA 
>> policy of the current thread, while large pages are allocated at page 
>> fault time and is controlled by the NUMA policy of the memory range. 
>> ZGC currently only sets up the NUMA policy for the memory range, which 
>> has no effect on small pages (since they are allocated by tmpfs rather 
>> than being anonymous mappings).
>>
>> We should fix this, so that the NUMA allocation support works equally 
>> well for small pages.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237649
>> Webrev: http://cr.openjdk.java.net/~pliden/8237649/webrev.0
>>
>> /Per
> 


From erik.osterlund at oracle.com  Fri Jan 24 15:04:08 2020
From: erik.osterlund at oracle.com (erik.osterlund at oracle.com)
Date: Fri, 24 Jan 2020 16:04:08 +0100
Subject: RFR: 8237758: ZGC: Move get_mempolicy() syscall wrapper to
 ZSyscall
In-Reply-To: <3496f019-6790-7c91-cf5d-62779274287b@oracle.com>
References: <3496f019-6790-7c91-cf5d-62779274287b@oracle.com>
Message-ID: <f91910ab-d49a-9513-74c8-d68b3d856645@oracle.com>

Hi Per,

Looks good!

/Erik

On 1/23/20 11:02 AM, Per Liden wrote:
> System call wrappers should live in ZSyscall, but the wrapper for 
> get_mempolicy() currently lives in ZNUMA. We should move it.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237758
> Webrev: http://cr.openjdk.java.net/~pliden/8237758/webrev.0
>
> /Per


From per.liden at oracle.com  Fri Jan 24 15:04:32 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 24 Jan 2020 16:04:32 +0100
Subject: RFR: 8237758: ZGC: Move get_mempolicy() syscall wrapper to
 ZSyscall
In-Reply-To: <f91910ab-d49a-9513-74c8-d68b3d856645@oracle.com>
References: <3496f019-6790-7c91-cf5d-62779274287b@oracle.com>
 <f91910ab-d49a-9513-74c8-d68b3d856645@oracle.com>
Message-ID: <efd9cf43-1826-ac70-f877-a61e3a27a5de@oracle.com>

Thanks Erik!

/Per

On 1/24/20 4:04 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good!
> 
> /Erik
> 
> On 1/23/20 11:02 AM, Per Liden wrote:
>> System call wrappers should live in ZSyscall, but the wrapper for 
>> get_mempolicy() currently lives in ZNUMA. We should move it.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237758
>> Webrev: http://cr.openjdk.java.net/~pliden/8237758/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Fri Jan 24 15:04:14 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 24 Jan 2020 16:04:14 +0100
Subject: RFR: 8237825: ZGC: Replace -XX:ZPath with -XX:AllocateHeapAt
In-Reply-To: <9571eb79-286d-3035-eaf3-a15dab48b52d@oracle.com>
References: <4142fa91-5b9c-2880-7781-7774f92e56a1@oracle.com>
 <9571eb79-286d-3035-eaf3-a15dab48b52d@oracle.com>
Message-ID: <b483b650-b7ca-166b-acaa-0ffffeacb70f@oracle.com>

Thanks Erik!

/Per

On 1/24/20 4:01 PM, erik.osterlund at oracle.com wrote:
> Hi Per,
> 
> Looks good!
> 
> /Erik
> 
> On 1/24/20 3:49 PM, Per Liden wrote:
>> ZGC has the option -XX:ZPath to allow a user to explicitly specify 
>> where the backing file system is located. However, after ZPath was 
>> introduced, a new generic option -XX:AllocateHeapAt was introduced. 
>> This option is used the other GCs and have the same meaning/purpose 
>> and it's arguably better named. There's no good reason why ZGC 
>> shouldn't use that too, instead of -XX:ZPath.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237825
>> Webrev: http://cr.openjdk.java.net/~pliden/8237825/webrev.0
>>
>> /Per
> 


From shade at redhat.com  Fri Jan 24 17:51:13 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 24 Jan 2020 18:51:13 +0100
Subject: RFR (S) 8237821: Shenandoah: build broken after JDK-8237637
 (Remove dubious type conversions from oop)
In-Reply-To: <1b170e9c-8a4d-984d-e03b-c6bccf6dc765@redhat.com>
References: <1b170e9c-8a4d-984d-e03b-c6bccf6dc765@redhat.com>
Message-ID: <ce19eaa9-3455-a5b9-ba9a-0ab2f894f611@redhat.com>

(should have copied hotspot-gc-dev@ as well)

On 1/24/20 6:50 PM, Aleksey Shipilev wrote:
> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8237821
> 
> Current release build in jdk/jdk is broken because of this. Instead of reverting the 8237637, let's
> complete the Shenandoah parts.
> 
> I believe it is in the spirit of the original patch to be explicit about oop/HeapWord*/void* in
> Shenandoah. cset->is_in(oop) should be doing the proper cast_from_oop dance that would go through
> CHECK_UNHANDLED_OOPS and friends. So we need an overload to handle potentially interior pointers
> (HeapWord*/void*). And it also removes some of the template mess we have, that lead to this failure.
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8237821/webrev.01
> 
> Testing: Linux x86_64 {release,fastdebug,slowdebug} hotspot_gc_shenandoah
> 

-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Mon Jan 27 13:54:21 2020
From: per.liden at oracle.com (Per Liden)
Date: Mon, 27 Jan 2020 14:54:21 +0100
Subject: RFR: 8237884: ZGC: Use clamp() instead of MIN2(MAX2())
Message-ID: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>

JDK-8233702 introduced clamp(), but ZHeuristics still uses MIN2(MAX2()) 
in one place.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237884
Webrev: http://cr.openjdk.java.net/~pliden/8237884/webrev.0

/Per


From per.liden at oracle.com  Mon Jan 27 13:54:16 2020
From: per.liden at oracle.com (Per Liden)
Date: Mon, 27 Jan 2020 14:54:16 +0100
Subject: RFR: 8237882: ZGC: Removed ZUtils::round_{up,down}_power_of_2()
 declarations
Message-ID: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>

JDK-8234331 removed ZUtils::round_{up,down}_power_of_2() but left the 
function declarations in the ZUtils class.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237882
Webrev: http://cr.openjdk.java.net/~pliden/8237882/webrev.0

/Per


From stuart.monteith at linaro.org  Mon Jan 27 15:24:08 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Mon, 27 Jan 2020 15:24:08 +0000
Subject: RFR: 8237884: ZGC: Use clamp() instead of MIN2(MAX2())
In-Reply-To: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
References: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
Message-ID: <CAEGA6kZbUXOpvw6f1TH9-T8_-n6_J3Ubm2oE+_Yb8Bg5F_G7Gg@mail.gmail.com>

That looks trivially Ok to me. (Not a reviewer).

On Mon, 27 Jan 2020 at 13:55, Per Liden <per.liden at oracle.com> wrote:
>
> JDK-8233702 introduced clamp(), but ZHeuristics still uses MIN2(MAX2())
> in one place.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237884
> Webrev: http://cr.openjdk.java.net/~pliden/8237884/webrev.0
>
> /Per


From per.liden at oracle.com  Mon Jan 27 15:25:54 2020
From: per.liden at oracle.com (Per Liden)
Date: Mon, 27 Jan 2020 16:25:54 +0100
Subject: RFR: 8237884: ZGC: Use clamp() instead of MIN2(MAX2())
In-Reply-To: <CAEGA6kZbUXOpvw6f1TH9-T8_-n6_J3Ubm2oE+_Yb8Bg5F_G7Gg@mail.gmail.com>
References: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
 <CAEGA6kZbUXOpvw6f1TH9-T8_-n6_J3Ubm2oE+_Yb8Bg5F_G7Gg@mail.gmail.com>
Message-ID: <5e5f4be6-5ff9-5d05-5e33-c8dc0a1d18f2@oracle.com>

Thanks for reviewing, Stuart.

cheers,
Per

On 1/27/20 4:24 PM, Stuart Monteith wrote:
> That looks trivially Ok to me. (Not a reviewer).
> 
> On Mon, 27 Jan 2020 at 13:55, Per Liden <per.liden at oracle.com> wrote:
>>
>> JDK-8233702 introduced clamp(), but ZHeuristics still uses MIN2(MAX2())
>> in one place.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237884
>> Webrev: http://cr.openjdk.java.net/~pliden/8237884/webrev.0
>>
>> /Per


From stuart.monteith at linaro.org  Mon Jan 27 15:30:51 2020
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Mon, 27 Jan 2020 15:30:51 +0000
Subject: RFR: 8237882: ZGC: Removed ZUtils::round_{up,
 down}_power_of_2() declarations
In-Reply-To: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
References: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
Message-ID: <CAEGA6kbVyTPM5-mZV19XWt7oQw5u5-u04csTFZ3HY6TsdNznuQ@mail.gmail.com>

This looks OK (not a reviewer).

On Mon, 27 Jan 2020 at 13:57, Per Liden <per.liden at oracle.com> wrote:
>
> JDK-8234331 removed ZUtils::round_{up,down}_power_of_2() but left the
> function declarations in the ZUtils class.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237882
> Webrev: http://cr.openjdk.java.net/~pliden/8237882/webrev.0
>
> /Per


From per.liden at oracle.com  Mon Jan 27 15:42:19 2020
From: per.liden at oracle.com (Per Liden)
Date: Mon, 27 Jan 2020 16:42:19 +0100
Subject: RFR: 8237882: ZGC: Removed ZUtils::round_{up,down}_power_of_2()
 declarations
In-Reply-To: <CAEGA6kbVyTPM5-mZV19XWt7oQw5u5-u04csTFZ3HY6TsdNznuQ@mail.gmail.com>
References: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
 <CAEGA6kbVyTPM5-mZV19XWt7oQw5u5-u04csTFZ3HY6TsdNznuQ@mail.gmail.com>
Message-ID: <bb005901-f9c7-5c9a-28e1-afa03267a153@oracle.com>

Thanks Stuart!

/Per

On 1/27/20 4:30 PM, Stuart Monteith wrote:
> This looks OK (not a reviewer).
> 
> On Mon, 27 Jan 2020 at 13:57, Per Liden <per.liden at oracle.com> wrote:
>>
>> JDK-8234331 removed ZUtils::round_{up,down}_power_of_2() but left the
>> function declarations in the ZUtils class.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237882
>> Webrev: http://cr.openjdk.java.net/~pliden/8237882/webrev.0
>>
>> /Per


From zgu at redhat.com  Mon Jan 27 16:25:15 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Mon, 27 Jan 2020 11:25:15 -0500
Subject: [15] RFR(T) 8237874: Shenandoah: Backout JDK-8234399
Message-ID: <78491ea8-d877-04e8-b265-74f8a30997e0@redhat.com>

I would like to backout JDK-8234399, as Shenandoah still triggers 
barriers on GC paths.

Bug: https://bugs.openjdk.java.net/browse/JDK-8237874
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237874/webrev/

Test:
   hotspot_gc_shenandoah and test case in bug report.


Thanks,

-Zhengyu


From shade at redhat.com  Mon Jan 27 16:28:49 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 27 Jan 2020 17:28:49 +0100
Subject: [15] RFR(T) 8237874: Shenandoah: Backout JDK-8234399
In-Reply-To: <78491ea8-d877-04e8-b265-74f8a30997e0@redhat.com>
References: <78491ea8-d877-04e8-b265-74f8a30997e0@redhat.com>
Message-ID: <ab6d49f1-f82c-f91f-66b0-d0250f2b2778@redhat.com>

On 1/27/20 5:25 PM, Zhengyu Gu wrote:
> I would like to backout JDK-8234399, as Shenandoah still triggers 
> barriers on GC paths.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237874
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237874/webrev/

Yes, please.

-- 
Thanks,
-Aleksey


From kim.barrett at oracle.com  Mon Jan 27 17:36:39 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 27 Jan 2020 12:36:39 -0500
Subject: RFR: 8237884: ZGC: Use clamp() instead of MIN2(MAX2())
In-Reply-To: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
References: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
Message-ID: <08A78C8C-782C-4D9F-92B5-E7C6BD909212@oracle.com>

> On Jan 27, 2020, at 8:54 AM, Per Liden <per.liden at oracle.com> wrote:
> 
> JDK-8233702 introduced clamp(), but ZHeuristics still uses MIN2(MAX2()) in one place.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237884
> Webrev: http://cr.openjdk.java.net/~pliden/8237884/webrev.0
> 
> /Per

Looks good.


From kim.barrett at oracle.com  Mon Jan 27 17:37:58 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 27 Jan 2020 12:37:58 -0500
Subject: RFR: 8237882: ZGC: Removed ZUtils::round_{up,down}_power_of_2()
 declarations
In-Reply-To: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
References: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
Message-ID: <4F32DC9F-DF27-4247-94A5-B5315235279C@oracle.com>

> On Jan 27, 2020, at 8:54 AM, Per Liden <per.liden at oracle.com> wrote:
> 
> JDK-8234331 removed ZUtils::round_{up,down}_power_of_2() but left the function declarations in the ZUtils class.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237882
> Webrev: http://cr.openjdk.java.net/~pliden/8237882/webrev.0
> 
> /Per

Looks good.


From manc at google.com  Tue Jan 28 03:13:02 2020
From: manc at google.com (Man Cao)
Date: Mon, 27 Jan 2020 19:13:02 -0800
Subject: RFR (XS): 8234608: [TESTBUG] Memory leak in
 gc/g1/unloading/libdefine.cpp
Message-ID: <CA+w6HxYTpW39sn7bGCQL48c6v=9PoOzb=La6KJV9KS5yuLNyOA@mail.gmail.com>

Hi all,

Could anyone review this small patch contributed by my colleague Ian Rogers
(irogers at google.com)?
Bug: https://bugs.openjdk.java.net/browse/JDK-8234608
Webrev: https://cr.openjdk.java.net/~manc/8234608/webrev.00/

-Man


From per.liden at oracle.com  Tue Jan 28 07:10:15 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 28 Jan 2020 08:10:15 +0100
Subject: RFR: 8237882: ZGC: Removed ZUtils::round_{up,down}_power_of_2()
 declarations
In-Reply-To: <4F32DC9F-DF27-4247-94A5-B5315235279C@oracle.com>
References: <95acba6b-dcda-6f4d-c99b-bff42b45e68e@oracle.com>
 <4F32DC9F-DF27-4247-94A5-B5315235279C@oracle.com>
Message-ID: <430790ac-227d-88ef-ee8d-7acdfea5df06@oracle.com>

Thanks Kim!

/Per

On 1/27/20 6:37 PM, Kim Barrett wrote:
>> On Jan 27, 2020, at 8:54 AM, Per Liden <per.liden at oracle.com> wrote:
>>
>> JDK-8234331 removed ZUtils::round_{up,down}_power_of_2() but left the function declarations in the ZUtils class.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237882
>> Webrev: http://cr.openjdk.java.net/~pliden/8237882/webrev.0
>>
>> /Per
> 
> Looks good.
> 


From per.liden at oracle.com  Tue Jan 28 07:10:28 2020
From: per.liden at oracle.com (Per Liden)
Date: Tue, 28 Jan 2020 08:10:28 +0100
Subject: RFR: 8237884: ZGC: Use clamp() instead of MIN2(MAX2())
In-Reply-To: <08A78C8C-782C-4D9F-92B5-E7C6BD909212@oracle.com>
References: <8df285ce-3a79-30d5-88b2-f9ebfe213abe@oracle.com>
 <08A78C8C-782C-4D9F-92B5-E7C6BD909212@oracle.com>
Message-ID: <3001ce1d-213e-3e31-efec-91ace42dc572@oracle.com>

Thanks Kim!

/Per


On 1/27/20 6:36 PM, Kim Barrett wrote:
>> On Jan 27, 2020, at 8:54 AM, Per Liden <per.liden at oracle.com> wrote:
>>
>> JDK-8233702 introduced clamp(), but ZHeuristics still uses MIN2(MAX2()) in one place.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237884
>> Webrev: http://cr.openjdk.java.net/~pliden/8237884/webrev.0
>>
>> /Per
> 
> Looks good.
> 


From ivan.walulya at oracle.com  Tue Jan 28 08:05:13 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Tue, 28 Jan 2020 09:05:13 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
Message-ID: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>

Hi all,

Please review the removal of unused Enum from the parallel GC. 

Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>

Testing: Tier 1 - Tier 3

//Ivan

From leo.korinth at oracle.com  Tue Jan 28 08:10:47 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Tue, 28 Jan 2020 09:10:47 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
Message-ID: <d50b7c25-3164-30bd-971a-6a40ab97f189@oracle.com>

Hi!

On 28/01/2020 09:05, Ivan Walulya wrote:
> Hi all,
> 
> Please review the removal of unused Enum from the parallel GC.

Thanks for fixing this!

> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>

Looks good Ivan. I will sponsor this change for you.

Thanks,
Leo


> Testing: Tier 1 - Tier 3
> 
> //Ivan
> 


From ivan.walulya at oracle.com  Tue Jan 28 08:16:52 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Tue, 28 Jan 2020 09:16:52 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <d50b7c25-3164-30bd-971a-6a40ab97f189@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
 <d50b7c25-3164-30bd-971a-6a40ab97f189@oracle.com>
Message-ID: <F0B5A311-7352-4DFF-AAFE-D5541036EDD6@oracle.com>

Thanks Leo!

//Ivan

> On 28 Jan 2020, at 09:10, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> Hi!
> 
> On 28/01/2020 09:05, Ivan Walulya wrote:
>> Hi all,
>> Please review the removal of unused Enum from the parallel GC.
> 
> Thanks for fixing this!
> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>
> 
> Looks good Ivan. I will sponsor this change for you.
> 
> Thanks,
> Leo
> 
> 
>> Testing: Tier 1 - Tier 3
>> //Ivan


From kim.barrett at oracle.com  Tue Jan 28 08:46:01 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 28 Jan 2020 03:46:01 -0500
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
Message-ID: <256CF4B8-6742-497C-A5B8-E0D942465B14@oracle.com>

> On Jan 28, 2020, at 3:05 AM, Ivan Walulya <ivan.walulya at oracle.com> wrote:
> 
> Hi all,
> 
> Please review the removal of unused Enum from the parallel GC. 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>
> 
> Testing: Tier 1 - Tier 3
> 
> //Ivan

Looks good.


From ivan.walulya at oracle.com  Tue Jan 28 08:50:54 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Tue, 28 Jan 2020 09:50:54 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <256CF4B8-6742-497C-A5B8-E0D942465B14@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
 <256CF4B8-6742-497C-A5B8-E0D942465B14@oracle.com>
Message-ID: <DFC096CB-FAC1-4F1D-A289-156EEE81D576@oracle.com>

Thanks Kim!

//Ivan

> On 28 Jan 2020, at 09:46, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Jan 28, 2020, at 3:05 AM, Ivan Walulya <ivan.walulya at oracle.com> wrote:
>> 
>> Hi all,
>> 
>> Please review the removal of unused Enum from the parallel GC. 
>> 
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>
>> 
>> Testing: Tier 1 - Tier 3
>> 
>> //Ivan
> 
> Looks good.
> 


From thomas.schatzl at oracle.com  Tue Jan 28 09:22:13 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 28 Jan 2020 10:22:13 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
Message-ID: <df4cbd72-090d-7207-d375-9b2e604e7b28@oracle.com>

Hi,

On 28.01.20 09:05, Ivan Walulya wrote:
> Hi all,
> 
> Please review the removal of unused Enum from the parallel GC.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>
> 
> Testing: Tier 1 - Tier 3
> 
> //Ivan
> 

   a bit late, but looks good :)

Thomas


From ivan.walulya at oracle.com  Tue Jan 28 09:44:15 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Tue, 28 Jan 2020 10:44:15 +0100
Subject: RFR: 8232689: Remove ParCompactionManager::Action enum
In-Reply-To: <df4cbd72-090d-7207-d375-9b2e604e7b28@oracle.com>
References: <2AE55972-F775-471A-8F1A-86E552A8788D@oracle.com>
 <df4cbd72-090d-7207-d375-9b2e604e7b28@oracle.com>
Message-ID: <85DBDB9C-5F6F-4DF9-955D-DA476B33521A@oracle.com>

Thanks Thomas

//Ivan

> On 28 Jan 2020, at 10:22, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 28.01.20 09:05, Ivan Walulya wrote:
>> Hi all,
>> Please review the removal of unused Enum from the parallel GC.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8232689 <https://bugs.openjdk.java.net/browse/JDK-8232689>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8232689/ <http://cr.openjdk.java.net/~lkorinth/ivan/8232689/>
>> Testing: Tier 1 - Tier 3
>> //Ivan
> 
>  a bit late, but looks good :)
> 
> Thomas


From thomas.schatzl at oracle.com  Tue Jan 28 14:23:52 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 28 Jan 2020 15:23:52 +0100
Subject: RFR (XS): 8234608: [TESTBUG] Memory leak in
 gc/g1/unloading/libdefine.cpp
In-Reply-To: <CA+w6HxYTpW39sn7bGCQL48c6v=9PoOzb=La6KJV9KS5yuLNyOA@mail.gmail.com>
References: <CA+w6HxYTpW39sn7bGCQL48c6v=9PoOzb=La6KJV9KS5yuLNyOA@mail.gmail.com>
Message-ID: <d1849115-fead-0096-4fb1-d9bce405dea4@oracle.com>

Hi Ian/Man,

On 28.01.20 04:13, Man Cao wrote:
> Hi all,
> 
> Could anyone review this small patch contributed by my colleague Ian Rogers
> (irogers at google.com)?
> Bug: https://bugs.openjdk.java.net/browse/JDK-8234608
> Webrev: https://cr.openjdk.java.net/~manc/8234608/webrev.00/
> 
> -Man
> 

   the change looks good - but the test testing this is broken. In fact, 
the tests doing the redefinition via JVMTI do not even run at all. This 
may be why the error that has been "fixed" in this change about passing 
the byte stream to the RedefineClass method has never been noticed before.

I did some hacking to enable redefinition in the tests, and then 
immediately had to fix the JNI method name, which once more indicates 
that the appropriate tests were never run... :(

My changes are available at 
http://cr.openjdk.java.net/~tschatzl/8234608/webrev/ ; however the 
test(s) fail at the RedefineClasses call (with and without your patch) with

[...]vmTestbase/gc/g1/unloading/libdefine.cpp: Failed to call 
RedefineClasses():
	the function returned error 60
	For more info about this error see the JVMTI spec

which means

  JVMTI_ERROR_INVALID_CLASS_FORMAT (60)
     A new class file is malformed (the VM would return a 
ClassFormatError).

I have no further clue about what's wrong here. Maybe you are 
interested/have time to investigate more but I need to give up for today 
at least. Otherwise it's probably best to just add some links to the CR 
for somebody else to continue.

Thanks,
   Thomas


From zgu at redhat.com  Tue Jan 28 15:34:49 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 28 Jan 2020 10:34:49 -0500
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
Message-ID: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>

Please review this patch that uses single-threaded string dedup's 
oops_do() implementation for heap iteration.

The bug was reported by SAP on Windows, but it is not Windows specific. 
The bug is due to heap iteration uses parallel version of string dedup's 
oops_do() implementation, which can interfere concurrent string dedup 
cleaning task.


Bug: https://bugs.openjdk.java.net/browse/JDK-8237963
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html

Test:
    gc/shenandoah/jvmti/TestHeapDump.java test with 
-XX:+UseStringDeduplication (fastdebug and release) on x86_64 Linux.

Thanks,

-Zhengyu


From shade at redhat.com  Tue Jan 28 17:54:25 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 28 Jan 2020 18:54:25 +0100
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
In-Reply-To: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
References: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
Message-ID: <889edf0c-db71-24e1-a3dc-7bc5ea433541@redhat.com>

On 1/28/20 4:34 PM, Zhengyu Gu wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237963
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html

Looks fine!

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Jan 28 19:03:34 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 28 Jan 2020 14:03:34 -0500
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
In-Reply-To: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
References: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
Message-ID: <2de8acb4-bb48-ae91-47c5-b5a3adfa301c@redhat.com>

Sorry, the early fix is incorrect.

Concurrent string dedup cleaning task may change string dedup 
table/queue structures, that makes it unsafe for heap iteration to walk 
them concurently.

Instead, heap iteration should use concurrent version, to block out each 
other.

Updated webrev: 
http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html

Also changed bug synopsis to:

Shenandoah: Heap iteration should use concurrent version of string dedup 
roots

Test:
     gc/shenandoah/jvmti/TestHeapDump.java test with
    -XX:+UseStringDeduplication (fastdebug and release) on x86_64 Linux.
    in loop with 10 iterations.

Thanks,

-Zhengyu

On 1/28/20 10:34 AM, Zhengyu Gu wrote:
> Please review this patch that uses single-threaded string dedup's 
> oops_do() implementation for heap iteration.
> 
> The bug was reported by SAP on Windows, but it is not Windows specific. 
> The bug is due to heap iteration uses parallel version of string dedup's 
> oops_do() implementation, which can interfere concurrent string dedup 
> cleaning task.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8237963
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html
> 
> Test:
>  ?? gc/shenandoah/jvmti/TestHeapDump.java test with 
> -XX:+UseStringDeduplication (fastdebug and release) on x86_64 Linux.
> 
> Thanks,
> 
> -Zhengyu


From zgu at redhat.com  Tue Jan 28 19:12:04 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 28 Jan 2020 14:12:04 -0500
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
In-Reply-To: <2de8acb4-bb48-ae91-47c5-b5a3adfa301c@redhat.com>
References: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
 <2de8acb4-bb48-ae91-47c5-b5a3adfa301c@redhat.com>
Message-ID: <32f8b945-658d-f7d4-eff1-834fb2e9f6e9@redhat.com>

Correction:

Updated webrev:
http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.01/index.html

Thanks,

-Zhengyu


On 1/28/20 2:03 PM, Zhengyu Gu wrote:
> Sorry, the early fix is incorrect.
> 
> Concurrent string dedup cleaning task may change string dedup 
> table/queue structures, that makes it unsafe for heap iteration to walk 
> them concurently.
> 
> Instead, heap iteration should use concurrent version, to block out each 
> other.
> 
> Updated webrev: 
> http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html
> 
> Also changed bug synopsis to:
> 
> Shenandoah: Heap iteration should use concurrent version of string dedup 
> roots
> 
> Test:
>  ??? gc/shenandoah/jvmti/TestHeapDump.java test with
>  ?? -XX:+UseStringDeduplication (fastdebug and release) on x86_64 Linux.
>  ?? in loop with 10 iterations.
> 
> Thanks,
> 
> -Zhengyu
> 
> On 1/28/20 10:34 AM, Zhengyu Gu wrote:
>> Please review this patch that uses single-threaded string dedup's 
>> oops_do() implementation for heap iteration.
>>
>> The bug was reported by SAP on Windows, but it is not Windows 
>> specific. The bug is due to heap iteration uses parallel version of 
>> string dedup's oops_do() implementation, which can interfere 
>> concurrent string dedup cleaning task.
>>
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8237963
>> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.00/index.html
>>
>> Test:
>> ??? gc/shenandoah/jvmti/TestHeapDump.java test with 
>> -XX:+UseStringDeduplication (fastdebug and release) on x86_64 Linux.
>>
>> Thanks,
>>
>> -Zhengyu


From shade at redhat.com  Tue Jan 28 19:15:29 2020
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 28 Jan 2020 20:15:29 +0100
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
In-Reply-To: <32f8b945-658d-f7d4-eff1-834fb2e9f6e9@redhat.com>
References: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
 <2de8acb4-bb48-ae91-47c5-b5a3adfa301c@redhat.com>
 <32f8b945-658d-f7d4-eff1-834fb2e9f6e9@redhat.com>
Message-ID: <37e1ebb9-456e-357e-dca5-44d4f7875710@redhat.com>

On 1/28/20 8:12 PM, Zhengyu Gu wrote:
> http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.01/index.html
Looks good.

This still passes tests, right?

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Jan 28 19:16:47 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 28 Jan 2020 14:16:47 -0500
Subject: [15] RFR 8237963: Shenandoah: Heap iteration should use
 single-threaded string dedup oops_do_slow()
In-Reply-To: <37e1ebb9-456e-357e-dca5-44d4f7875710@redhat.com>
References: <9cef2092-fc85-07c5-3764-a1103c11ce43@redhat.com>
 <2de8acb4-bb48-ae91-47c5-b5a3adfa301c@redhat.com>
 <32f8b945-658d-f7d4-eff1-834fb2e9f6e9@redhat.com>
 <37e1ebb9-456e-357e-dca5-44d4f7875710@redhat.com>
Message-ID: <18717403-3080-5a4b-9a4a-6d3b3ec686f8@redhat.com>

Thanks, Aleksey.

On 1/28/20 2:15 PM, Aleksey Shipilev wrote:
> On 1/28/20 8:12 PM, Zhengyu Gu wrote:
>> http://cr.openjdk.java.net/~zgu/JDK-8237963/webrev.01/index.html
> Looks good.
> 
> This still passes tests, right?

Of course.

-Zhengyu
> 


From manc at google.com  Tue Jan 28 19:18:36 2020
From: manc at google.com (Man Cao)
Date: Tue, 28 Jan 2020 11:18:36 -0800
Subject: RFR (XS): 8234608: [TESTBUG] Memory leak in
 gc/g1/unloading/libdefine.cpp
In-Reply-To: <d1849115-fead-0096-4fb1-d9bce405dea4@oracle.com>
References: <CA+w6HxYTpW39sn7bGCQL48c6v=9PoOzb=La6KJV9KS5yuLNyOA@mail.gmail.com>
 <d1849115-fead-0096-4fb1-d9bce405dea4@oracle.com>
Message-ID: <CA+w6HxZ8d1CMsSHWtV3ZeEdhsDabeRkHtBpnoWHyrHOOXNbY+g@mail.gmail.com>

Thanks, Thomas. I'll take a look briefly.
Agreed that the test should be able to run first, before pushing this
change.

-Man


From sangheon.kim at oracle.com  Wed Jan 29 00:40:32 2020
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 28 Jan 2020 16:40:32 -0800
Subject: RFR: 8233822: VM_G1CollectForAllocation should always check for
 upgrade to full
In-Reply-To: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
References: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
Message-ID: <25625fd5-7ced-ed64-aea8-78f3367d1f02@oracle.com>

Hi Kim,

On 1/21/20 12:31 AM, Kim Barrett wrote:
> Please review this G1 change to always check whether a full collection
> should be performed after a non-full collection pause, e.g. the
> collection needs to be "upgraded" to a full collection.  There are
> various conditions which can lead to needing to do that, and as the CR
> suggests, we need to be consistent about checking for and performing
> such an upgrade.
>
> This is accomplished by moving most of do_collection_pause_at_safepoint
> into a helper function and changing that existing function to call the
> helper, then check for and, if needed, perform a needed upgrade to a
> full collection.  Callers of that function are updated to remove
> explict conditional upgrading, where present.  This also addresses the
> surprisingly placed call in a G1-specific block of code in gc/shared
> (see also JDK-8237567).
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8233822
>
> Webrev:
> https://cr.openjdk.java.net/~kbarrett/8233822/open.00/
Looks good.

Not related to your patch (so can ignore) but as you are changing 
VM_G1CollectForAllocation class,
comments for relation at g1VMOperations.hpp seems out-dated. :)

Thanks,
Sangheon


>
> Testing:
> mach5 tier1-5
> Locally (linux-x64) ran modified InfiniteList.java test (allocate
> small rather than arrays) and verified some upgrades occurred as
> expected.
>
>


From kim.barrett at oracle.com  Wed Jan 29 01:07:26 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 28 Jan 2020 20:07:26 -0500
Subject: RFR: 8233822: VM_G1CollectForAllocation should always check for
 upgrade to full
In-Reply-To: <25625fd5-7ced-ed64-aea8-78f3367d1f02@oracle.com>
References: <5389B188-BA91-412F-A12E-0DB5A96FF0A9@oracle.com>
 <25625fd5-7ced-ed64-aea8-78f3367d1f02@oracle.com>
Message-ID: <966AA66F-3726-4088-8EE2-AD758B5EBCAA@oracle.com>

> On Jan 28, 2020, at 7:40 PM, sangheon.kim at oracle.com wrote:
> On 1/21/20 12:31 AM, Kim Barrett wrote:
>> [?]
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8233822
>> 
>> Webrev:
>> https://cr.openjdk.java.net/~kbarrett/8233822/open.00/
> Looks good.

Thanks.

> Not related to your patch (so can ignore) but as you are changing VM_G1CollectForAllocation class,
> comments for relation at g1VMOperations.hpp seems out-dated. :)

I assume you are referring to the comment near the top of the file
summarizing some parts of the class hierarchy?  There's more wrong
than right in that comment!  It wrongly suggests VM_G1Concurrent and
VM_G1CollectForAllocation both (directly) derive from VM_GC_Operation,
and it's missing VM_G1TryInitiateConcMark.

I started to correct the comment, but decided that beyond the first
line summary it doesn't provide any information that isn't obvious
from the code, and is also messy, so plan to just delete all but that
first line.


From thomas.schatzl at oracle.com  Wed Jan 29 13:04:00 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 29 Jan 2020 14:04:00 +0100
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
Message-ID: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>

Hi all,

   can I have reviews for this change that removes the 
ParallelTaskTerminator code? In JDK12 we introduced another 
implementation, the OWSTTaskTerminator, that seems to work just fine as 
default for more than a year now, so I think it is time to remove the 
old implementation.

@Shenandoah-Team: I left the ShenandoahTaskTerminator wrapper, but 
removing the TaskTerminator wrapper for the other GCs. I can remove that 
too, but did not know if you wanted to keep the name.

In a follow-up change I would like to change the name of the 
OWSTTaskTerminator to just TaskTerminator (also renaming the files, but 
keeping it separate to taskqueue.?pp). Tell me if there are any concerns.

CR:
https://bugs.openjdk.java.net/browse/JDK-8215297
Webrev:
http://cr.openjdk.java.net/~tschatzl/8215297/webrev/
Testing:
hs-tier1-5, builds with Shenandoah

Thanks,
   Thomas


From zgu at redhat.com  Wed Jan 29 13:45:36 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Wed, 29 Jan 2020 08:45:36 -0500
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
In-Reply-To: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
References: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
Message-ID: <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>

Hi Thomas,

Shared changes look good to me.

I filed follow-up CR(JDK-8238162) to remove ShenandoahTaskTerminator 
wrapper.

Thanks,

-Zhengyu

On 1/29/20 8:04 AM, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this change that removes the 
> ParallelTaskTerminator code? In JDK12 we introduced another 
> implementation, the OWSTTaskTerminator, that seems to work just fine as 
> default for more than a year now, so I think it is time to remove the 
> old implementation.
> 
> @Shenandoah-Team: I left the ShenandoahTaskTerminator wrapper, but 
> removing the TaskTerminator wrapper for the other GCs. I can remove that 
> too, but did not know if you wanted to keep the name.
> 
> In a follow-up change I would like to change the name of the 
> OWSTTaskTerminator to just TaskTerminator (also renaming the files, but 
> keeping it separate to taskqueue.?pp). Tell me if there are any concerns.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8215297
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8215297/webrev/
> Testing:
> hs-tier1-5, builds with Shenandoah
> 
> Thanks,
>  ? Thomas
> 


From thomas.schatzl at oracle.com  Thu Jan 30 10:45:04 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 11:45:04 +0100
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
In-Reply-To: <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>
References: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
 <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>
Message-ID: <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>

Hi Zhengyu,

On 29.01.20 14:45, Zhengyu Gu wrote:
> Hi Thomas,
> 
> Shared changes look good to me.

   thanks for your review; I touched up the files a little in

http://cr.openjdk.java.net/~tschatzl/8215297/webrev.0_to_1 (diff)
http://cr.openjdk.java.net/~tschatzl/8215297/webrev.1 (full)

slightly after some feedback and finding some minor (pre-existing) issues.

> 
> I filed follow-up CR(JDK-8238162) to remove ShenandoahTaskTerminator 
> wrapper.

Okay.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Thu Jan 30 11:08:35 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 12:08:35 +0100
Subject: RFR (S): 8238160: Uniformize Parallel GC task queue variable names
Message-ID: <8d350538-9a82-b420-e7de-319edaf8605c@oracle.com>

Hi all,

   can I have reviews for this small change that moves some global 
typedefs used only by Parallel GC from taskqueue.hpp to parallel gc 
files, and further makes naming of instances of these more uniform?

CR:
https://bugs.openjdk.java.net/browse/JDK-8238160
Webrev:
http://cr.openjdk.java.net/~tschatzl/8238160/webrev/
Testing:
local compilation

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Thu Jan 30 11:34:34 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 12:34:34 +0100
Subject: RFR (S): 8238220: Rename OWSTTaskTerminator to TaskTerminator
Message-ID: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>

Hi all,

   can I have reviews for this renaming change of OWSTTaskTerminator to 
TaskTerminator now that there is only one task termination protocol 
implementation?

I believe that the OWST prefix only makes the code harder to read 
without conveying interesting information at the uses.

Based on JDK-8215297.

CR:
https://bugs.openjdk.java.net/browse/JDK-8238220
Webrev:
http://cr.openjdk.java.net/~tschatzl/8238220/webrev/
Testing:
local compilation

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Thu Jan 30 11:56:53 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 12:56:53 +0100
Subject: RFR (XS): 8238229: Remove TRACESPINNING debug code
Message-ID: <77430bd4-19d8-0c6e-edc8-750dae163d96@oracle.com>

Hi all,

   can I have reviews for this removal of some debug code in the 
TaskTerminator class?

The code counts the total number of yields/peeks/spins during task 
termination. Since it is guarded by a define, the code is not included 
in any regular build, so there is potential for bit-rotting.

Since the code is not very complicated (and I believe it is too simple 
for real measurements), and needs rebuild anyway for use I propose to 
just remove it instead of trying to improve it for unknown requirements.

Based on JDK-8238220.

CR:
https://bugs.openjdk.java.net/browse/JDK-8238229
Webrev:
http://cr.openjdk.java.net/~tschatzl/8238229/webrev/
Testing:
local compilation

Thanks,
   Thomas


From zgu at redhat.com  Thu Jan 30 13:23:28 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 30 Jan 2020 08:23:28 -0500
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
In-Reply-To: <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>
References: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
 <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>
 <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>
Message-ID: <93ce4809-34d9-a2f7-b38d-565fea5b6d81@redhat.com>

Still good.

Thanks,

-Zhengyu

On 1/30/20 5:45 AM, Thomas Schatzl wrote:
> Hi Zhengyu,
> 
> On 29.01.20 14:45, Zhengyu Gu wrote:
>> Hi Thomas,
>>
>> Shared changes look good to me.
> 
>  ? thanks for your review; I touched up the files a little in
> 
> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.1 (full)
> 
> slightly after some feedback and finding some minor (pre-existing) issues.
> 
>>
>> I filed follow-up CR(JDK-8238162) to remove ShenandoahTaskTerminator 
>> wrapper.
> 
> Okay.
> 
> Thanks,
>  ? Thomas
> 


From stefan.johansson at oracle.com  Thu Jan 30 15:24:09 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 30 Jan 2020 16:24:09 +0100
Subject: RFR (XS): 8238229: Remove TRACESPINNING debug code
In-Reply-To: <77430bd4-19d8-0c6e-edc8-750dae163d96@oracle.com>
References: <77430bd4-19d8-0c6e-edc8-750dae163d96@oracle.com>
Message-ID: <48885c09-77c2-8924-d9ec-2a825fd60f29@oracle.com>

Looks good,
StefanJ

On 2020-01-30 12:56, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this removal of some debug code in the 
> TaskTerminator class?
> 
> The code counts the total number of yields/peeks/spins during task 
> termination. Since it is guarded by a define, the code is not included 
> in any regular build, so there is potential for bit-rotting.
> 
> Since the code is not very complicated (and I believe it is too simple 
> for real measurements), and needs rebuild anyway for use I propose to 
> just remove it instead of trying to improve it for unknown requirements.
> 
> Based on JDK-8238220.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8238229
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8238229/webrev/

I agree that this can be removed, and there is even more code that 
should go. The call from each collected heap:
src/hotspot/share/gc/shared/genCollectedHeap.cpp:680:#ifdef TRACESPINNING
src/hotspot/share/gc/parallel/psParallelCompact.cpp:1972:#ifdef 
TRACESPINNING
src/hotspot/share/gc/parallel/psScavenge.cpp:734:#ifdef TRACESPINNING
src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1135:#ifdef TRACESPINNING
src/hotspot/share/gc/g1/g1CollectedHeap.cpp:3143:#ifdef TRACESPINNING

Otherwise, looks good,
StefanJ

> Testing:
> local compilation
> 
> Thanks,
>  ? Thomas


From stefan.johansson at oracle.com  Thu Jan 30 15:30:25 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 30 Jan 2020 16:30:25 +0100
Subject: RFR (S): 8238220: Rename OWSTTaskTerminator to TaskTerminator
In-Reply-To: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
References: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
Message-ID: <f303cfbd-7c16-460f-bc88-a5698c8df6bf@oracle.com>

Hi Thomas,

On 2020-01-30 12:34, Thomas Schatzl wrote:
> Hi all,
> 
>  ? can I have reviews for this renaming change of OWSTTaskTerminator to 
> TaskTerminator now that there is only one task termination protocol 
> implementation?
> 
> I believe that the OWST prefix only makes the code harder to read 
> without conveying interesting information at the uses.
> 
> Based on JDK-8215297.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8238220
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8238220/webrev/
Looks good,
StefanJ

> Testing:
> local compilation
> 
> Thanks,
>  ? Thomas


From stefan.johansson at oracle.com  Thu Jan 30 15:47:36 2020
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 30 Jan 2020 16:47:36 +0100
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
In-Reply-To: <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>
References: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
 <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>
 <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>
Message-ID: <5f9aed62-f413-2808-c410-e5ef634aba88@oracle.com>

Hi Thomas,

On 2020-01-30 11:45, Thomas Schatzl wrote:
> Hi Zhengyu,
> 
> On 29.01.20 14:45, Zhengyu Gu wrote:
>> Hi Thomas,
>>
>> Shared changes look good to me.
> 
>  ? thanks for your review; I touched up the files a little in
> 
> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.0_to_1 (diff)
> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.1 (full)
Seems to be some webrev-hickup with webrev.1, but I guess nothing 
changed in g1ConcurrentMark.cpp since first webrev so looked there instead.

Looks good, especially after the follow-up patches.

Thanks,
StefanJ

> 
> slightly after some feedback and finding some minor (pre-existing) issues.
> 
>>
>> I filed follow-up CR(JDK-8238162) to remove ShenandoahTaskTerminator 
>> wrapper.
> 
> Okay.
> 
> Thanks,
>  ? Thomas


From thomas.schatzl at oracle.com  Thu Jan 30 16:40:01 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 17:40:01 +0100
Subject: RFR (M): 8215297: Remove ParallelTaskTerminator
In-Reply-To: <5f9aed62-f413-2808-c410-e5ef634aba88@oracle.com>
References: <fbda7e7c-3c83-51a3-186e-14038dc0ce8f@oracle.com>
 <46581590-d830-3096-d0ac-0dab873fb862@redhat.com>
 <1a5fa317-cd95-46f3-e82b-e7b8d24c9b0a@oracle.com>
 <5f9aed62-f413-2808-c410-e5ef634aba88@oracle.com>
Message-ID: <16476d4a-26f7-bb6c-4ffd-68d071ca38ac@oracle.com>

Hi,

On 30.01.20 16:47, Stefan Johansson wrote:
> Hi Thomas,
> 
> On 2020-01-30 11:45, Thomas Schatzl wrote:
>> Hi Zhengyu,
>>
>> On 29.01.20 14:45, Zhengyu Gu wrote:
>>> Hi Thomas,
>>>
>>> Shared changes look good to me.
>>
>> ?? thanks for your review; I touched up the files a little in
>>
>> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.0_to_1 (diff)
>> http://cr.openjdk.java.net/~tschatzl/8215297/webrev.1 (full)
> Seems to be some webrev-hickup with webrev.1, but I guess nothing 
> changed in g1ConcurrentMark.cpp since first webrev so looked there instead.
> 
> Looks good, especially after the follow-up patches.

Yes, I'm having issues with webrev lately. I managed to fix the webrev.

Thanks for your review,
   Thomas


From thomas.schatzl at oracle.com  Thu Jan 30 16:43:16 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 30 Jan 2020 17:43:16 +0100
Subject: RFR (XS): 8238229: Remove TRACESPINNING debug code
In-Reply-To: <48885c09-77c2-8924-d9ec-2a825fd60f29@oracle.com>
References: <77430bd4-19d8-0c6e-edc8-750dae163d96@oracle.com>
 <48885c09-77c2-8924-d9ec-2a825fd60f29@oracle.com>
Message-ID: <00eec1c7-d524-44c1-a331-95088bb74f3c@oracle.com>

Hi,

On 30.01.20 16:24, Stefan Johansson wrote:
> Looks good,
> StefanJ

   all fixed. Idk why these were missing in that webrev, I regenerated it.

Thanks,
   Thomas

> 
> On 2020-01-30 12:56, Thomas Schatzl wrote:
>> Hi all,
>>
>> ?? can I have reviews for this removal of some debug code in the 
>> TaskTerminator class?
>>
>> The code counts the total number of yields/peeks/spins during task 
>> termination. Since it is guarded by a define, the code is not included 
>> in any regular build, so there is potential for bit-rotting.
>>
>> Since the code is not very complicated (and I believe it is too simple 
>> for real measurements), and needs rebuild anyway for use I propose to 
>> just remove it instead of trying to improve it for unknown requirements.
>>
>> Based on JDK-8238220.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8238229
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8238229/webrev/
> 
> I agree that this can be removed, and there is even more code that 
> should go. The call from each collected heap:
> src/hotspot/share/gc/shared/genCollectedHeap.cpp:680:#ifdef TRACESPINNING
> src/hotspot/share/gc/parallel/psParallelCompact.cpp:1972:#ifdef 
> TRACESPINNING
> src/hotspot/share/gc/parallel/psScavenge.cpp:734:#ifdef TRACESPINNING
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp:1135:#ifdef TRACESPINNING
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp:3143:#ifdef TRACESPINNING
> 
> Otherwise, looks good,
> StefanJ
> 
>> Testing:
>> local compilation
>>
>> Thanks,
>> ?? Thomas


From sangheon.kim at oracle.com  Thu Jan 30 18:08:48 2020
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Thu, 30 Jan 2020 10:08:48 -0800
Subject: RFR (S): 8238220: Rename OWSTTaskTerminator to TaskTerminator
In-Reply-To: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
References: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
Message-ID: <f8226668-56bf-1fb1-2d94-7bc06bb9f1db@oracle.com>

Hi Thomas,

On 1/30/20 3:34 AM, Thomas Schatzl wrote:
> Hi all,
>
> ? can I have reviews for this renaming change of OWSTTaskTerminator to 
> TaskTerminator now that there is only one task termination protocol 
> implementation?
>
> I believe that the OWST prefix only makes the code harder to read 
> without conveying interesting information at the uses.
>
> Based on JDK-8215297.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8238220
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8238220/webrev/
Looks good as is.

One thing to note is the order of renamed header file.
It looks like you are treating uppercase first? :)

e.g. at g1CollectedHeap.cpp

+#include "gc/shared/taskTerminator.hpp"
  #include "gc/shared/taskqueue.inline.hpp"


I expect alphabet order first and then upper-lowercase. :)

Thanks,
Sangheon

> Testing:
> local compilation
>
> Thanks,
> ? Thomas


From kim.barrett at oracle.com  Thu Jan 30 23:14:23 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 30 Jan 2020 18:14:23 -0500
Subject: RFR (XS): 8238229: Remove TRACESPINNING debug code
In-Reply-To: <00eec1c7-d524-44c1-a331-95088bb74f3c@oracle.com>
References: <77430bd4-19d8-0c6e-edc8-750dae163d96@oracle.com>
 <48885c09-77c2-8924-d9ec-2a825fd60f29@oracle.com>
 <00eec1c7-d524-44c1-a331-95088bb74f3c@oracle.com>
Message-ID: <C47A6199-A5FC-4A66-A11A-11176127BF08@oracle.com>

> On Jan 30, 2020, at 11:43 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 30.01.20 16:24, Stefan Johansson wrote:
>> Looks good,
>> StefanJ
> 
>  all fixed. Idk why these were missing in that webrev, I regenerated it.
> 
> Thanks,
>  Thomas
> 
>> On 2020-01-30 12:56, Thomas Schatzl wrote:
>>> Hi all,
>>> 
>>>    can I have reviews for this removal of some debug code in the TaskTerminator class?
>>> 
>>> The code counts the total number of yields/peeks/spins during task termination. Since it is guarded by a define, the code is not included in any regular build, so there is potential for bit-rotting.
>>> 
>>> Since the code is not very complicated (and I believe it is too simple for real measurements), and needs rebuild anyway for use I propose to just remove it instead of trying to improve it for unknown requirements.
>>> 
>>> Based on JDK-8238220.
>>> 
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8238229
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8238229/webrev/
>> I agree that this can be removed, and there is even more code that should go. The call from each collected heap:

Looks good.


From manc at google.com  Fri Jan 31 03:27:55 2020
From: manc at google.com (Man Cao)
Date: Thu, 30 Jan 2020 19:27:55 -0800
Subject: RFR (XS): 8234608: [TESTBUG] Memory leak in
 gc/g1/unloading/libdefine.cpp
In-Reply-To: <CA+w6HxZ8d1CMsSHWtV3ZeEdhsDabeRkHtBpnoWHyrHOOXNbY+g@mail.gmail.com>
References: <CA+w6HxYTpW39sn7bGCQL48c6v=9PoOzb=La6KJV9KS5yuLNyOA@mail.gmail.com>
 <d1849115-fead-0096-4fb1-d9bce405dea4@oracle.com>
 <CA+w6HxZ8d1CMsSHWtV3ZeEdhsDabeRkHtBpnoWHyrHOOXNbY+g@mail.gmail.com>
Message-ID: <CA+w6HxYBNfYzrNy=gtfF2uvjy1cGCHxXztPmDMQUv3AaYcccKw@mail.gmail.com>

Hi,

I have incorporated Thomas's changes, and fixed the tests and updated the
CR.
New webrev: https://cr.openjdk.java.net/~manc/8234608/webrev.01/

The issue is that the signature of makeRedefinition0() in libdefine.cpp was
wrong.
It missed the "jclass clazz" parameter.

I have tested using 'make test
TEST="test/hotspot/jtreg/vmTestbase/gc/g1/unloading/tests/unloading_redefinition_*"
', for both fastdebug and product builds.

I suppose Submit repo would not run these tests, because it only runs
tier1. Am I correct?

-Man


On Tue, Jan 28, 2020 at 11:18 AM Man Cao <manc at google.com> wrote:

> Thanks, Thomas. I'll take a look briefly.
> Agreed that the test should be able to run first, before pushing this
> change.
>
> -Man
>


From rwestrel at redhat.com  Fri Jan 31 08:47:12 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 31 Jan 2020 09:47:12 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
Message-ID: <87wo98t3lb.fsf@redhat.com>


http://cr.openjdk.java.net/~roland/8237776/webrev.00/

xmm0 (an argument to a call) gets corrupted in the c2i adapter (when
going from c1 code to the interpreter) at the
ShenandoahRuntime::write_ref_field_pre_entry() runtime call. That call
is in the c2i because of c2i_entry_barrier() and
resolve_weak_handle(). The proposed fix saves all floating point
argument registers.

Roland.


From rkennke at redhat.com  Fri Jan 31 09:51:26 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 31 Jan 2020 10:51:26 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <87wo98t3lb.fsf@redhat.com>
References: <87wo98t3lb.fsf@redhat.com>
Message-ID: <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>

Hey Roland,

the patch looks good, but it lacks the x86_32 counterpart. Or would you
rather handle that separately?

Thanks,
Roman


Roland Westrelin <rwestrel at redhat.com> schrieb am Fr., 31. Jan. 2020, 09:47:

>
> http://cr.openjdk.java.net/~roland/8237776/webrev.00/
>
> xmm0 (an argument to a call) gets corrupted in the c2i adapter (when
> going from c1 code to the interpreter) at the
> ShenandoahRuntime::write_ref_field_pre_entry() runtime call. That call
> is in the c2i because of c2i_entry_barrier() and
> resolve_weak_handle(). The proposed fix saves all floating point
> argument registers.
>
> Roland.
>
>


From rwestrel at redhat.com  Fri Jan 31 09:55:13 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 31 Jan 2020 10:55:13 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
References: <87wo98t3lb.fsf@redhat.com>
 <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
Message-ID: <87tv4ct0fy.fsf@redhat.com>


> the patch looks good, but it lacks the x86_32 counterpart. Or would you
> rather handle that separately?

Actually, AFAIU, the 64 bits fix covers 32 bits too. 32bits needs xmm0
and xmm1 saved. So we're saving too many registers on 32 bits but that
seems pretty armless.

Roland.


From rkennke at redhat.com  Fri Jan 31 09:58:39 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 31 Jan 2020 10:58:39 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <87tv4ct0fy.fsf@redhat.com>
References: <87wo98t3lb.fsf@redhat.com>
 <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
 <87tv4ct0fy.fsf@redhat.com>
Message-ID: <CAAN-Kyj2vfuJ_Fz5K0AZcMN4X+9FSyHdGeXh6Qjd4DW6KbjKXg@mail.gmail.com>

Roland Westrelin <rwestrel at redhat.com> schrieb am Fr., 31. Jan. 2020, 10:55:

>
> > the patch looks good, but it lacks the x86_32 counterpart. Or would you
> > rather handle that separately?
>
> Actually, AFAIU, the 64 bits fix covers 32 bits too. 32bits needs xmm0
> and xmm1 saved. So we're saving too many registers on 32 bits but that
> seems pretty armless.
>
> Roland.
>

Ah OK. Good then!

(armless... nice typo! :-D )

Thanks,
Roman


>


From thomas.schatzl at oracle.com  Fri Jan 31 10:41:13 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 31 Jan 2020 11:41:13 +0100
Subject: RFR (S): 8238220: Rename OWSTTaskTerminator to TaskTerminator
In-Reply-To: <f8226668-56bf-1fb1-2d94-7bc06bb9f1db@oracle.com>
References: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
 <f8226668-56bf-1fb1-2d94-7bc06bb9f1db@oracle.com>
Message-ID: <10c01fdb-d6e3-01a3-6cee-a8f467fac372@oracle.com>

Hi Sangheon,

On 30.01.20 19:08, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> On 1/30/20 3:34 AM, Thomas Schatzl wrote:
>> Hi all,
>>
>> ? can I have reviews for this renaming change of OWSTTaskTerminator to 
>> TaskTerminator now that there is only one task termination protocol 
>> implementation?
>>
>> I believe that the OWST prefix only makes the code harder to read 
>> without conveying interesting information at the uses.
>>
>> Based on JDK-8215297.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8238220
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8238220/webrev/
> Looks good as is.
> 
> One thing to note is the order of renamed header file.
> It looks like you are treating uppercase first? :)
> 
> e.g. at g1CollectedHeap.cpp
> 
> +#include "gc/shared/taskTerminator.hpp"
>   #include "gc/shared/taskqueue.inline.hpp"
> 
> 
> I expect alphabet order first and then upper-lowercase. :)
> 

   by default, upper case sorts before lower case in many if not all 
situations on computers since typically all upper case letters are 
"before" lower case letters in character sets.

I would like to keep it as is unless you or somebody else really objects 
- there does not seem to be a precedence in hotspot files.

Thanks,
   Thomas


From ivan.walulya at oracle.com  Fri Jan 31 12:22:35 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Fri, 31 Jan 2020 13:22:35 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only used
 by CMS
Message-ID: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>

Hi all,

Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.

Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>


Testing: Tier 1 - Tier 3

//Ivan 

From per.liden at oracle.com  Fri Jan 31 12:32:11 2020
From: per.liden at oracle.com (Per Liden)
Date: Fri, 31 Jan 2020 13:32:11 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
Message-ID: <91bccd09-a5c7-faea-204e-cf7c9a0dc561@oracle.com>

Looks good.

/Per

On 1/31/20 1:22 PM, Ivan Walulya wrote:
> Hi all,
> 
> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>
> 
> 
> Testing: Tier 1 - Tier 3
> 
> //Ivan
> 


From thomas.schatzl at oracle.com  Fri Jan 31 12:34:04 2020
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 31 Jan 2020 13:34:04 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
Message-ID: <cbbcdbbf-682f-4fd1-978e-ee4fffc6220d@oracle.com>

Hi,

On 31.01.20 13:22, Ivan Walulya wrote:
> Hi all,
> 
> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>
> 
> 
> Testing: Tier 1 - Tier 3

   looks good.

Thomas


From ivan.walulya at oracle.com  Fri Jan 31 12:35:57 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Fri, 31 Jan 2020 13:35:57 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <91bccd09-a5c7-faea-204e-cf7c9a0dc561@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
 <91bccd09-a5c7-faea-204e-cf7c9a0dc561@oracle.com>
Message-ID: <B51B2C56-2F9C-4DD3-8B42-116A143AD259@oracle.com>

Thanks Per!

//Ivan

> On 31 Jan 2020, at 13:32, Per Liden <per.liden at oracle.com> wrote:
> 
> Looks good.
> 
> /Per
> 
> On 1/31/20 1:22 PM, Ivan Walulya wrote:
>> Hi all,
>> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>
>> Testing: Tier 1 - Tier 3
>> //Ivan


From ivan.walulya at oracle.com  Fri Jan 31 12:36:12 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Fri, 31 Jan 2020 13:36:12 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <cbbcdbbf-682f-4fd1-978e-ee4fffc6220d@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
 <cbbcdbbf-682f-4fd1-978e-ee4fffc6220d@oracle.com>
Message-ID: <58A3B3BE-BD5C-433D-81C9-6859FB973E25@oracle.com>

Thanks Thomas!

//Ivan

> On 31 Jan 2020, at 13:34, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 31.01.20 13:22, Ivan Walulya wrote:
>> Hi all,
>> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>
>> Testing: Tier 1 - Tier 3
> 
>  looks good.
> 
> Thomas


From leo.korinth at oracle.com  Fri Jan 31 12:41:51 2020
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 31 Jan 2020 13:41:51 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
Message-ID: <42a03b0e-e48b-28c5-b16b-dd5f04d10f3e@oracle.com>

On 31/01/2020 13:22, Ivan Walulya wrote:
> Hi all,
> 
> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>

Looks good, I will sponsor it for you.

Thanks,
Leo

> 
> 
> Testing: Tier 1 - Tier 3
> 
> //Ivan
> 


From rkennke at redhat.com  Fri Jan 31 12:49:48 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 31 Jan 2020 13:49:48 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <CAAN-Kyj2vfuJ_Fz5K0AZcMN4X+9FSyHdGeXh6Qjd4DW6KbjKXg@mail.gmail.com>
References: <87wo98t3lb.fsf@redhat.com>
 <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
 <87tv4ct0fy.fsf@redhat.com>
 <CAAN-Kyj2vfuJ_Fz5K0AZcMN4X+9FSyHdGeXh6Qjd4DW6KbjKXg@mail.gmail.com>
Message-ID: <CAAN-Kyg7ZHtNPLPzi1DKzXBF0OizFK+gKRbBAdjsvkyk2Quwyg@mail.gmail.com>

Thinking more about it: we should probably change the synopsis for the bug
and RFR:
- Fix is not in Shenandoah code, but shared. Even though only Shenandoah
currently seems to fall over it. Also, should probably draw attention of
non-Shenandoah reviewers...
- Maybe reflect what the problem is and/or what the fix is?

Thanks,
Roman

Roman Kennke <rkennke at redhat.com> schrieb am Fr., 31. Jan. 2020, 10:58:

>
>
> Roland Westrelin <rwestrel at redhat.com> schrieb am Fr., 31. Jan. 2020,
> 10:55:
>
>>
>> > the patch looks good, but it lacks the x86_32 counterpart. Or would you
>> > rather handle that separately?
>>
>> Actually, AFAIU, the 64 bits fix covers 32 bits too. 32bits needs xmm0
>> and xmm1 saved. So we're saving too many registers on 32 bits but that
>> seems pretty armless.
>>
>> Roland.
>>
>
> Ah OK. Good then!
>
> (armless... nice typo! :-D )
>
> Thanks,
> Roman
>
>
>>


From rwestrel at redhat.com  Fri Jan 31 12:54:21 2020
From: rwestrel at redhat.com (Roland Westrelin)
Date: Fri, 31 Jan 2020 13:54:21 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <CAAN-Kyg7ZHtNPLPzi1DKzXBF0OizFK+gKRbBAdjsvkyk2Quwyg@mail.gmail.com>
References: <87wo98t3lb.fsf@redhat.com>
 <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
 <87tv4ct0fy.fsf@redhat.com>
 <CAAN-Kyj2vfuJ_Fz5K0AZcMN4X+9FSyHdGeXh6Qjd4DW6KbjKXg@mail.gmail.com>
 <CAAN-Kyg7ZHtNPLPzi1DKzXBF0OizFK+gKRbBAdjsvkyk2Quwyg@mail.gmail.com>
Message-ID: <87lfpnu6pu.fsf@redhat.com>


> - Fix is not in Shenandoah code, but shared. Even though only Shenandoah
> currently seems to fall over it. Also, should probably draw attention of
> non-Shenandoah reviewers...

Fix IS in shenandoah code.

Roland.


From rkennke at redhat.com  Fri Jan 31 12:58:48 2020
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 31 Jan 2020 13:58:48 +0100
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <87lfpnu6pu.fsf@redhat.com>
References: <87wo98t3lb.fsf@redhat.com>
 <CAAN-KyjLCp_n_cCe0pK2Jj1xZzkQ-CJdoYMbbUSpjQdaUjO3TQ@mail.gmail.com>
 <87tv4ct0fy.fsf@redhat.com>
 <CAAN-Kyj2vfuJ_Fz5K0AZcMN4X+9FSyHdGeXh6Qjd4DW6KbjKXg@mail.gmail.com>
 <CAAN-Kyg7ZHtNPLPzi1DKzXBF0OizFK+gKRbBAdjsvkyk2Quwyg@mail.gmail.com>
 <87lfpnu6pu.fsf@redhat.com>
Message-ID: <CAAN-KygaA86TvrpVZFDPHMAfQZa=z8qm_FFmipJvQrWf=KPfYw@mail.gmail.com>

Roland Westrelin <rwestrel at redhat.com> schrieb am Fr., 31. Jan. 2020, 13:54:

>
> > - Fix is not in Shenandoah code, but shared. Even though only Shenandoah
> > currently seems to fall over it. Also, should probably draw attention of
> > non-Shenandoah reviewers...
>
> Fix IS in shenandoah code.
>


/me looks again. Duh. I should not review stuff on the phone, in the train.
Nevermind then..

Thanks,
Roman


>
>
> Roland.
>
>


From ivan.walulya at oracle.com  Fri Jan 31 13:32:17 2020
From: ivan.walulya at oracle.com (Ivan Walulya)
Date: Fri, 31 Jan 2020 14:32:17 +0100
Subject: JDK-8233220: Remove Space::_par_seq_tasks member as it was only
 used by CMS
In-Reply-To: <42a03b0e-e48b-28c5-b16b-dd5f04d10f3e@oracle.com>
References: <32A6D3AA-F9CD-42C4-B810-2CA05AEFBFC0@oracle.com>
 <42a03b0e-e48b-28c5-b16b-dd5f04d10f3e@oracle.com>
Message-ID: <AE723377-EEA4-44AD-BC56-8A22AD2645B4@oracle.com>

Thanks Leo!

//Ivan

> On 31 Jan 2020, at 13:41, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> On 31/01/2020 13:22, Ivan Walulya wrote:
>> Hi all,
>> Please review a minor enhancement to remove Space::_par_seq_tasks member which was only used by CMS.
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8233220 <https://bugs.openjdk.java.net/browse/JDK-8233220>
>> Webrev: http://cr.openjdk.java.net/~lkorinth/ivan/8233220/ <http://cr.openjdk.java.net/~lkorinth/ivan/8233220/>
> 
> Looks good, I will sponsor it for you.
> 
> Thanks,
> Leo
> 
>> Testing: Tier 1 - Tier 3
>> //Ivan


From zgu at redhat.com  Fri Jan 31 13:55:52 2020
From: zgu at redhat.com (Zhengyu Gu)
Date: Fri, 31 Jan 2020 08:55:52 -0500
Subject: RFR(S): 8237776: Shenandoah: Wrong result with Lucene test
In-Reply-To: <87wo98t3lb.fsf@redhat.com>
References: <87wo98t3lb.fsf@redhat.com>
Message-ID: <316762d7-b1b0-97c4-6cdb-ac31bd2d7e6b@redhat.com>

Looks good to me.

-Zhengyu

On 1/31/20 3:47 AM, Roland Westrelin wrote:
> 
> http://cr.openjdk.java.net/~roland/8237776/webrev.00/
> 
> xmm0 (an argument to a call) gets corrupted in the c2i adapter (when
> going from c1 code to the interpreter) at the
> ShenandoahRuntime::write_ref_field_pre_entry() runtime call. That call
> is in the c2i because of c2i_entry_barrier() and
> resolve_weak_handle(). The proposed fix saves all floating point
> argument registers.
> 
> Roland.
> 


From sangheon.kim at oracle.com  Fri Jan 31 17:54:02 2020
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Fri, 31 Jan 2020 09:54:02 -0800
Subject: RFR (S): 8238220: Rename OWSTTaskTerminator to TaskTerminator
In-Reply-To: <10c01fdb-d6e3-01a3-6cee-a8f467fac372@oracle.com>
References: <5f99b054-e286-2a8c-5a37-d641eb4932f1@oracle.com>
 <f8226668-56bf-1fb1-2d94-7bc06bb9f1db@oracle.com>
 <10c01fdb-d6e3-01a3-6cee-a8f467fac372@oracle.com>
Message-ID: <65ce518b-56da-92a8-010a-e58c5c015a7e@oracle.com>

Hi Thomas,

On 1/31/20 2:41 AM, Thomas Schatzl wrote:
> Hi Sangheon,
>
> On 30.01.20 19:08, sangheon.kim at oracle.com wrote:
>> Hi Thomas,
>>
>> On 1/30/20 3:34 AM, Thomas Schatzl wrote:
>>> Hi all,
>>>
>>> ? can I have reviews for this renaming change of OWSTTaskTerminator 
>>> to TaskTerminator now that there is only one task termination 
>>> protocol implementation?
>>>
>>> I believe that the OWST prefix only makes the code harder to read 
>>> without conveying interesting information at the uses.
>>>
>>> Based on JDK-8215297.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8238220
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8238220/webrev/
>> Looks good as is.
>>
>> One thing to note is the order of renamed header file.
>> It looks like you are treating uppercase first? :)
>>
>> e.g. at g1CollectedHeap.cpp
>>
>> +#include "gc/shared/taskTerminator.hpp"
>> ? #include "gc/shared/taskqueue.inline.hpp"
>>
>>
>> I expect alphabet order first and then upper-lowercase. :)
>>
>
> ? by default, upper case sorts before lower case in many if not all 
> situations on computers since typically all upper case letters are 
> "before" lower case letters in character sets.
>
> I would like to keep it as is unless you or somebody else really 
> objects - there does not seem to be a precedence in hotspot files.
I'm fine with current order.
As you said personally, hotspot style just says "Keep the include lines 
sorted".

https://wiki.openjdk.java.net/display/HotSpot/StyleGuide

Thanks,
Sangheon


>
> Thanks,
> ? Thomas


From kim.barrett at oracle.com  Fri Jan 31 22:25:34 2020
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 31 Jan 2020 17:25:34 -0500
Subject: RFR: 8237143: Eliminate DirtyCardQ_cbl_mon
In-Reply-To: <86BABDA8-E402-49F3-B478-ED0E70490015@oracle.com>
References: <745E91C1-AE1A-4DA2-80EE-59B70897F4BF@oracle.com>
 <f9505da9-ca8b-f880-5420-bcdb3622b206@oracle.com>
 <86BABDA8-E402-49F3-B478-ED0E70490015@oracle.com>
Message-ID: <40479EE1-74EF-4C5F-A04B-8877F0ED9ACB@oracle.com>

> On Jan 23, 2020, at 3:10 PM, Kim Barrett <kim.barrett at oracle.com> wrote:
> 
>> On Jan 22, 2020, at 11:12 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> On 16.01.20 09:51, Kim Barrett wrote:
>>> Please review this change to eliminate the DirtyCardQ_cbl_mon.  This
>>> is one of the two remaining super-special "access" ranked mutexes.
>>> (The other is the Shared_DirtyCardQ_lock, whose elimination is covered
>>> by JDK-8221360.)
>>> There are three main parts to this change.
>>> (1) Replace the under-a-lock FIFO queue in G1DirtyCardQueueSet with a
>>> lock-free FIFO queue.
>>> (2) Replace the use of a HotSpot monitor for signaling activation of
>>> concurrent refinement threads with a semaphore-based solution.
>>> (3) Handle pausing of buffer refinement in the middle of a buffer in
>>> order to handle a pending safepoint request.  This can no longer just
>>> push the partially processed buffer back onto the queue, due to ABA
>>> problems now that the buffer is lock-free.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8237143
>>> Webrev:
>>> https://cr.openjdk.java.net/~kbarrett/8237143/open.00/
>>> Testing:
>>> mach5 tier1-5
>>> Normal performance testing showed no significant change.
>>> specjbb2015 on a very big machine showed a 3.5% average critical-jOPS
>>> improvement, though not statistically significant; removing contention
>>> for that lock by many hardware threads may be a little bit noticeable.
>> 
>> initial comments only, and so far only about comments :( The code itself looks good to me, but I want to look over it again.
> 
> After some offline discussion with Thomas, I?m doing some restructuring that
> makes it probably not very efficient for anyone else to do a careful review of
> the open.00 version.

Here's a new webrev:

https://cr.openjdk.java.net/~kbarrett/8237143/open.02/

Testing:
mach5 tier1-5
Performance testing showed no significant change.

I didn't bother providing an incremental webrev, because the changes
to g1DirtyCardQueue.[ch]pp are pretty substantial.  Those are the only
files changed, except for the suggested move of the comment for
G1ConcurrentRefineThread::maybe_deactivate and some related comment
improvements nearby.

Most of this round of changes are refactoring within G1DirtyCardQueueSet,
mainly adding internal helper classes for the FIFO queue and for the paused
buffers, each with their own (commented) APIs.  I think that has addressed a
lot of Thomas's comments about the comments, and I hope has made the code
easier to understand.

I've also improved the mechanism for handling "paused" buffers, simplifying
it by making better use of some invariants.

> On Jan 22, 2020, at 11:12 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> // The key idea to make this work is that pop (get_completed_buffer)
> // never returns an element of the queue if it is the only accessible
> // element,

> If I understand this correctly, maybe "if there is only one buffer in the FIFO" is easier to understand than "only accessible element". (or define "accessible element?).

I specifically don't want to say it that way because we could have a
situation like

(1) Start with a queue having exactly one element.

(2) Thread1 starts a push by updating tail, but has not yet linked the old
tail to the new.

(3) Thread2 performs a push.

The buffer pushed by Thread2 is "in the queue" by some reasonable
definition, so the queue contains two buffers.  But that buffer is not yet
accessible, because Thread1 hasn't completed its push.  The alternative is
to (in the description) somehow divorce a completed push from the notion of
the number of buffers in the queue, which seems worse to me.  I expanded the
discussion a bit though, including what is meant by "accessible".

> The code seems to unnecessarily use the NULL_buffer constant. Maybe use it here too. Overall I am not sure about the usefulness of using NULL_buffer in the code. The NULL value in Hotspot code is generally accepted as a special value, and the name "NULL_buffer" does not seem to add any information.

The point of NULL_buffer was to avoid casts of NULL in Atomic operations,
and I then used it consistently.  But I've changed to using such casts,
since it turned out there weren't that many and we can get rid of those
uniformly here and elsewhere when we have C++11 nullptr and nullptr_t.