From per.liden at oracle.com  Thu Aug  1 08:28:54 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 1 Aug 2019 10:28:54 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <2a64dceb-6188-f740-708e-983a5f6f681e@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <2a64dceb-6188-f740-708e-983a5f6f681e@oracle.com>
Message-ID: <5efec517-72f7-2f90-234e-3a9f2ec0248c@oracle.com>

Hi Thomas,

On 7/31/19 7:59 PM, Thomas Schatzl wrote:
> Hi,
> 
> On 31.07.19 10:19, Per Liden wrote:
>> Hi,
>>
>> I found some time to benchmark the "GC clears pages"-approach, and 
>> it's fairly clear that it's not paying off. So ditching that idea.
>>
>> However, I'm still looking for something that would not just do 
>> segmented clearing of arrays in large zpages. Letting oop arrays 
>> temporarily be typed arrays while it's being cleared could be an 
>> option. I did a prototype for that, which looks like this:
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>
>> There's at least one issue here, the code doing allocation sampling 
>> will see that we allocated long arrays instead of oop arrays, so the 
>> reporting there will be skewed. That can be addressed if we go down 
>> this path. The code is otherwise fairly simple and contained. Feel 
>> free to spot any issues.
> 
>  ? that looks like a really neat way of doing this.
> 
> Looking over this there does not seem to be any real dependency on ZGC 
> code, so if you went this way, would it be possible to provide this 
> solution for all collectors?

This is potentially dangerous for any GC doing concurrent oop_iterate(), 
as in that case the klass pointer must only be read once, with acquire 
ordering.

An example in G1 where this would break is 
HeapRegion::do_oops_on_memregion_in_humongous(), and I'm thinking there 
are more cases. For example, when a half zeroed type array in young is 
promoted to old, and then we switch the klass pointer.

I wouldn't be surprised if CMS have similar problems, but haven't check.

However, this would probably work fine for Serial and Parallel. On the 
other hand, depending on the performance impact, it's not completely 
obvious that you'd want it there.

We could perhaps add this code to the shared ObjArrayAllocator, and 
introduce a CollectedHeap::supports_segmented_array_clearing() so that 
GCs can easily opt-in when they are ready to do so.

> 
> For other collectors slightly larger segment sizes might be sufficient 
> too to slightly favor performance.
> 
> Did you measure the impact on zeroing throughput of this?

I haven't done any performance measurements of this yet. The current 4K 
segment size was just an educated guess, but it might not be the optimal 
number.

cheers,
Per


From thomas.schatzl at oracle.com  Thu Aug  1 09:43:52 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 1 Aug 2019 02:43:52 -0700
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <5efec517-72f7-2f90-234e-3a9f2ec0248c@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <2a64dceb-6188-f740-708e-983a5f6f681e@oracle.com>
 <5efec517-72f7-2f90-234e-3a9f2ec0248c@oracle.com>
Message-ID: <543741a4-f510-96f5-8b5a-eb1f27cf7df0@oracle.com>

On 01.08.19 01:28, Per Liden wrote:
> Hi Thomas,
> 
> On 7/31/19 7:59 PM, Thomas Schatzl wrote:
>> Hi,
>>
>> On 31.07.19 10:19, Per Liden wrote:
>>> Hi,
>>>
>>> I found some time to benchmark the "GC clears pages"-approach, and 
>>> it's fairly clear that it's not paying off. So ditching that idea.
>>>
>>> However, I'm still looking for something that would not just do 
>>> segmented clearing of arrays in large zpages. Letting oop arrays 
>>> temporarily be typed arrays while it's being cleared could be an 
>>> option. I did a prototype for that, which looks like this:
>>>
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>>
>>> There's at least one issue here, the code doing allocation sampling 
>>> will see that we allocated long arrays instead of oop arrays, so the 
>>> reporting there will be skewed. That can be addressed if we go down 
>>> this path. The code is otherwise fairly simple and contained. Feel 
>>> free to spot any issues.
>>
>> ?? that looks like a really neat way of doing this.
>>
>> Looking over this there does not seem to be any real dependency on ZGC 
>> code, so if you went this way, would it be possible to provide this 
>> solution for all collectors?
> 
> This is potentially dangerous for any GC doing concurrent oop_iterate(), 
> as in that case the klass pointer must only be read once, with acquire 
> ordering.
> 
> An example in G1 where this would break is 
> HeapRegion::do_oops_on_memregion_in_humongous(), and I'm thinking there 
> are more cases. 

Point taken, you are completely right, I was not thinking it through.

However for humongous objects it might be sufficient to just zero 
manually in a loop with basically the same safepoint polling loop while 
the klass is still NULL (and make sure it is not done again later).

Of course, also making sure that these seemingly empty regions are not 
reclaimed during the safepoint somehow in a different way. :)

 > For example, when a half zeroed type array in young is
 > promoted to old, and then we switch the klass pointer.

In G1 we are probably not so much worried by "large" objects into young 
gen - while 16M max object size takes some time to clear, only handling 
the humongous objects would already help a lot I believe.

Actually another approach could be the GC completing the zeroing in 
parallel for young gen objects - at that time it does have all memory 
bandwidth for itself. Which would at least improve the situation unless 
many threads do that at the same time (still these objects may be 16m in 
size max).

Or just guaranteeing that such objects stay in survivor "zeroing" 
regions during a gc (in case of evac failure, do the work in the pause). 
Another option would be delaying refinement for cards in these regions 
if after gc we have such objects until completed (which may be not 
enough due to memory visibility issues, but I just like that idea right 
now :) ).

It is unclear if such large effort makes sense though, and probably 
there are better options with a bit more thought :).

> I wouldn't be surprised if CMS have similar problems, but haven't check.

At this time I would not spend time on any new feature for CMS that is 
not absolutely necessary.

> However, this would probably work fine for Serial and Parallel. On the 
> other hand, depending on the performance impact, it's not completely 
> obvious that you'd want it there.
> 
> We could perhaps add this code to the shared ObjArrayAllocator, and 
> introduce a CollectedHeap::supports_segmented_array_clearing() so that 
> GCs can easily opt-in when they are ready to do so.

Not sure. It is probably worth looking into how this would work in the 
other collectors in a different CR, I would keep it ZGC local for now 
after all.

>>
>> For other collectors slightly larger segment sizes might be sufficient 
>> too to slightly favor performance.
>>
>> Did you measure the impact on zeroing throughput of this?
> 
> I haven't done any performance measurements of this yet. The current 4K 
> segment size was just an educated guess, but it might not be the optimal 
> number.
> 

Okay, thanks.

Thomas.


From per.liden at oracle.com  Thu Aug  1 10:19:22 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 1 Aug 2019 12:19:22 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <543741a4-f510-96f5-8b5a-eb1f27cf7df0@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <2a64dceb-6188-f740-708e-983a5f6f681e@oracle.com>
 <5efec517-72f7-2f90-234e-3a9f2ec0248c@oracle.com>
 <543741a4-f510-96f5-8b5a-eb1f27cf7df0@oracle.com>
Message-ID: <cc5de3e4-17eb-08aa-2a8e-3335bd538efc@oracle.com>

Hi Thomas,

On 8/1/19 11:43 AM, Thomas Schatzl wrote:
> On 01.08.19 01:28, Per Liden wrote:
>> Hi Thomas,
>>
>> On 7/31/19 7:59 PM, Thomas Schatzl wrote:
>>> Hi,
>>>
>>> On 31.07.19 10:19, Per Liden wrote:
>>>> Hi,
>>>>
>>>> I found some time to benchmark the "GC clears pages"-approach, and 
>>>> it's fairly clear that it's not paying off. So ditching that idea.
>>>>
>>>> However, I'm still looking for something that would not just do 
>>>> segmented clearing of arrays in large zpages. Letting oop arrays 
>>>> temporarily be typed arrays while it's being cleared could be an 
>>>> option. I did a prototype for that, which looks like this:
>>>>
>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>>>
>>>> There's at least one issue here, the code doing allocation sampling 
>>>> will see that we allocated long arrays instead of oop arrays, so the 
>>>> reporting there will be skewed. That can be addressed if we go down 
>>>> this path. The code is otherwise fairly simple and contained. Feel 
>>>> free to spot any issues.
>>>
>>> ?? that looks like a really neat way of doing this.
>>>
>>> Looking over this there does not seem to be any real dependency on 
>>> ZGC code, so if you went this way, would it be possible to provide 
>>> this solution for all collectors?
>>
>> This is potentially dangerous for any GC doing concurrent 
>> oop_iterate(), as in that case the klass pointer must only be read 
>> once, with acquire ordering.
>>
>> An example in G1 where this would break is 
>> HeapRegion::do_oops_on_memregion_in_humongous(), and I'm thinking 
>> there are more cases. 
> 
> Point taken, you are completely right, I was not thinking it through.
> 
> However for humongous objects it might be sufficient to just zero 
> manually in a loop with basically the same safepoint polling loop while 
> the klass is still NULL (and make sure it is not done again later).
> 
> Of course, also making sure that these seemingly empty regions are not 
> reclaimed during the safepoint somehow in a different way. :)

Yes, something like that could probably be done, and it's not completely 
different from Stefan's original patch for this where he pinned the page 
  (stopping it from being collected) while it was being cleared.

However, for ZGC, I'd really like to solve this problem for all arrays, 
not just those allocated in large zpages, which is why I've been keen on 
exploring some other options.

> 
>  > For example, when a half zeroed type array in young is
>  > promoted to old, and then we switch the klass pointer.
> 
> In G1 we are probably not so much worried by "large" objects into young 
> gen - while 16M max object size takes some time to clear, only handling 
> the humongous objects would already help a lot I believe.
> 
> Actually another approach could be the GC completing the zeroing in 
> parallel for young gen objects - at that time it does have all memory 
> bandwidth for itself. Which would at least improve the situation unless 
> many threads do that at the same time (still these objects may be 16m in 
> size max).
> 
> Or just guaranteeing that such objects stay in survivor "zeroing" 
> regions during a gc (in case of evac failure, do the work in the pause). 
> Another option would be delaying refinement for cards in these regions 
> if after gc we have such objects until completed (which may be not 
> enough due to memory visibility issues, but I just like that idea right 
> now :) ).
> 
> It is unclear if such large effort makes sense though, and probably 
> there are better options with a bit more thought :).
> 
>> I wouldn't be surprised if CMS have similar problems, but haven't check.
> 
> At this time I would not spend time on any new feature for CMS that is 
> not absolutely necessary.
> 
>> However, this would probably work fine for Serial and Parallel. On the 
>> other hand, depending on the performance impact, it's not completely 
>> obvious that you'd want it there.
>>
>> We could perhaps add this code to the shared ObjArrayAllocator, and 
>> introduce a CollectedHeap::supports_segmented_array_clearing() so that 
>> GCs can easily opt-in when they are ready to do so.
> 
> Not sure. It is probably worth looking into how this would work in the 
> other collectors in a different CR, I would keep it ZGC local for now 
> after all.

I agree. I'd like to keep this ZGC-specific for now. A future RFE could 
look into bringing this feature (perhaps solved in a different way) to 
other collectors, if deemed important.

cheers,
Per

> 
>>>
>>> For other collectors slightly larger segment sizes might be 
>>> sufficient too to slightly favor performance.
>>>
>>> Did you measure the impact on zeroing throughput of this?
>>
>> I haven't done any performance measurements of this yet. The current 
>> 4K segment size was just an educated guess, but it might not be the 
>> optimal number.
>>
> 
> Okay, thanks.
> 
> Thomas.


From per.liden at oracle.com  Thu Aug  1 14:14:25 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 1 Aug 2019 16:14:25 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
Message-ID: <f5548322-fed7-ec09-5be2-e8207ddf6e15@oracle.com>

Here's an updated webrev that should be complete, i.e. fixes the issues 
related to allocation sampling/reporting that I mentioned. This patch 
makes MemAllocator::finish() virtual, so that we can do our thing and 
install the correct klass pointer before the Allocation destructor 
executes. This seems to be the least intrusive why of doing this.

http://cr.openjdk.java.net/~pliden/8227226/webrev.2

This passed function testing, but proper benchmarking remains to be done.

cheers,
Per

On 7/31/19 7:19 PM, Per Liden wrote:
> Hi,
> 
> I found some time to benchmark the "GC clears pages"-approach, and it's 
> fairly clear that it's not paying off. So ditching that idea.
> 
> However, I'm still looking for something that would not just do 
> segmented clearing of arrays in large zpages. Letting oop arrays 
> temporarily be typed arrays while it's being cleared could be an option. 
> I did a prototype for that, which looks like this:
> 
> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
> 
> There's at least one issue here, the code doing allocation sampling will 
> see that we allocated long arrays instead of oop arrays, so the 
> reporting there will be skewed. That can be addressed if we go down this 
> path. The code is otherwise fairly simple and contained. Feel free to 
> spot any issues.
> 
> cheers,
> Per
> 
> On 7/26/19 2:27 PM, Per Liden wrote:
>> Hi Ryan & Erik,
>>
>> I had a look at this and started exploring a slightly different 
>> approach. Instead doing segmented clearing in the allocation path, we 
>> can have the concurrent GC thread clear pages when they are reclaimed 
>> and not do any clearing in the allocation path at all.
>>
>> That would look like this:
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>
>> (I've had to temporarily comment out three lines of assert/debug code 
>> to make this work)
>>
>> The relocation set selection phase will now be tasked with some 
>> potentially expensive clearing work, so we'll want to make that part 
>> parallel also.
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>
>> Moving this work from Java threads onto the concurrent GC threads 
>> means we will potentially prolong the RelocationSetSelection and 
>> Relocation phases. That might be a trade-off worth doing. In return, 
>> we get:
>>
>> * Faster array allocations, as there's now less work done in the 
>> allocation path.
>> * This benefits all arrays, not just those allocated in large pages.
>> * No need to consider/tune a "chunk size".
>> * I also tend think we'll end up with slightly less complex code, that 
>> is a bit easier to reason about. Can be debated of course.
>>
>> This approach might also "survive" longer, because the YC scheme we've 
>> been loosely thinking about currently requires newly allocated pages 
>> to be cleared anyway. It's of course too early to tell if that 
>> requirement will stand in the end, but it's possible anyway.
>>
>> I'll need to do some more testing and benchmarking to make sure 
>> there's no regression or bugs here. The commented out debug code also 
>> needs to be addressed of course.
>>
>> Comments? Other ideas?
>>
>> cheers,
>> Per
>>
>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>
>>> Somehow I lost the RFR off the front and started a new thread.
>>> Now that we're both off vacation I'd like to revisit this.? Can you 
>>> take a look?
>>>
>>> ?On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" 
>>> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> 
>>> wrote:
>>>
>>> ???? http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>> ???? This shifts away from abusing the constructor do_zero magic in 
>>> exchange for virtualizing mem_clear() and specializing for the Z 
>>> version.? It does create a change in mem_clear in that it returns an 
>>> updated version of mem.? It does create change outside of the Z code 
>>> however it does feel cleaner.
>>> ???? I didn't make a change to PinAllocating - looking at it, the 
>>> safety of keeping it constructor / destructor based still seemed 
>>> appropriate to me.? If the objection is to using the sequence numbers 
>>> to pin (and instead using handles to update) - this to me seems less 
>>> error prone.? I had originally discussed handles with Stefan but the 
>>> proposal came down to this which looks much cleaner.
>>> ???? On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of Sciampacone, 
>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
>>> sci at amazon.com> wrote:
>>> ???????? 1) Yes this was a conscious decision.? There was discussion 
>>> on determining the optimal point for breakup but given the existing 
>>> sizes this seemed sufficient.? This doesn't preclude the ability to 
>>> go down that path if its deemed absolutely necessary.? The path for 
>>> more complex decisions is now available.
>>> ???????? 2) Agree
>>> ???????? 3) I'm not clear here.? Do you mean effectively going direct 
>>> to ZHeap and bypassing the single function PinAllocating?? Agree. 
>>> Otherwise I'll ask you to be a bit clearer.
>>> ???????? 4) Agree
>>> ???????? 5) I initially had the exact same reaction but I played 
>>> around with a few other versions (including breaking up 
>>> initialization points between header and body to get the desired 
>>> results) and this ended up looking correct.? I'll try mixing in the 
>>> mem clearer function again (fresh start) to see if it looks any better.
>>> ???????? On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> wrote:
>>> ???????????? Hi Ryan,
>>> ???????????? A few general comments:
>>> ???????????? 1) It looks like this still only work for large pages?
>>> ???????????? 2) The log_info stuff should be removed.
>>> ???????????? 3) I'm not a huge fan of single-use utilities like 
>>> PinAllocating, at
>>> ???????????? least not when, IMO, the alternative is more straight 
>>> forward and less code.
>>> ???????????? 4) Please make locals const when possible.
>>> ???????????? 5) Duplicating _do_zero looks odd. Injecting a "mem 
>>> clearer", similar to
>>> ???????????? what Stefans original patch did, seems worth exploring.
>>> ???????????? cheers,
>>> ???????????? /Per
>>> ???????????? (Btw, I'm on vacation so I might not be super-responsive 
>>> to emails)
>>> ???????????? On 2019-07-08 12:42, Erik ?sterlund wrote:
>>> ???????????? > Hi Ryan,
>>> ???????????? >
>>> ???????????? > This looks good in general. Just some stylistic things...
>>> ???????????? >
>>> ???????????? > 1) In the ZGC project we like the letter 'Z' so much 
>>> that we put it in
>>> ???????????? > front of everything we possibly can, including all 
>>> class names.
>>> ???????????? > 2) We also explicitly state things are private even 
>>> though it's
>>> ???????????? > bleedingly obvious.
>>> ???????????? >
>>> ???????????? > So:
>>> ???????????? >
>>> ???????????? > 39 class PinAllocating {
>>> ???????????? > 40 HeapWord* _mem;
>>> ???????????? > 41 public: -> 39 class ZPinAllocating { 40 private: 41 
>>> HeapWord* _mem;
>>> ???????????? >??? 42
>>> ???????????? >?? 41 public: I can be your sponsor and push this 
>>> change for you. I don't
>>> ???????????? > think there is a need for another webrev for my small 
>>> stylistic remarks,
>>> ???????????? > so I can just fix that before pushing this for you. On 
>>> that note, I'll
>>> ???????????? > add me and StefanK to the contributed-by section as we 
>>> all worked out
>>> ???????????? > the right solution to this problem collaboratively. I 
>>> have run through
>>> ???????????? > mach5 tier1-5, and found no issues with this patch. 
>>> Thanks, /Erik
>>> ???????????? >
>>> ???????????? > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>> ???????????? >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>> ???????????? >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>> ???????????? >>
>>> ???????????? >> This patch introduces safe point checks into array 
>>> clearing during
>>> ???????????? >> allocation for ZGC.? The patch isolates the changes 
>>> to ZGC as (in
>>> ???????????? >> particular with the more modern collectors) the 
>>> approach to
>>> ???????????? >> incrementalizing or respecting safe point checks is 
>>> going to be
>>> ???????????? >> different.
>>> ???????????? >>
>>> ???????????? >> The approach is to keep the region holding the array 
>>> in the allocating
>>> ???????????? >> state (pin logic) while updating the color to the 
>>> array after checks.
>>> ???????????? >>
>>> ???????????? >> Can I get a review?? Thanks.
>>> ???????????? >>
>>> ???????????? >> Ryan
>>> ???????????? >
>>>

From erik.osterlund at oracle.com  Thu Aug  1 15:56:10 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Thu, 1 Aug 2019 17:56:10 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <f5548322-fed7-ec09-5be2-e8207ddf6e15@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <f5548322-fed7-ec09-5be2-e8207ddf6e15@oracle.com>
Message-ID: <323E8C1C-FA9B-44E1-9C4F-7275255B3906@oracle.com>

Hi Per,

I like that this approach is unintrusive, does its thing at the right abstraction layer, and also handles medium sized arrays. Looks good.

Thanks,
/Erik

> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
> 
> Here's an updated webrev that should be complete, i.e. fixes the issues related to allocation sampling/reporting that I mentioned. This patch makes MemAllocator::finish() virtual, so that we can do our thing and install the correct klass pointer before the Allocation destructor executes. This seems to be the least intrusive why of doing this.
> 
> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
> 
> This passed function testing, but proper benchmarking remains to be done.
> 
> cheers,
> Per
> 
>> On 7/31/19 7:19 PM, Per Liden wrote:
>> Hi,
>> I found some time to benchmark the "GC clears pages"-approach, and it's fairly clear that it's not paying off. So ditching that idea.
>> However, I'm still looking for something that would not just do segmented clearing of arrays in large zpages. Letting oop arrays temporarily be typed arrays while it's being cleared could be an option. I did a prototype for that, which looks like this:
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>> There's at least one issue here, the code doing allocation sampling will see that we allocated long arrays instead of oop arrays, so the reporting there will be skewed. That can be addressed if we go down this path. The code is otherwise fairly simple and contained. Feel free to spot any issues.
>> cheers,
>> Per
>>> On 7/26/19 2:27 PM, Per Liden wrote:
>>> Hi Ryan & Erik,
>>> 
>>> I had a look at this and started exploring a slightly different approach. Instead doing segmented clearing in the allocation path, we can have the concurrent GC thread clear pages when they are reclaimed and not do any clearing in the allocation path at all.
>>> 
>>> That would look like this:
>>> 
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>> 
>>> (I've had to temporarily comment out three lines of assert/debug code to make this work)
>>> 
>>> The relocation set selection phase will now be tasked with some potentially expensive clearing work, so we'll want to make that part parallel also.
>>> 
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>> 
>>> Moving this work from Java threads onto the concurrent GC threads means we will potentially prolong the RelocationSetSelection and Relocation phases. That might be a trade-off worth doing. In return, we get:
>>> 
>>> * Faster array allocations, as there's now less work done in the allocation path.
>>> * This benefits all arrays, not just those allocated in large pages.
>>> * No need to consider/tune a "chunk size".
>>> * I also tend think we'll end up with slightly less complex code, that is a bit easier to reason about. Can be debated of course.
>>> 
>>> This approach might also "survive" longer, because the YC scheme we've been loosely thinking about currently requires newly allocated pages to be cleared anyway. It's of course too early to tell if that requirement will stand in the end, but it's possible anyway.
>>> 
>>> I'll need to do some more testing and benchmarking to make sure there's no regression or bugs here. The commented out debug code also needs to be addressed of course.
>>> 
>>> Comments? Other ideas?
>>> 
>>> cheers,
>>> Per
>>> 
>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>> 
>>>> Somehow I lost the RFR off the front and started a new thread.
>>>> Now that we're both off vacation I'd like to revisit this.  Can you take a look?
>>>> 
>>>> ?On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> wrote:
>>>> 
>>>>      http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>>>      This shifts away from abusing the constructor do_zero magic in exchange for virtualizing mem_clear() and specializing for the Z version.  It does create a change in mem_clear in that it returns an updated version of mem.  It does create change outside of the Z code however it does feel cleaner.
>>>>      I didn't make a change to PinAllocating - looking at it, the safety of keeping it constructor / destructor based still seemed appropriate to me.  If the objection is to using the sequence numbers to pin (and instead using handles to update) - this to me seems less error prone.  I had originally discussed handles with Stefan but the proposal came down to this which looks much cleaner.
>>>>      On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> wrote:
>>>>          1) Yes this was a conscious decision.  There was discussion on determining the optimal point for breakup but given the existing sizes this seemed sufficient.  This doesn't preclude the ability to go down that path if its deemed absolutely necessary.  The path for more complex decisions is now available.
>>>>          2) Agree
>>>>          3) I'm not clear here.  Do you mean effectively going direct to ZHeap and bypassing the single function PinAllocating?  Agree. Otherwise I'll ask you to be a bit clearer.
>>>>          4) Agree
>>>>          5) I initially had the exact same reaction but I played around with a few other versions (including breaking up initialization points between header and body to get the desired results) and this ended up looking correct.  I'll try mixing in the mem clearer function again (fresh start) to see if it looks any better.
>>>>          On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> wrote:
>>>>              Hi Ryan,
>>>>              A few general comments:
>>>>              1) It looks like this still only work for large pages?
>>>>              2) The log_info stuff should be removed.
>>>>              3) I'm not a huge fan of single-use utilities like PinAllocating, at
>>>>              least not when, IMO, the alternative is more straight forward and less code.
>>>>              4) Please make locals const when possible.
>>>>              5) Duplicating _do_zero looks odd. Injecting a "mem clearer", similar to
>>>>              what Stefans original patch did, seems worth exploring.
>>>>              cheers,
>>>>              /Per
>>>>              (Btw, I'm on vacation so I might not be super-responsive to emails)
>>>>              On 2019-07-08 12:42, Erik ?sterlund wrote:
>>>>              > Hi Ryan,
>>>>              >
>>>>              > This looks good in general. Just some stylistic things...
>>>>              >
>>>>              > 1) In the ZGC project we like the letter 'Z' so much that we put it in
>>>>              > front of everything we possibly can, including all class names.
>>>>              > 2) We also explicitly state things are private even though it's
>>>>              > bleedingly obvious.
>>>>              >
>>>>              > So:
>>>>              >
>>>>              > 39 class PinAllocating {
>>>>              > 40 HeapWord* _mem;
>>>>              > 41 public: -> 39 class ZPinAllocating { 40 private: 41 HeapWord* _mem;
>>>>              >    42
>>>>              >   41 public: I can be your sponsor and push this change for you. I don't
>>>>              > think there is a need for another webrev for my small stylistic remarks,
>>>>              > so I can just fix that before pushing this for you. On that note, I'll
>>>>              > add me and StefanK to the contributed-by section as we all worked out
>>>>              > the right solution to this problem collaboratively. I have run through
>>>>              > mach5 tier1-5, and found no issues with this patch. Thanks, /Erik
>>>>              >
>>>>              > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>>>              >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>>>              >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>>>              >>
>>>>              >> This patch introduces safe point checks into array clearing during
>>>>              >> allocation for ZGC.  The patch isolates the changes to ZGC as (in
>>>>              >> particular with the more modern collectors) the approach to
>>>>              >> incrementalizing or respecting safe point checks is going to be
>>>>              >> different.
>>>>              >>
>>>>              >> The approach is to keep the region holding the array in the allocating
>>>>              >> state (pin logic) while updating the color to the array after checks.
>>>>              >>
>>>>              >> Can I get a review?  Thanks.
>>>>              >>
>>>>              >> Ryan
>>>>              >
>>>> 


From leeharland at gmail.com  Fri Aug  2 08:04:23 2019
From: leeharland at gmail.com (Lee Harland)
Date: Fri, 2 Aug 2019 09:04:23 +0100
Subject: jdk11.0.2+9 G1BarrierSetC2::eliminate_gc_barrier JVM crash
In-Reply-To: <CAP63v+ZstchXUgGtWGsyz8oMJzJUsbi7fnHLyxvQfQE2MdS0mg@mail.gmail.com>
References: <CAP63v+ZstchXUgGtWGsyz8oMJzJUsbi7fnHLyxvQfQE2MdS0mg@mail.gmail.com>
Message-ID: <CAP63v+ZckH+tVNB+8gi9355A=Veni3z2UO1nTCMkUUbhuUgimw@mail.gmail.com>

Hi - i'd encountered this error while running Adopt Java9+ and posted it on
the adopt github and they suggested I post it here. Link to the specific
test case i made
https://github.com/AdoptOpenJDK/openjdk-build/issues/967#issuecomment-514122684

apologies if wrong list or not enough info etc, just doing what i was told
:)

cheers


From shade at redhat.com  Fri Aug  2 08:16:44 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 2 Aug 2019 10:16:44 +0200
Subject: jdk11.0.2+9 G1BarrierSetC2::eliminate_gc_barrier JVM crash
In-Reply-To: <CAP63v+ZckH+tVNB+8gi9355A=Veni3z2UO1nTCMkUUbhuUgimw@mail.gmail.com>
References: <CAP63v+ZstchXUgGtWGsyz8oMJzJUsbi7fnHLyxvQfQE2MdS0mg@mail.gmail.com>
 <CAP63v+ZckH+tVNB+8gi9355A=Veni3z2UO1nTCMkUUbhuUgimw@mail.gmail.com>
Message-ID: <3ce2641d-a4bd-c0eb-1ba9-51a15e218e92@redhat.com>

On 8/2/19 10:04 AM, Lee Harland wrote:
> Hi - i'd encountered this error while running Adopt Java9+ and posted it on
> the adopt github and they suggested I post it here. Link to the specific
> test case i made
> https://github.com/AdoptOpenJDK/openjdk-build/issues/967#issuecomment-514122684

This reproduces on current jdk/jdk:

$ build/linux-x86_64-server-fastdebug/images/jdk/bin/java -cp .:tinylog-1.3.5.jar JVMCrashOnJDK11

#
# A fatal error has been detected by the Java Runtime Environment:
#
#  Internal Error (/home/shade/trunks/jdk-jdk/src/hotspot/share/gc/g1/c2/g1BarrierSetC2.cpp:665),
pid=18721, tid=18748
#  assert(node->Opcode() == Op_CastP2X) failed: ConvP2XNode required
#
# JRE version: OpenJDK Runtime Environment (14.0) (fastdebug build 14-internal+0-adhoc.shade.jdk-jdk)
# Java VM: OpenJDK 64-Bit Server VM (fastdebug 14-internal+0-adhoc.shade.jdk-jdk, mixed mode,
sharing, tiered, compressed oops, g1 gc, linux-amd64)
# Problematic frame:
# V  [libjvm.so+0xbefedb]  G1BarrierSetC2::eliminate_gc_barrier(PhaseMacroExpand*, Node*) const+0x4b
#
# Core dump will be written. Default location: Core dumps may be processed with
"/usr/share/apport/apport %p %s %c %d %P" (or dumping to /home/shade/temp/gc/core.18721)
#
# If you would like to submit a bug report, please visit:
#   http://bugreport.java.com/bugreport/crash.jsp
#

V  [libjvm.so+0xbefedb]  G1BarrierSetC2::eliminate_gc_barrier(PhaseMacroExpand*, Node*) const+0x4b
V  [libjvm.so+0x1227bc0]  PhaseMacroExpand::process_users_of_allocation(CallNode*)+0x660
V  [libjvm.so+0x1228ccb]  PhaseMacroExpand::eliminate_allocate_node(AllocateNode*) [clone
.part.202]+0x32b
V  [libjvm.so+0x122e8eb]  PhaseMacroExpand::eliminate_macro_nodes()+0x5cb

Looks like a regression. Submitted:
  https://bugs.openjdk.java.net/browse/JDK-8229016

-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Fri Aug  2 09:11:26 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 2 Aug 2019 11:11:26 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <323E8C1C-FA9B-44E1-9C4F-7275255B3906@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <f5548322-fed7-ec09-5be2-e8207ddf6e15@oracle.com>
 <323E8C1C-FA9B-44E1-9C4F-7275255B3906@oracle.com>
Message-ID: <de27afe0-1613-e4d7-0914-d6bcff321ba5@oracle.com>

Hi Erik,

On 8/1/19 5:56 PM, Erik Osterlund wrote:
> Hi Per,
> 
> I like that this approach is unintrusive, does its thing at the right abstraction layer, and also handles medium sized arrays.

It even handles small arrays (i.e. arrays in small zpages) ;)

> Looks good.

Thanks! I'll test various segment sizes and see how that affects 
performance and TTSP.

cheers,
Per

> 
> Thanks,
> /Erik
> 
>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>>
>> Here's an updated webrev that should be complete, i.e. fixes the issues related to allocation sampling/reporting that I mentioned. This patch makes MemAllocator::finish() virtual, so that we can do our thing and install the correct klass pointer before the Allocation destructor executes. This seems to be the least intrusive why of doing this.
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>>
>> This passed function testing, but proper benchmarking remains to be done.
>>
>> cheers,
>> Per
>>
>>> On 7/31/19 7:19 PM, Per Liden wrote:
>>> Hi,
>>> I found some time to benchmark the "GC clears pages"-approach, and it's fairly clear that it's not paying off. So ditching that idea.
>>> However, I'm still looking for something that would not just do segmented clearing of arrays in large zpages. Letting oop arrays temporarily be typed arrays while it's being cleared could be an option. I did a prototype for that, which looks like this:
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>> There's at least one issue here, the code doing allocation sampling will see that we allocated long arrays instead of oop arrays, so the reporting there will be skewed. That can be addressed if we go down this path. The code is otherwise fairly simple and contained. Feel free to spot any issues.
>>> cheers,
>>> Per
>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>>>> Hi Ryan & Erik,
>>>>
>>>> I had a look at this and started exploring a slightly different approach. Instead doing segmented clearing in the allocation path, we can have the concurrent GC thread clear pages when they are reclaimed and not do any clearing in the allocation path at all.
>>>>
>>>> That would look like this:
>>>>
>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>>>
>>>> (I've had to temporarily comment out three lines of assert/debug code to make this work)
>>>>
>>>> The relocation set selection phase will now be tasked with some potentially expensive clearing work, so we'll want to make that part parallel also.
>>>>
>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>>>
>>>> Moving this work from Java threads onto the concurrent GC threads means we will potentially prolong the RelocationSetSelection and Relocation phases. That might be a trade-off worth doing. In return, we get:
>>>>
>>>> * Faster array allocations, as there's now less work done in the allocation path.
>>>> * This benefits all arrays, not just those allocated in large pages.
>>>> * No need to consider/tune a "chunk size".
>>>> * I also tend think we'll end up with slightly less complex code, that is a bit easier to reason about. Can be debated of course.
>>>>
>>>> This approach might also "survive" longer, because the YC scheme we've been loosely thinking about currently requires newly allocated pages to be cleared anyway. It's of course too early to tell if that requirement will stand in the end, but it's possible anyway.
>>>>
>>>> I'll need to do some more testing and benchmarking to make sure there's no regression or bugs here. The commented out debug code also needs to be addressed of course.
>>>>
>>>> Comments? Other ideas?
>>>>
>>>> cheers,
>>>> Per
>>>>
>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>>>
>>>>> Somehow I lost the RFR off the front and started a new thread.
>>>>> Now that we're both off vacation I'd like to revisit this.  Can you take a look?
>>>>>
>>>>> ?On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> wrote:
>>>>>
>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>>>>       This shifts away from abusing the constructor do_zero magic in exchange for virtualizing mem_clear() and specializing for the Z version.  It does create a change in mem_clear in that it returns an updated version of mem.  It does create change outside of the Z code however it does feel cleaner.
>>>>>       I didn't make a change to PinAllocating - looking at it, the safety of keeping it constructor / destructor based still seemed appropriate to me.  If the objection is to using the sequence numbers to pin (and instead using handles to update) - this to me seems less error prone.  I had originally discussed handles with Stefan but the proposal came down to this which looks much cleaner.
>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> wrote:
>>>>>           1) Yes this was a conscious decision.  There was discussion on determining the optimal point for breakup but given the existing sizes this seemed sufficient.  This doesn't preclude the ability to go down that path if its deemed absolutely necessary.  The path for more complex decisions is now available.
>>>>>           2) Agree
>>>>>           3) I'm not clear here.  Do you mean effectively going direct to ZHeap and bypassing the single function PinAllocating?  Agree. Otherwise I'll ask you to be a bit clearer.
>>>>>           4) Agree
>>>>>           5) I initially had the exact same reaction but I played around with a few other versions (including breaking up initialization points between header and body to get the desired results) and this ended up looking correct.  I'll try mixing in the mem clearer function again (fresh start) to see if it looks any better.
>>>>>           On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> wrote:
>>>>>               Hi Ryan,
>>>>>               A few general comments:
>>>>>               1) It looks like this still only work for large pages?
>>>>>               2) The log_info stuff should be removed.
>>>>>               3) I'm not a huge fan of single-use utilities like PinAllocating, at
>>>>>               least not when, IMO, the alternative is more straight forward and less code.
>>>>>               4) Please make locals const when possible.
>>>>>               5) Duplicating _do_zero looks odd. Injecting a "mem clearer", similar to
>>>>>               what Stefans original patch did, seems worth exploring.
>>>>>               cheers,
>>>>>               /Per
>>>>>               (Btw, I'm on vacation so I might not be super-responsive to emails)
>>>>>               On 2019-07-08 12:42, Erik ?sterlund wrote:
>>>>>               > Hi Ryan,
>>>>>               >
>>>>>               > This looks good in general. Just some stylistic things...
>>>>>               >
>>>>>               > 1) In the ZGC project we like the letter 'Z' so much that we put it in
>>>>>               > front of everything we possibly can, including all class names.
>>>>>               > 2) We also explicitly state things are private even though it's
>>>>>               > bleedingly obvious.
>>>>>               >
>>>>>               > So:
>>>>>               >
>>>>>               > 39 class PinAllocating {
>>>>>               > 40 HeapWord* _mem;
>>>>>               > 41 public: -> 39 class ZPinAllocating { 40 private: 41 HeapWord* _mem;
>>>>>               >    42
>>>>>               >   41 public: I can be your sponsor and push this change for you. I don't
>>>>>               > think there is a need for another webrev for my small stylistic remarks,
>>>>>               > so I can just fix that before pushing this for you. On that note, I'll
>>>>>               > add me and StefanK to the contributed-by section as we all worked out
>>>>>               > the right solution to this problem collaboratively. I have run through
>>>>>               > mach5 tier1-5, and found no issues with this patch. Thanks, /Erik
>>>>>               >
>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>>>>               >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>>>>               >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>>>>               >>
>>>>>               >> This patch introduces safe point checks into array clearing during
>>>>>               >> allocation for ZGC.  The patch isolates the changes to ZGC as (in
>>>>>               >> particular with the more modern collectors) the approach to
>>>>>               >> incrementalizing or respecting safe point checks is going to be
>>>>>               >> different.
>>>>>               >>
>>>>>               >> The approach is to keep the region holding the array in the allocating
>>>>>               >> state (pin logic) while updating the color to the array after checks.
>>>>>               >>
>>>>>               >> Can I get a review?  Thanks.
>>>>>               >>
>>>>>               >> Ryan
>>>>>               >
>>>>>
> 


From rkennke at redhat.com  Fri Aug  2 09:16:36 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 2 Aug 2019 11:16:36 +0200
Subject: RFR: JDK-8229002: Shenandoah: Missing node types in
 ShenandoahLoadReferenceBarrier::needs_barrier_impl()
Message-ID: <5f158b40-34c7-d650-fe05-507643a6186f@redhat.com>

JDK11 testing brought up some missing cases in our C2 optimizer. Those
are trivial and non-fatal because we do the conservative thing there
anyway. But they hold up testing.

http://cr.openjdk.java.net/~rkennke/JDK-8229002/webrev.00/

Testing: hotspot_gc_shenandoah

Ok to push to jdk/jdk?

I would also like to cherry-pick this straight into shenandoah/jdk11 to
unbreak testing. Ok with that too?

Roman


From shade at redhat.com  Fri Aug  2 09:20:26 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 2 Aug 2019 11:20:26 +0200
Subject: RFR: JDK-8229002: Shenandoah: Missing node types in
 ShenandoahLoadReferenceBarrier::needs_barrier_impl()
In-Reply-To: <5f158b40-34c7-d650-fe05-507643a6186f@redhat.com>
References: <5f158b40-34c7-d650-fe05-507643a6186f@redhat.com>
Message-ID: <d1a472ae-ebd2-0d6e-b13f-b09a8d0bdce9@redhat.com>

On 8/2/19 11:16 AM, Roman Kennke wrote:
> JDK11 testing brought up some missing cases in our C2 optimizer. Those
> are trivial and non-fatal because we do the conservative thing there
> anyway. But they hold up testing.
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8229002/webrev.00/
> 
> Testing: hotspot_gc_shenandoah
> 
> Ok to push to jdk/jdk?

Yes.

> I would also like to cherry-pick this straight into shenandoah/jdk11 to
> unbreak testing. Ok with that too?

Yes.


-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Fri Aug  2 09:40:22 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 2 Aug 2019 11:40:22 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
Message-ID: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>

Hi,

This patch does various cleanups of ZVerify, basically a post-commit 
review of JDK-8227175. The patch mostly moves some code around and 
adjusts a few names. However, there's also one bug fix and one logic change:

* ZVerify::roots_strong() didn't have a ZStatTimerDisable.

* The call to ClassLoaderDataGraph::clear_claimed_marks() was moved from 
ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and now only 
clears the claim type the iterator actually used (instead of all types).

Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0

/Per


From fujie at loongson.cn  Fri Aug  2 09:44:31 2019
From: fujie at loongson.cn (Jie Fu)
Date: Fri, 2 Aug 2019 17:44:31 +0800
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
Message-ID: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>

Hi all,

JBS:??? https://bugs.openjdk.java.net/browse/JDK-8229020
Webrev: http://cr.openjdk.java.net/~jiefu/8229020/webrev.00/

*Background*
The failure was first observed on our Loongson CPUs which allow loads 
reordering with the following test
---------------------------------------------------------
make test 
TEST="compiler/codecache/stress/UnexpectedDeoptimizationTest.java" 
CONF=fastdebug
---------------------------------------------------------

*Analysis*
The failure was caused by the loads reordering on CPUs with weak memory 
consistency.
Just imagine the following case:
 ? - If the load of _tasks[t] in line 436 is floating up before the load 
of _tasks[t] in line 432, and
 ? - the load in line 436 may read 0 and the load in line 432 may read 1,
 ? - then the if-condition in line 433 is false, so Atomic::cmpxchg 
won't be executed,
 ? - then the assert in line 436 fails.
---------------------------------------------------------
429
430 bool SubTasksDone::try_claim_task(uint t) {
431?? assert(t < _n_tasks, "bad task id.");
432?? uint old = _tasks[t];
433?? if (old == 0) {
434???? old = Atomic::cmpxchg(1u, &_tasks[t], 0u);
435?? }
436?? assert(_tasks[t] == 1, "What else?");
437?? bool res = old == 0;
438 #ifdef ASSERT
439?? if (res) {
440???? assert(_claimed < _n_tasks, "Too many tasks claimed; missing 
clear?");
441???? Atomic::inc(&_claimed);
442?? }
443 #endif
444?? return res;
445 }
446
---------------------------------------------------------

*Fix*
It would be better to insert a memory fence before line 436 to prevent 
the load from floating up.

Could you please review it?

Thanks a lot.
Best regards,
Jie


From per.liden at oracle.com  Fri Aug  2 13:11:04 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 2 Aug 2019 15:11:04 +0200
Subject: RFR: 8227226: Segmented array clearing for ZGC
In-Reply-To: <de27afe0-1613-e4d7-0914-d6bcff321ba5@oracle.com>
References: <5809FFE3-ED37-429B-9189-49D8FD14D092@amazon.com>
 <625131FC-8B02-4BEC-80B5-F1757232277A@amazon.com>
 <7C95DE35-C9F8-4F11-8D07-9511748128C4@amazon.com>
 <ec82f3db-5ced-2fd5-d584-d636fe6f824e@oracle.com>
 <62b08b98-f54c-5359-4cc0-be36d43febdd@oracle.com>
 <f5548322-fed7-ec09-5be2-e8207ddf6e15@oracle.com>
 <323E8C1C-FA9B-44E1-9C4F-7275255B3906@oracle.com>
 <de27afe0-1613-e4d7-0914-d6bcff321ba5@oracle.com>
Message-ID: <be7bbe2f-4c45-0641-5b8b-653d2037ae24@oracle.com>

Did some micro-benchmarking (on a Xeon E5-2630) with various segment 
sizes between 4K and 512K, and 64K seems to offer a good trade-off. For 
a 1G array, the allocation time increases by ~1%, but in exchange the 
worst case TTSP drops from ~280ms to ~0.6ms.

Updated webrev using 64K:

http://cr.openjdk.java.net/~pliden/8227226/webrev.3

cheers,
Per

On 8/2/19 11:11 AM, Per Liden wrote:
> Hi Erik,
> 
> On 8/1/19 5:56 PM, Erik Osterlund wrote:
>> Hi Per,
>>
>> I like that this approach is unintrusive, does its thing at the right 
>> abstraction layer, and also handles medium sized arrays.
> 
> It even handles small arrays (i.e. arrays in small zpages) ;)
> 
>> Looks good.
> 
> Thanks! I'll test various segment sizes and see how that affects 
> performance and TTSP.
> 
> cheers,
> Per
> 
>>
>> Thanks,
>> /Erik
>>
>>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>>>
>>> Here's an updated webrev that should be complete, i.e. fixes the 
>>> issues related to allocation sampling/reporting that I mentioned. 
>>> This patch makes MemAllocator::finish() virtual, so that we can do 
>>> our thing and install the correct klass pointer before the Allocation 
>>> destructor executes. This seems to be the least intrusive why of 
>>> doing this.
>>>
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>>>
>>> This passed function testing, but proper benchmarking remains to be 
>>> done.
>>>
>>> cheers,
>>> Per
>>>
>>>> On 7/31/19 7:19 PM, Per Liden wrote:
>>>> Hi,
>>>> I found some time to benchmark the "GC clears pages"-approach, and 
>>>> it's fairly clear that it's not paying off. So ditching that idea.
>>>> However, I'm still looking for something that would not just do 
>>>> segmented clearing of arrays in large zpages. Letting oop arrays 
>>>> temporarily be typed arrays while it's being cleared could be an 
>>>> option. I did a prototype for that, which looks like this:
>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>>> There's at least one issue here, the code doing allocation sampling 
>>>> will see that we allocated long arrays instead of oop arrays, so the 
>>>> reporting there will be skewed. That can be addressed if we go down 
>>>> this path. The code is otherwise fairly simple and contained. Feel 
>>>> free to spot any issues.
>>>> cheers,
>>>> Per
>>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>>>>> Hi Ryan & Erik,
>>>>>
>>>>> I had a look at this and started exploring a slightly different 
>>>>> approach. Instead doing segmented clearing in the allocation path, 
>>>>> we can have the concurrent GC thread clear pages when they are 
>>>>> reclaimed and not do any clearing in the allocation path at all.
>>>>>
>>>>> That would look like this:
>>>>>
>>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>>>>
>>>>> (I've had to temporarily comment out three lines of assert/debug 
>>>>> code to make this work)
>>>>>
>>>>> The relocation set selection phase will now be tasked with some 
>>>>> potentially expensive clearing work, so we'll want to make that 
>>>>> part parallel also.
>>>>>
>>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>>>>
>>>>> Moving this work from Java threads onto the concurrent GC threads 
>>>>> means we will potentially prolong the RelocationSetSelection and 
>>>>> Relocation phases. That might be a trade-off worth doing. In 
>>>>> return, we get:
>>>>>
>>>>> * Faster array allocations, as there's now less work done in the 
>>>>> allocation path.
>>>>> * This benefits all arrays, not just those allocated in large pages.
>>>>> * No need to consider/tune a "chunk size".
>>>>> * I also tend think we'll end up with slightly less complex code, 
>>>>> that is a bit easier to reason about. Can be debated of course.
>>>>>
>>>>> This approach might also "survive" longer, because the YC scheme 
>>>>> we've been loosely thinking about currently requires newly 
>>>>> allocated pages to be cleared anyway. It's of course too early to 
>>>>> tell if that requirement will stand in the end, but it's possible 
>>>>> anyway.
>>>>>
>>>>> I'll need to do some more testing and benchmarking to make sure 
>>>>> there's no regression or bugs here. The commented out debug code 
>>>>> also needs to be addressed of course.
>>>>>
>>>>> Comments? Other ideas?
>>>>>
>>>>> cheers,
>>>>> Per
>>>>>
>>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>>>>
>>>>>> Somehow I lost the RFR off the front and started a new thread.
>>>>>> Now that we're both off vacation I'd like to revisit this.? Can 
>>>>>> you take a look?
>>>>>>
>>>>>> ?On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, 
>>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
>>>>>> sci at amazon.com> wrote:
>>>>>>
>>>>>> ????? http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>>>>> ????? This shifts away from abusing the constructor do_zero magic 
>>>>>> in exchange for virtualizing mem_clear() and specializing for the 
>>>>>> Z version.? It does create a change in mem_clear in that it 
>>>>>> returns an updated version of mem.? It does create change outside 
>>>>>> of the Z code however it does feel cleaner.
>>>>>> ????? I didn't make a change to PinAllocating - looking at it, the 
>>>>>> safety of keeping it constructor / destructor based still seemed 
>>>>>> appropriate to me.? If the objection is to using the sequence 
>>>>>> numbers to pin (and instead using handles to update) - this to me 
>>>>>> seems less error prone.? I had originally discussed handles with 
>>>>>> Stefan but the proposal came down to this which looks much cleaner.
>>>>>> ????? On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of 
>>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on 
>>>>>> behalf of sci at amazon.com> wrote:
>>>>>> ????????? 1) Yes this was a conscious decision.? There was 
>>>>>> discussion on determining the optimal point for breakup but given 
>>>>>> the existing sizes this seemed sufficient.? This doesn't preclude 
>>>>>> the ability to go down that path if its deemed absolutely 
>>>>>> necessary.? The path for more complex decisions is now available.
>>>>>> ????????? 2) Agree
>>>>>> ????????? 3) I'm not clear here.? Do you mean effectively going 
>>>>>> direct to ZHeap and bypassing the single function PinAllocating?  
>>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>>>>>> ????????? 4) Agree
>>>>>> ????????? 5) I initially had the exact same reaction but I played 
>>>>>> around with a few other versions (including breaking up 
>>>>>> initialization points between header and body to get the desired 
>>>>>> results) and this ended up looking correct.? I'll try mixing in 
>>>>>> the mem clearer function again (fresh start) to see if it looks 
>>>>>> any better.
>>>>>> ????????? On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> 
>>>>>> wrote:
>>>>>> ????????????? Hi Ryan,
>>>>>> ????????????? A few general comments:
>>>>>> ????????????? 1) It looks like this still only work for large pages?
>>>>>> ????????????? 2) The log_info stuff should be removed.
>>>>>> ????????????? 3) I'm not a huge fan of single-use utilities like 
>>>>>> PinAllocating, at
>>>>>> ????????????? least not when, IMO, the alternative is more 
>>>>>> straight forward and less code.
>>>>>> ????????????? 4) Please make locals const when possible.
>>>>>> ????????????? 5) Duplicating _do_zero looks odd. Injecting a "mem 
>>>>>> clearer", similar to
>>>>>> ????????????? what Stefans original patch did, seems worth exploring.
>>>>>> ????????????? cheers,
>>>>>> ????????????? /Per
>>>>>> ????????????? (Btw, I'm on vacation so I might not be 
>>>>>> super-responsive to emails)
>>>>>> ????????????? On 2019-07-08 12:42, Erik ?sterlund wrote:
>>>>>> ????????????? > Hi Ryan,
>>>>>> ????????????? >
>>>>>> ????????????? > This looks good in general. Just some stylistic 
>>>>>> things...
>>>>>> ????????????? >
>>>>>> ????????????? > 1) In the ZGC project we like the letter 'Z' so 
>>>>>> much that we put it in
>>>>>> ????????????? > front of everything we possibly can, including all 
>>>>>> class names.
>>>>>> ????????????? > 2) We also explicitly state things are private 
>>>>>> even though it's
>>>>>> ????????????? > bleedingly obvious.
>>>>>> ????????????? >
>>>>>> ????????????? > So:
>>>>>> ????????????? >
>>>>>> ????????????? > 39 class PinAllocating {
>>>>>> ????????????? > 40 HeapWord* _mem;
>>>>>> ????????????? > 41 public: -> 39 class ZPinAllocating { 40 
>>>>>> private: 41 HeapWord* _mem;
>>>>>> ????????????? >??? 42
>>>>>> ????????????? >?? 41 public: I can be your sponsor and push this 
>>>>>> change for you. I don't
>>>>>> ????????????? > think there is a need for another webrev for my 
>>>>>> small stylistic remarks,
>>>>>> ????????????? > so I can just fix that before pushing this for 
>>>>>> you. On that note, I'll
>>>>>> ????????????? > add me and StefanK to the contributed-by section 
>>>>>> as we all worked out
>>>>>> ????????????? > the right solution to this problem 
>>>>>> collaboratively. I have run through
>>>>>> ????????????? > mach5 tier1-5, and found no issues with this 
>>>>>> patch. Thanks, /Erik
>>>>>> ????????????? >
>>>>>> ????????????? > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>>>>> ????????????? >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>>>>> ????????????? >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>>>>> ????????????? >>
>>>>>> ????????????? >> This patch introduces safe point checks into 
>>>>>> array clearing during
>>>>>> ????????????? >> allocation for ZGC.? The patch isolates the 
>>>>>> changes to ZGC as (in
>>>>>> ????????????? >> particular with the more modern collectors) the 
>>>>>> approach to
>>>>>> ????????????? >> incrementalizing or respecting safe point checks 
>>>>>> is going to be
>>>>>> ????????????? >> different.
>>>>>> ????????????? >>
>>>>>> ????????????? >> The approach is to keep the region holding the 
>>>>>> array in the allocating
>>>>>> ????????????? >> state (pin logic) while updating the color to the 
>>>>>> array after checks.
>>>>>> ????????????? >>
>>>>>> ????????????? >> Can I get a review?? Thanks.
>>>>>> ????????????? >>
>>>>>> ????????????? >> Ryan
>>>>>> ????????????? >
>>>>>>
>>

From thomas.schatzl at oracle.com  Fri Aug  2 15:39:54 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 2 Aug 2019 08:39:54 -0700
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
In-Reply-To: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
References: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
Message-ID: <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>

Hi Jie,

On 02.08.19 02:44, Jie Fu wrote:
> Hi all,
> 
> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8229020
> Webrev: http://cr.openjdk.java.net/~jiefu/8229020/webrev.00/
> 
> *Background*
> The failure was first observed on our Loongson CPUs which allow loads 
> reordering with the following test
> ---------------------------------------------------------
> make test 
> TEST="compiler/codecache/stress/UnexpectedDeoptimizationTest.java" 
> CONF=fastdebug
> ---------------------------------------------------------
> 
> *Analysis*
> The failure was caused by the loads reordering on CPUs with weak memory 
> consistency.
> Just imagine the following case:
>  ? - If the load of _tasks[t] in line 436 is floating up before the load 
> of _tasks[t] in line 432, and
>  ? - the load in line 436 may read 0 and the load in line 432 may read 1,
>  ? - then the if-condition in line 433 is false, so Atomic::cmpxchg 
> won't be executed,
>  ? - then the assert in line 436 fails.
> ---------------------------------------------------------
> 429
> 430 bool SubTasksDone::try_claim_task(uint t) {
> 431?? assert(t < _n_tasks, "bad task id.");
> 432?? uint old = _tasks[t];
> 433?? if (old == 0) {
> 434???? old = Atomic::cmpxchg(1u, &_tasks[t], 0u);
> 435?? }
> 436?? assert(_tasks[t] == 1, "What else?");
> 437?? bool res = old == 0;
> 438 #ifdef ASSERT
> 439?? if (res) {
> 440???? assert(_claimed < _n_tasks, "Too many tasks claimed; missing 
> clear?");
> 441???? Atomic::inc(&_claimed);
> 442?? }
> 443 #endif
> 444?? return res;
> 445 }
> 446
> ---------------------------------------------------------
> 
> *Fix*
> It would be better to insert a memory fence before line 436 to prevent 
> the load from floating up.
> 
> Could you please review it?

   remove the assert. It's old paranoid code adding no further 
information imho.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Fri Aug  2 15:46:07 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 2 Aug 2019 08:46:07 -0700
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <7df6ab28-aa69-c3c7-d2f8-18a34c0304c5@oracle.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <7df6ab28-aa69-c3c7-d2f8-18a34c0304c5@oracle.com>
Message-ID: <a8f16e2e-2628-b986-92b7-05da6f6bf528@oracle.com>

Hi all,

On 31.07.19 09:44, Thomas Schatzl wrote:
> Hi,
> 
> On 29.07.19 07:41, Tony Printezis wrote:
>> Hi Thomas,
>>
>> Latest webrev here:
>>
>> http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/
>>
>> Main change: I renamed the PreGCValues class to PreGenGCValues so that 
>> it?s clear it?s mainly for generational GCs.
>>
> 
>  ? please also update the file name containing only PreGenGCValues.
> 
> It would also be useful to provide a diff webrev from the last version; 
> using Mercurial mq and webrev -r XX may help with maintaining all 
> changes that could be folded before pushing.
> 

   talked to Tony about this and we agreed to keep it as is because 
there will be support for non-generational collectors.

> Looks good otherwise.


Thanks,
   Thomas


From kim.barrett at oracle.com  Fri Aug  2 20:30:21 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 2 Aug 2019 16:30:21 -0400
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
In-Reply-To: <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>
References: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
 <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>
Message-ID: <80564FF3-F7B2-43AC-B31F-29E1F38B5069@oracle.com>

> On Aug 2, 2019, at 11:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Jie,
> 
> On 02.08.19 02:44, Jie Fu wrote:
>> Hi all,
>> JBS:    https://bugs.openjdk.java.net/browse/JDK-8229020
>> Webrev: http://cr.openjdk.java.net/~jiefu/8229020/webrev.00/
>> *Background*
>> The failure was first observed on our Loongson CPUs which allow loads reordering with the following test
>> ---------------------------------------------------------
>> make test TEST="compiler/codecache/stress/UnexpectedDeoptimizationTest.java" CONF=fastdebug
>> ---------------------------------------------------------
>> *Analysis*
>> The failure was caused by the loads reordering on CPUs with weak memory consistency.
>> Just imagine the following case:
>>   - If the load of _tasks[t] in line 436 is floating up before the load of _tasks[t] in line 432, and
>>   - the load in line 436 may read 0 and the load in line 432 may read 1,
>>   - then the if-condition in line 433 is false, so Atomic::cmpxchg won't be executed,
>>   - then the assert in line 436 fails.
>> ---------------------------------------------------------
>> 429
>> 430 bool SubTasksDone::try_claim_task(uint t) {
>> 431   assert(t < _n_tasks, "bad task id.");
>> 432   uint old = _tasks[t];
>> 433   if (old == 0) {
>> 434     old = Atomic::cmpxchg(1u, &_tasks[t], 0u);
>> 435   }
>> 436   assert(_tasks[t] == 1, "What else?");
>> 437   bool res = old == 0;
>> 438 #ifdef ASSERT
>> 439   if (res) {
>> 440     assert(_claimed < _n_tasks, "Too many tasks claimed; missing clear?");
>> 441     Atomic::inc(&_claimed);
>> 442   }
>> 443 #endif
>> 444   return res;
>> 445 }
>> 446
>> ---------------------------------------------------------
>> *Fix*
>> It would be better to insert a memory fence before line 436 to prevent the load from floating up.
>> Could you please review it?
> 
>  remove the assert. It's old paranoid code adding no further information imho.
> 
> Thanks,
>  Thomas

I agree with Thomas; remove the assert.  I'm slightly surprised this
hasn't come up before with other weakly ordered platforms.


From fujie at loongson.cn  Sat Aug  3 01:27:05 2019
From: fujie at loongson.cn (Jie Fu)
Date: Sat, 3 Aug 2019 09:27:05 +0800
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
In-Reply-To: <80564FF3-F7B2-43AC-B31F-29E1F38B5069@oracle.com>
References: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
 <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>
 <80564FF3-F7B2-43AC-B31F-29E1F38B5069@oracle.com>
Message-ID: <c3ef2620-52a3-1cbb-50fa-4b7e9e828040@loongson.cn>

Hi Thomas and Kim,

Thanks for your review and nice suggestion.

Updated: http://cr.openjdk.java.net/~jiefu/8229020/webrev.02/

I had removed the assert and added reviewers in the patch.
I need a sponsor. Could someone help to push it?

Thanks a lot.
Best regards,
Jie

On 2019/8/3 ??4:30, Kim Barrett wrote:
>> On Aug 2, 2019, at 11:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi Jie,
>>
>> On 02.08.19 02:44, Jie Fu wrote:
>>> Hi all,
>>> JBS:    https://bugs.openjdk.java.net/browse/JDK-8229020
>>> Webrev: http://cr.openjdk.java.net/~jiefu/8229020/webrev.00/
>>> *Background*
>>> The failure was first observed on our Loongson CPUs which allow loads reordering with the following test
>>> ---------------------------------------------------------
>>> make test TEST="compiler/codecache/stress/UnexpectedDeoptimizationTest.java" CONF=fastdebug
>>> ---------------------------------------------------------
>>> *Analysis*
>>> The failure was caused by the loads reordering on CPUs with weak memory consistency.
>>> Just imagine the following case:
>>>    - If the load of _tasks[t] in line 436 is floating up before the load of _tasks[t] in line 432, and
>>>    - the load in line 436 may read 0 and the load in line 432 may read 1,
>>>    - then the if-condition in line 433 is false, so Atomic::cmpxchg won't be executed,
>>>    - then the assert in line 436 fails.
>>> ---------------------------------------------------------
>>> 429
>>> 430 bool SubTasksDone::try_claim_task(uint t) {
>>> 431   assert(t < _n_tasks, "bad task id.");
>>> 432   uint old = _tasks[t];
>>> 433   if (old == 0) {
>>> 434     old = Atomic::cmpxchg(1u, &_tasks[t], 0u);
>>> 435   }
>>> 436   assert(_tasks[t] == 1, "What else?");
>>> 437   bool res = old == 0;
>>> 438 #ifdef ASSERT
>>> 439   if (res) {
>>> 440     assert(_claimed < _n_tasks, "Too many tasks claimed; missing clear?");
>>> 441     Atomic::inc(&_claimed);
>>> 442   }
>>> 443 #endif
>>> 444   return res;
>>> 445 }
>>> 446
>>> ---------------------------------------------------------
>>> *Fix*
>>> It would be better to insert a memory fence before line 436 to prevent the load from floating up.
>>> Could you please review it?
>>   remove the assert. It's old paranoid code adding no further information imho.
>>
>> Thanks,
>>   Thomas
> I agree with Thomas; remove the assert.  I'm slightly surprised this
> hasn't come up before with other weakly ordered platforms.
>


From thomas.schatzl at oracle.com  Sat Aug  3 01:29:07 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 2 Aug 2019 18:29:07 -0700
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
In-Reply-To: <c3ef2620-52a3-1cbb-50fa-4b7e9e828040@loongson.cn>
References: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
 <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>
 <80564FF3-F7B2-43AC-B31F-29E1F38B5069@oracle.com>
 <c3ef2620-52a3-1cbb-50fa-4b7e9e828040@loongson.cn>
Message-ID: <5bc48d2f-33f3-f042-16c9-f2e08623618d@oracle.com>

Hi,

On 02.08.19 18:27, Jie Fu wrote:
> Hi Thomas and Kim,
> 
> Thanks for your review and nice suggestion.
> 
> Updated: http://cr.openjdk.java.net/~jiefu/8229020/webrev.02/
> 
> I had removed the assert and added reviewers in the patch.
> I need a sponsor. Could someone help to push it?
> 

   looks good. Unless Kim gets to it earlier, I can push it on Monday.

Thomas


From fujie at loongson.cn  Sat Aug  3 01:35:19 2019
From: fujie at loongson.cn (Jie Fu)
Date: Sat, 3 Aug 2019 09:35:19 +0800
Subject: RFR: 8229020: Failure on CPUs allowing loads reordering:
 assert(_tasks[t] == 1) failed: What else?
In-Reply-To: <5bc48d2f-33f3-f042-16c9-f2e08623618d@oracle.com>
References: <9dc6f2dc-4880-161f-b23b-045c6c252dd8@loongson.cn>
 <4e221bfe-8ed6-24a1-7dae-cc8cc76579e0@oracle.com>
 <80564FF3-F7B2-43AC-B31F-29E1F38B5069@oracle.com>
 <c3ef2620-52a3-1cbb-50fa-4b7e9e828040@loongson.cn>
 <5bc48d2f-33f3-f042-16c9-f2e08623618d@oracle.com>
Message-ID: <15742de1-75d1-6c9c-7e9a-b4377b4be309@loongson.cn>

Thanks again for your help.

On 2019/8/3 ??9:29, Thomas Schatzl wrote:
> Hi,
>
> On 02.08.19 18:27, Jie Fu wrote:
>> Hi Thomas and Kim,
>>
>> Thanks for your review and nice suggestion.
>>
>> Updated: http://cr.openjdk.java.net/~jiefu/8229020/webrev.02/
>>
>> I had removed the assert and added reviewers in the patch.
>> I need a sponsor. Could someone help to push it?
>>
>
> ? looks good. Unless Kim gets to it earlier, I can push it on Monday.
>
> Thomas


From kim.barrett at oracle.com  Sat Aug  3 08:43:15 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 3 Aug 2019 04:43:15 -0400
Subject: RFR: 8229044: G1RedirtyCardsQueueSet should be local to a collection 
Message-ID: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>

Please review this change to the use of G1RedirtyCardsQueueSet.
Rather than a singleton instance in the G1CollectedHeap that is reused
by each collection pause, we now (stack) allocate one for use by the
current collection pause.

CR:
https://bugs.openjdk.java.net/browse/JDK-8229044

Webrev:
http://cr.openjdk.java.net/~kbarrett/8229044/open.00/

Testing:
mach5 tier1-5


From thomas.schatzl at oracle.com  Sat Aug  3 19:10:17 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 3 Aug 2019 12:10:17 -0700
Subject: RFR: 8229044: G1RedirtyCardsQueueSet should be local to a
 collection
In-Reply-To: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>
References: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>
Message-ID: <19ad902e-0974-79fd-a0c6-67d53d686dcd@oracle.com>

Hi,

On 03.08.19 01:43, Kim Barrett wrote:
> Please review this change to the use of G1RedirtyCardsQueueSet.
> Rather than a singleton instance in the G1CollectedHeap that is reused
> by each collection pause, we now (stack) allocate one for use by the
> current collection pause.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8229044
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8229044/open.00/
> 
> Testing:
> mach5 tier1-5

   looks good. Thanks.

I was wondering whether at some point we should extract all transient GC 
state coupled with the actual algorithms applied into a separate class. 
But that's probably something for the future :)

Thomas


From kim.barrett at oracle.com  Sat Aug  3 19:22:21 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 3 Aug 2019 15:22:21 -0400
Subject: RFR: 8229044: G1RedirtyCardsQueueSet should be local to a
 collection
In-Reply-To: <19ad902e-0974-79fd-a0c6-67d53d686dcd@oracle.com>
References: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>
 <19ad902e-0974-79fd-a0c6-67d53d686dcd@oracle.com>
Message-ID: <1BA16061-A0AA-4022-B9FB-D5AB11A8924B@oracle.com>

> On Aug 3, 2019, at 3:10 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 03.08.19 01:43, Kim Barrett wrote:
>> Please review this change to the use of G1RedirtyCardsQueueSet.
>> Rather than a singleton instance in the G1CollectedHeap that is reused
>> by each collection pause, we now (stack) allocate one for use by the
>> current collection pause.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8229044
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8229044/open.00/
>> Testing:
>> mach5 tier1-5
> 
>  looks good. Thanks.
> 
> I was wondering whether at some point we should extract all transient GC state coupled with the actual algorithms applied into a separate class. But that's probably something for the future :)
> 
> Thomas

Thanks.

I thought about putting the redirty set directly in the ParScanThreadStateSet with an
accessor and passing that set to more places, but that seemed like it would make it
more difficult to understand the usage of the ParScanThreadState[Set].

I also thought about putting it in the EvacuationInfo, but what?s there seems to be
accounting stuff and not otherwise interesting data structures.  And again, I?d probably
prefer to extract the redirty set to pass into call trees that need it and not all the other
stuff.

I think part of the problem is that there?s just a lot of varied state shared between various
largish pieces of the collector.  Finding ways to reduce that would be nice, but detangling
is often hard work.


From thomas.schatzl at oracle.com  Sat Aug  3 19:27:07 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 3 Aug 2019 12:27:07 -0700
Subject: RFR (XS): Optimize branch frequency of G1's write post-barrier in
 C2
In-Reply-To: <CA+w6HxbB7iV7sDo2SLrhkQaQrGx3MrQG3SXrH5pFCQ2SMvBtjA@mail.gmail.com>
References: <CA+w6HxZ6pVzNhdJ9r7DOX75gpS4fvOFmuf5UJqvBhK=PxKWgDg@mail.gmail.com>
 <ca559898d832da3dfde9bef800f39e9a7c9c6b45.camel@oracle.com>
 <CA+w6HxbB7iV7sDo2SLrhkQaQrGx3MrQG3SXrH5pFCQ2SMvBtjA@mail.gmail.com>
Message-ID: <41520e0a-671a-de55-24ab-6615fc456459@oracle.com>

ping at compiler team to have a quick look.

Thanks,
   Thomas

On 11.07.19 16:35, Man Cao wrote:
> Thanks Thomas for the review and running experiments!
> 
>  > - can you share the code changes to generate the statistics? It would
>  > be nice to confirm these on a few more applications and play around
>  > with them a bit :)
>  > I would like to confirm some very old numbers we have for other older
>  > benchmarks that this is indeed the best probabibility distribution.
>  > Particularly I do not understand that from these numbers we did not
>  > change the probabilities as you suggested :( There were other changes
>  > mostly related to barrier elision in that time frame, but it seems
>  > likelihood changes were not attempted.
> 
> It is here: http://cr.openjdk.java.net/~manc/8225776/branch_profiling/
> I also added a comment in 
> https://bugs.openjdk.java.net/browse/JDK-8225776 to clarify the methodology.
> 
>  > - these numbers (and yours) also indicate that the not-young check is
>  > very likely to be not taken (i.e. you jump over the storeload). Did you
>  > also perform some experiments changing the order a bit?
>  > It might be detrimental for this particular case where the StoreLoad is
>  > expensive, and the xor/non-null filter out at least some additional of
>  > those, but maybe
>  > if (young) -> exit
>  > if (different-region) -> exit
>  > if (non-null) -> exit
>  > StoreLoad
>  > ...
>  > may be better to do? I am aware that the "young" check adds a load,
>  > which is also expensive (but not as much as the StoreLoad), but it
>  > seems to be an interesting case to look at.
>  >
>  > In our old results (as far as I can interpret them) it did not seem to
>  > have any advantage/disadvantage, so I am just curious whether you did
>  > such tests and their conclusion.
> 
> Yes, I did this experiment. The load from card table on the fast path 
> turns out to be expensive for several benchmarks:
> https://cr.openjdk.java.net/~manc/8225776/20190516-jdk11G1WriteBarrier-dacapoDefault4G-YoungCheckFirst.html
> For this experiment, I was setting 4G heap with -XX:NewRatio=1, so most 
> writes happen to young object, and GC happens very infrequently.
> The implementation had some bug that some benchmarks crashed while 
> running. I didn't look into fixing the bug, as this direction does not 
> seem worthwhile.
> 
>  > - internal (quick) perf testing showed no overall score changes, except
>  > that maxJOPS on SpecJBB2015 seemed to improve by ~1.2% (only had time
>  > for very few experiments at this time, will rerun, so there is some
>  > chance that this has been a fluke) which is definitely nice.
> 
> Good to hear that!
> -Man


From thomas.schatzl at oracle.com  Sat Aug  3 20:37:25 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Sat, 3 Aug 2019 13:37:25 -0700
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
Message-ID: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>

Hi all,

   as already discussed during the OCW last week the Oracle garbage 
collection team is set to remove the CMS collector from OpenJDK for the 
reasons stated there and in the JEP in JDK 14.

I wrote up a first draft available at

https://bugs.openjdk.java.net/browse/JDK-8229049

Comments and reviewers to move it along appreciated ;)

Thanks,
   Thomas


From per.liden at oracle.com  Mon Aug  5 10:08:44 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 12:08:44 +0200
Subject: RFR: 8229127: Make some methods in the allocation path non-virtual
Message-ID: <7c536ddb-cc72-fc13-bbb0-5cccdfcf7b21@oracle.com>

Some virtual methods in the allocation path are no longer overridden by 
any GC, so we can make them non-virtual until such a need arise again. 
Keeping CollectedHeap::array_allocate() virtual because ZGC still wants 
to override that path (as part of JDK-8227226, which is currently out 
for review).

Bug: https://bugs.openjdk.java.net/browse/JDK-8229127
Webrev: http://cr.openjdk.java.net/~pliden/8229127/webrev.0

/Per


From per.liden at oracle.com  Mon Aug  5 11:37:45 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 13:37:45 +0200
Subject: RFR: 8229128: ZGC: Remove unused ZThreadRootsIterator
Message-ID: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>

ZThreadRootsIterator is no longer used and can be removed. Trivial?

Bug: https://bugs.openjdk.java.net/browse/JDK-8229128
Webrev: http://cr.openjdk.java.net/~pliden/8229128/webrev.0

/Per


From erik.osterlund at oracle.com  Mon Aug  5 11:50:56 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 5 Aug 2019 13:50:56 +0200
Subject: RFR: 8229127: Make some methods in the allocation path non-virtual
In-Reply-To: <7c536ddb-cc72-fc13-bbb0-5cccdfcf7b21@oracle.com>
References: <7c536ddb-cc72-fc13-bbb0-5cccdfcf7b21@oracle.com>
Message-ID: <83fe853c-c45d-1e62-2aa2-a09cb4b3b365@oracle.com>

Hi Per,

Looks good and trivial.

Thanks,
/Erik

On 2019-08-05 12:08, Per Liden wrote:
> Some virtual methods in the allocation path are no longer overridden 
> by any GC, so we can make them non-virtual until such a need arise 
> again. Keeping CollectedHeap::array_allocate() virtual because ZGC 
> still wants to override that path (as part of JDK-8227226, which is 
> currently out for review).
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229127
> Webrev: http://cr.openjdk.java.net/~pliden/8229127/webrev.0
>
> /Per


From per.liden at oracle.com  Mon Aug  5 11:50:07 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 13:50:07 +0200
Subject: RFR: 8229129: ZGC: Fix incorrect format string for doubles
Message-ID: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>

ZGC sometimes prints doubles with an incorrect format string, "%lf" 
instead of "%f". The "l" doesn't cause any problems, but it also has no 
meaning when printing doubles, so it should be removed.

Bug: https://bugs.openjdk.java.net/browse/JDK-8229129
Webrev: http://cr.openjdk.java.net/~pliden/8229129/webrev.0

/Per


From erik.osterlund at oracle.com  Mon Aug  5 11:51:44 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 5 Aug 2019 13:51:44 +0200
Subject: RFR: 8229128: ZGC: Remove unused ZThreadRootsIterator
In-Reply-To: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
References: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
Message-ID: <fc990588-d92a-79e9-bdaf-52f5d46d0ef0@oracle.com>

Hi Per,

Looks good and trivial.

Thanks,
/Erik

On 2019-08-05 13:37, Per Liden wrote:
> ZThreadRootsIterator is no longer used and can be removed. Trivial?
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229128
> Webrev: http://cr.openjdk.java.net/~pliden/8229128/webrev.0
>
> /Per


From per.liden at oracle.com  Mon Aug  5 11:56:35 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 13:56:35 +0200
Subject: RFR: 8229128: ZGC: Remove unused ZThreadRootsIterator
In-Reply-To: <fc990588-d92a-79e9-bdaf-52f5d46d0ef0@oracle.com>
References: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
 <fc990588-d92a-79e9-bdaf-52f5d46d0ef0@oracle.com>
Message-ID: <098c0a59-7381-f3c8-7126-4ccb966b6dd4@oracle.com>

Thanks Erik!

/Per

On 8/5/19 1:51 PM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks good and trivial.
> 
> Thanks,
> /Erik
> 
> On 2019-08-05 13:37, Per Liden wrote:
>> ZThreadRootsIterator is no longer used and can be removed. Trivial?
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229128
>> Webrev: http://cr.openjdk.java.net/~pliden/8229128/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Mon Aug  5 11:55:53 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 13:55:53 +0200
Subject: RFR: 8229127: Make some methods in the allocation path non-virtual
In-Reply-To: <83fe853c-c45d-1e62-2aa2-a09cb4b3b365@oracle.com>
References: <7c536ddb-cc72-fc13-bbb0-5cccdfcf7b21@oracle.com>
 <83fe853c-c45d-1e62-2aa2-a09cb4b3b365@oracle.com>
Message-ID: <87b4c3b3-f16a-3778-2e46-f3e09083618c@oracle.com>

Thanks Erik!

/Per

On 8/5/19 1:50 PM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks good and trivial.
> 
> Thanks,
> /Erik
> 
> On 2019-08-05 12:08, Per Liden wrote:
>> Some virtual methods in the allocation path are no longer overridden 
>> by any GC, so we can make them non-virtual until such a need arise 
>> again. Keeping CollectedHeap::array_allocate() virtual because ZGC 
>> still wants to override that path (as part of JDK-8227226, which is 
>> currently out for review).
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229127
>> Webrev: http://cr.openjdk.java.net/~pliden/8229127/webrev.0
>>
>> /Per
> 


From shade at redhat.com  Mon Aug  5 13:36:09 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 5 Aug 2019 15:36:09 +0200
Subject: RFR (XS) 8229134: [TESTBUG] 32-bit build fails
 gc/arguments/TestSurvivorAlignmentInBytesOption.java after JDK-8228855
Message-ID: <8c694fae-1f0d-ec6f-ae1e-de79ad5eef45@redhat.com>

Testbug:
  https://bugs.openjdk.java.net/browse/JDK-8229134

ObjectAlignmentInBytes is not available on 32-bit VMs. So the fix it to check for that before trying:
  http://cr.openjdk.java.net/~shade/8229134/webrev.01/

Testing: affected tests on x86_32, x86_64

-- 
Thanks,
-Aleksey


From per.liden at oracle.com  Mon Aug  5 13:47:54 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 15:47:54 +0200
Subject: RFR: 8229135: ZGC: Adding missing ZStatTimerDisable before call to
 ZVerify::roots_strong()
Message-ID: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>

ZVerify::roots_strong() is called outside of a ZStatTimerDisable scope, 
which means the root scanning stat counters/samplers will be polluted.

(This fix was originally part of JDK-8229017, but Stefan asked me to 
break this out into a separate fix)

Bug: https://bugs.openjdk.java.net/browse/JDK-8229135
Webrev: http://cr.openjdk.java.net/~pliden/8229135/webrev.0

/Per


From per.liden at oracle.com  Mon Aug  5 13:52:35 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 15:52:35 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
In-Reply-To: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
References: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
Message-ID: <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>

Stefan asked me to break out the ZStatTimerDisable fix into a separate 
fix, which I did (JDK-8229135), so here's an updated webrev without that 
part:

http://cr.openjdk.java.net/~pliden/8229017/webrev.1

/Per

On 8/2/19 11:40 AM, Per Liden wrote:
> Hi,
> 
> This patch does various cleanups of ZVerify, basically a post-commit 
> review of JDK-8227175. The patch mostly moves some code around and 
> adjusts a few names. However, there's also one bug fix and one logic 
> change:
> 
> * ZVerify::roots_strong() didn't have a ZStatTimerDisable.
> 
> * The call to ClassLoaderDataGraph::clear_claimed_marks() was moved from 
> ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and now only 
> clears the claim type the iterator actually used (instead of all types).
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
> Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0
> 
> /Per


From stefan.karlsson at oracle.com  Mon Aug  5 14:01:27 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 5 Aug 2019 16:01:27 +0200
Subject: RFR: 8229135: ZGC: Adding missing ZStatTimerDisable before call
 to ZVerify::roots_strong()
In-Reply-To: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
References: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
Message-ID: <77d9866e-1a57-8838-997e-5eb71e708434@oracle.com>

Looks good.

StefanK

On 2019-08-05 15:47, Per Liden wrote:
> ZVerify::roots_strong() is called outside of a ZStatTimerDisable scope, 
> which means the root scanning stat counters/samplers will be polluted.
> 
> (This fix was originally part of JDK-8229017, but Stefan asked me to 
> break this out into a separate fix)
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229135
> Webrev: http://cr.openjdk.java.net/~pliden/8229135/webrev.0
> 
> /Per


From stefan.karlsson at oracle.com  Mon Aug  5 14:03:15 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 5 Aug 2019 16:03:15 +0200
Subject: RFR: 8229128: ZGC: Remove unused ZThreadRootsIterator
In-Reply-To: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
References: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
Message-ID: <6c87b3ff-0c7e-aa20-40b0-791946693d14@oracle.com>

Looks good.

StefanK

On 2019-08-05 13:37, Per Liden wrote:
> ZThreadRootsIterator is no longer used and can be removed. Trivial?
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229128
> Webrev: http://cr.openjdk.java.net/~pliden/8229128/webrev.0
> 
> /Per


From per.liden at oracle.com  Mon Aug  5 14:04:05 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 16:04:05 +0200
Subject: RFR: 8229135: ZGC: Adding missing ZStatTimerDisable before call
 to ZVerify::roots_strong()
In-Reply-To: <77d9866e-1a57-8838-997e-5eb71e708434@oracle.com>
References: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
 <77d9866e-1a57-8838-997e-5eb71e708434@oracle.com>
Message-ID: <1d27fe13-523b-598e-57c9-03ca16637ef6@oracle.com>

Thanks Stefan!

/Per

On 8/5/19 4:01 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-05 15:47, Per Liden wrote:
>> ZVerify::roots_strong() is called outside of a ZStatTimerDisable 
>> scope, which means the root scanning stat counters/samplers will be 
>> polluted.
>>
>> (This fix was originally part of JDK-8229017, but Stefan asked me to 
>> break this out into a separate fix)
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229135
>> Webrev: http://cr.openjdk.java.net/~pliden/8229135/webrev.0
>>
>> /Per


From per.liden at oracle.com  Mon Aug  5 14:04:20 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 16:04:20 +0200
Subject: RFR: 8229128: ZGC: Remove unused ZThreadRootsIterator
In-Reply-To: <6c87b3ff-0c7e-aa20-40b0-791946693d14@oracle.com>
References: <77b7a1be-7965-6fca-a6c3-8f05b0e5b168@oracle.com>
 <6c87b3ff-0c7e-aa20-40b0-791946693d14@oracle.com>
Message-ID: <9023184c-a1e2-3352-e724-3588de998776@oracle.com>

Thanks Stefan!

/Per

On 8/5/19 4:03 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-05 13:37, Per Liden wrote:
>> ZThreadRootsIterator is no longer used and can be removed. Trivial?
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229128
>> Webrev: http://cr.openjdk.java.net/~pliden/8229128/webrev.0
>>
>> /Per


From stefan.karlsson at oracle.com  Mon Aug  5 14:07:15 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 5 Aug 2019 16:07:15 +0200
Subject: RFR: 8229129: ZGC: Fix incorrect format string for doubles
In-Reply-To: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
References: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
Message-ID: <1039fd75-3b7a-0279-87da-952032088e68@oracle.com>

Looks good.

StefanK

On 2019-08-05 13:50, Per Liden wrote:
> ZGC sometimes prints doubles with an incorrect format string, "%lf" 
> instead of "%f". The "l" doesn't cause any problems, but it also has no 
> meaning when printing doubles, so it should be removed.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229129
> Webrev: http://cr.openjdk.java.net/~pliden/8229129/webrev.0
> 
> /Per


From per.liden at oracle.com  Mon Aug  5 14:08:05 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 5 Aug 2019 16:08:05 +0200
Subject: RFR: 8229129: ZGC: Fix incorrect format string for doubles
In-Reply-To: <1039fd75-3b7a-0279-87da-952032088e68@oracle.com>
References: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
 <1039fd75-3b7a-0279-87da-952032088e68@oracle.com>
Message-ID: <a54ddb79-2b6b-3008-1b92-dd2f351aeec8@oracle.com>

Thanks Stefan!

/Per

On 8/5/19 4:07 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-05 13:50, Per Liden wrote:
>> ZGC sometimes prints doubles with an incorrect format string, "%lf" 
>> instead of "%f". The "l" doesn't cause any problems, but it also has 
>> no meaning when printing doubles, so it should be removed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229129
>> Webrev: http://cr.openjdk.java.net/~pliden/8229129/webrev.0
>>
>> /Per


From leo.korinth at oracle.com  Mon Aug  5 15:41:56 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 5 Aug 2019 17:41:56 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
Message-ID: <2d970c64-0c42-8213-6432-754d73839783@oracle.com>

Hi!

On 20/07/2019 04:12, Kim Barrett wrote:
>> On May 27, 2019, at 12:30 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> Hi,
>>
>> Here is the fifth patch in a proposed patch series of eight that
>> removes gcTaskManager and uses the WorkGang abstraction instead.
>>
>> ScavengeRootsTask, ThreadRootsTask and OldToYoungRootsTask is replaced
>> with ScavengeRootsTask. Code is basically the same, the major
>> difference is that roots are visited using EnumClaimer and
>> Threads::possibly_parallel_threads_do. Here we can reuse the RootType
>> and EnumClaimer from patch number two.
>>
>> The reason "case threads:" was removed is that the code is dead. That
>> part is confusing as the code is done (in parallel from the calling
>> function).
>>
>> Enhancement:
>>   https://bugs.openjdk.java.net/browse/JDK-8224663
>>
>> Webrev:
>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask/
>>   http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>
>> Testing (on the whole patch series):
>>   mach5 remote-build-and-test --build-profiles linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>   gc test suite
>>
>> Thanks,
>> Leo
> 
> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask
> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-1
> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-2
> 
> Looks good, other than a couple formatting nits and some stale comments.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 55 #if INCLUDE_JVMCI
> 56 #include "jvmci/jvmci.hpp"
> 57 #endif
> 
> Conditional includes go at the end, per the style guide. There are
> probably counter-examples :)

Will correct my "fix".

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 337 };
> 338 //
> 339 // OldToYoungRootsTask
> 
> Add a blank line between the class and the block comment.

Will fix.

> ------------------------------------------------------------------------------
> 393   ScavengeRootsTask(
> 394     PSOldGen* old_gen,
> 395     HeapWord* gen_top,
> 396     uint active_workers,
> 397     bool is_empty) :
> 
> It's more usual to line these up with the "(" with the first parameter
> on the same line as the function name.

Will fix.

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 339 // OldToYoungRootsTask
> 340 //
> 341 // This task is used to scan old to young roots in parallel
> 
> This comment seems like it needs some update, and maybe is misplaced?
> This seems like it's really a description of scavenge_contents_parallel,
> in conjunction with the old task creation model.

The comment is taken (without changes) from psTasks.hpp and describes 
the first part (OldToYoungRootsTask). The new task ScavengeRootsTask now 
does three things from the old code:
- OldToYoungRootsTask (where the comment is from)
- ScavengeRootsTask
- StealTask (depending on thread count)

I guess the problem could be bad naming from my side, maybe the name of 
ScavengeRootsTask ought to reflect this "merge". Maybe I should rename 
the task ScavengeRootsTask and maybe I should extract the method 
old_to_young_roots_task() and place the comment there? Maybe the comment 
is not needed at all?

How would you prefer to have it?

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 446     // If active_workers can exceed 1, add a StrealTask.
> 447     // PSPromotionManager::drain_stacks_depth() does not fully drain its
> 448     // stacks and expects a StealTask to complete the draining if
> 449     // ParallelGCThreads is > 1.
> 
> Stale comment now?

I believe the comment is still valid, the logic is meant to be the same. 
Do you think the comment is better if I s/Str?ealTask/steal_task() or 
have I missed something? Because no change in the behaviour was done on 
purpose.

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 438     for (Parallel::RootType::Value root_type; _enum_claimer.try_claim(root_type); /* empty */) {
> 439       scavenge_roots_task(root_type, worker_id);
> 440     }
> 
> For the future, maybe serial processing phases should be moved
> earlier.  OK to leave it for now to maintain correlation with the old
> code.

Sorry Kim, I do not understand what you suggest here. Are you referring 
to the order of the members of enum ParallelRootType, and thus the order 
in which they are dispatched (in parallel)?

Thanks,
Leo

> ------------------------------------------------------------------------------
> 


From leo.korinth at oracle.com  Mon Aug  5 16:13:02 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 5 Aug 2019 18:13:02 +0200
Subject: RFR: 8224664: Parallel GC: Use WorkGang (6: PSRefProcTaskProxy)
In-Reply-To: <40c31548-5e28-9f03-0bff-5afabd140f74@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <c71d7922-328a-b68d-1ec6-e6c435eaaa01@oracle.com>
 <40c31548-5e28-9f03-0bff-5afabd140f74@oracle.com>
Message-ID: <76cf4207-939a-4161-af94-7d3131b99ccb@oracle.com>

Hi!

On 28/07/2019 19:54, Thomas Schatzl wrote:
> Hi,
> 
> On 27.05.19 10:34, Leo Korinth wrote:
>> Hi,
>>
>> Here is the sixth patch in a proposed patch series of eight that
>> removes gcTaskManager and uses the WorkGang abstraction instead.
>>
>> Here, the new PSRefProcTask is composed of PSRefProcTaskProxy and the
>> old StealTask (that was partially moved before).
>>
>> Now both psTasks.* and pcTasks.* are removed.
>>
>> Enhancement:
>> ?? https://bugs.openjdk.java.net/browse/JDK-8224664
>>
>> Webrev:
>>
>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224664-Parallel-GC-Use-WorkGang-6-PSRefProcTaskProxy/ 
>>
>> ?? http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>
>> Testing (on the whole patch series):
>> ?? mach5 remote-build-and-test --build-profiles 
>> linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 
>> --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>> ?? gc test suite
> 
> 
> http://cr.openjdk.java.net/~lkorinth/workgang/2/_8224664-Parallel-GC-Use-WorkGang-6-PSRefProcTaskProxy 
> 
> 
> Potentially psCardTable.cpp could get a copyright update, it's up to 
> you. No need for an extra webrev for this change.

Copyright gets updated in patch #7 though it should have been made 
already in #6 as you noted, I will not move the copyright update as it 
will be more for you to review. I guess you prefer it that way.

Thanks for reviewing,
Leo

> 
> Thanks,
>  ? Thomas
> 
> 
> 


From leo.korinth at oracle.com  Mon Aug  5 16:15:13 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 5 Aug 2019 18:15:13 +0200
Subject: RFR: 8224665: Parallel GC: Use WorkGang (7: remove task manager)
In-Reply-To: <3a695282-8b5a-54b0-8f52-e3509ae572c2@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <5bc88bed-7c2c-b4e7-fef2-2d457d7ba4f1@oracle.com>
 <75E8E7E0-DDE4-44EC-B561-3AC1DCCE7C86@oracle.com>
 <3a695282-8b5a-54b0-8f52-e3509ae572c2@oracle.com>
Message-ID: <8e47f17f-55f3-ef66-728f-2ab7c2c4f22e@oracle.com>

Thanks for reviewing Kim and Thomas
/Leo

On 28/07/2019 19:55, Thomas Schatzl wrote:
> Hi,
> 
> On 20.07.19 04:43, Kim Barrett wrote:
>>> On May 27, 2019, at 1:44 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> Here is the seventh patch in a proposed patch series of eight that
>>> removes gcTaskManager and uses the WorkGang abstraction instead.
>>>
>>> We try to remove everything task manager and task thread related.
>>>
>>> Some utility methods (gc_threads_do, print_gc_threads_on) are changed
>>> to use the WorkGang versions. Some worker counts are fetched from
>>> WorkGang as well. Most of the change is just code deletion.
>>>
>>> Enhancement:
>>> ? https://bugs.openjdk.java.net/browse/JDK-8224665
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224665-Parallel-GC-Use-WorkGang-7-remove-task-manager/ 
>>>
>>> ? http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>>
>>> Testing (on the whole patch series):
>>> ? mach5 remote-build-and-test --build-profiles 
>>> linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 
>>> --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>> ? gc test suite
>>>
>>> Thanks,
>>> Leo
>>
>> Looks good.
>>
> 
>  ? looks good.
> 
> Thomas


From leo.korinth at oracle.com  Mon Aug  5 16:16:39 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 5 Aug 2019 18:16:39 +0200
Subject: RFR: 8224662: Parallel GC: Use WorkGang (4:
 SharedRestorePreservedMarksTaskExecutor)
In-Reply-To: <b2b1ff2f-db31-0b7c-7a17-9d367c364715@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <226f085b-6767-5ca8-a32e-76c0eea2de63@oracle.com>
 <CF985BD7-16D3-43A5-88F0-3DBD9D29466A@oracle.com>
 <b2b1ff2f-db31-0b7c-7a17-9d367c364715@oracle.com>
Message-ID: <b1024464-a947-e4fa-4082-0dee22130e6a@oracle.com>

Thanks for reviewing Kim and Thomas!
/Leo

On 28/07/2019 19:43, Thomas Schatzl wrote:
> Hi,
> 
> On 02.07.19 18:06, Kim Barrett wrote:
>>> On May 27, 2019, at 4:56 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>> [?]
>>>
>>> Enhancement:
>>> ? https://bugs.openjdk.java.net/browse/JDK-8224662
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224662-Parallel-GC-Use-WorkGang-4-SharedRestorePreservedMarksTaskExecutor/ 
>>>
>>> ? http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>>
>>> Testing (on the whole patch series):
>>> ? mach5 remote-build-and-test --build-profiles 
>>> linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 
>>> --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>> ? gc test suite
>>>
>>> Thanks,
>>> Leo
>>
>> Looks good.
>>
> 
>  ? looks good.
> 
> Thanks,
>  ? Thomas


From leo.korinth at oracle.com  Mon Aug  5 16:27:01 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Mon, 5 Aug 2019 18:27:01 +0200
Subject: RFR: 8224666: Parallel GC: Use WorkGang (8: obsolete and remove
 flags)
In-Reply-To: <86684073-36cc-5c86-7f0c-313c1bb129d8@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <c03f4b8d-48df-0c66-dc84-3f84c5c7b26a@oracle.com>
 <CD0928E2-2B37-42D4-A7F6-5BC13ED45796@oracle.com>
 <86684073-36cc-5c86-7f0c-313c1bb129d8@oracle.com>
Message-ID: <c91eafb8-3420-3a12-d1b5-24ac32d3e378@oracle.com>


On 28/07/2019 20:19, Thomas Schatzl wrote:
> Hi,
> 
> On 20.07.19 04:55, Kim Barrett wrote:
>>> On May 29, 2019, at 8:55 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> Here is the eighth and last patch that removes gcTaskManager and uses
>>> the WorkGang abstraction instead.
> [...]
>>>
>>> Enhancement:
>>> ? https://bugs.openjdk.java.net/browse/JDK-8224666
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224666-Parallel-GC-Use-WorkGang-8-obsolete-and-remove-flags/ 
>>>
>>> ? http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>>
>>> Testing (on the whole patch series):
>>> ? mach5 remote-build-and-test --build-profiles 
>>> linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 
>>> --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>> ? gc test suite
>>>
>>> Thanks,
>>> Leo
>>
>> 8224666-Parallel-GC-Use-WorkGang-8-obsolete-and-remove-flags
>> 8224666-Parallel-GC-Use-WorkGang-8-obsolete-and-remove-flags-fixup-1
>>
>> Looks good.
>>
>> I see there is a draft CSR for removal of these product flags
>> (JDK-8224668).? I think it needs to be pushed along in the state
>> machine so it can be reviewed.? I've reviewed the current text and it
>> looks okay to me.
>>
> 
> I udpated the CSR a little too and signed it. Not sure if we should keep 
> the name since it is not descriptive enough. We should ask? if it is 
> possible to have different names for CSR and bugfix.
> 
> Thanks,
>  ? Thomas

I changed the title to:
Parallel GC: Obsolete and remove flags BindGCTaskThreadsToCPUs and 
UseGCTaskAffinity

Is that better? If you have a better suggestion please change the title, 
I will probably agree :-)

Is there anything more I need to do?

Thanks,
Leo


From dean.long at oracle.com  Mon Aug  5 20:04:09 2019
From: dean.long at oracle.com (dean.long at oracle.com)
Date: Mon, 5 Aug 2019 13:04:09 -0700
Subject: RFR (XS): Optimize branch frequency of G1's write post-barrier in
 C2
In-Reply-To: <41520e0a-671a-de55-24ab-6615fc456459@oracle.com>
References: <CA+w6HxZ6pVzNhdJ9r7DOX75gpS4fvOFmuf5UJqvBhK=PxKWgDg@mail.gmail.com>
 <ca559898d832da3dfde9bef800f39e9a7c9c6b45.camel@oracle.com>
 <CA+w6HxbB7iV7sDo2SLrhkQaQrGx3MrQG3SXrH5pFCQ2SMvBtjA@mail.gmail.com>
 <41520e0a-671a-de55-24ab-6615fc456459@oracle.com>
Message-ID: <70d36c8e-4730-e58a-a186-57bd4ad2728d@oracle.com>

Looks OK to me

dl

On 8/3/19 12:27 PM, Thomas Schatzl wrote:
> ping at compiler team to have a quick look.
>
> Thanks,
> ? Thomas
>
> On 11.07.19 16:35, Man Cao wrote:
>> Thanks Thomas for the review and running experiments!
>>
>> ?> - can you share the code changes to generate the statistics? It would
>> ?> be nice to confirm these on a few more applications and play around
>> ?> with them a bit :)
>> ?> I would like to confirm some very old numbers we have for other older
>> ?> benchmarks that this is indeed the best probabibility distribution.
>> ?> Particularly I do not understand that from these numbers we did not
>> ?> change the probabilities as you suggested :( There were other changes
>> ?> mostly related to barrier elision in that time frame, but it seems
>> ?> likelihood changes were not attempted.
>>
>> It is here: http://cr.openjdk.java.net/~manc/8225776/branch_profiling/
>> I also added a comment in 
>> https://bugs.openjdk.java.net/browse/JDK-8225776 to clarify the 
>> methodology.
>>
>> ?> - these numbers (and yours) also indicate that the not-young check is
>> ?> very likely to be not taken (i.e. you jump over the storeload). 
>> Did you
>> ?> also perform some experiments changing the order a bit?
>> ?> It might be detrimental for this particular case where the 
>> StoreLoad is
>> ?> expensive, and the xor/non-null filter out at least some 
>> additional of
>> ?> those, but maybe
>> ?> if (young) -> exit
>> ?> if (different-region) -> exit
>> ?> if (non-null) -> exit
>> ?> StoreLoad
>> ?> ...
>> ?> may be better to do? I am aware that the "young" check adds a load,
>> ?> which is also expensive (but not as much as the StoreLoad), but it
>> ?> seems to be an interesting case to look at.
>> ?>
>> ?> In our old results (as far as I can interpret them) it did not 
>> seem to
>> ?> have any advantage/disadvantage, so I am just curious whether you did
>> ?> such tests and their conclusion.
>>
>> Yes, I did this experiment. The load from card table on the fast path 
>> turns out to be expensive for several benchmarks:
>> https://cr.openjdk.java.net/~manc/8225776/20190516-jdk11G1WriteBarrier-dacapoDefault4G-YoungCheckFirst.html 
>>
>> For this experiment, I was setting 4G heap with -XX:NewRatio=1, so 
>> most writes happen to young object, and GC happens very infrequently.
>> The implementation had some bug that some benchmarks crashed while 
>> running. I didn't look into fixing the bug, as this direction does 
>> not seem worthwhile.
>>
>> ?> - internal (quick) perf testing showed no overall score changes, 
>> except
>> ?> that maxJOPS on SpecJBB2015 seemed to improve by ~1.2% (only had time
>> ?> for very few experiments at this time, will rerun, so there is some
>> ?> chance that this has been a fluke) which is definitely nice.
>>
>> Good to hear that!
>> -Man
>


From manc at google.com  Mon Aug  5 20:15:58 2019
From: manc at google.com (Man Cao)
Date: Mon, 5 Aug 2019 13:15:58 -0700
Subject: RFR (XS): Optimize branch frequency of G1's write post-barrier in
 C2
In-Reply-To: <70d36c8e-4730-e58a-a186-57bd4ad2728d@oracle.com>
References: <CA+w6HxZ6pVzNhdJ9r7DOX75gpS4fvOFmuf5UJqvBhK=PxKWgDg@mail.gmail.com>
 <ca559898d832da3dfde9bef800f39e9a7c9c6b45.camel@oracle.com>
 <CA+w6HxbB7iV7sDo2SLrhkQaQrGx3MrQG3SXrH5pFCQ2SMvBtjA@mail.gmail.com>
 <41520e0a-671a-de55-24ab-6615fc456459@oracle.com>
 <70d36c8e-4730-e58a-a186-57bd4ad2728d@oracle.com>
Message-ID: <CA+w6HxYgr=ZGS46+H-2hOLd0h+G9tZKhQM_sbwsAgJeAopj5KA@mail.gmail.com>

Thanks for the reviews!

-Man


On Mon, Aug 5, 2019 at 1:04 PM <dean.long at oracle.com> wrote:

> Looks OK to me
>
> dl
>
> On 8/3/19 12:27 PM, Thomas Schatzl wrote:
> > ping at compiler team to have a quick look.
> >
> > Thanks,
> >   Thomas
> >
> > On 11.07.19 16:35, Man Cao wrote:
> >> Thanks Thomas for the review and running experiments!
> >>
> >>  > - can you share the code changes to generate the statistics? It would
> >>  > be nice to confirm these on a few more applications and play around
> >>  > with them a bit :)
> >>  > I would like to confirm some very old numbers we have for other older
> >>  > benchmarks that this is indeed the best probabibility distribution.
> >>  > Particularly I do not understand that from these numbers we did not
> >>  > change the probabilities as you suggested :( There were other changes
> >>  > mostly related to barrier elision in that time frame, but it seems
> >>  > likelihood changes were not attempted.
> >>
> >> It is here: http://cr.openjdk.java.net/~manc/8225776/branch_profiling/
> >> I also added a comment in
> >> https://bugs.openjdk.java.net/browse/JDK-8225776 to clarify the
> >> methodology.
> >>
> >>  > - these numbers (and yours) also indicate that the not-young check is
> >>  > very likely to be not taken (i.e. you jump over the storeload).
> >> Did you
> >>  > also perform some experiments changing the order a bit?
> >>  > It might be detrimental for this particular case where the
> >> StoreLoad is
> >>  > expensive, and the xor/non-null filter out at least some
> >> additional of
> >>  > those, but maybe
> >>  > if (young) -> exit
> >>  > if (different-region) -> exit
> >>  > if (non-null) -> exit
> >>  > StoreLoad
> >>  > ...
> >>  > may be better to do? I am aware that the "young" check adds a load,
> >>  > which is also expensive (but not as much as the StoreLoad), but it
> >>  > seems to be an interesting case to look at.
> >>  >
> >>  > In our old results (as far as I can interpret them) it did not
> >> seem to
> >>  > have any advantage/disadvantage, so I am just curious whether you did
> >>  > such tests and their conclusion.
> >>
> >> Yes, I did this experiment. The load from card table on the fast path
> >> turns out to be expensive for several benchmarks:
> >>
> https://cr.openjdk.java.net/~manc/8225776/20190516-jdk11G1WriteBarrier-dacapoDefault4G-YoungCheckFirst.html
> >>
> >> For this experiment, I was setting 4G heap with -XX:NewRatio=1, so
> >> most writes happen to young object, and GC happens very infrequently.
> >> The implementation had some bug that some benchmarks crashed while
> >> running. I didn't look into fixing the bug, as this direction does
> >> not seem worthwhile.
> >>
> >>  > - internal (quick) perf testing showed no overall score changes,
> >> except
> >>  > that maxJOPS on SpecJBB2015 seemed to improve by ~1.2% (only had time
> >>  > for very few experiments at this time, will rerun, so there is some
> >>  > chance that this has been a fluke) which is definitely nice.
> >>
> >> Good to hear that!
> >> -Man
> >
>
>


From kim.barrett at oracle.com  Mon Aug  5 21:51:49 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 5 Aug 2019 17:51:49 -0400
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
Message-ID: <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>

> On Aug 5, 2019, at 11:41 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> Hi!
> 
> On 20/07/2019 04:12, Kim Barrett wrote:
>>> On May 27, 2019, at 12:30 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> Here is the fifth patch in a proposed patch series of eight that
>>> removes gcTaskManager and uses the WorkGang abstraction instead.
>>> 
>>> ScavengeRootsTask, ThreadRootsTask and OldToYoungRootsTask is replaced
>>> with ScavengeRootsTask. Code is basically the same, the major
>>> difference is that roots are visited using EnumClaimer and
>>> Threads::possibly_parallel_threads_do. Here we can reuse the RootType
>>> and EnumClaimer from patch number two.
>>> 
>>> The reason "case threads:" was removed is that the code is dead. That
>>> part is confusing as the code is done (in parallel from the calling
>>> function).
>>> 
>>> Enhancement:
>>>  https://bugs.openjdk.java.net/browse/JDK-8224663
>>> 
>>> Webrev:
>>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask/
>>>  http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>> 
>>> Testing (on the whole patch series):
>>>  mach5 remote-build-and-test --build-profiles linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>>  gc test suite
>>> 
>>> Thanks,
>>> Leo
>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask
>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-1
>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-2
>> Looks good, other than a couple formatting nits and some stale comments.
>> ------------------------------------------------------------------------------
>> src/hotspot/share/gc/parallel/psScavenge.cpp
>> 55 #if INCLUDE_JVMCI
>> 56 #include "jvmci/jvmci.hpp"
>> 57 #endif
>> Conditional includes go at the end, per the style guide. There are
>> probably counter-examples :)
> 
> Will correct my "fix".
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/gc/parallel/psScavenge.cpp
>> 337 };
>> 338 //
>> 339 // OldToYoungRootsTask
>> Add a blank line between the class and the block comment.
> 
> Will fix.
> 
>> ------------------------------------------------------------------------------
>> 393   ScavengeRootsTask(
>> 394     PSOldGen* old_gen,
>> 395     HeapWord* gen_top,
>> 396     uint active_workers,
>> 397     bool is_empty) :
>> It's more usual to line these up with the "(" with the first parameter
>> on the same line as the function name.
> 
> Will fix.
> 
>> ------------------------------------------------------------------------------
>> src/hotspot/share/gc/parallel/psScavenge.cpp
>> 339 // OldToYoungRootsTask
>> 340 //
>> 341 // This task is used to scan old to young roots in parallel
>> This comment seems like it needs some update, and maybe is misplaced?
>> This seems like it's really a description of scavenge_contents_parallel,
>> in conjunction with the old task creation model.
> 
> The comment is taken (without changes) from psTasks.hpp and describes the first part (OldToYoungRootsTask). The new task ScavengeRootsTask now does three things from the old code:
> - OldToYoungRootsTask (where the comment is from)
> - ScavengeRootsTask
> - StealTask (depending on thread count)
> 
> I guess the problem could be bad naming from my side, maybe the name of ScavengeRootsTask ought to reflect this "merge". Maybe I should rename the task ScavengeRootsTask and maybe I should extract the method old_to_young_roots_task() and place the comment there? Maybe the comment is not needed at all?
> 
> How would you prefer to have it?

OldToYoungRootsTask basically consisted of the call to
PSCardTable::scavenge_contents_parallel, and this comment really seems
to be about how that function works.  Maybe it should be moved there,
and tidied up for that new location.

>> src/hotspot/share/gc/parallel/psScavenge.cpp
>> 446     // If active_workers can exceed 1, add a StrealTask.
>> 447     // PSPromotionManager::drain_stacks_depth() does not fully drain its
>> 448     // stacks and expects a StealTask to complete the draining if
>> 449     // ParallelGCThreads is > 1.
>> Stale comment now?
> 
> I believe the comment is still valid, the logic is meant to be the same. Do you think the comment is better if I s/Str?ealTask/steal_task() or have I missed something? Because no change in the behaviour was done on purpose.

Oh, I see.  The comment's reference to PSPM::drain_stacks_depth()
really should be to the call to drain_stacks(false) earlier in the function
containing the comment.  (Which were in different functions before
your changes, so yay for better co-location of code and comments.)
[pre-existing]

There's no such thing as a StealTask anymore.  It should be referring
to steal_task().

In workgang nomenclature, scavenge_roots_task and steal_task are
perhaps misnamed.  They aren't tasks, they are helper work functions.
Maybe they should be called scavenge_roots_work and steal_work?

I don't remember if there were similar possible naming issues
elsewhere in this cluster of changes.  And you should verify the
naming convention with someone else before making any changes in
response to this comment.

>> src/hotspot/share/gc/parallel/psScavenge.cpp
>> 438     for (Parallel::RootType::Value root_type; _enum_claimer.try_claim(root_type); /* empty */) {
>> 439       scavenge_roots_task(root_type, worker_id);
>> 440     }
>> For the future, maybe serial processing phases should be moved
>> earlier.  OK to leave it for now to maintain correlation with the old
>> code.
> 
> Sorry Kim, I do not understand what you suggest here. Are you referring to the order of the members of enum ParallelRootType, and thus the order in which they are dispatched (in parallel)?

I was wondering if the serial subtasks in scavenge_roots_task might
not be better scheduled before card_table->scavenge_contents_parallel().
But now that I better understand how the latter works, I no longer
think so.

I think we already talked about the order of the ParallelRootType enumerators
and how that might be important for scheduling, and agreed that could be
looked at later.

And finally, a couple more minor things that I missed earlier, both pre-existing.

------------------------------------------------------------------------------
src/hotspot/share/gc/parallel/psScavenge.cpp
416       assert(!_old_gen->object_space()->is_empty(),
417         "Should not be called is there is no work");
418       assert(_old_gen != NULL, "Sanity");

[pre-existing]
Checking _old_gen != NULL after already using it in the previous
assert is kind of pointless.  These asserts should be reordered.

------------------------------------------------------------------------------
src/hotspot/share/gc/parallel/psScavenge.cpp
381 // to the start of stride 0 in slice 1.

[pre-existing]
s/stride/stripe/

------------------------------------------------------------------------------


From kim.barrett at oracle.com  Tue Aug  6 00:45:47 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 5 Aug 2019 20:45:47 -0400
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
Message-ID: <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>

> On Jul 29, 2019, at 10:41 AM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> Hi Thomas,
> 
> Latest webrev here:
> 
> http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/
> 
> Main change: I renamed the PreGCValues class to PreGenGCValues so that it?s
> clear it?s mainly for generational GCs.

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/preGCValues.hpp 
  63   const size_t _young_gen_used;
  64   const size_t _young_gen_capacity;
  65   const size_t _eden_used;
  66   const size_t _eden_capacity;
  67   const size_t _from_used;
  68   const size_t _from_capacity;
  69   const size_t _old_gen_used;
  70   const size_t _old_gen_capacity;
  71   const metaspace::MetaspaceSizesSnapshot _meta_sizes;

Making these members const prevents assignment by the default
assignment operator.  I don't know if that's intentional, but it seems
unnecessary.

The _meta_sizes const qualifier is pre-existing.

------------------------------------------------------------------------------

Other than that, looks good.  I don't need a new webrev if you decide
to remove those const qualifiers.


From kim.barrett at oracle.com  Tue Aug  6 01:25:20 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 5 Aug 2019 21:25:20 -0400
Subject: RFR(T): 8229156: ProblemList
 gc/stress/gclocker/TestExcessGCLockerCollections.java 
Message-ID: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>

Please review adding the named test to the ProblemList. It's a brand
new test that turned out to be both a not very good test and to
intermittently provide false negatives.  (mea culpa)

The fix it is testing (JDK-8048556, where this test was added) has
received additional manual checking and still looks good, but we need
in-progress changes for some RFEs (JDK-8227225 and followups for other
collectors) to really fix this test.

diff -r c38cca5ffb66 -r 0c6e374d42e8 test/hotspot/jtreg/ProblemList.txt
--- a/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 11:16:48 2019 -0400
+++ b/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 21:12:11 2019 -0400
@@ -77,6 +77,7 @@
 gc/g1/humongousObjects/objectGraphTest/TestObjectGraphAfterGC.java 8156755 generic-all
 gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all
 gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
+gc/stress/gclocker/TestExcessGCLockerCollections.java 8229120 generic-all
 gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
 gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
 gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all


From leo.korinth at oracle.com  Tue Aug  6 08:17:15 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Tue, 6 Aug 2019 10:17:15 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
Message-ID: <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>


On 05/08/2019 23:51, Kim Barrett wrote:
>> On Aug 5, 2019, at 11:41 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> Hi!
>>
>> On 20/07/2019 04:12, Kim Barrett wrote:
>>>> On May 27, 2019, at 12:30 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> Here is the fifth patch in a proposed patch series of eight that
>>>> removes gcTaskManager and uses the WorkGang abstraction instead.
>>>>
>>>> ScavengeRootsTask, ThreadRootsTask and OldToYoungRootsTask is replaced
>>>> with ScavengeRootsTask. Code is basically the same, the major
>>>> difference is that roots are visited using EnumClaimer and
>>>> Threads::possibly_parallel_threads_do. Here we can reuse the RootType
>>>> and EnumClaimer from patch number two.
>>>>
>>>> The reason "case threads:" was removed is that the code is dead. That
>>>> part is confusing as the code is done (in parallel from the calling
>>>> function).
>>>>
>>>> Enhancement:
>>>>   https://bugs.openjdk.java.net/browse/JDK-8224663
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~lkorinth/workgang/0/_8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask/
>>>>   http://cr.openjdk.java.net/~lkorinth/workgang/0/all/
>>>>
>>>> Testing (on the whole patch series):
>>>>   mach5 remote-build-and-test --build-profiles linux-x64,linux-x64-debug,macosx-x64,solaris-sparcv9,windows-x64 --test open/test/hotspot/jtreg/:hotspot_gc -a -XX:+UseParallelGC
>>>>   gc test suite
>>>>
>>>> Thanks,
>>>> Leo
>>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask
>>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-1
>>> 8224663-Parallel-GC-Use-WorkGang-5-ScavengeRootsTask-fixup-2
>>> Looks good, other than a couple formatting nits and some stale comments.
>>> ------------------------------------------------------------------------------
>>> src/hotspot/share/gc/parallel/psScavenge.cpp
>>> 55 #if INCLUDE_JVMCI
>>> 56 #include "jvmci/jvmci.hpp"
>>> 57 #endif
>>> Conditional includes go at the end, per the style guide. There are
>>> probably counter-examples :)
>>
>> Will correct my "fix".
>>
>>> ------------------------------------------------------------------------------
>>> src/hotspot/share/gc/parallel/psScavenge.cpp
>>> 337 };
>>> 338 //
>>> 339 // OldToYoungRootsTask
>>> Add a blank line between the class and the block comment.
>>
>> Will fix.
>>
>>> ------------------------------------------------------------------------------
>>> 393   ScavengeRootsTask(
>>> 394     PSOldGen* old_gen,
>>> 395     HeapWord* gen_top,
>>> 396     uint active_workers,
>>> 397     bool is_empty) :
>>> It's more usual to line these up with the "(" with the first parameter
>>> on the same line as the function name.
>>
>> Will fix.
>>
>>> ------------------------------------------------------------------------------
>>> src/hotspot/share/gc/parallel/psScavenge.cpp
>>> 339 // OldToYoungRootsTask
>>> 340 //
>>> 341 // This task is used to scan old to young roots in parallel
>>> This comment seems like it needs some update, and maybe is misplaced?
>>> This seems like it's really a description of scavenge_contents_parallel,
>>> in conjunction with the old task creation model.
>>
>> The comment is taken (without changes) from psTasks.hpp and describes the first part (OldToYoungRootsTask). The new task ScavengeRootsTask now does three things from the old code:
>> - OldToYoungRootsTask (where the comment is from)
>> - ScavengeRootsTask
>> - StealTask (depending on thread count)
>>
>> I guess the problem could be bad naming from my side, maybe the name of ScavengeRootsTask ought to reflect this "merge". Maybe I should rename the task ScavengeRootsTask and maybe I should extract the method old_to_young_roots_task() and place the comment there? Maybe the comment is not needed at all?
>>
>> How would you prefer to have it?
> 
> OldToYoungRootsTask basically consisted of the call to
> PSCardTable::scavenge_contents_parallel, and this comment really seems
> to be about how that function works.  Maybe it should be moved there,
> and tidied up for that new location.

Okay. Will try to do that.

> 
>>> src/hotspot/share/gc/parallel/psScavenge.cpp
>>> 446     // If active_workers can exceed 1, add a StrealTask.
>>> 447     // PSPromotionManager::drain_stacks_depth() does not fully drain its
>>> 448     // stacks and expects a StealTask to complete the draining if
>>> 449     // ParallelGCThreads is > 1.
>>> Stale comment now?
>>
>> I believe the comment is still valid, the logic is meant to be the same. Do you think the comment is better if I s/Str?ealTask/steal_task() or have I missed something? Because no change in the behaviour was done on purpose.
> 
> Oh, I see.  The comment's reference to PSPM::drain_stacks_depth()
> really should be to the call to drain_stacks(false) earlier in the function
> containing the comment.  (Which were in different functions before
> your changes, so yay for better co-location of code and comments.)
> [pre-existing]
> 
> There's no such thing as a StealTask anymore.  It should be referring
> to steal_task().
> 
> In workgang nomenclature, scavenge_roots_task and steal_task are
> perhaps misnamed.  They aren't tasks, they are helper work functions.
> Maybe they should be called scavenge_roots_work and steal_work?

You are probably right.

> 
> I don't remember if there were similar possible naming issues
> elsewhere in this cluster of changes.  And you should verify the
> naming convention with someone else before making any changes in
> response to this comment.
> 

I have named all these "*_work" functions "*_task" consistently through 
all patches to reflect the old usage of the code :-(, I will rename them 
if Thomas agree, is that okay with you Thomas?

>>> src/hotspot/share/gc/parallel/psScavenge.cpp
>>> 438     for (Parallel::RootType::Value root_type; _enum_claimer.try_claim(root_type); /* empty */) {
>>> 439       scavenge_roots_task(root_type, worker_id);
>>> 440     }
>>> For the future, maybe serial processing phases should be moved
>>> earlier.  OK to leave it for now to maintain correlation with the old
>>> code.
>>
>> Sorry Kim, I do not understand what you suggest here. Are you referring to the order of the members of enum ParallelRootType, and thus the order in which they are dispatched (in parallel)?
> 
> I was wondering if the serial subtasks in scavenge_roots_task might
> not be better scheduled before card_table->scavenge_contents_parallel().
> But now that I better understand how the latter works, I no longer
> think so.
> 
> I think we already talked about the order of the ParallelRootType enumerators
> and how that might be important for scheduling, and agreed that could be
> looked at later.

Okay, will not change then.

> 
> And finally, a couple more minor things that I missed earlier, both pre-existing.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 416       assert(!_old_gen->object_space()->is_empty(),
> 417         "Should not be called is there is no work");
> 418       assert(_old_gen != NULL, "Sanity");
> 
> [pre-existing]
> Checking _old_gen != NULL after already using it in the previous
> assert is kind of pointless.  These asserts should be reordered.

Yes, the order should obviously be changed. Nice catch!

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 381 // to the start of stride 0 in slice 1.
> 
> [pre-existing]
> s/stride/stripe/

Will fix.

Thanks,
Leo

> 
> ------------------------------------------------------------------------------
> 
> 


From shade at redhat.com  Tue Aug  6 08:18:43 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 6 Aug 2019 10:18:43 +0200
Subject: RFR(T): 8229156: ProblemList
 gc/stress/gclocker/TestExcessGCLockerCollections.java
In-Reply-To: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
References: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
Message-ID: <26d385b8-cf68-1c0f-8392-740f4c1d9089@redhat.com>

On 8/6/19 3:25 AM, Kim Barrett wrote:
> Please review adding the named test to the ProblemList. It's a brand
> new test that turned out to be both a not very good test and to
> intermittently provide false negatives.  (mea culpa)
> 
> The fix it is testing (JDK-8048556, where this test was added) has
> received additional manual checking and still looks good, but we need
> in-progress changes for some RFEs (JDK-8227225 and followups for other
> collectors) to really fix this test.
> 
> diff -r c38cca5ffb66 -r 0c6e374d42e8 test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 11:16:48 2019 -0400
> +++ b/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 21:12:11 2019 -0400
> @@ -77,6 +77,7 @@
>  gc/g1/humongousObjects/objectGraphTest/TestObjectGraphAfterGC.java 8156755 generic-all
>  gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all
>  gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
> +gc/stress/gclocker/TestExcessGCLockerCollections.java 8229120 generic-all
>  gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>  gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>  gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all

Looks good and trivial.

-Aleksey


From fujie at loongson.cn  Tue Aug  6 08:21:53 2019
From: fujie at loongson.cn (Jie Fu)
Date: Tue, 6 Aug 2019 16:21:53 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
Message-ID: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>

Hi all,

JBS:??? https://bugs.openjdk.java.net/browse/JDK-8229169
Webrev: http://cr.openjdk.java.net/~jiefu/8229169/webrev.00/

*Background*
Various GC crashes were observed on our Loongson CPUs which support a 
weak memory model.
These crashes can be reproduced with the following test.
---------------------------------------------------------
make test 
TEST="hotspot/jtreg/gc/g1/humongousObjects/TestHumongousClassLoader.java" 
CONF=release
---------------------------------------------------------

*Analysis*
Crashes were caused by the false failure of GenericTaskQueue::pop_local [1].
A corner case had been observed on architectures allowing loads 
reordering, which led to the various GC crashes.

With weak memory architectures, the load of '_age.get()' in line 187 may 
float up before the load of '_age.top' in line 179.
However, for some corner case, the work stealing algorithm may become 
incorrect if this reordering occurs.
---------------------------------------------------------
153
154 template<class E, MEMFLAGS F, unsigned int N> inline bool
155 GenericTaskQueue<E, F, N>::pop_local(volatile E& t, uint threshold) {
156?? uint localBot = _bottom;
157?? // This value cannot be N-1.? That can only occur as a result of
158?? // the assignment to bottom in this method.? If it does, this method
159?? // resets the size to 0 before the next call (which is sequential,
160?? // since this is pop_local.)
161?? uint dirty_n_elems = dirty_size(localBot, _age.top());
162?? assert(dirty_n_elems != N - 1, "Shouldn't be possible...");
163?? if (dirty_n_elems <= threshold) return false;
164?? localBot = decrement_index(localBot);
165?? _bottom = localBot;
166?? // This is necessary to prevent any read below from being reordered
167?? // before the store just above.
168?? OrderAccess::fence();
169?? // g++ complains if the volatile result of the assignment is
170?? // unused, so we cast the volatile away.? We cannot cast directly
171?? // to void, because gcc treats that as not using the result of the
172?? // assignment.? However, casting to E& means that we trigger an
173?? // unused-value warning.? So, we cast the E& to void.
174?? (void) const_cast<E&>(t = _elems[localBot]);
175?? // This is a second read of "age"; the "size()" above is the first.
176?? // If there's still at least one element in the queue, based on the
177?? // "_bottom" and "age" we've read, then there can be no 
interference with
178?? // a "pop_global" operation, and we're done.
179?? idx_t tp = _age.top();??? // XXX
180?? if (size(localBot, tp) > 0) {
181???? assert(dirty_size(localBot, tp) != N - 1, "sanity");
182???? TASKQUEUE_STATS_ONLY(stats.record_pop());
183???? return true;
184?? } else {
185???? // Otherwise, the queue contained exactly one element; we take 
the slow
186???? // path.
187???? return pop_local_slow(localBot, _age.get());
188?? }
189 }
190
---------------------------------------------------------

Assume that after the execution of line 168, the status of the task 
queue is:
 ??????? _top = 8825; _bottom = 8826; N = 131072

Just imagine the following corner case:
 ?-1) The load of '_age.get()' in line 187 is floating up before the 
load of '_age.top' in line 179, and gets _age._top = 8825
 ???? Concurrently, another thread steals a task from the queue, and 
changes the task queue status to:
 ??????? _top = 8826; _bottom = 8826; N = 131072
 ???? This status means that there is still one task (indexed by _top = 
8826) in the queue
 ?-2) Then the load of '_age.top' in line 179 gets tp = _age._top = 8826
 ?-3) Then the if-condition in line 180 is false, and pop_local_slow is 
called with parameters localBot = 8826 and _age.get()._top = 8825
 ?-4) pop_local_slow will return false since localBot != 
_age.get().top() [2], and the task queue will be set empty [3]
 ???? It's obvious incorrect to empty the task queue if the remaining 
task in step 1) hasn't been processed yet.
 ?-5) Then pop_local returns false, which is wrong again if the 
remaining task hasn't been stolen by another thread


*Fix*
For weak memory architectures, a memory fence before line 187 is 
required to prevent the load of _age from floating up.

Could you please review it?

Thanks a lot.
Best regards,
Jie

[1] 
http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l155
[2] 
http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l135
[3] 
http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l149


From shade at redhat.com  Tue Aug  6 09:10:31 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 6 Aug 2019 11:10:31 +0200
Subject: RFR (XS) 8229176: Shenandoah should acquire CodeCache_lock without
 safepoint check
Message-ID: <9cb954c2-c445-f0fd-b67a-b214dfebbe6c@redhat.com>

P1 bug:
  https://bugs.openjdk.java.net/browse/JDK-8229176

CodeCache_lock is defined with Monitor::_safepoint_check_never, should be acquired without safepoint
check. New stronger assert introduced by JDK-8229000 fails because of that.

Fix:

diff -r 8f067351c370 src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp   Mon Aug 05 16:27:30 2019 -0700
+++ b/src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp   Tue Aug 06 11:08:36 2019 +0200
@@ -201,5 +201,5 @@
     }
     case 2: {
-      CodeCache_lock->lock();
+      CodeCache_lock->lock_without_safepoint_check();
       break;
     }

Testing: hotspot_gc_shenandoah (massive failures before, no failures after)

-- 
Thanks,
-Aleksey


From martin.doerr at sap.com  Tue Aug  6 09:29:13 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 6 Aug 2019 09:29:13 +0000
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
Message-ID: <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>

Hi Jie,

thanks for reporting and analyzing this issue.

From your description, OrderAccess ::loadload() seems to be the appropriate barrier.
It should be used on all platforms because it contains a compiler barrier for TSO platforms which prevent compilers from reordering the load accesses.

We should also check that the writer of _age._top uses at least a release (or storestore) barrier in your scenario.

I wonder why I've never seen this issue on PPC64. The test "TestHumongousClassLoader" seems to work stable. But could be that we just never hit this corner case by chance.

If I missed anything please let me know.

Best regards,
Martin


> -----Original Message-----
> From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On
> Behalf Of Jie Fu
> Sent: Dienstag, 6. August 2019 10:22
> To: Hotspot-Gc-Dev <hotspot-gc-dev at openjdk.java.net>
> Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
> architectures with weak memory model
> 
> Hi all,
> 
> JBS:??? https://bugs.openjdk.java.net/browse/JDK-8229169
> Webrev: http://cr.openjdk.java.net/~jiefu/8229169/webrev.00/
> 
> *Background*
> Various GC crashes were observed on our Loongson CPUs which support a
> weak memory model.
> These crashes can be reproduced with the following test.
> ---------------------------------------------------------
> make test
> TEST="hotspot/jtreg/gc/g1/humongousObjects/TestHumongousClassLoader
> .java"
> CONF=release
> ---------------------------------------------------------
> 
> *Analysis*
> Crashes were caused by the false failure of GenericTaskQueue::pop_local
> [1].
> A corner case had been observed on architectures allowing loads
> reordering, which led to the various GC crashes.
> 
> With weak memory architectures, the load of '_age.get()' in line 187 may
> float up before the load of '_age.top' in line 179.
> However, for some corner case, the work stealing algorithm may become
> incorrect if this reordering occurs.
> ---------------------------------------------------------
> 153
> 154 template<class E, MEMFLAGS F, unsigned int N> inline bool
> 155 GenericTaskQueue<E, F, N>::pop_local(volatile E& t, uint threshold) {
> 156?? uint localBot = _bottom;
> 157?? // This value cannot be N-1.? That can only occur as a result of
> 158?? // the assignment to bottom in this method.? If it does, this method
> 159?? // resets the size to 0 before the next call (which is sequential,
> 160?? // since this is pop_local.)
> 161?? uint dirty_n_elems = dirty_size(localBot, _age.top());
> 162?? assert(dirty_n_elems != N - 1, "Shouldn't be possible...");
> 163?? if (dirty_n_elems <= threshold) return false;
> 164?? localBot = decrement_index(localBot);
> 165?? _bottom = localBot;
> 166?? // This is necessary to prevent any read below from being reordered
> 167?? // before the store just above.
> 168?? OrderAccess::fence();
> 169?? // g++ complains if the volatile result of the assignment is
> 170?? // unused, so we cast the volatile away.? We cannot cast directly
> 171?? // to void, because gcc treats that as not using the result of the
> 172?? // assignment.? However, casting to E& means that we trigger an
> 173?? // unused-value warning.? So, we cast the E& to void.
> 174?? (void) const_cast<E&>(t = _elems[localBot]);
> 175?? // This is a second read of "age"; the "size()" above is the first.
> 176?? // If there's still at least one element in the queue, based on the
> 177?? // "_bottom" and "age" we've read, then there can be no
> interference with
> 178?? // a "pop_global" operation, and we're done.
> 179?? idx_t tp = _age.top();??? // XXX
> 180?? if (size(localBot, tp) > 0) {
> 181???? assert(dirty_size(localBot, tp) != N - 1, "sanity");
> 182???? TASKQUEUE_STATS_ONLY(stats.record_pop());
> 183???? return true;
> 184?? } else {
> 185???? // Otherwise, the queue contained exactly one element; we take
> the slow
> 186???? // path.
> 187???? return pop_local_slow(localBot, _age.get());
> 188?? }
> 189 }
> 190
> ---------------------------------------------------------
> 
> Assume that after the execution of line 168, the status of the task
> queue is:
>  ??????? _top = 8825; _bottom = 8826; N = 131072
> 
> Just imagine the following corner case:
>  ?-1) The load of '_age.get()' in line 187 is floating up before the
> load of '_age.top' in line 179, and gets _age._top = 8825
>  ???? Concurrently, another thread steals a task from the queue, and
> changes the task queue status to:
>  ??????? _top = 8826; _bottom = 8826; N = 131072
>  ???? This status means that there is still one task (indexed by _top =
> 8826) in the queue
>  ?-2) Then the load of '_age.top' in line 179 gets tp = _age._top = 8826
>  ?-3) Then the if-condition in line 180 is false, and pop_local_slow is
> called with parameters localBot = 8826 and _age.get()._top = 8825
>  ?-4) pop_local_slow will return false since localBot !=
> _age.get().top() [2], and the task queue will be set empty [3]
>  ???? It's obvious incorrect to empty the task queue if the remaining
> task in step 1) hasn't been processed yet.
>  ?-5) Then pop_local returns false, which is wrong again if the
> remaining task hasn't been stolen by another thread
> 
> 
> *Fix*
> For weak memory architectures, a memory fence before line 187 is
> required to prevent the load of _age from floating up.
> 
> Could you please review it?
> 
> Thanks a lot.
> Best regards,
> Jie
> 
> [1]
> http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/
> shared/taskqueue.inline.hpp#l155
> [2]
> http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/
> shared/taskqueue.inline.hpp#l135
> [3]
> http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/
> shared/taskqueue.inline.hpp#l149
> 


From rkennke at redhat.com  Tue Aug  6 09:56:49 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 6 Aug 2019 11:56:49 +0200
Subject: RFR (XS) 8229176: Shenandoah should acquire CodeCache_lock
 without safepoint check
In-Reply-To: <9cb954c2-c445-f0fd-b67a-b214dfebbe6c@redhat.com>
References: <9cb954c2-c445-f0fd-b67a-b214dfebbe6c@redhat.com>
Message-ID: <50e42ec6-3f39-813e-62d8-bee3c0f60f82@redhat.com>

I needed to make this exact fix when backporting to shenandoah/jdk11.
Yes, is good. Thanks!

Roman


> P1 bug:
>   https://bugs.openjdk.java.net/browse/JDK-8229176
> 
> CodeCache_lock is defined with Monitor::_safepoint_check_never, should be acquired without safepoint
> check. New stronger assert introduced by JDK-8229000 fails because of that.
> 
> Fix:
> 
> diff -r 8f067351c370 src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp   Mon Aug 05 16:27:30 2019 -0700
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahCodeRoots.cpp   Tue Aug 06 11:08:36 2019 +0200
> @@ -201,5 +201,5 @@
>      }
>      case 2: {
> -      CodeCache_lock->lock();
> +      CodeCache_lock->lock_without_safepoint_check();
>        break;
>      }
> 
> Testing: hotspot_gc_shenandoah (massive failures before, no failures after)
> 


From shade at redhat.com  Tue Aug  6 10:01:45 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 6 Aug 2019 12:01:45 +0200
Subject: RFR (XS) 8229176: Shenandoah should acquire CodeCache_lock
 without safepoint check
In-Reply-To: <50e42ec6-3f39-813e-62d8-bee3c0f60f82@redhat.com>
References: <9cb954c2-c445-f0fd-b67a-b214dfebbe6c@redhat.com>
 <50e42ec6-3f39-813e-62d8-bee3c0f60f82@redhat.com>
Message-ID: <c2e1fe4d-dd3b-58bb-5795-30fb7a946457@redhat.com>

On 8/6/19 11:56 AM, Roman Kennke wrote:
> I needed to make this exact fix when backporting to shenandoah/jdk11.
> Yes, is good. Thanks!
Okay, dropped 11-shenandoah.

Pushed!

-- 
Thanks,
-Aleksey


From thomas.schatzl at oracle.com  Tue Aug  6 10:10:14 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 6 Aug 2019 12:10:14 +0200
Subject: RFR (XS) 8229134: [TESTBUG] 32-bit build fails
 gc/arguments/TestSurvivorAlignmentInBytesOption.java after JDK-8228855
In-Reply-To: <8c694fae-1f0d-ec6f-ae1e-de79ad5eef45@redhat.com>
References: <8c694fae-1f0d-ec6f-ae1e-de79ad5eef45@redhat.com>
Message-ID: <4a89deeb-acd2-4c06-f5a3-dd92d400d8ce@oracle.com>

Hi,

On 05.08.19 15:36, Aleksey Shipilev wrote:
> Testbug:
>    https://bugs.openjdk.java.net/browse/JDK-8229134
> 
> ObjectAlignmentInBytes is not available on 32-bit VMs. So the fix it to check for that before trying:
>    http://cr.openjdk.java.net/~shade/8229134/webrev.01/
> 
> Testing: affected tests on x86_32, x86_64
> 

   looks good.

Thomas


From shade at redhat.com  Tue Aug  6 10:34:03 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 6 Aug 2019 12:34:03 +0200
Subject: RFR (XS) 8229134: [TESTBUG] 32-bit build fails
 gc/arguments/TestSurvivorAlignmentInBytesOption.java after JDK-8228855
In-Reply-To: <4a89deeb-acd2-4c06-f5a3-dd92d400d8ce@oracle.com>
References: <8c694fae-1f0d-ec6f-ae1e-de79ad5eef45@redhat.com>
 <4a89deeb-acd2-4c06-f5a3-dd92d400d8ce@oracle.com>
Message-ID: <023672a4-3e87-3c7a-c51d-317f52986c70@redhat.com>

On 8/6/19 12:10 PM, Thomas Schatzl wrote:
> On 05.08.19 15:36, Aleksey Shipilev wrote:
>> Testbug:
>> ?? https://bugs.openjdk.java.net/browse/JDK-8229134
>>
>> ObjectAlignmentInBytes is not available on 32-bit VMs. So the fix it to check for that before trying:
>> ?? http://cr.openjdk.java.net/~shade/8229134/webrev.01/
> 
> ? looks good.

Thanks, retested and pushed.

-- 
Thanks,
-Aleksey


From fujie at loongson.cn  Tue Aug  6 11:03:40 2019
From: fujie at loongson.cn (Jie Fu)
Date: Tue, 6 Aug 2019 19:03:40 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
Message-ID: <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>

Hi Martin,

Thanks for your review and valuable comments.

Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.01/

Please see comments inline.

On 2019/8/6 ??5:29, Doerr, Martin wrote
>  From your description, OrderAccess ::loadload() seems to be the appropriate barrier.
> It should be used on all platforms because it contains a compiler barrier for TSO platforms which prevent compilers from reordering the load accesses.
Done. Thanks.
>
> We should also check that the writer of _age._top uses at least a release (or storestore) barrier in your scenario.
The writer calls GenericTaskQueue<E, F, N>::pop_global to update 
_age._top with _age.cmpxchg(newAge, oldAge) [1] .
I think it already contains the release (or storestore) barrier semantic.
What do you think?
> I wonder why I've never seen this issue on PPC64. The test "TestHumongousClassLoader" seems to work stable. But could be that we just never hit this corner case by chance.
Here is my reproducer to debug this issue: 
http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
And this is my patch used to catch the corner case: 
http://cr.openjdk.java.net/~jiefu/8229169/gc-debug.diff
HTH.

Any comments?

Thanks a lot.
Best regards,
Jie

[1] 
http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l222


From stefan.karlsson at oracle.com  Tue Aug  6 13:11:03 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 6 Aug 2019 15:11:03 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
In-Reply-To: <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
References: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
 <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
Message-ID: <1061c55e-f902-eecb-36b3-d3c5b68679fb@oracle.com>

Looks good.

StefanK

On 2019-08-05 15:52, Per Liden wrote:
> Stefan asked me to break out the ZStatTimerDisable fix into a separate 
> fix, which I did (JDK-8229135), so here's an updated webrev without that 
> part:
> 
> http://cr.openjdk.java.net/~pliden/8229017/webrev.1
> 
> /Per
> 
> On 8/2/19 11:40 AM, Per Liden wrote:
>> Hi,
>>
>> This patch does various cleanups of ZVerify, basically a post-commit 
>> review of JDK-8227175. The patch mostly moves some code around and 
>> adjusts a few names. However, there's also one bug fix and one logic 
>> change:
>>
>> * ZVerify::roots_strong() didn't have a ZStatTimerDisable.
>>
>> * The call to ClassLoaderDataGraph::clear_claimed_marks() was moved 
>> from ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and now 
>> only clears the claim type the iterator actually used (instead of all 
>> types).
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
>> Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0
>>
>> /Per


From erik.osterlund at oracle.com  Tue Aug  6 13:11:57 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 6 Aug 2019 15:11:57 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
In-Reply-To: <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
References: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
 <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
Message-ID: <4241ea5f-4f21-1f2c-aa90-964bd2179082@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 2019-08-05 15:52, Per Liden wrote:
> Stefan asked me to break out the ZStatTimerDisable fix into a separate 
> fix, which I did (JDK-8229135), so here's an updated webrev without 
> that part:
>
> http://cr.openjdk.java.net/~pliden/8229017/webrev.1
>
> /Per
>
> On 8/2/19 11:40 AM, Per Liden wrote:
>> Hi,
>>
>> This patch does various cleanups of ZVerify, basically a post-commit 
>> review of JDK-8227175. The patch mostly moves some code around and 
>> adjusts a few names. However, there's also one bug fix and one logic 
>> change:
>>
>> * ZVerify::roots_strong() didn't have a ZStatTimerDisable.
>>
>> * The call to ClassLoaderDataGraph::clear_claimed_marks() was moved 
>> from ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and 
>> now only clears the claim type the iterator actually used (instead of 
>> all types).
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
>> Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0
>>
>> /Per


From per.liden at oracle.com  Tue Aug  6 13:12:49 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 6 Aug 2019 15:12:49 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
In-Reply-To: <4241ea5f-4f21-1f2c-aa90-964bd2179082@oracle.com>
References: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
 <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
 <4241ea5f-4f21-1f2c-aa90-964bd2179082@oracle.com>
Message-ID: <33f03699-f363-9970-22a6-a7dc26afbaa0@oracle.com>

Thanks Erik!

/Per

On 8/6/19 3:11 PM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 2019-08-05 15:52, Per Liden wrote:
>> Stefan asked me to break out the ZStatTimerDisable fix into a separate 
>> fix, which I did (JDK-8229135), so here's an updated webrev without 
>> that part:
>>
>> http://cr.openjdk.java.net/~pliden/8229017/webrev.1
>>
>> /Per
>>
>> On 8/2/19 11:40 AM, Per Liden wrote:
>>> Hi,
>>>
>>> This patch does various cleanups of ZVerify, basically a post-commit 
>>> review of JDK-8227175. The patch mostly moves some code around and 
>>> adjusts a few names. However, there's also one bug fix and one logic 
>>> change:
>>>
>>> * ZVerify::roots_strong() didn't have a ZStatTimerDisable.
>>>
>>> * The call to ClassLoaderDataGraph::clear_claimed_marks() was moved 
>>> from ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and 
>>> now only clears the claim type the iterator actually used (instead of 
>>> all types).
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
>>> Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0
>>>
>>> /Per
> 


From per.liden at oracle.com  Tue Aug  6 13:13:06 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 6 Aug 2019 15:13:06 +0200
Subject: RFR: 8229135: ZGC: Adding missing ZStatTimerDisable before call
 to ZVerify::roots_strong()
In-Reply-To: <c81456b6-892c-4c37-5c76-8f52d8b40563@oracle.com>
References: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
 <c81456b6-892c-4c37-5c76-8f52d8b40563@oracle.com>
Message-ID: <d73bbeeb-6596-06d1-bd1d-ef81b74b9dfd@oracle.com>

Thanks Erik!

/Per

On 8/6/19 3:13 PM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 2019-08-05 15:47, Per Liden wrote:
>> ZVerify::roots_strong() is called outside of a ZStatTimerDisable 
>> scope, which means the root scanning stat counters/samplers will be 
>> polluted.
>>
>> (This fix was originally part of JDK-8229017, but Stefan asked me to 
>> break this out into a separate fix)
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229135
>> Webrev: http://cr.openjdk.java.net/~pliden/8229135/webrev.0
>>
>> /Per
> 


From per.liden at oracle.com  Tue Aug  6 13:12:39 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 6 Aug 2019 15:12:39 +0200
Subject: RFR: 8229017: ZGC: Various cleanups of ZVerify
In-Reply-To: <1061c55e-f902-eecb-36b3-d3c5b68679fb@oracle.com>
References: <cb119e29-f6a2-2054-5b00-afcf8c8167e7@oracle.com>
 <9dd7c352-024e-adab-a7f0-756cc3b99218@oracle.com>
 <1061c55e-f902-eecb-36b3-d3c5b68679fb@oracle.com>
Message-ID: <7739d35d-3c2d-9571-6cfa-db39b61a69ba@oracle.com>

Thanks Stefan!

/Per

On 8/6/19 3:11 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-05 15:52, Per Liden wrote:
>> Stefan asked me to break out the ZStatTimerDisable fix into a separate 
>> fix, which I did (JDK-8229135), so here's an updated webrev without 
>> that part:
>>
>> http://cr.openjdk.java.net/~pliden/8229017/webrev.1
>>
>> /Per
>>
>> On 8/2/19 11:40 AM, Per Liden wrote:
>>> Hi,
>>>
>>> This patch does various cleanups of ZVerify, basically a post-commit 
>>> review of JDK-8227175. The patch mostly moves some code around and 
>>> adjusts a few names. However, there's also one bug fix and one logic 
>>> change:
>>>
>>> * ZVerify::roots_strong() didn't have a ZStatTimerDisable.
>>>
>>> * The call to ClassLoaderDataGraph::clear_claimed_marks() was moved 
>>> from ZMarkConcurrentRootsTask() to ZConcurrentRootsIterator(), and 
>>> now only clears the claim type the iterator actually used (instead of 
>>> all types).
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229017
>>> Webrev: http://cr.openjdk.java.net/~pliden/8229017/webrev.0
>>>
>>> /Per


From erik.osterlund at oracle.com  Tue Aug  6 13:13:55 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 6 Aug 2019 15:13:55 +0200
Subject: RFR: 8229135: ZGC: Adding missing ZStatTimerDisable before call
 to ZVerify::roots_strong()
In-Reply-To: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
References: <dfdbd982-6b68-1a8a-dda5-35285e667e3a@oracle.com>
Message-ID: <c81456b6-892c-4c37-5c76-8f52d8b40563@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 2019-08-05 15:47, Per Liden wrote:
> ZVerify::roots_strong() is called outside of a ZStatTimerDisable 
> scope, which means the root scanning stat counters/samplers will be 
> polluted.
>
> (This fix was originally part of JDK-8229017, but Stefan asked me to 
> break this out into a separate fix)
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229135
> Webrev: http://cr.openjdk.java.net/~pliden/8229135/webrev.0
>
> /Per


From erik.osterlund at oracle.com  Tue Aug  6 13:15:19 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 6 Aug 2019 15:15:19 +0200
Subject: RFR: 8229129: ZGC: Fix incorrect format string for doubles
In-Reply-To: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
References: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
Message-ID: <f4bb8861-bc2c-5db0-d9cb-3ba4de057318@oracle.com>

Hi Per,

Looks good.

Thanks,
/Erik

On 2019-08-05 13:50, Per Liden wrote:
> ZGC sometimes prints doubles with an incorrect format string, "%lf" 
> instead of "%f". The "l" doesn't cause any problems, but it also has 
> no meaning when printing doubles, so it should be removed.
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229129
> Webrev: http://cr.openjdk.java.net/~pliden/8229129/webrev.0
>
> /Per


From per.liden at oracle.com  Tue Aug  6 13:14:51 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 6 Aug 2019 15:14:51 +0200
Subject: RFR: 8229129: ZGC: Fix incorrect format string for doubles
In-Reply-To: <f4bb8861-bc2c-5db0-d9cb-3ba4de057318@oracle.com>
References: <c181ea08-dba7-c1ff-fc28-f5cf02d467ab@oracle.com>
 <f4bb8861-bc2c-5db0-d9cb-3ba4de057318@oracle.com>
Message-ID: <36eba8ea-dc3f-60e0-ad6f-6c1bb5b67494@oracle.com>

Thanks Erik!

/Per

On 8/6/19 3:15 PM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks good.
> 
> Thanks,
> /Erik
> 
> On 2019-08-05 13:50, Per Liden wrote:
>> ZGC sometimes prints doubles with an incorrect format string, "%lf" 
>> instead of "%f". The "l" doesn't cause any problems, but it also has 
>> no meaning when printing doubles, so it should be removed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229129
>> Webrev: http://cr.openjdk.java.net/~pliden/8229129/webrev.0
>>
>> /Per
> 


From martin.doerr at sap.com  Tue Aug  6 14:12:20 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 6 Aug 2019 14:12:20 +0000
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
Message-ID: <AM6PR02MB4788A0E5F7B77E09AF4C9A429AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>

Hi Jie,

thanks for the new webrev.
 
> The writer calls GenericTaskQueue<E, F, N>::pop_global to update
> _age._top with _age.cmpxchg(newAge, oldAge) [1] .
> I think it already contains the release (or storestore) barrier semantic.
Yes. I agree.

> What do you think?
Looks reasonable to me, but somebody from GC team should also take a look.

> Here is my reproducer to debug this issue:
> http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
Maybe I didn't figure out how to run it correctly. Where's the source code for ClassLoaderGenerator.class?
Did you get the error while running javac?
It didn't crash on PPC64.

Best regards,
Martin


From tprintezis at twitter.com  Tue Aug  6 14:23:08 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Tue, 6 Aug 2019 07:23:08 -0700
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
Message-ID: <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>

Hi Kim,

The way instances of this class are used is to ?checkpoint? the values at
the start of the GC and just report the size transitions at the end of the
GC. We haven?t had a need to assign to an instance, once it?s been
constructed. So, I?d be inclined to just leave the const modifiers in. If
we want to use the assignment operator in the future, we could just remove
the const modifiers then?

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 5, 2019 at 8:45:54 PM, Kim Barrett (kim.barrett at oracle.com) wrote:

> On Jul 29, 2019, at 10:41 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>
> Hi Thomas,
>
> Latest webrev here:
>
> http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/
>
> Main change: I renamed the PreGCValues class to PreGenGCValues so that
it?s
> clear it?s mainly for generational GCs.

------------------------------------------------------------------------------

src/hotspot/share/gc/shared/preGCValues.hpp
63 const size_t _young_gen_used;
64 const size_t _young_gen_capacity;
65 const size_t _eden_used;
66 const size_t _eden_capacity;
67 const size_t _from_used;
68 const size_t _from_capacity;
69 const size_t _old_gen_used;
70 const size_t _old_gen_capacity;
71 const metaspace::MetaspaceSizesSnapshot _meta_sizes;

Making these members const prevents assignment by the default
assignment operator. I don't know if that's intentional, but it seems
unnecessary.

The _meta_sizes const qualifier is pre-existing.

------------------------------------------------------------------------------


Other than that, looks good. I don't need a new webrev if you decide
to remove those const qualifiers.


From fujie at loongson.cn  Tue Aug  6 14:58:21 2019
From: fujie at loongson.cn (Jie Fu)
Date: Tue, 6 Aug 2019 22:58:21 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <AM6PR02MB4788A0E5F7B77E09AF4C9A429AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <AM6PR02MB4788A0E5F7B77E09AF4C9A429AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
Message-ID: <a17366b9-d6e0-4c08-f964-fb34f850fb81@loongson.cn>

Hi Martin,

On 2019/8/6 ??10:12, Doerr, Martin wrote:
>> Here is my reproducer to debug this issue:
>> http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
> Maybe I didn't figure out how to run it correctly. Where's the source code for ClassLoaderGenerator.class?

The reproducer is constructed from 
hotspot/jtreg/gc/g1/humongousObjects/TestHumongousClassLoader.java.

So the source code for ClassLoaderGenerator.class is 
test/hotspot/jtreg/gc/g1/humongousObjects/ClassLoaderGenerator.java.

> Did you get the error while running javac?
Yes, it crashed while running javac.
> It didn't crash on PPC64.

This bug can be easily reproduced on Our latest Loongson 3A3000/4000 
processors.

Thanks a lot.
Best regards,
Jie


From thomas.schatzl at oracle.com  Tue Aug  6 15:03:13 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 6 Aug 2019 17:03:13 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
Message-ID: <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>

Hi,

On 06.08.19 10:17, Leo Korinth wrote:
> 
> 
> On 05/08/2019 23:51, Kim Barrett wrote:
[...]

>> In workgang nomenclature, scavenge_roots_task and steal_task are
>> perhaps misnamed.? They aren't tasks, they are helper work functions.
>> Maybe they should be called scavenge_roots_work and steal_work?
> 
> You are probably right.
> 
>>
>> I don't remember if there were similar possible naming issues
>> elsewhere in this cluster of changes.? And you should verify the
>> naming convention with someone else before making any changes in
>> response to this comment.
>>
> 
> I have named all these "*_work" functions "*_task" consistently through 
> all patches to reflect the old usage of the code :-(, I will rename them 
> if Thomas agree, is that okay with you Thomas?

I am good with that suggestion.

Thanks,
   Thomas


From kim.barrett at oracle.com  Tue Aug  6 15:36:55 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 6 Aug 2019 11:36:55 -0400
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
 <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
Message-ID: <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>

> On Aug 6, 2019, at 10:23 AM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> Hi Kim,
> 
> The way instances of this class are used is to ?checkpoint? the values at the start of the GC and just report the size transitions at the end of the GC. We haven?t had a need to assign to an instance, once it?s been constructed. So, I?d be inclined to just leave the const modifiers in. If we want to use the assignment operator in the future, we could just remove the const modifiers then?

That?s fine.  It just seemed odd.

> 
> Tony
> 
> 
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com
> 
> 
> On August 5, 2019 at 8:45:54 PM, Kim Barrett (kim.barrett at oracle.com) wrote:
> 
>> > On Jul 29, 2019, at 10:41 AM, Tony Printezis <tprintezis at twitter.com> wrote: 
>> >  
>> > Hi Thomas, 
>> >  
>> > Latest webrev here: 
>> >  
>> > http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/ 
>> >  
>> > Main change: I renamed the PreGCValues class to PreGenGCValues so that it?s 
>> > clear it?s mainly for generational GCs. 
>> 
>> ------------------------------------------------------------------------------ 
>> src/hotspot/share/gc/shared/preGCValues.hpp  
>> 63 const size_t _young_gen_used; 
>> 64 const size_t _young_gen_capacity; 
>> 65 const size_t _eden_used; 
>> 66 const size_t _eden_capacity; 
>> 67 const size_t _from_used; 
>> 68 const size_t _from_capacity; 
>> 69 const size_t _old_gen_used; 
>> 70 const size_t _old_gen_capacity; 
>> 71 const metaspace::MetaspaceSizesSnapshot _meta_sizes; 
>> 
>> Making these members const prevents assignment by the default 
>> assignment operator. I don't know if that's intentional, but it seems 
>> unnecessary. 
>> 
>> The _meta_sizes const qualifier is pre-existing. 
>> 
>> ------------------------------------------------------------------------------ 
>> 
>> Other than that, looks good. I don't need a new webrev if you decide 
>> to remove those const qualifiers.


From kim.barrett at oracle.com  Tue Aug  6 15:37:29 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 6 Aug 2019 11:37:29 -0400
Subject: RFR(T): 8229156: ProblemList
 gc/stress/gclocker/TestExcessGCLockerCollections.java
In-Reply-To: <26d385b8-cf68-1c0f-8392-740f4c1d9089@redhat.com>
References: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
 <26d385b8-cf68-1c0f-8392-740f4c1d9089@redhat.com>
Message-ID: <FC122A53-4FD0-4906-AC3B-CDBE77B48F60@oracle.com>

> On Aug 6, 2019, at 4:18 AM, Aleksey Shipilev <shade at redhat.com> wrote:
> 
> On 8/6/19 3:25 AM, Kim Barrett wrote:
>> Please review adding the named test to the ProblemList. It's a brand
>> new test that turned out to be both a not very good test and to
>> intermittently provide false negatives.  (mea culpa)
>> 
>> The fix it is testing (JDK-8048556, where this test was added) has
>> received additional manual checking and still looks good, but we need
>> in-progress changes for some RFEs (JDK-8227225 and followups for other
>> collectors) to really fix this test.
>> 
>> diff -r c38cca5ffb66 -r 0c6e374d42e8 test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 11:16:48 2019 -0400
>> +++ b/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 21:12:11 2019 -0400
>> @@ -77,6 +77,7 @@
>> gc/g1/humongousObjects/objectGraphTest/TestObjectGraphAfterGC.java 8156755 generic-all
>> gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all
>> gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
>> +gc/stress/gclocker/TestExcessGCLockerCollections.java 8229120 generic-all
>> gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>> gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>> gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
> 
> Looks good and trivial.
> 
> -Aleksey

Thanks.


From tprintezis at twitter.com  Tue Aug  6 15:49:47 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Tue, 6 Aug 2019 08:49:47 -0700
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
 <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
 <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>
Message-ID: <CAOzU2ik3aSkzC1v9p55HEiNp2qxyRRouUA9q=6S56O=OnZHPqg@mail.gmail.com>

Thanks Kim! I?ll leave the const modifiers in. I?ll also do one more
jdk-submit submission before pushing.

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 6, 2019 at 11:37:07 AM, Kim Barrett (kim.barrett at oracle.com)
wrote:

> On Aug 6, 2019, at 10:23 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>
> Hi Kim,
>
> The way instances of this class are used is to ?checkpoint? the values at
the start of the GC and just report the size transitions at the end of the
GC. We haven?t had a need to assign to an instance, once it?s been
constructed. So, I?d be inclined to just leave the const modifiers in. If
we want to use the assignment operator in the future, we could just remove
the const modifiers then?

That?s fine. It just seemed odd.

>
> Tony
>
>
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com
>
>
> On August 5, 2019 at 8:45:54 PM, Kim Barrett (kim.barrett at oracle.com)
wrote:
>
>> > On Jul 29, 2019, at 10:41 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>> >
>> > Hi Thomas,
>> >
>> > Latest webrev here:
>> >
>> > http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/
>> >
>> > Main change: I renamed the PreGCValues class to PreGenGCValues so that
it?s
>> > clear it?s mainly for generational GCs.
>>
>>
------------------------------------------------------------------------------

>> src/hotspot/share/gc/shared/preGCValues.hpp
>> 63 const size_t _young_gen_used;
>> 64 const size_t _young_gen_capacity;
>> 65 const size_t _eden_used;
>> 66 const size_t _eden_capacity;
>> 67 const size_t _from_used;
>> 68 const size_t _from_capacity;
>> 69 const size_t _old_gen_used;
>> 70 const size_t _old_gen_capacity;
>> 71 const metaspace::MetaspaceSizesSnapshot _meta_sizes;
>>
>> Making these members const prevents assignment by the default
>> assignment operator. I don't know if that's intentional, but it seems
>> unnecessary.
>>
>> The _meta_sizes const qualifier is pre-existing.
>>
>>
------------------------------------------------------------------------------

>>
>> Other than that, looks good. I don't need a new webrev if you decide
>> to remove those const qualifiers.


From thomas.schatzl at oracle.com  Tue Aug  6 15:51:14 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 6 Aug 2019 17:51:14 +0200
Subject: RFR(T): 8229156: ProblemList
 gc/stress/gclocker/TestExcessGCLockerCollections.java
In-Reply-To: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
References: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
Message-ID: <5bf34dbd-f40a-16d7-c7ec-ea4ef339051b@oracle.com>

Hi Kim,

On 06.08.19 03:25, Kim Barrett wrote:
> Please review adding the named test to the ProblemList. It's a brand
> new test that turned out to be both a not very good test and to
> intermittently provide false negatives.  (mea culpa)
> 
> The fix it is testing (JDK-8048556, where this test was added) has
> received additional manual checking and still looks good, but we need
> in-progress changes for some RFEs (JDK-8227225 and followups for other
> collectors) to really fix this test.
> 
> diff -r c38cca5ffb66 -r 0c6e374d42e8 test/hotspot/jtreg/ProblemList.txt
> --- a/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 11:16:48 2019 -0400
> +++ b/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 21:12:11 2019 -0400
> @@ -77,6 +77,7 @@
>   gc/g1/humongousObjects/objectGraphTest/TestObjectGraphAfterGC.java 8156755 generic-all
>   gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all
>   gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
> +gc/stress/gclocker/TestExcessGCLockerCollections.java 8229120 generic-all
>   gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>   gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>   gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
> 

  looks good.

Thomas


From tprintezis at twitter.com  Tue Aug  6 19:49:33 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Tue, 6 Aug 2019 12:49:33 -0700
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <CAOzU2ik3aSkzC1v9p55HEiNp2qxyRRouUA9q=6S56O=OnZHPqg@mail.gmail.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
 <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
 <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>
 <CAOzU2ik3aSkzC1v9p55HEiNp2qxyRRouUA9q=6S56O=OnZHPqg@mail.gmail.com>
Message-ID: <CAOzU2i=7j4vxydy++K+4DM-X72DDpdR6OgKi=TFL2C5NOdwTNA@mail.gmail.com>

So, CMS is apparently going away. Is Serial going away too? Is there need
for a similar change for GenCollectedHeap (which will work for both GCs)?

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 6, 2019 at 11:49:47 AM, Tony Printezis (tprintezis at twitter.com)
wrote:

Thanks Kim! I?ll leave the const modifiers in. I?ll also do one more
jdk-submit submission before pushing.

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 6, 2019 at 11:37:07 AM, Kim Barrett (kim.barrett at oracle.com)
wrote:

> On Aug 6, 2019, at 10:23 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>
> Hi Kim,
>
> The way instances of this class are used is to ?checkpoint? the values at
the start of the GC and just report the size transitions at the end of the
GC. We haven?t had a need to assign to an instance, once it?s been
constructed. So, I?d be inclined to just leave the const modifiers in. If
we want to use the assignment operator in the future, we could just remove
the const modifiers then?

That?s fine. It just seemed odd.

>
> Tony
>
>
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com
>
>
> On August 5, 2019 at 8:45:54 PM, Kim Barrett (kim.barrett at oracle.com)
wrote:
>
>> > On Jul 29, 2019, at 10:41 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>> >
>> > Hi Thomas,
>> >
>> > Latest webrev here:
>> >
>> > http://cr.openjdk.java.net/~tonyp/8227225/webrev.1/
>> >
>> > Main change: I renamed the PreGCValues class to PreGenGCValues so that
it?s
>> > clear it?s mainly for generational GCs.
>>
>>
------------------------------------------------------------------------------
>> src/hotspot/share/gc/shared/preGCValues.hpp
>> 63 const size_t _young_gen_used;
>> 64 const size_t _young_gen_capacity;
>> 65 const size_t _eden_used;
>> 66 const size_t _eden_capacity;
>> 67 const size_t _from_used;
>> 68 const size_t _from_capacity;
>> 69 const size_t _old_gen_used;
>> 70 const size_t _old_gen_capacity;
>> 71 const metaspace::MetaspaceSizesSnapshot _meta_sizes;
>>
>> Making these members const prevents assignment by the default
>> assignment operator. I don't know if that's intentional, but it seems
>> unnecessary.
>>
>> The _meta_sizes const qualifier is pre-existing.
>>
>>
------------------------------------------------------------------------------
>>
>> Other than that, looks good. I don't need a new webrev if you decide
>> to remove those const qualifiers.


From kim.barrett at oracle.com  Tue Aug  6 22:45:50 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 6 Aug 2019 18:45:50 -0400
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <CAOzU2i=7j4vxydy++K+4DM-X72DDpdR6OgKi=TFL2C5NOdwTNA@mail.gmail.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
 <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
 <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>
 <CAOzU2ik3aSkzC1v9p55HEiNp2qxyRRouUA9q=6S56O=OnZHPqg@mail.gmail.com>
 <CAOzU2i=7j4vxydy++K+4DM-X72DDpdR6OgKi=TFL2C5NOdwTNA@mail.gmail.com>
Message-ID: <17F4E98C-A74E-4CDD-B162-F3E9C54AAFA6@oracle.com>

> On Aug 6, 2019, at 3:49 PM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> So, CMS is apparently going away. Is Serial going away too? Is there need for a similar change for GenCollectedHeap (which will work for both GCs)?

I?ve not heard of any proposal to make Serial go away.
There might be some post-CMS-removal cleanup, since it would be the only GenCollectedHeap.
But I don?t think adding subspace transition reporting should wait for that.


From kim.barrett at oracle.com  Tue Aug  6 22:47:17 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 6 Aug 2019 18:47:17 -0400
Subject: RFR(T): 8229156: ProblemList
 gc/stress/gclocker/TestExcessGCLockerCollections.java
In-Reply-To: <5bf34dbd-f40a-16d7-c7ec-ea4ef339051b@oracle.com>
References: <7A59994D-1E02-462E-8814-14C34089FD2D@oracle.com>
 <5bf34dbd-f40a-16d7-c7ec-ea4ef339051b@oracle.com>
Message-ID: <9B3C7399-9009-4FBD-AC91-F5A7249E9072@oracle.com>

> On Aug 6, 2019, at 11:51 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 06.08.19 03:25, Kim Barrett wrote:
>> Please review adding the named test to the ProblemList. It's a brand
>> new test that turned out to be both a not very good test and to
>> intermittently provide false negatives.  (mea culpa)
>> The fix it is testing (JDK-8048556, where this test was added) has
>> received additional manual checking and still looks good, but we need
>> in-progress changes for some RFEs (JDK-8227225 and followups for other
>> collectors) to really fix this test.
>> diff -r c38cca5ffb66 -r 0c6e374d42e8 test/hotspot/jtreg/ProblemList.txt
>> --- a/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 11:16:48 2019 -0400
>> +++ b/test/hotspot/jtreg/ProblemList.txt	Mon Aug 05 21:12:11 2019 -0400
>> @@ -77,6 +77,7 @@
>>  gc/g1/humongousObjects/objectGraphTest/TestObjectGraphAfterGC.java 8156755 generic-all
>>  gc/g1/logging/TestG1LoggingFailure.java 8169634 generic-all
>>  gc/g1/humongousObjects/TestHeapCounters.java 8178918 generic-all
>> +gc/stress/gclocker/TestExcessGCLockerCollections.java 8229120 generic-all
>>  gc/stress/gclocker/TestGCLockerWithParallel.java 8180622 generic-all
>>  gc/stress/gclocker/TestGCLockerWithG1.java 8180622 generic-all
>>  gc/stress/TestJNIBlockFullGC/TestJNIBlockFullGC.java 8192647 generic-all
> 
> looks good.
> 
> Thomas

Thanks.


From tprintezis at twitter.com  Tue Aug  6 23:06:51 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Tue, 6 Aug 2019 16:06:51 -0700
Subject: RFR(S): 8227225: ParallelGC: add subspace transitions for young
 gen for gc+heap=info log lines
In-Reply-To: <17F4E98C-A74E-4CDD-B162-F3E9C54AAFA6@oracle.com>
References: <CAOzU2inXqbueM=9k26Dre956FPKGiuDq3tnEhot_3vHLSgm6Eg@mail.gmail.com>
 <877bd17cff3be20bf8acd95d9eddbb5c9cfb7cf5.camel@oracle.com>
 <CAOzU2imCNQSvUG0szfAs9+RwqYvQSCBJg8t0mERPiFKEmZh4pA@mail.gmail.com>
 <c8a2d4fa1119bbc517161064a9ba93b5e262222c.camel@oracle.com>
 <CAOzU2i=HxwSFD5TS91ipTWW_Natewmd6um=cQbzdg84ygMq5KQ@mail.gmail.com>
 <CAOzU2ink2-2kq8bOQ4s3A8vPenqWzuPRL+8P=nn6_uD3Wpi7hg@mail.gmail.com>
 <3FB0759C-FB9A-418A-8BDA-9BE948C8FC00@oracle.com>
 <CAOzU2ik1GpUG5_fzmvsBp2MKYHtsyV2v4cUdUMkzWn+xBM4BFA@mail.gmail.com>
 <77B06735-235C-4807-A2E8-3CFE7DC05B3F@oracle.com>
 <CAOzU2ik3aSkzC1v9p55HEiNp2qxyRRouUA9q=6S56O=OnZHPqg@mail.gmail.com>
 <CAOzU2i=7j4vxydy++K+4DM-X72DDpdR6OgKi=TFL2C5NOdwTNA@mail.gmail.com>
 <17F4E98C-A74E-4CDD-B162-F3E9C54AAFA6@oracle.com>
Message-ID: <CAOzU2immgy73gROmy7_CrCcB9d++OGCc9u55jXLfDwTDz=xcnQ@mail.gmail.com>

Kim,

Thanks. Yeah, it?s also a very simple change. I?ll post the webrev tomorrow.

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 6, 2019 at 6:45:57 PM, Kim Barrett (kim.barrett at oracle.com) wrote:

> On Aug 6, 2019, at 3:49 PM, Tony Printezis <tprintezis at twitter.com>
wrote:
>
> So, CMS is apparently going away. Is Serial going away too? Is there need
for a similar change for GenCollectedHeap (which will work for both GCs)?

I?ve not heard of any proposal to make Serial go away.
There might be some post-CMS-removal cleanup, since it would be the only
GenCollectedHeap.
But I don?t think adding subspace transition reporting should wait for
that.


From zgu at redhat.com  Tue Aug  6 23:47:44 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 6 Aug 2019 19:47:44 -0400
Subject: RFR(T) 8229206: Shenandoah: ShenandoahWeakRoot::oops_do() uses wrong
 timing phase
Message-ID: <739c7633-459a-8955-d982-a1016131798f@redhat.com>

ShenandoahWeakRoot::oop_do() uses wrong timing phase.


Bug: https://bugs.openjdk.java.net/browse/JDK-8229206
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229206/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)

Thanks,

-Zhengyu


From sci at amazon.com  Wed Aug  7 01:05:11 2019
From: sci at amazon.com (Sciampacone, Ryan)
Date: Wed, 7 Aug 2019 01:05:11 +0000
Subject: 8227226: Segmented array clearing for ZGC
Message-ID: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>

Although least intrusive, it goes back to some of the earlier complaints about using false in the constructor for do_zero.  It also makes a fair number of assumptions (and goes against the hierarchies intent) on initialization logic to hide in finish().  That said, I agree that is fairly clean - and definitely addresses the missed cases of the earlier webrev.

2 things,

1. Isn't the substitute_oop_array_klass() check too narrow?  It will only detect types Object[], and not any other type of reference array (such as String[]) ?  I believe there's a bug here (correct me if I'm wrong).
2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) dependency.  I realize what code base this is in but it would be properly defensive.

What does the reporting look like in this case?  Is the long[] type reported accepted?  I'm wondering if this depletes some of the simplicity.

?On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:

    Did some micro-benchmarking (on a Xeon E5-2630) with various segment 
    sizes between 4K and 512K, and 64K seems to offer a good trade-off. For 
    a 1G array, the allocation time increases by ~1%, but in exchange the 
    worst case TTSP drops from ~280ms to ~0.6ms.
    
    Updated webrev using 64K:
    
    http://cr.openjdk.java.net/~pliden/8227226/webrev.3
    
    cheers,
    Per
    
    On 8/2/19 11:11 AM, Per Liden wrote:
    > Hi Erik,
    > 
    > On 8/1/19 5:56 PM, Erik Osterlund wrote:
    >> Hi Per,
    >>
    >> I like that this approach is unintrusive, does its thing at the right 
    >> abstraction layer, and also handles medium sized arrays.
    > 
    > It even handles small arrays (i.e. arrays in small zpages) ;)
    > 
    >> Looks good.
    > 
    > Thanks! I'll test various segment sizes and see how that affects 
    > performance and TTSP.
    > 
    > cheers,
    > Per
    > 
    >>
    >> Thanks,
    >> /Erik
    >>
    >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
    >>>
    >>> Here's an updated webrev that should be complete, i.e. fixes the 
    >>> issues related to allocation sampling/reporting that I mentioned. 
    >>> This patch makes MemAllocator::finish() virtual, so that we can do 
    >>> our thing and install the correct klass pointer before the Allocation 
    >>> destructor executes. This seems to be the least intrusive why of 
    >>> doing this.
    >>>
    >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
    >>>
    >>> This passed function testing, but proper benchmarking remains to be 
    >>> done.
    >>>
    >>> cheers,
    >>> Per
    >>>
    >>>> On 7/31/19 7:19 PM, Per Liden wrote:
    >>>> Hi,
    >>>> I found some time to benchmark the "GC clears pages"-approach, and 
    >>>> it's fairly clear that it's not paying off. So ditching that idea.
    >>>> However, I'm still looking for something that would not just do 
    >>>> segmented clearing of arrays in large zpages. Letting oop arrays 
    >>>> temporarily be typed arrays while it's being cleared could be an 
    >>>> option. I did a prototype for that, which looks like this:
    >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
    >>>> There's at least one issue here, the code doing allocation sampling 
    >>>> will see that we allocated long arrays instead of oop arrays, so the 
    >>>> reporting there will be skewed. That can be addressed if we go down 
    >>>> this path. The code is otherwise fairly simple and contained. Feel 
    >>>> free to spot any issues.
    >>>> cheers,
    >>>> Per
    >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
    >>>>> Hi Ryan & Erik,
    >>>>>
    >>>>> I had a look at this and started exploring a slightly different 
    >>>>> approach. Instead doing segmented clearing in the allocation path, 
    >>>>> we can have the concurrent GC thread clear pages when they are 
    >>>>> reclaimed and not do any clearing in the allocation path at all.
    >>>>>
    >>>>> That would look like this:
    >>>>>
    >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
    >>>>>
    >>>>> (I've had to temporarily comment out three lines of assert/debug 
    >>>>> code to make this work)
    >>>>>
    >>>>> The relocation set selection phase will now be tasked with some 
    >>>>> potentially expensive clearing work, so we'll want to make that 
    >>>>> part parallel also.
    >>>>>
    >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
    >>>>>
    >>>>> Moving this work from Java threads onto the concurrent GC threads 
    >>>>> means we will potentially prolong the RelocationSetSelection and 
    >>>>> Relocation phases. That might be a trade-off worth doing. In 
    >>>>> return, we get:
    >>>>>
    >>>>> * Faster array allocations, as there's now less work done in the 
    >>>>> allocation path.
    >>>>> * This benefits all arrays, not just those allocated in large pages.
    >>>>> * No need to consider/tune a "chunk size".
    >>>>> * I also tend think we'll end up with slightly less complex code, 
    >>>>> that is a bit easier to reason about. Can be debated of course.
    >>>>>
    >>>>> This approach might also "survive" longer, because the YC scheme 
    >>>>> we've been loosely thinking about currently requires newly 
    >>>>> allocated pages to be cleared anyway. It's of course too early to 
    >>>>> tell if that requirement will stand in the end, but it's possible 
    >>>>> anyway.
    >>>>>
    >>>>> I'll need to do some more testing and benchmarking to make sure 
    >>>>> there's no regression or bugs here. The commented out debug code 
    >>>>> also needs to be addressed of course.
    >>>>>
    >>>>> Comments? Other ideas?
    >>>>>
    >>>>> cheers,
    >>>>> Per
    >>>>>
    >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
    >>>>>>
    >>>>>> Somehow I lost the RFR off the front and started a new thread.
    >>>>>> Now that we're both off vacation I'd like to revisit this.  Can 
    >>>>>> you take a look?
    >>>>>>
    >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, 
    >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
    >>>>>> sci at amazon.com> wrote:
    >>>>>>
    >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
    >>>>>>       This shifts away from abusing the constructor do_zero magic 
    >>>>>> in exchange for virtualizing mem_clear() and specializing for the 
    >>>>>> Z version.  It does create a change in mem_clear in that it 
    >>>>>> returns an updated version of mem.  It does create change outside 
    >>>>>> of the Z code however it does feel cleaner.
    >>>>>>       I didn't make a change to PinAllocating - looking at it, the 
    >>>>>> safety of keeping it constructor / destructor based still seemed 
    >>>>>> appropriate to me.  If the objection is to using the sequence 
    >>>>>> numbers to pin (and instead using handles to update) - this to me 
    >>>>>> seems less error prone.  I had originally discussed handles with 
    >>>>>> Stefan but the proposal came down to this which looks much cleaner.
    >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of 
    >>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on 
    >>>>>> behalf of sci at amazon.com> wrote:
    >>>>>>           1) Yes this was a conscious decision.  There was 
    >>>>>> discussion on determining the optimal point for breakup but given 
    >>>>>> the existing sizes this seemed sufficient.  This doesn't preclude 
    >>>>>> the ability to go down that path if its deemed absolutely 
    >>>>>> necessary.  The path for more complex decisions is now available.
    >>>>>>           2) Agree
    >>>>>>           3) I'm not clear here.  Do you mean effectively going 
    >>>>>> direct to ZHeap and bypassing the single function PinAllocating?  
    >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
    >>>>>>           4) Agree
    >>>>>>           5) I initially had the exact same reaction but I played 
    >>>>>> around with a few other versions (including breaking up 
    >>>>>> initialization points between header and body to get the desired 
    >>>>>> results) and this ended up looking correct.  I'll try mixing in 
    >>>>>> the mem clearer function again (fresh start) to see if it looks 
    >>>>>> any better.
    >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> 
    >>>>>> wrote:
    >>>>>>               Hi Ryan,
    >>>>>>               A few general comments:
    >>>>>>               1) It looks like this still only work for large pages?
    >>>>>>               2) The log_info stuff should be removed.
    >>>>>>               3) I'm not a huge fan of single-use utilities like 
    >>>>>> PinAllocating, at
    >>>>>>               least not when, IMO, the alternative is more 
    >>>>>> straight forward and less code.
    >>>>>>               4) Please make locals const when possible.
    >>>>>>               5) Duplicating _do_zero looks odd. Injecting a "mem 
    >>>>>> clearer", similar to
    >>>>>>               what Stefans original patch did, seems worth exploring.
    >>>>>>               cheers,
    >>>>>>               /Per
    >>>>>>               (Btw, I'm on vacation so I might not be 
    >>>>>> super-responsive to emails)
    >>>>>>               On 2019-07-08 12:42, Erik ?sterlund wrote:
    >>>>>>               > Hi Ryan,
    >>>>>>               >
    >>>>>>               > This looks good in general. Just some stylistic 
    >>>>>> things...
    >>>>>>               >
    >>>>>>               > 1) In the ZGC project we like the letter 'Z' so 
    >>>>>> much that we put it in
    >>>>>>               > front of everything we possibly can, including all 
    >>>>>> class names.
    >>>>>>               > 2) We also explicitly state things are private 
    >>>>>> even though it's
    >>>>>>               > bleedingly obvious.
    >>>>>>               >
    >>>>>>               > So:
    >>>>>>               >
    >>>>>>               > 39 class PinAllocating {
    >>>>>>               > 40 HeapWord* _mem;
    >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40 
    >>>>>> private: 41 HeapWord* _mem;
    >>>>>>               >    42
    >>>>>>               >   41 public: I can be your sponsor and push this 
    >>>>>> change for you. I don't
    >>>>>>               > think there is a need for another webrev for my 
    >>>>>> small stylistic remarks,
    >>>>>>               > so I can just fix that before pushing this for 
    >>>>>> you. On that note, I'll
    >>>>>>               > add me and StefanK to the contributed-by section 
    >>>>>> as we all worked out
    >>>>>>               > the right solution to this problem 
    >>>>>> collaboratively. I have run through
    >>>>>>               > mach5 tier1-5, and found no issues with this 
    >>>>>> patch. Thanks, /Erik
    >>>>>>               >
    >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
    >>>>>>               >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
    >>>>>>               >> https://bugs.openjdk.java.net/browse/JDK-8227226
    >>>>>>               >>
    >>>>>>               >> This patch introduces safe point checks into 
    >>>>>> array clearing during
    >>>>>>               >> allocation for ZGC.  The patch isolates the 
    >>>>>> changes to ZGC as (in
    >>>>>>               >> particular with the more modern collectors) the 
    >>>>>> approach to
    >>>>>>               >> incrementalizing or respecting safe point checks 
    >>>>>> is going to be
    >>>>>>               >> different.
    >>>>>>               >>
    >>>>>>               >> The approach is to keep the region holding the 
    >>>>>> array in the allocating
    >>>>>>               >> state (pin logic) while updating the color to the 
    >>>>>> array after checks.
    >>>>>>               >>
    >>>>>>               >> Can I get a review?  Thanks.
    >>>>>>               >>
    >>>>>>               >> Ryan
    >>>>>>               >
    >>>>>>
    >>
    

From zgu at redhat.com  Wed Aug  7 01:11:44 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 6 Aug 2019 21:11:44 -0400
Subject: RFR 8229213: Shenandoah: Allow VM global oop storage to be processed
 concurrently
Message-ID: <cbbd19b9-893c-e237-cd3a-f426b7b8ccdd@redhat.com>

JDK-8227653 introduced new VM global oop storage, and piggyback it to 
SystemDictionary for oop processing. This is not desirable for 
Shenandoah, since SystemDictionary is defined as a serial root, that can 
only be processed at paused.

This patch refactored oop storage backed roots and grouped jni handles 
and vm global oop storage into ShenandohVMRoots, that allows to be 
processed concurrently or at pauses, as currently 
ShenandoahJNIHandleRoots does.


Bug: https://bugs.openjdk.java.net/browse/JDK-8229213
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229213/webrev.00/index.html

Test:
   hotspot_gc_shenandoah (fastdebug and release)

Thanks,

-Zhengyu


From kim.barrett at oracle.com  Wed Aug  7 02:17:21 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 6 Aug 2019 22:17:21 -0400
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
Message-ID: <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>

> On Aug 6, 2019, at 7:03 AM, Jie Fu <fujie at loongson.cn> wrote:
> 
> Hi Martin,
> 
> Thanks for your review and valuable comments.
> 
> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.01/
> 
> Please see comments inline.
> 
> On 2019/8/6 ??5:29, Doerr, Martin wrote
>> From your description, OrderAccess ::loadload() seems to be the appropriate barrier.
>> It should be used on all platforms because it contains a compiler barrier for TSO platforms which prevent compilers from reordering the load accesses.
> Done. Thanks.
>> 
>> We should also check that the writer of _age._top uses at least a release (or storestore) barrier in your scenario.
> The writer calls GenericTaskQueue<E, F, N>::pop_global to update _age._top with _age.cmpxchg(newAge, oldAge) [1] .
> I think it already contains the release (or storestore) barrier semantic.
> What do you think?
>> I wonder why I've never seen this issue on PPC64. The test "TestHumongousClassLoader" seems to work stable. But could be that we just never hit this corner case by chance.
> Here is my reproducer to debug this issue: http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
> And this is my patch used to catch the corner case: http://cr.openjdk.java.net/~jiefu/8229169/gc-debug.diff
> HTH.
> 
> Any comments?
> 
> Thanks a lot.
> Best regards,
> Jie
> 
> [1] http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l222

The additional loadload barrier looks good.  I can sponsor this change.

I'd like the comment to be explicit that the barrier is to prevent
reordering the two reads of _age.  Perhaps note that if size == 0 then
_age cannot change after the read used for the size check.

I think a possible alternative to adding the barrier would be to
ensure the two uses of _age really are using the same data, as they
appear to be (incorrectly) assuming.  That is, read _age once and use
the result of that read in both places, e.g. something like

  Age curAge = _age.get();  // read once for consistent value below
  idx_t tp = curAge.top();
  ...
  } else {
    return pop_local_slow(localBot, curAge);
  }

I think I prefer the additional loadload barrier though. Maybe wait a
couple of days to see if anyone else wants to chime in though; these
sorts of things can be hard to review. I'll ask if anyone else here
has time to take a look.

Martin, I assume you also saw JDK-8229020, also from Jie?  I'm
somewhat surprised neither of these has been previously reported.
Maybe these Loongson CPUs are more aggressively re-ordering reads than
are other platforms?


From fujie at loongson.cn  Wed Aug  7 03:51:33 2019
From: fujie at loongson.cn (Jie Fu)
Date: Wed, 7 Aug 2019 11:51:33 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
Message-ID: <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>

Hi Kim,

Thank you so much.

Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.02/

I had added explicit comment and the reviewers in it.

Thanks a lot.
Best regards,
Jie

On 2019/8/7 ??10:17, Kim Barrett wrote:
>> On Aug 6, 2019, at 7:03 AM, Jie Fu <fujie at loongson.cn> wrote:
>>
>> Hi Martin,
>>
>> Thanks for your review and valuable comments.
>>
>> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.01/
>>
>> Please see comments inline.
>>
>> On 2019/8/6 ??5:29, Doerr, Martin wrote
>>>  From your description, OrderAccess ::loadload() seems to be the appropriate barrier.
>>> It should be used on all platforms because it contains a compiler barrier for TSO platforms which prevent compilers from reordering the load accesses.
>> Done. Thanks.
>>> We should also check that the writer of _age._top uses at least a release (or storestore) barrier in your scenario.
>> The writer calls GenericTaskQueue<E, F, N>::pop_global to update _age._top with _age.cmpxchg(newAge, oldAge) [1] .
>> I think it already contains the release (or storestore) barrier semantic.
>> What do you think?
>>> I wonder why I've never seen this issue on PPC64. The test "TestHumongousClassLoader" seems to work stable. But could be that we just never hit this corner case by chance.
>> Here is my reproducer to debug this issue: http://cr.openjdk.java.net/~jiefu/8229169/reproducer/
>> And this is my patch used to catch the corner case: http://cr.openjdk.java.net/~jiefu/8229169/gc-debug.diff
>> HTH.
>>
>> Any comments?
>>
>> Thanks a lot.
>> Best regards,
>> Jie
>>
>> [1] http://hg.openjdk.java.net/jdk/jdk/file/8f067351c370/src/hotspot/share/gc/shared/taskqueue.inline.hpp#l222
> The additional loadload barrier looks good.  I can sponsor this change.
>
> I'd like the comment to be explicit that the barrier is to prevent
> reordering the two reads of _age.  Perhaps note that if size == 0 then
> _age cannot change after the read used for the size check.
>
> I think a possible alternative to adding the barrier would be to
> ensure the two uses of _age really are using the same data, as they
> appear to be (incorrectly) assuming.  That is, read _age once and use
> the result of that read in both places, e.g. something like
>
>    Age curAge = _age.get();  // read once for consistent value below
>    idx_t tp = curAge.top();
>    ...
>    } else {
>      return pop_local_slow(localBot, curAge);
>    }
>
> I think I prefer the additional loadload barrier though. Maybe wait a
> couple of days to see if anyone else wants to chime in though; these
> sorts of things can be hard to review. I'll ask if anyone else here
> has time to take a look.
>
> Martin, I assume you also saw JDK-8229020, also from Jie?  I'm
> somewhat surprised neither of these has been previously reported.
> Maybe these Loongson CPUs are more aggressively re-ordering reads than
> are other platforms?
>


From shade at redhat.com  Wed Aug  7 07:08:01 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 7 Aug 2019 09:08:01 +0200
Subject: RFR(T) 8229206: Shenandoah: ShenandoahWeakRoot::oops_do() uses
 wrong timing phase
In-Reply-To: <739c7633-459a-8955-d982-a1016131798f@redhat.com>
References: <739c7633-459a-8955-d982-a1016131798f@redhat.com>
Message-ID: <f44bb695-9d1b-f2a8-c75d-d3378488a0a8@redhat.com>

On 8/7/19 1:47 AM, Zhengyu Gu wrote:
> ShenandoahWeakRoot::oop_do() uses wrong timing phase.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229206
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229206/webrev.00/

Oops. Looks good and trivial.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Wed Aug  7 07:10:26 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 7 Aug 2019 09:10:26 +0200
Subject: RFR 8229213: Shenandoah: Allow VM global oop storage to be
 processed concurrently
In-Reply-To: <cbbd19b9-893c-e237-cd3a-f426b7b8ccdd@redhat.com>
References: <cbbd19b9-893c-e237-cd3a-f426b7b8ccdd@redhat.com>
Message-ID: <46f1b6c2-44dd-e393-b434-24dc758f8857@redhat.com>

On 8/7/19 3:11 AM, Zhengyu Gu wrote:
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229213
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229213/webrev.00/index.html

Looks okay to me. Roman needs to take a look too?

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Wed Aug  7 07:17:41 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 7 Aug 2019 09:17:41 +0200
Subject: RFR 8229213: Shenandoah: Allow VM global oop storage to be
 processed concurrently
In-Reply-To: <cbbd19b9-893c-e237-cd3a-f426b7b8ccdd@redhat.com>
References: <cbbd19b9-893c-e237-cd3a-f426b7b8ccdd@redhat.com>
Message-ID: <861955ee-a815-71f5-c20c-4f4f59effe8e@redhat.com>

Yes, that looks good to me. Thanks!

Roman


> JDK-8227653 introduced new VM global oop storage, and piggyback it to
> SystemDictionary for oop processing. This is not desirable for
> Shenandoah, since SystemDictionary is defined as a serial root, that can
> only be processed at paused.
> 
> This patch refactored oop storage backed roots and grouped jni handles
> and vm global oop storage into ShenandohVMRoots, that allows to be
> processed concurrently or at pauses, as currently
> ShenandoahJNIHandleRoots does.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229213
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229213/webrev.00/index.html
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> 
> Thanks,
> 
> -Zhengyu


From thomas.schatzl at oracle.com  Wed Aug  7 08:10:41 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 7 Aug 2019 10:10:41 +0200
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
 <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
Message-ID: <2cc1cbd3-5986-d975-68ec-ca62a518c926@oracle.com>

Hi,

On 07.08.19 05:51, Jie Fu wrote:
> Hi Kim,
> 
> Thank you so much.
> 
> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.02/
> 
> I had added explicit comment and the reviewers in it.
> 

   change looks good to me.

Thanks,
   Thomas


From martin.doerr at sap.com  Wed Aug  7 08:24:24 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Wed, 7 Aug 2019 08:24:24 +0000
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
Message-ID: <AM6PR02MB4788716ABC2DF8A3EA77BCB79AD40@AM6PR02MB4788.eurprd02.prod.outlook.com>

Hi Jie and Kim,

I've run Jie's test for a while on a Power 8 machine, but the issue didn't show up.

> Martin, I assume you also saw JDK-8229020, also from Jie?  I'm
> somewhat surprised neither of these has been previously reported.
> Maybe these Loongson CPUs are more aggressively re-ordering reads than
> are other platforms?
There are other effects which may prevent re-ordering like data on same cache line, branch predicted differently, ...
So I'm not surprised we haven't observed all possible re-orderings.

Thanks, Jie, for improving the comment in your latest webrev. Looks good to me.
I appreciate that this gets fixed. Such problems are hard to find.

Best regards,
Martin


From fujie at loongson.cn  Wed Aug  7 09:08:14 2019
From: fujie at loongson.cn (Jie Fu)
Date: Wed, 7 Aug 2019 17:08:14 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <2cc1cbd3-5986-d975-68ec-ca62a518c926@oracle.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
 <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
 <2cc1cbd3-5986-d975-68ec-ca62a518c926@oracle.com>
Message-ID: <c980ebf5-ff85-ae98-4277-9decf04ff735@loongson.cn>

Hi Thomas,

Thanks for your review.

Updated the reviewers: http://cr.openjdk.java.net/~jiefu/8229169/webrev.03/

Thanks a lot.
Best regards,
Jie

On 2019/8/7 ??4:10, Thomas Schatzl wrote:
> Hi,
>
> On 07.08.19 05:51, Jie Fu wrote:
>> Hi Kim,
>>
>> Thank you so much.
>>
>> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.02/
>>
>> I had added explicit comment and the reviewers in it.
>>
>
> ? change looks good to me.
>
> Thanks,
> ? Thomas


From fujie at loongson.cn  Wed Aug  7 09:27:07 2019
From: fujie at loongson.cn (Jie Fu)
Date: Wed, 7 Aug 2019 17:27:07 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <AM6PR02MB4788716ABC2DF8A3EA77BCB79AD40@AM6PR02MB4788.eurprd02.prod.outlook.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
 <AM6PR02MB4788716ABC2DF8A3EA77BCB79AD40@AM6PR02MB4788.eurprd02.prod.outlook.com>
Message-ID: <19e54994-424d-2593-e7b7-fbc189c3a0cf@loongson.cn>

We (Loongson) are highly appreciated if this bug could be fixed.
Thanks Martin, Kim and Thomas for your help.

And special thanks to Kim for sponsoring it.

Thanks a lot.
Best regards,
Jie

On 2019/8/7 ??4:24, Doerr, Martin wrote:
> Hi Jie and Kim,
>
> I've run Jie's test for a while on a Power 8 machine, but the issue didn't show up.
>
>> Martin, I assume you also saw JDK-8229020, also from Jie?  I'm
>> somewhat surprised neither of these has been previously reported.
>> Maybe these Loongson CPUs are more aggressively re-ordering reads than
>> are other platforms?
> There are other effects which may prevent re-ordering like data on same cache line, branch predicted differently, ...
> So I'm not surprised we haven't observed all possible re-orderings.
>
> Thanks, Jie, for improving the comment in your latest webrev. Looks good to me.
> I appreciate that this gets fixed. Such problems are hard to find.
>
> Best regards,
> Martin
>


From per.liden at oracle.com  Wed Aug  7 09:59:42 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 7 Aug 2019 11:59:42 +0200
Subject: 8227226: Segmented array clearing for ZGC
In-Reply-To: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
References: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
Message-ID: <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>

Hi Ryan,

On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
> Although least intrusive, it goes back to some of the earlier complaints about using false in the constructor for do_zero.  It also makes a fair number of 

My earlier comment about this was not about passing false to the 
constructor, but the duplication of the _do_zero member, which I thought 
looked a bit odd. In this patch, this was avoided by separation these 
paths already in ZCollectedHeap::array_allocate().

> assumptions (and goes against the hierarchies intent) on initialization logic to hide in finish().  That said, I agree that is fairly clean - and definitely addresses the missed cases of the earlier webrev.
> 

We've had the same discussions here and concluded that we might want to 
restructure parts of MemAllocator to better accommodate this use case, 
but that overriding finish() seems ok for now. A patch to restructure 
MemAllocator could come later if we think it's needed.

> 2 things,
> 
> 1. Isn't the substitute_oop_array_klass() check too narrow?  It will only detect types Object[], and not any other type of reference array (such as String[]) ?  I believe there's a bug here (correct me if I'm wrong).

On the JVM level, Object[], String[] and int[][] all have the same 
Klass, so we should catch them all with this single check.

> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) dependency.  I realize what code base this is in but it would be properly defensive.

Sounds good.

> 
> What does the reporting look like in this case?  Is the long[] type reported accepted?  I'm wondering if this depletes some of the simplicity.

By overriding finish(), the sampling/reporting remains correct and 
unaffected, as it will never see the intermediate long[].

Updated webrev:

http://cr.openjdk.java.net/~pliden/8227226/webrev.4

cheers,
Per

> 
> ?On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:
> 
>      Did some micro-benchmarking (on a Xeon E5-2630) with various segment
>      sizes between 4K and 512K, and 64K seems to offer a good trade-off. For
>      a 1G array, the allocation time increases by ~1%, but in exchange the
>      worst case TTSP drops from ~280ms to ~0.6ms.
>      
>      Updated webrev using 64K:
>      
>      http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>      
>      cheers,
>      Per
>      
>      On 8/2/19 11:11 AM, Per Liden wrote:
>      > Hi Erik,
>      >
>      > On 8/1/19 5:56 PM, Erik Osterlund wrote:
>      >> Hi Per,
>      >>
>      >> I like that this approach is unintrusive, does its thing at the right
>      >> abstraction layer, and also handles medium sized arrays.
>      >
>      > It even handles small arrays (i.e. arrays in small zpages) ;)
>      >
>      >> Looks good.
>      >
>      > Thanks! I'll test various segment sizes and see how that affects
>      > performance and TTSP.
>      >
>      > cheers,
>      > Per
>      >
>      >>
>      >> Thanks,
>      >> /Erik
>      >>
>      >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>      >>>
>      >>> Here's an updated webrev that should be complete, i.e. fixes the
>      >>> issues related to allocation sampling/reporting that I mentioned.
>      >>> This patch makes MemAllocator::finish() virtual, so that we can do
>      >>> our thing and install the correct klass pointer before the Allocation
>      >>> destructor executes. This seems to be the least intrusive why of
>      >>> doing this.
>      >>>
>      >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>      >>>
>      >>> This passed function testing, but proper benchmarking remains to be
>      >>> done.
>      >>>
>      >>> cheers,
>      >>> Per
>      >>>
>      >>>> On 7/31/19 7:19 PM, Per Liden wrote:
>      >>>> Hi,
>      >>>> I found some time to benchmark the "GC clears pages"-approach, and
>      >>>> it's fairly clear that it's not paying off. So ditching that idea.
>      >>>> However, I'm still looking for something that would not just do
>      >>>> segmented clearing of arrays in large zpages. Letting oop arrays
>      >>>> temporarily be typed arrays while it's being cleared could be an
>      >>>> option. I did a prototype for that, which looks like this:
>      >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>      >>>> There's at least one issue here, the code doing allocation sampling
>      >>>> will see that we allocated long arrays instead of oop arrays, so the
>      >>>> reporting there will be skewed. That can be addressed if we go down
>      >>>> this path. The code is otherwise fairly simple and contained. Feel
>      >>>> free to spot any issues.
>      >>>> cheers,
>      >>>> Per
>      >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>      >>>>> Hi Ryan & Erik,
>      >>>>>
>      >>>>> I had a look at this and started exploring a slightly different
>      >>>>> approach. Instead doing segmented clearing in the allocation path,
>      >>>>> we can have the concurrent GC thread clear pages when they are
>      >>>>> reclaimed and not do any clearing in the allocation path at all.
>      >>>>>
>      >>>>> That would look like this:
>      >>>>>
>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>      >>>>>
>      >>>>> (I've had to temporarily comment out three lines of assert/debug
>      >>>>> code to make this work)
>      >>>>>
>      >>>>> The relocation set selection phase will now be tasked with some
>      >>>>> potentially expensive clearing work, so we'll want to make that
>      >>>>> part parallel also.
>      >>>>>
>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>      >>>>>
>      >>>>> Moving this work from Java threads onto the concurrent GC threads
>      >>>>> means we will potentially prolong the RelocationSetSelection and
>      >>>>> Relocation phases. That might be a trade-off worth doing. In
>      >>>>> return, we get:
>      >>>>>
>      >>>>> * Faster array allocations, as there's now less work done in the
>      >>>>> allocation path.
>      >>>>> * This benefits all arrays, not just those allocated in large pages.
>      >>>>> * No need to consider/tune a "chunk size".
>      >>>>> * I also tend think we'll end up with slightly less complex code,
>      >>>>> that is a bit easier to reason about. Can be debated of course.
>      >>>>>
>      >>>>> This approach might also "survive" longer, because the YC scheme
>      >>>>> we've been loosely thinking about currently requires newly
>      >>>>> allocated pages to be cleared anyway. It's of course too early to
>      >>>>> tell if that requirement will stand in the end, but it's possible
>      >>>>> anyway.
>      >>>>>
>      >>>>> I'll need to do some more testing and benchmarking to make sure
>      >>>>> there's no regression or bugs here. The commented out debug code
>      >>>>> also needs to be addressed of course.
>      >>>>>
>      >>>>> Comments? Other ideas?
>      >>>>>
>      >>>>> cheers,
>      >>>>> Per
>      >>>>>
>      >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>      >>>>>>
>      >>>>>> Somehow I lost the RFR off the front and started a new thread.
>      >>>>>> Now that we're both off vacation I'd like to revisit this.  Can
>      >>>>>> you take a look?
>      >>>>>>
>      >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone,
>      >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>      >>>>>> sci at amazon.com> wrote:
>      >>>>>>
>      >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>      >>>>>>       This shifts away from abusing the constructor do_zero magic
>      >>>>>> in exchange for virtualizing mem_clear() and specializing for the
>      >>>>>> Z version.  It does create a change in mem_clear in that it
>      >>>>>> returns an updated version of mem.  It does create change outside
>      >>>>>> of the Z code however it does feel cleaner.
>      >>>>>>       I didn't make a change to PinAllocating - looking at it, the
>      >>>>>> safety of keeping it constructor / destructor based still seemed
>      >>>>>> appropriate to me.  If the objection is to using the sequence
>      >>>>>> numbers to pin (and instead using handles to update) - this to me
>      >>>>>> seems less error prone.  I had originally discussed handles with
>      >>>>>> Stefan but the proposal came down to this which looks much cleaner.
>      >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
>      >>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on
>      >>>>>> behalf of sci at amazon.com> wrote:
>      >>>>>>           1) Yes this was a conscious decision.  There was
>      >>>>>> discussion on determining the optimal point for breakup but given
>      >>>>>> the existing sizes this seemed sufficient.  This doesn't preclude
>      >>>>>> the ability to go down that path if its deemed absolutely
>      >>>>>> necessary.  The path for more complex decisions is now available.
>      >>>>>>           2) Agree
>      >>>>>>           3) I'm not clear here.  Do you mean effectively going
>      >>>>>> direct to ZHeap and bypassing the single function PinAllocating?
>      >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>      >>>>>>           4) Agree
>      >>>>>>           5) I initially had the exact same reaction but I played
>      >>>>>> around with a few other versions (including breaking up
>      >>>>>> initialization points between header and body to get the desired
>      >>>>>> results) and this ended up looking correct.  I'll try mixing in
>      >>>>>> the mem clearer function again (fresh start) to see if it looks
>      >>>>>> any better.
>      >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com>
>      >>>>>> wrote:
>      >>>>>>               Hi Ryan,
>      >>>>>>               A few general comments:
>      >>>>>>               1) It looks like this still only work for large pages?
>      >>>>>>               2) The log_info stuff should be removed.
>      >>>>>>               3) I'm not a huge fan of single-use utilities like
>      >>>>>> PinAllocating, at
>      >>>>>>               least not when, IMO, the alternative is more
>      >>>>>> straight forward and less code.
>      >>>>>>               4) Please make locals const when possible.
>      >>>>>>               5) Duplicating _do_zero looks odd. Injecting a "mem
>      >>>>>> clearer", similar to
>      >>>>>>               what Stefans original patch did, seems worth exploring.
>      >>>>>>               cheers,
>      >>>>>>               /Per
>      >>>>>>               (Btw, I'm on vacation so I might not be
>      >>>>>> super-responsive to emails)
>      >>>>>>               On 2019-07-08 12:42, Erik ?sterlund wrote:
>      >>>>>>               > Hi Ryan,
>      >>>>>>               >
>      >>>>>>               > This looks good in general. Just some stylistic
>      >>>>>> things...
>      >>>>>>               >
>      >>>>>>               > 1) In the ZGC project we like the letter 'Z' so
>      >>>>>> much that we put it in
>      >>>>>>               > front of everything we possibly can, including all
>      >>>>>> class names.
>      >>>>>>               > 2) We also explicitly state things are private
>      >>>>>> even though it's
>      >>>>>>               > bleedingly obvious.
>      >>>>>>               >
>      >>>>>>               > So:
>      >>>>>>               >
>      >>>>>>               > 39 class PinAllocating {
>      >>>>>>               > 40 HeapWord* _mem;
>      >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40
>      >>>>>> private: 41 HeapWord* _mem;
>      >>>>>>               >    42
>      >>>>>>               >   41 public: I can be your sponsor and push this
>      >>>>>> change for you. I don't
>      >>>>>>               > think there is a need for another webrev for my
>      >>>>>> small stylistic remarks,
>      >>>>>>               > so I can just fix that before pushing this for
>      >>>>>> you. On that note, I'll
>      >>>>>>               > add me and StefanK to the contributed-by section
>      >>>>>> as we all worked out
>      >>>>>>               > the right solution to this problem
>      >>>>>> collaboratively. I have run through
>      >>>>>>               > mach5 tier1-5, and found no issues with this
>      >>>>>> patch. Thanks, /Erik
>      >>>>>>               >
>      >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>      >>>>>>               >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>      >>>>>>               >> https://bugs.openjdk.java.net/browse/JDK-8227226
>      >>>>>>               >>
>      >>>>>>               >> This patch introduces safe point checks into
>      >>>>>> array clearing during
>      >>>>>>               >> allocation for ZGC.  The patch isolates the
>      >>>>>> changes to ZGC as (in
>      >>>>>>               >> particular with the more modern collectors) the
>      >>>>>> approach to
>      >>>>>>               >> incrementalizing or respecting safe point checks
>      >>>>>> is going to be
>      >>>>>>               >> different.
>      >>>>>>               >>
>      >>>>>>               >> The approach is to keep the region holding the
>      >>>>>> array in the allocating
>      >>>>>>               >> state (pin logic) while updating the color to the
>      >>>>>> array after checks.
>      >>>>>>               >>
>      >>>>>>               >> Can I get a review?  Thanks.
>      >>>>>>               >>
>      >>>>>>               >> Ryan
>      >>>>>>               >
>      >>>>>>
>      >>
>      
> 


From per.liden at oracle.com  Wed Aug  7 10:23:13 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 7 Aug 2019 12:23:13 +0200
Subject: 8227226: Segmented array clearing for ZGC
In-Reply-To: <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>
References: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
 <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>
Message-ID: <eb005dec-18c3-8079-1cd1-49b9b9cda907@oracle.com>

Hi again,

On 8/7/19 11:59 AM, Per Liden wrote:
> Hi Ryan,
> 
> On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
>> Although least intrusive, it goes back to some of the earlier 
>> complaints about using false in the constructor for do_zero.? It also 
>> makes a fair number of 
> 
> My earlier comment about this was not about passing false to the 
> constructor, but the duplication of the _do_zero member, which I thought 
> looked a bit odd. In this patch, this was avoided by separation these 
> paths already in ZCollectedHeap::array_allocate().
> 
>> assumptions (and goes against the hierarchies intent) on 
>> initialization logic to hide in finish().? That said, I agree that is 
>> fairly clean - and definitely addresses the missed cases of the 
>> earlier webrev.
>>
> 
> We've had the same discussions here and concluded that we might want to 
> restructure parts of MemAllocator to better accommodate this use case, 
> but that overriding finish() seems ok for now. A patch to restructure 
> MemAllocator could come later if we think it's needed.
> 
>> 2 things,
>>
>> 1. Isn't the substitute_oop_array_klass() check too narrow?? It will 
>> only detect types Object[], and not any other type of reference array 
>> (such as String[]) ?? I believe there's a bug here (correct me if I'm 
>> wrong).
> 
> On the JVM level, Object[], String[] and int[][] all have the same 
> Klass, so we should catch them all with this single check.

Sorry, I'm of course wrong here. Changed the check to call 
klass->is_objArray_klass() instead. Thanks!

Updated webrev.4 in-place.

cheers,
Per

> 
>> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) 
>> dependency.? I realize what code base this is in but it would be 
>> properly defensive.
> 
> Sounds good.
> 
>>
>> What does the reporting look like in this case?? Is the long[] type 
>> reported accepted?? I'm wondering if this depletes some of the 
>> simplicity.
> 
> By overriding finish(), the sampling/reporting remains correct and 
> unaffected, as it will never see the intermediate long[].
> 
> Updated webrev:
> 
> http://cr.openjdk.java.net/~pliden/8227226/webrev.4
> 
> cheers,
> Per
> 
>>
>> ?On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" 
>> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
>> per.liden at oracle.com> wrote:
>>
>> ???? Did some micro-benchmarking (on a Xeon E5-2630) with various segment
>> ???? sizes between 4K and 512K, and 64K seems to offer a good 
>> trade-off. For
>> ???? a 1G array, the allocation time increases by ~1%, but in exchange 
>> the
>> ???? worst case TTSP drops from ~280ms to ~0.6ms.
>> ???? Updated webrev using 64K:
>> ???? http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>> ???? cheers,
>> ???? Per
>> ???? On 8/2/19 11:11 AM, Per Liden wrote:
>> ???? > Hi Erik,
>> ???? >
>> ???? > On 8/1/19 5:56 PM, Erik Osterlund wrote:
>> ???? >> Hi Per,
>> ???? >>
>> ???? >> I like that this approach is unintrusive, does its thing at 
>> the right
>> ???? >> abstraction layer, and also handles medium sized arrays.
>> ???? >
>> ???? > It even handles small arrays (i.e. arrays in small zpages) ;)
>> ???? >
>> ???? >> Looks good.
>> ???? >
>> ???? > Thanks! I'll test various segment sizes and see how that affects
>> ???? > performance and TTSP.
>> ???? >
>> ???? > cheers,
>> ???? > Per
>> ???? >
>> ???? >>
>> ???? >> Thanks,
>> ???? >> /Erik
>> ???? >>
>> ???? >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>> ???? >>>
>> ???? >>> Here's an updated webrev that should be complete, i.e. fixes the
>> ???? >>> issues related to allocation sampling/reporting that I 
>> mentioned.
>> ???? >>> This patch makes MemAllocator::finish() virtual, so that we 
>> can do
>> ???? >>> our thing and install the correct klass pointer before the 
>> Allocation
>> ???? >>> destructor executes. This seems to be the least intrusive why of
>> ???? >>> doing this.
>> ???? >>>
>> ???? >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>> ???? >>>
>> ???? >>> This passed function testing, but proper benchmarking remains 
>> to be
>> ???? >>> done.
>> ???? >>>
>> ???? >>> cheers,
>> ???? >>> Per
>> ???? >>>
>> ???? >>>> On 7/31/19 7:19 PM, Per Liden wrote:
>> ???? >>>> Hi,
>> ???? >>>> I found some time to benchmark the "GC clears 
>> pages"-approach, and
>> ???? >>>> it's fairly clear that it's not paying off. So ditching that 
>> idea.
>> ???? >>>> However, I'm still looking for something that would not just do
>> ???? >>>> segmented clearing of arrays in large zpages. Letting oop 
>> arrays
>> ???? >>>> temporarily be typed arrays while it's being cleared could 
>> be an
>> ???? >>>> option. I did a prototype for that, which looks like this:
>> ???? >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>> ???? >>>> There's at least one issue here, the code doing allocation 
>> sampling
>> ???? >>>> will see that we allocated long arrays instead of oop 
>> arrays, so the
>> ???? >>>> reporting there will be skewed. That can be addressed if we 
>> go down
>> ???? >>>> this path. The code is otherwise fairly simple and 
>> contained. Feel
>> ???? >>>> free to spot any issues.
>> ???? >>>> cheers,
>> ???? >>>> Per
>> ???? >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>> ???? >>>>> Hi Ryan & Erik,
>> ???? >>>>>
>> ???? >>>>> I had a look at this and started exploring a slightly 
>> different
>> ???? >>>>> approach. Instead doing segmented clearing in the 
>> allocation path,
>> ???? >>>>> we can have the concurrent GC thread clear pages when they are
>> ???? >>>>> reclaimed and not do any clearing in the allocation path at 
>> all.
>> ???? >>>>>
>> ???? >>>>> That would look like this:
>> ???? >>>>>
>> ???? >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>> ???? >>>>>
>> ???? >>>>> (I've had to temporarily comment out three lines of 
>> assert/debug
>> ???? >>>>> code to make this work)
>> ???? >>>>>
>> ???? >>>>> The relocation set selection phase will now be tasked with 
>> some
>> ???? >>>>> potentially expensive clearing work, so we'll want to make 
>> that
>> ???? >>>>> part parallel also.
>> ???? >>>>>
>> ???? >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>> ???? >>>>>
>> ???? >>>>> Moving this work from Java threads onto the concurrent GC 
>> threads
>> ???? >>>>> means we will potentially prolong the 
>> RelocationSetSelection and
>> ???? >>>>> Relocation phases. That might be a trade-off worth doing. In
>> ???? >>>>> return, we get:
>> ???? >>>>>
>> ???? >>>>> * Faster array allocations, as there's now less work done 
>> in the
>> ???? >>>>> allocation path.
>> ???? >>>>> * This benefits all arrays, not just those allocated in 
>> large pages.
>> ???? >>>>> * No need to consider/tune a "chunk size".
>> ???? >>>>> * I also tend think we'll end up with slightly less complex 
>> code,
>> ???? >>>>> that is a bit easier to reason about. Can be debated of 
>> course.
>> ???? >>>>>
>> ???? >>>>> This approach might also "survive" longer, because the YC 
>> scheme
>> ???? >>>>> we've been loosely thinking about currently requires newly
>> ???? >>>>> allocated pages to be cleared anyway. It's of course too 
>> early to
>> ???? >>>>> tell if that requirement will stand in the end, but it's 
>> possible
>> ???? >>>>> anyway.
>> ???? >>>>>
>> ???? >>>>> I'll need to do some more testing and benchmarking to make 
>> sure
>> ???? >>>>> there's no regression or bugs here. The commented out debug 
>> code
>> ???? >>>>> also needs to be addressed of course.
>> ???? >>>>>
>> ???? >>>>> Comments? Other ideas?
>> ???? >>>>>
>> ???? >>>>> cheers,
>> ???? >>>>> Per
>> ???? >>>>>
>> ???? >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>> ???? >>>>>>
>> ???? >>>>>> Somehow I lost the RFR off the front and started a new 
>> thread.
>> ???? >>>>>> Now that we're both off vacation I'd like to revisit 
>> this.? Can
>> ???? >>>>>> you take a look?
>> ???? >>>>>>
>> ???? >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of 
>> Sciampacone,
>> ???? >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>> ???? >>>>>> sci at amazon.com> wrote:
>> ???? >>>>>>
>> ???? >>>>>>?????? http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>> ???? >>>>>>?????? This shifts away from abusing the constructor 
>> do_zero magic
>> ???? >>>>>> in exchange for virtualizing mem_clear() and specializing 
>> for the
>> ???? >>>>>> Z version.? It does create a change in mem_clear in that it
>> ???? >>>>>> returns an updated version of mem.? It does create change 
>> outside
>> ???? >>>>>> of the Z code however it does feel cleaner.
>> ???? >>>>>>?????? I didn't make a change to PinAllocating - looking at 
>> it, the
>> ???? >>>>>> safety of keeping it constructor / destructor based still 
>> seemed
>> ???? >>>>>> appropriate to me.? If the objection is to using the sequence
>> ???? >>>>>> numbers to pin (and instead using handles to update) - 
>> this to me
>> ???? >>>>>> seems less error prone.? I had originally discussed 
>> handles with
>> ???? >>>>>> Stefan but the proposal came down to this which looks much 
>> cleaner.
>> ???? >>>>>>?????? On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
>> ???? >>>>>> Sciampacone, Ryan" 
>> <hotspot-gc-dev-bounces at openjdk.java.net on
>> ???? >>>>>> behalf of sci at amazon.com> wrote:
>> ???? >>>>>>?????????? 1) Yes this was a conscious decision.? There was
>> ???? >>>>>> discussion on determining the optimal point for breakup 
>> but given
>> ???? >>>>>> the existing sizes this seemed sufficient.? This doesn't 
>> preclude
>> ???? >>>>>> the ability to go down that path if its deemed absolutely
>> ???? >>>>>> necessary.? The path for more complex decisions is now 
>> available.
>> ???? >>>>>>?????????? 2) Agree
>> ???? >>>>>>?????????? 3) I'm not clear here.? Do you mean effectively 
>> going
>> ???? >>>>>> direct to ZHeap and bypassing the single function 
>> PinAllocating?
>> ???? >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>> ???? >>>>>>?????????? 4) Agree
>> ???? >>>>>>?????????? 5) I initially had the exact same reaction but I 
>> played
>> ???? >>>>>> around with a few other versions (including breaking up
>> ???? >>>>>> initialization points between header and body to get the 
>> desired
>> ???? >>>>>> results) and this ended up looking correct.? I'll try 
>> mixing in
>> ???? >>>>>> the mem clearer function again (fresh start) to see if it 
>> looks
>> ???? >>>>>> any better.
>> ???? >>>>>>?????????? On 7/8/19, 5:49 AM, "Per Liden" 
>> <per.liden at oracle.com>
>> ???? >>>>>> wrote:
>> ???? >>>>>>?????????????? Hi Ryan,
>> ???? >>>>>>?????????????? A few general comments:
>> ???? >>>>>>?????????????? 1) It looks like this still only work for 
>> large pages?
>> ???? >>>>>>?????????????? 2) The log_info stuff should be removed.
>> ???? >>>>>>?????????????? 3) I'm not a huge fan of single-use 
>> utilities like
>> ???? >>>>>> PinAllocating, at
>> ???? >>>>>>?????????????? least not when, IMO, the alternative is more
>> ???? >>>>>> straight forward and less code.
>> ???? >>>>>>?????????????? 4) Please make locals const when possible.
>> ???? >>>>>>?????????????? 5) Duplicating _do_zero looks odd. Injecting 
>> a "mem
>> ???? >>>>>> clearer", similar to
>> ???? >>>>>>?????????????? what Stefans original patch did, seems worth 
>> exploring.
>> ???? >>>>>>?????????????? cheers,
>> ???? >>>>>>?????????????? /Per
>> ???? >>>>>>?????????????? (Btw, I'm on vacation so I might not be
>> ???? >>>>>> super-responsive to emails)
>> ???? >>>>>>?????????????? On 2019-07-08 12:42, Erik ?sterlund wrote:
>> ???? >>>>>>?????????????? > Hi Ryan,
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>?????????????? > This looks good in general. Just some 
>> stylistic
>> ???? >>>>>> things...
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>?????????????? > 1) In the ZGC project we like the letter 
>> 'Z' so
>> ???? >>>>>> much that we put it in
>> ???? >>>>>>?????????????? > front of everything we possibly can, 
>> including all
>> ???? >>>>>> class names.
>> ???? >>>>>>?????????????? > 2) We also explicitly state things are 
>> private
>> ???? >>>>>> even though it's
>> ???? >>>>>>?????????????? > bleedingly obvious.
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>?????????????? > So:
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>?????????????? > 39 class PinAllocating {
>> ???? >>>>>>?????????????? > 40 HeapWord* _mem;
>> ???? >>>>>>?????????????? > 41 public: -> 39 class ZPinAllocating { 40
>> ???? >>>>>> private: 41 HeapWord* _mem;
>> ???? >>>>>>?????????????? >??? 42
>> ???? >>>>>>?????????????? >?? 41 public: I can be your sponsor and 
>> push this
>> ???? >>>>>> change for you. I don't
>> ???? >>>>>>?????????????? > think there is a need for another webrev 
>> for my
>> ???? >>>>>> small stylistic remarks,
>> ???? >>>>>>?????????????? > so I can just fix that before pushing this 
>> for
>> ???? >>>>>> you. On that note, I'll
>> ???? >>>>>>?????????????? > add me and StefanK to the contributed-by 
>> section
>> ???? >>>>>> as we all worked out
>> ???? >>>>>>?????????????? > the right solution to this problem
>> ???? >>>>>> collaboratively. I have run through
>> ???? >>>>>>?????????????? > mach5 tier1-5, and found no issues with this
>> ???? >>>>>> patch. Thanks, /Erik
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>?????????????? > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>> ???? >>>>>>?????????????? >> 
>> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>> ???? >>>>>>?????????????? >> 
>> https://bugs.openjdk.java.net/browse/JDK-8227226
>> ???? >>>>>>?????????????? >>
>> ???? >>>>>>?????????????? >> This patch introduces safe point checks into
>> ???? >>>>>> array clearing during
>> ???? >>>>>>?????????????? >> allocation for ZGC.? The patch isolates the
>> ???? >>>>>> changes to ZGC as (in
>> ???? >>>>>>?????????????? >> particular with the more modern 
>> collectors) the
>> ???? >>>>>> approach to
>> ???? >>>>>>?????????????? >> incrementalizing or respecting safe point 
>> checks
>> ???? >>>>>> is going to be
>> ???? >>>>>>?????????????? >> different.
>> ???? >>>>>>?????????????? >>
>> ???? >>>>>>?????????????? >> The approach is to keep the region 
>> holding the
>> ???? >>>>>> array in the allocating
>> ???? >>>>>>?????????????? >> state (pin logic) while updating the 
>> color to the
>> ???? >>>>>> array after checks.
>> ???? >>>>>>?????????????? >>
>> ???? >>>>>>?????????????? >> Can I get a review?? Thanks.
>> ???? >>>>>>?????????????? >>
>> ???? >>>>>>?????????????? >> Ryan
>> ???? >>>>>>?????????????? >
>> ???? >>>>>>
>> ???? >>
>>

From thomas.schatzl at oracle.com  Wed Aug  7 10:39:46 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 7 Aug 2019 12:39:46 +0200
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
Message-ID: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>

Hi all,

   can I have reviews for this refactoring that changes the minimum 
index for the young indices (used for determining survivors per young 
region) from -1 to 0?

This avoids some imho unnecessary increment in the 
copy_to_survivor_space() method.

CR:
https://bugs.openjdk.java.net/browse/JDK-8227442
Webrev:
http://cr.openjdk.java.net/~tschatzl/8227442/webrev/
Testing:
hs-tier1-5 almost done with no issues

Thanks,
   Thomas


From sci at amazon.com  Wed Aug  7 13:55:46 2019
From: sci at amazon.com (Sciampacone, Ryan)
Date: Wed, 7 Aug 2019 13:55:46 +0000
Subject: 8227226: Segmented array clearing for ZGC
In-Reply-To: <eb005dec-18c3-8079-1cd1-49b9b9cda907@oracle.com>
References: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
 <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>
 <eb005dec-18c3-8079-1cd1-49b9b9cda907@oracle.com>
Message-ID: <8DD2FD76-8995-4BCB-A075-3215F466915E@amazon.com>

    > By overriding finish(), the sampling/reporting remains correct and 
    > unaffected, as it will never see the intermediate long[].
  
I learned something today.  Thank you.

For MemAllocator, I think we all agree the flow is locked in a bit too rigidly but this helps with some of the VM/GC assumptions so we end up battling it.  That said I'm with you - if there's a rewrite to be had, it's not in this patch.

Otherwise, fwiw lgtm.


?On 8/7/19, 3:24 AM, "Per Liden" <per.liden at oracle.com> wrote:

    Hi again,
    
    On 8/7/19 11:59 AM, Per Liden wrote:
    > Hi Ryan,
    > 
    > On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
    >> Although least intrusive, it goes back to some of the earlier 
    >> complaints about using false in the constructor for do_zero.  It also 
    >> makes a fair number of 
    > 
    > My earlier comment about this was not about passing false to the 
    > constructor, but the duplication of the _do_zero member, which I thought 
    > looked a bit odd. In this patch, this was avoided by separation these 
    > paths already in ZCollectedHeap::array_allocate().
    > 
    >> assumptions (and goes against the hierarchies intent) on 
    >> initialization logic to hide in finish().  That said, I agree that is 
    >> fairly clean - and definitely addresses the missed cases of the 
    >> earlier webrev.
    >>
    > 
    > We've had the same discussions here and concluded that we might want to 
    > restructure parts of MemAllocator to better accommodate this use case, 
    > but that overriding finish() seems ok for now. A patch to restructure 
    > MemAllocator could come later if we think it's needed.
    > 
    >> 2 things,
    >>
    >> 1. Isn't the substitute_oop_array_klass() check too narrow?  It will 
    >> only detect types Object[], and not any other type of reference array 
    >> (such as String[]) ?  I believe there's a bug here (correct me if I'm 
    >> wrong).
    > 
    > On the JVM level, Object[], String[] and int[][] all have the same 
    > Klass, so we should catch them all with this single check.
    
    Sorry, I'm of course wrong here. Changed the check to call 
    klass->is_objArray_klass() instead. Thanks!
    
    Updated webrev.4 in-place.
    
    cheers,
    Per
    
    > 
    >> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) 
    >> dependency.  I realize what code base this is in but it would be 
    >> properly defensive.
    > 
    > Sounds good.
    > 
    >>
    >> What does the reporting look like in this case?  Is the long[] type 
    >> reported accepted?  I'm wondering if this depletes some of the 
    >> simplicity.
    > 
    > By overriding finish(), the sampling/reporting remains correct and 
    > unaffected, as it will never see the intermediate long[].
    > 
    > Updated webrev:
    > 
    > http://cr.openjdk.java.net/~pliden/8227226/webrev.4
    > 
    > cheers,
    > Per
    > 
    >>
    >> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" 
    >> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
    >> per.liden at oracle.com> wrote:
    >>
    >>      Did some micro-benchmarking (on a Xeon E5-2630) with various segment
    >>      sizes between 4K and 512K, and 64K seems to offer a good 
    >> trade-off. For
    >>      a 1G array, the allocation time increases by ~1%, but in exchange 
    >> the
    >>      worst case TTSP drops from ~280ms to ~0.6ms.
    >>      Updated webrev using 64K:
    >>      http://cr.openjdk.java.net/~pliden/8227226/webrev.3
    >>      cheers,
    >>      Per
    >>      On 8/2/19 11:11 AM, Per Liden wrote:
    >>      > Hi Erik,
    >>      >
    >>      > On 8/1/19 5:56 PM, Erik Osterlund wrote:
    >>      >> Hi Per,
    >>      >>
    >>      >> I like that this approach is unintrusive, does its thing at 
    >> the right
    >>      >> abstraction layer, and also handles medium sized arrays.
    >>      >
    >>      > It even handles small arrays (i.e. arrays in small zpages) ;)
    >>      >
    >>      >> Looks good.
    >>      >
    >>      > Thanks! I'll test various segment sizes and see how that affects
    >>      > performance and TTSP.
    >>      >
    >>      > cheers,
    >>      > Per
    >>      >
    >>      >>
    >>      >> Thanks,
    >>      >> /Erik
    >>      >>
    >>      >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
    >>      >>>
    >>      >>> Here's an updated webrev that should be complete, i.e. fixes the
    >>      >>> issues related to allocation sampling/reporting that I 
    >> mentioned.
    >>      >>> This patch makes MemAllocator::finish() virtual, so that we 
    >> can do
    >>      >>> our thing and install the correct klass pointer before the 
    >> Allocation
    >>      >>> destructor executes. This seems to be the least intrusive why of
    >>      >>> doing this.
    >>      >>>
    >>      >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
    >>      >>>
    >>      >>> This passed function testing, but proper benchmarking remains 
    >> to be
    >>      >>> done.
    >>      >>>
    >>      >>> cheers,
    >>      >>> Per
    >>      >>>
    >>      >>>> On 7/31/19 7:19 PM, Per Liden wrote:
    >>      >>>> Hi,
    >>      >>>> I found some time to benchmark the "GC clears 
    >> pages"-approach, and
    >>      >>>> it's fairly clear that it's not paying off. So ditching that 
    >> idea.
    >>      >>>> However, I'm still looking for something that would not just do
    >>      >>>> segmented clearing of arrays in large zpages. Letting oop 
    >> arrays
    >>      >>>> temporarily be typed arrays while it's being cleared could 
    >> be an
    >>      >>>> option. I did a prototype for that, which looks like this:
    >>      >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
    >>      >>>> There's at least one issue here, the code doing allocation 
    >> sampling
    >>      >>>> will see that we allocated long arrays instead of oop 
    >> arrays, so the
    >>      >>>> reporting there will be skewed. That can be addressed if we 
    >> go down
    >>      >>>> this path. The code is otherwise fairly simple and 
    >> contained. Feel
    >>      >>>> free to spot any issues.
    >>      >>>> cheers,
    >>      >>>> Per
    >>      >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
    >>      >>>>> Hi Ryan & Erik,
    >>      >>>>>
    >>      >>>>> I had a look at this and started exploring a slightly 
    >> different
    >>      >>>>> approach. Instead doing segmented clearing in the 
    >> allocation path,
    >>      >>>>> we can have the concurrent GC thread clear pages when they are
    >>      >>>>> reclaimed and not do any clearing in the allocation path at 
    >> all.
    >>      >>>>>
    >>      >>>>> That would look like this:
    >>      >>>>>
    >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
    >>      >>>>>
    >>      >>>>> (I've had to temporarily comment out three lines of 
    >> assert/debug
    >>      >>>>> code to make this work)
    >>      >>>>>
    >>      >>>>> The relocation set selection phase will now be tasked with 
    >> some
    >>      >>>>> potentially expensive clearing work, so we'll want to make 
    >> that
    >>      >>>>> part parallel also.
    >>      >>>>>
    >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
    >>      >>>>>
    >>      >>>>> Moving this work from Java threads onto the concurrent GC 
    >> threads
    >>      >>>>> means we will potentially prolong the 
    >> RelocationSetSelection and
    >>      >>>>> Relocation phases. That might be a trade-off worth doing. In
    >>      >>>>> return, we get:
    >>      >>>>>
    >>      >>>>> * Faster array allocations, as there's now less work done 
    >> in the
    >>      >>>>> allocation path.
    >>      >>>>> * This benefits all arrays, not just those allocated in 
    >> large pages.
    >>      >>>>> * No need to consider/tune a "chunk size".
    >>      >>>>> * I also tend think we'll end up with slightly less complex 
    >> code,
    >>      >>>>> that is a bit easier to reason about. Can be debated of 
    >> course.
    >>      >>>>>
    >>      >>>>> This approach might also "survive" longer, because the YC 
    >> scheme
    >>      >>>>> we've been loosely thinking about currently requires newly
    >>      >>>>> allocated pages to be cleared anyway. It's of course too 
    >> early to
    >>      >>>>> tell if that requirement will stand in the end, but it's 
    >> possible
    >>      >>>>> anyway.
    >>      >>>>>
    >>      >>>>> I'll need to do some more testing and benchmarking to make 
    >> sure
    >>      >>>>> there's no regression or bugs here. The commented out debug 
    >> code
    >>      >>>>> also needs to be addressed of course.
    >>      >>>>>
    >>      >>>>> Comments? Other ideas?
    >>      >>>>>
    >>      >>>>> cheers,
    >>      >>>>> Per
    >>      >>>>>
    >>      >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
    >>      >>>>>>
    >>      >>>>>> Somehow I lost the RFR off the front and started a new 
    >> thread.
    >>      >>>>>> Now that we're both off vacation I'd like to revisit 
    >> this.  Can
    >>      >>>>>> you take a look?
    >>      >>>>>>
    >>      >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of 
    >> Sciampacone,
    >>      >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
    >>      >>>>>> sci at amazon.com> wrote:
    >>      >>>>>>
    >>      >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
    >>      >>>>>>       This shifts away from abusing the constructor 
    >> do_zero magic
    >>      >>>>>> in exchange for virtualizing mem_clear() and specializing 
    >> for the
    >>      >>>>>> Z version.  It does create a change in mem_clear in that it
    >>      >>>>>> returns an updated version of mem.  It does create change 
    >> outside
    >>      >>>>>> of the Z code however it does feel cleaner.
    >>      >>>>>>       I didn't make a change to PinAllocating - looking at 
    >> it, the
    >>      >>>>>> safety of keeping it constructor / destructor based still 
    >> seemed
    >>      >>>>>> appropriate to me.  If the objection is to using the sequence
    >>      >>>>>> numbers to pin (and instead using handles to update) - 
    >> this to me
    >>      >>>>>> seems less error prone.  I had originally discussed 
    >> handles with
    >>      >>>>>> Stefan but the proposal came down to this which looks much 
    >> cleaner.
    >>      >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
    >>      >>>>>> Sciampacone, Ryan" 
    >> <hotspot-gc-dev-bounces at openjdk.java.net on
    >>      >>>>>> behalf of sci at amazon.com> wrote:
    >>      >>>>>>           1) Yes this was a conscious decision.  There was
    >>      >>>>>> discussion on determining the optimal point for breakup 
    >> but given
    >>      >>>>>> the existing sizes this seemed sufficient.  This doesn't 
    >> preclude
    >>      >>>>>> the ability to go down that path if its deemed absolutely
    >>      >>>>>> necessary.  The path for more complex decisions is now 
    >> available.
    >>      >>>>>>           2) Agree
    >>      >>>>>>           3) I'm not clear here.  Do you mean effectively 
    >> going
    >>      >>>>>> direct to ZHeap and bypassing the single function 
    >> PinAllocating?
    >>      >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
    >>      >>>>>>           4) Agree
    >>      >>>>>>           5) I initially had the exact same reaction but I 
    >> played
    >>      >>>>>> around with a few other versions (including breaking up
    >>      >>>>>> initialization points between header and body to get the 
    >> desired
    >>      >>>>>> results) and this ended up looking correct.  I'll try 
    >> mixing in
    >>      >>>>>> the mem clearer function again (fresh start) to see if it 
    >> looks
    >>      >>>>>> any better.
    >>      >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" 
    >> <per.liden at oracle.com>
    >>      >>>>>> wrote:
    >>      >>>>>>               Hi Ryan,
    >>      >>>>>>               A few general comments:
    >>      >>>>>>               1) It looks like this still only work for 
    >> large pages?
    >>      >>>>>>               2) The log_info stuff should be removed.
    >>      >>>>>>               3) I'm not a huge fan of single-use 
    >> utilities like
    >>      >>>>>> PinAllocating, at
    >>      >>>>>>               least not when, IMO, the alternative is more
    >>      >>>>>> straight forward and less code.
    >>      >>>>>>               4) Please make locals const when possible.
    >>      >>>>>>               5) Duplicating _do_zero looks odd. Injecting 
    >> a "mem
    >>      >>>>>> clearer", similar to
    >>      >>>>>>               what Stefans original patch did, seems worth 
    >> exploring.
    >>      >>>>>>               cheers,
    >>      >>>>>>               /Per
    >>      >>>>>>               (Btw, I'm on vacation so I might not be
    >>      >>>>>> super-responsive to emails)
    >>      >>>>>>               On 2019-07-08 12:42, Erik ?sterlund wrote:
    >>      >>>>>>               > Hi Ryan,
    >>      >>>>>>               >
    >>      >>>>>>               > This looks good in general. Just some 
    >> stylistic
    >>      >>>>>> things...
    >>      >>>>>>               >
    >>      >>>>>>               > 1) In the ZGC project we like the letter 
    >> 'Z' so
    >>      >>>>>> much that we put it in
    >>      >>>>>>               > front of everything we possibly can, 
    >> including all
    >>      >>>>>> class names.
    >>      >>>>>>               > 2) We also explicitly state things are 
    >> private
    >>      >>>>>> even though it's
    >>      >>>>>>               > bleedingly obvious.
    >>      >>>>>>               >
    >>      >>>>>>               > So:
    >>      >>>>>>               >
    >>      >>>>>>               > 39 class PinAllocating {
    >>      >>>>>>               > 40 HeapWord* _mem;
    >>      >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40
    >>      >>>>>> private: 41 HeapWord* _mem;
    >>      >>>>>>               >    42
    >>      >>>>>>               >   41 public: I can be your sponsor and 
    >> push this
    >>      >>>>>> change for you. I don't
    >>      >>>>>>               > think there is a need for another webrev 
    >> for my
    >>      >>>>>> small stylistic remarks,
    >>      >>>>>>               > so I can just fix that before pushing this 
    >> for
    >>      >>>>>> you. On that note, I'll
    >>      >>>>>>               > add me and StefanK to the contributed-by 
    >> section
    >>      >>>>>> as we all worked out
    >>      >>>>>>               > the right solution to this problem
    >>      >>>>>> collaboratively. I have run through
    >>      >>>>>>               > mach5 tier1-5, and found no issues with this
    >>      >>>>>> patch. Thanks, /Erik
    >>      >>>>>>               >
    >>      >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
    >>      >>>>>>               >> 
    >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
    >>      >>>>>>               >> 
    >> https://bugs.openjdk.java.net/browse/JDK-8227226
    >>      >>>>>>               >>
    >>      >>>>>>               >> This patch introduces safe point checks into
    >>      >>>>>> array clearing during
    >>      >>>>>>               >> allocation for ZGC.  The patch isolates the
    >>      >>>>>> changes to ZGC as (in
    >>      >>>>>>               >> particular with the more modern 
    >> collectors) the
    >>      >>>>>> approach to
    >>      >>>>>>               >> incrementalizing or respecting safe point 
    >> checks
    >>      >>>>>> is going to be
    >>      >>>>>>               >> different.
    >>      >>>>>>               >>
    >>      >>>>>>               >> The approach is to keep the region 
    >> holding the
    >>      >>>>>> array in the allocating
    >>      >>>>>>               >> state (pin logic) while updating the 
    >> color to the
    >>      >>>>>> array after checks.
    >>      >>>>>>               >>
    >>      >>>>>>               >> Can I get a review?  Thanks.
    >>      >>>>>>               >>
    >>      >>>>>>               >> Ryan
    >>      >>>>>>               >
    >>      >>>>>>
    >>      >>
    >>
    

From tprintezis at twitter.com  Wed Aug  7 14:05:10 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Wed, 7 Aug 2019 07:05:10 -0700
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for young
 gen for gc+heap=info log lines
Message-ID: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>

Hi all,

Similar to 8227225 but for the GenCollectedHeap GCs. Webrev is here:

http://cr.openjdk.java.net/~tonyp/8227224/webrev.0/

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


From christoph.langer at sap.com  Wed Aug  7 14:29:49 2019
From: christoph.langer at sap.com (Langer, Christoph)
Date: Wed, 7 Aug 2019 14:29:49 +0000
Subject: RFR(XS) 8228359: [TESTBUG]
 jdk.jfr.e.g.c.TestGCHeapConfigurationEventWith32BitOops.java does not expect
 MinHeapSize to be aligned to HeapAlignment
In-Reply-To: <6d437432-a43c-3e27-5b48-84d5486ea5aa@oracle.com>
References: <VI1PR02MB4829816E753D15A3BF84726D9BC80@VI1PR02MB4829.eurprd02.prod.outlook.com>
 <6d437432-a43c-3e27-5b48-84d5486ea5aa@oracle.com>
Message-ID: <DB7PR02MB5193AF52F7E77CD6216C0C028AD40@DB7PR02MB5193.eurprd02.prod.outlook.com>

Hi Richard,

this little testfix looks good to me as well and I see it working perfectly in our JDK13 nightlies for the last days. If nobody on gc-dev has objections, I'll push it later today to JDK13.

Thanks
Christoph


> -----Original Message-----
> From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On
> Behalf Of mikhailo.seledtsov at oracle.com
> Sent: Mittwoch, 24. Juli 2019 02:02
> To: Reingruber, Richard <richard.reingruber at sap.com>; hotspot-jfr-
> dev at openjdk.java.net; hotspot-gc-dev at openjdk.java.net
> Subject: Re: RFR(XS) 8228359: [TESTBUG]
> jdk.jfr.e.g.c.TestGCHeapConfigurationEventWith32BitOops.java does not
> expect MinHeapSize to be aligned to HeapAlignment
> 
> Looks good from my POV.
> 
> Adding hotspot-gc-dev at openjdk.java.net since this test concerns a GC
> event.
> 
> 
> Thank you,
> 
> Misha
> 
> On 7/18/19 6:56 AM, Reingruber, Richard wrote:
> > Hi,
> >
> > could I please get reviews for this small TESTBUG fix?
> >
> > Webrev:
> http://cr.openjdk.java.net/~rrich/webrevs/2019/8228359/webrev/
> > Bug:    https://bugs.openjdk.java.net/browse/JDK-8228359
> >
> > Since JDK-8223837 MinHeapSize is aligned to HeapAlignment and the test
> fails therefore on linuxppc, because there we have 64k pages and
> HeapAlignment is proportional to the page size.
> > The fix changes the test to expect the configured size of 100m to be aligned
> to HeapAlignment (32m on linuxppc).
> >
> > Tested on linuxppcle and on linuxx86_64.
> >
> > Thanks, Richard.

From sgehwolf at redhat.com  Wed Aug  7 14:36:49 2019
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Wed, 07 Aug 2019 16:36:49 +0200
Subject: PING? [8u] RFR: 8135318: CMS wrong max_eden_size for
 check_gc_overhead_limit
In-Reply-To: <c3ab08b2d702c6f01d3f2b939bca45cc8ce5c121.camel@redhat.com>
References: <c3ab08b2d702c6f01d3f2b939bca45cc8ce5c121.camel@redhat.com>
Message-ID: <5a5800679e314ee3b5545323bb156eff5475b091.camel@redhat.com>

On Tue, 2019-07-09 at 14:37 +0200, Severin Gehwolf wrote:
> Hi,
> 
> Could I please get a review for this OpenJDK 8u backport? It's
> essentially the same patch as in JDK 9+, but it didn't apply cleanly
> since the surrounding code is different and the files were moved
> around.
> 
> Adjustments I've done:
> 
> JDK 8u uses a different file layout:
> 
> src/share/vm/gc/cms/concurrentMarkSweepGeneration.cpp =>
> src/share/vm/gc_implementation/concurrentMarkSweep/concurrentMarkSweepGeneration.cpp
> 
> JDK 8u doesn't have JDK-8065972 and JDK-8065992, which account for the
> surrounding code changes. Note that JDK-8065992 makes the _young_gen
> instance variable ParNewGEneration* from Generation*.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8135318
> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8135318/01/webrev/
> JDK 9 change: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/dd0c55eac358
> 
> Testing: "tier1" testing of Linux x86_64 which showed no new regressions.
> 
> Thoughts?

Anyone?

Thanks,
Severin


From shade at redhat.com  Wed Aug  7 15:47:41 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 7 Aug 2019 17:47:41 +0200
Subject: PING? [8u] RFR: 8135318: CMS wrong max_eden_size for
 check_gc_overhead_limit
In-Reply-To: <5a5800679e314ee3b5545323bb156eff5475b091.camel@redhat.com>
References: <c3ab08b2d702c6f01d3f2b939bca45cc8ce5c121.camel@redhat.com>
 <5a5800679e314ee3b5545323bb156eff5475b091.camel@redhat.com>
Message-ID: <cc57413e-ce3b-2627-40d9-bba8db114752@redhat.com>

On 8/7/19 4:36 PM, Severin Gehwolf wrote:
> On Tue, 2019-07-09 at 14:37 +0200, Severin Gehwolf wrote:
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8135318
>> webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8135318/01/webrev/
>> JDK 9 change: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/dd0c55eac358

Backport looks good.

Looks like a simple fix without any bug tail. So the code in 8u would now compute eden size as
"max_young - 2*survivor", which makes sense: eden, plus from/to survivor spaces. Before it computed
it as "max_young - 3*survivor", which is erroneous. (It is weird DefNewGeneration::max_capacity()
discounts survivor space size, leading to this problem...) Right?

-- 
Thanks,
-Aleksey


From kim.barrett at oracle.com  Wed Aug  7 20:38:29 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 7 Aug 2019 16:38:29 -0400
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
 <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
Message-ID: <00988C76-CEFA-4858-A921-A9109FE80608@oracle.com>

> On Aug 6, 2019, at 11:51 PM, Jie Fu <fujie at loongson.cn> wrote:
> 
> Hi Kim,
> 
> Thank you so much.
> 
> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.02/
> 
> I had added explicit comment and the reviewers in it.

Updated comment is good.

I pushed the change a little while ago.


From kim.barrett at oracle.com  Wed Aug  7 22:42:44 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 7 Aug 2019 18:42:44 -0400
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for
 young gen for gc+heap=info log lines
In-Reply-To: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
References: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
Message-ID: <B2C21495-2C3B-4FDB-B00A-4FA1668C71F4@oracle.com>

> On Aug 7, 2019, at 10:05 AM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> Hi all,
> 
> Similar to 8227225 but for the GenCollectedHeap GCs. Webrev is here:
> 
> http://cr.openjdk.java.net/~tonyp/8227224/webrev.0/
> 
> Tony
> 
> 
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com

Looks good.

I had a somewhat longer comment about the similarities between this new
code for GenCollectedHeap and the recently added code for
ParallelScavengeHeap, but re-reading the previous thread I agree with
the plan to keep things relatively isolated for now and perhaps look for
possible refactorings once we have more complete coverage, particularly
including non-generational collectors.

One pre-existing nit I noticed while looking at the two side by side.
In ParallelScavengeHeap::get_pre_gc_values() the young->to_space()
value is fetched but not used.  I don't need a new webrev for removal
of that line.


From fujie at loongson.cn  Thu Aug  8 00:26:16 2019
From: fujie at loongson.cn (Jie Fu)
Date: Thu, 8 Aug 2019 08:26:16 +0800
Subject: RFR: 8229169: False failure of GenericTaskQueue::pop_local on
 architectures with weak memory model
In-Reply-To: <00988C76-CEFA-4858-A921-A9109FE80608@oracle.com>
References: <21d55328-1178-b9b9-d215-3aaf3b149f3b@loongson.cn>
 <AM6PR02MB47886EDB133D565ED05FC6529AD50@AM6PR02MB4788.eurprd02.prod.outlook.com>
 <94d048ea-7db4-2829-9fd5-3da7635af371@loongson.cn>
 <962978C2-A633-42E9-AB3D-C2C4ED7F5464@oracle.com>
 <94330f3b-168e-4e7b-d4da-a9a56266bb49@loongson.cn>
 <00988C76-CEFA-4858-A921-A9109FE80608@oracle.com>
Message-ID: <05e1f8e4-8c53-04eb-b199-659877c762de@loongson.cn>

Thank you very much, Kim.

On 2019/8/8 ??4:38, Kim Barrett wrote:
>> On Aug 6, 2019, at 11:51 PM, Jie Fu <fujie at loongson.cn> wrote:
>>
>> Hi Kim,
>>
>> Thank you so much.
>>
>> Updated: http://cr.openjdk.java.net/~jiefu/8229169/webrev.02/
>>
>> I had added explicit comment and the reviewers in it.
> Updated comment is good.
>
> I pushed the change a little while ago.
>


From per.liden at oracle.com  Thu Aug  8 08:42:59 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 8 Aug 2019 10:42:59 +0200
Subject: 8227226: Segmented array clearing for ZGC
In-Reply-To: <8DD2FD76-8995-4BCB-A075-3215F466915E@amazon.com>
References: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
 <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>
 <eb005dec-18c3-8079-1cd1-49b9b9cda907@oracle.com>
 <8DD2FD76-8995-4BCB-A075-3215F466915E@amazon.com>
Message-ID: <192aed90-07cc-6d59-2a5b-06b665630739@oracle.com>

On 8/7/19 3:55 PM, Sciampacone, Ryan wrote:
>      > By overriding finish(), the sampling/reporting remains correct and
>      > unaffected, as it will never see the intermediate long[].
>    
> I learned something today.  Thank you.
> 
> For MemAllocator, I think we all agree the flow is locked in a bit too rigidly but this helps with some of the VM/GC assumptions so we end up battling it.  That said I'm with you - if there's a rewrite to be had, it's not in this patch.
> 
> Otherwise, fwiw lgtm.

Thanks Ryan!

(Since you don't have an OpenJDK id I can't add you as Reviewed-by, but 
Stefan, Erik and you will all be added as Contributed-by)

cheers,
Per

> 
> 
> ?On 8/7/19, 3:24 AM, "Per Liden" <per.liden at oracle.com> wrote:
> 
>      Hi again,
>      
>      On 8/7/19 11:59 AM, Per Liden wrote:
>      > Hi Ryan,
>      >
>      > On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
>      >> Although least intrusive, it goes back to some of the earlier
>      >> complaints about using false in the constructor for do_zero.  It also
>      >> makes a fair number of
>      >
>      > My earlier comment about this was not about passing false to the
>      > constructor, but the duplication of the _do_zero member, which I thought
>      > looked a bit odd. In this patch, this was avoided by separation these
>      > paths already in ZCollectedHeap::array_allocate().
>      >
>      >> assumptions (and goes against the hierarchies intent) on
>      >> initialization logic to hide in finish().  That said, I agree that is
>      >> fairly clean - and definitely addresses the missed cases of the
>      >> earlier webrev.
>      >>
>      >
>      > We've had the same discussions here and concluded that we might want to
>      > restructure parts of MemAllocator to better accommodate this use case,
>      > but that overriding finish() seems ok for now. A patch to restructure
>      > MemAllocator could come later if we think it's needed.
>      >
>      >> 2 things,
>      >>
>      >> 1. Isn't the substitute_oop_array_klass() check too narrow?  It will
>      >> only detect types Object[], and not any other type of reference array
>      >> (such as String[]) ?  I believe there's a bug here (correct me if I'm
>      >> wrong).
>      >
>      > On the JVM level, Object[], String[] and int[][] all have the same
>      > Klass, so we should catch them all with this single check.
>      
>      Sorry, I'm of course wrong here. Changed the check to call
>      klass->is_objArray_klass() instead. Thanks!
>      
>      Updated webrev.4 in-place.
>      
>      cheers,
>      Per
>      
>      >
>      >> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *)
>      >> dependency.  I realize what code base this is in but it would be
>      >> properly defensive.
>      >
>      > Sounds good.
>      >
>      >>
>      >> What does the reporting look like in this case?  Is the long[] type
>      >> reported accepted?  I'm wondering if this depletes some of the
>      >> simplicity.
>      >
>      > By overriding finish(), the sampling/reporting remains correct and
>      > unaffected, as it will never see the intermediate long[].
>      >
>      > Updated webrev:
>      >
>      > http://cr.openjdk.java.net/~pliden/8227226/webrev.4
>      >
>      > cheers,
>      > Per
>      >
>      >>
>      >> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden"
>      >> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>      >> per.liden at oracle.com> wrote:
>      >>
>      >>      Did some micro-benchmarking (on a Xeon E5-2630) with various segment
>      >>      sizes between 4K and 512K, and 64K seems to offer a good
>      >> trade-off. For
>      >>      a 1G array, the allocation time increases by ~1%, but in exchange
>      >> the
>      >>      worst case TTSP drops from ~280ms to ~0.6ms.
>      >>      Updated webrev using 64K:
>      >>      http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>      >>      cheers,
>      >>      Per
>      >>      On 8/2/19 11:11 AM, Per Liden wrote:
>      >>      > Hi Erik,
>      >>      >
>      >>      > On 8/1/19 5:56 PM, Erik Osterlund wrote:
>      >>      >> Hi Per,
>      >>      >>
>      >>      >> I like that this approach is unintrusive, does its thing at
>      >> the right
>      >>      >> abstraction layer, and also handles medium sized arrays.
>      >>      >
>      >>      > It even handles small arrays (i.e. arrays in small zpages) ;)
>      >>      >
>      >>      >> Looks good.
>      >>      >
>      >>      > Thanks! I'll test various segment sizes and see how that affects
>      >>      > performance and TTSP.
>      >>      >
>      >>      > cheers,
>      >>      > Per
>      >>      >
>      >>      >>
>      >>      >> Thanks,
>      >>      >> /Erik
>      >>      >>
>      >>      >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>      >>      >>>
>      >>      >>> Here's an updated webrev that should be complete, i.e. fixes the
>      >>      >>> issues related to allocation sampling/reporting that I
>      >> mentioned.
>      >>      >>> This patch makes MemAllocator::finish() virtual, so that we
>      >> can do
>      >>      >>> our thing and install the correct klass pointer before the
>      >> Allocation
>      >>      >>> destructor executes. This seems to be the least intrusive why of
>      >>      >>> doing this.
>      >>      >>>
>      >>      >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>      >>      >>>
>      >>      >>> This passed function testing, but proper benchmarking remains
>      >> to be
>      >>      >>> done.
>      >>      >>>
>      >>      >>> cheers,
>      >>      >>> Per
>      >>      >>>
>      >>      >>>> On 7/31/19 7:19 PM, Per Liden wrote:
>      >>      >>>> Hi,
>      >>      >>>> I found some time to benchmark the "GC clears
>      >> pages"-approach, and
>      >>      >>>> it's fairly clear that it's not paying off. So ditching that
>      >> idea.
>      >>      >>>> However, I'm still looking for something that would not just do
>      >>      >>>> segmented clearing of arrays in large zpages. Letting oop
>      >> arrays
>      >>      >>>> temporarily be typed arrays while it's being cleared could
>      >> be an
>      >>      >>>> option. I did a prototype for that, which looks like this:
>      >>      >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>      >>      >>>> There's at least one issue here, the code doing allocation
>      >> sampling
>      >>      >>>> will see that we allocated long arrays instead of oop
>      >> arrays, so the
>      >>      >>>> reporting there will be skewed. That can be addressed if we
>      >> go down
>      >>      >>>> this path. The code is otherwise fairly simple and
>      >> contained. Feel
>      >>      >>>> free to spot any issues.
>      >>      >>>> cheers,
>      >>      >>>> Per
>      >>      >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>      >>      >>>>> Hi Ryan & Erik,
>      >>      >>>>>
>      >>      >>>>> I had a look at this and started exploring a slightly
>      >> different
>      >>      >>>>> approach. Instead doing segmented clearing in the
>      >> allocation path,
>      >>      >>>>> we can have the concurrent GC thread clear pages when they are
>      >>      >>>>> reclaimed and not do any clearing in the allocation path at
>      >> all.
>      >>      >>>>>
>      >>      >>>>> That would look like this:
>      >>      >>>>>
>      >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>      >>      >>>>>
>      >>      >>>>> (I've had to temporarily comment out three lines of
>      >> assert/debug
>      >>      >>>>> code to make this work)
>      >>      >>>>>
>      >>      >>>>> The relocation set selection phase will now be tasked with
>      >> some
>      >>      >>>>> potentially expensive clearing work, so we'll want to make
>      >> that
>      >>      >>>>> part parallel also.
>      >>      >>>>>
>      >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>      >>      >>>>>
>      >>      >>>>> Moving this work from Java threads onto the concurrent GC
>      >> threads
>      >>      >>>>> means we will potentially prolong the
>      >> RelocationSetSelection and
>      >>      >>>>> Relocation phases. That might be a trade-off worth doing. In
>      >>      >>>>> return, we get:
>      >>      >>>>>
>      >>      >>>>> * Faster array allocations, as there's now less work done
>      >> in the
>      >>      >>>>> allocation path.
>      >>      >>>>> * This benefits all arrays, not just those allocated in
>      >> large pages.
>      >>      >>>>> * No need to consider/tune a "chunk size".
>      >>      >>>>> * I also tend think we'll end up with slightly less complex
>      >> code,
>      >>      >>>>> that is a bit easier to reason about. Can be debated of
>      >> course.
>      >>      >>>>>
>      >>      >>>>> This approach might also "survive" longer, because the YC
>      >> scheme
>      >>      >>>>> we've been loosely thinking about currently requires newly
>      >>      >>>>> allocated pages to be cleared anyway. It's of course too
>      >> early to
>      >>      >>>>> tell if that requirement will stand in the end, but it's
>      >> possible
>      >>      >>>>> anyway.
>      >>      >>>>>
>      >>      >>>>> I'll need to do some more testing and benchmarking to make
>      >> sure
>      >>      >>>>> there's no regression or bugs here. The commented out debug
>      >> code
>      >>      >>>>> also needs to be addressed of course.
>      >>      >>>>>
>      >>      >>>>> Comments? Other ideas?
>      >>      >>>>>
>      >>      >>>>> cheers,
>      >>      >>>>> Per
>      >>      >>>>>
>      >>      >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>      >>      >>>>>>
>      >>      >>>>>> Somehow I lost the RFR off the front and started a new
>      >> thread.
>      >>      >>>>>> Now that we're both off vacation I'd like to revisit
>      >> this.  Can
>      >>      >>>>>> you take a look?
>      >>      >>>>>>
>      >>      >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of
>      >> Sciampacone,
>      >>      >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>      >>      >>>>>> sci at amazon.com> wrote:
>      >>      >>>>>>
>      >>      >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>      >>      >>>>>>       This shifts away from abusing the constructor
>      >> do_zero magic
>      >>      >>>>>> in exchange for virtualizing mem_clear() and specializing
>      >> for the
>      >>      >>>>>> Z version.  It does create a change in mem_clear in that it
>      >>      >>>>>> returns an updated version of mem.  It does create change
>      >> outside
>      >>      >>>>>> of the Z code however it does feel cleaner.
>      >>      >>>>>>       I didn't make a change to PinAllocating - looking at
>      >> it, the
>      >>      >>>>>> safety of keeping it constructor / destructor based still
>      >> seemed
>      >>      >>>>>> appropriate to me.  If the objection is to using the sequence
>      >>      >>>>>> numbers to pin (and instead using handles to update) -
>      >> this to me
>      >>      >>>>>> seems less error prone.  I had originally discussed
>      >> handles with
>      >>      >>>>>> Stefan but the proposal came down to this which looks much
>      >> cleaner.
>      >>      >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
>      >>      >>>>>> Sciampacone, Ryan"
>      >> <hotspot-gc-dev-bounces at openjdk.java.net on
>      >>      >>>>>> behalf of sci at amazon.com> wrote:
>      >>      >>>>>>           1) Yes this was a conscious decision.  There was
>      >>      >>>>>> discussion on determining the optimal point for breakup
>      >> but given
>      >>      >>>>>> the existing sizes this seemed sufficient.  This doesn't
>      >> preclude
>      >>      >>>>>> the ability to go down that path if its deemed absolutely
>      >>      >>>>>> necessary.  The path for more complex decisions is now
>      >> available.
>      >>      >>>>>>           2) Agree
>      >>      >>>>>>           3) I'm not clear here.  Do you mean effectively
>      >> going
>      >>      >>>>>> direct to ZHeap and bypassing the single function
>      >> PinAllocating?
>      >>      >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>      >>      >>>>>>           4) Agree
>      >>      >>>>>>           5) I initially had the exact same reaction but I
>      >> played
>      >>      >>>>>> around with a few other versions (including breaking up
>      >>      >>>>>> initialization points between header and body to get the
>      >> desired
>      >>      >>>>>> results) and this ended up looking correct.  I'll try
>      >> mixing in
>      >>      >>>>>> the mem clearer function again (fresh start) to see if it
>      >> looks
>      >>      >>>>>> any better.
>      >>      >>>>>>           On 7/8/19, 5:49 AM, "Per Liden"
>      >> <per.liden at oracle.com>
>      >>      >>>>>> wrote:
>      >>      >>>>>>               Hi Ryan,
>      >>      >>>>>>               A few general comments:
>      >>      >>>>>>               1) It looks like this still only work for
>      >> large pages?
>      >>      >>>>>>               2) The log_info stuff should be removed.
>      >>      >>>>>>               3) I'm not a huge fan of single-use
>      >> utilities like
>      >>      >>>>>> PinAllocating, at
>      >>      >>>>>>               least not when, IMO, the alternative is more
>      >>      >>>>>> straight forward and less code.
>      >>      >>>>>>               4) Please make locals const when possible.
>      >>      >>>>>>               5) Duplicating _do_zero looks odd. Injecting
>      >> a "mem
>      >>      >>>>>> clearer", similar to
>      >>      >>>>>>               what Stefans original patch did, seems worth
>      >> exploring.
>      >>      >>>>>>               cheers,
>      >>      >>>>>>               /Per
>      >>      >>>>>>               (Btw, I'm on vacation so I might not be
>      >>      >>>>>> super-responsive to emails)
>      >>      >>>>>>               On 2019-07-08 12:42, Erik ?sterlund wrote:
>      >>      >>>>>>               > Hi Ryan,
>      >>      >>>>>>               >
>      >>      >>>>>>               > This looks good in general. Just some
>      >> stylistic
>      >>      >>>>>> things...
>      >>      >>>>>>               >
>      >>      >>>>>>               > 1) In the ZGC project we like the letter
>      >> 'Z' so
>      >>      >>>>>> much that we put it in
>      >>      >>>>>>               > front of everything we possibly can,
>      >> including all
>      >>      >>>>>> class names.
>      >>      >>>>>>               > 2) We also explicitly state things are
>      >> private
>      >>      >>>>>> even though it's
>      >>      >>>>>>               > bleedingly obvious.
>      >>      >>>>>>               >
>      >>      >>>>>>               > So:
>      >>      >>>>>>               >
>      >>      >>>>>>               > 39 class PinAllocating {
>      >>      >>>>>>               > 40 HeapWord* _mem;
>      >>      >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40
>      >>      >>>>>> private: 41 HeapWord* _mem;
>      >>      >>>>>>               >    42
>      >>      >>>>>>               >   41 public: I can be your sponsor and
>      >> push this
>      >>      >>>>>> change for you. I don't
>      >>      >>>>>>               > think there is a need for another webrev
>      >> for my
>      >>      >>>>>> small stylistic remarks,
>      >>      >>>>>>               > so I can just fix that before pushing this
>      >> for
>      >>      >>>>>> you. On that note, I'll
>      >>      >>>>>>               > add me and StefanK to the contributed-by
>      >> section
>      >>      >>>>>> as we all worked out
>      >>      >>>>>>               > the right solution to this problem
>      >>      >>>>>> collaboratively. I have run through
>      >>      >>>>>>               > mach5 tier1-5, and found no issues with this
>      >>      >>>>>> patch. Thanks, /Erik
>      >>      >>>>>>               >
>      >>      >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>      >>      >>>>>>               >>
>      >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>      >>      >>>>>>               >>
>      >> https://bugs.openjdk.java.net/browse/JDK-8227226
>      >>      >>>>>>               >>
>      >>      >>>>>>               >> This patch introduces safe point checks into
>      >>      >>>>>> array clearing during
>      >>      >>>>>>               >> allocation for ZGC.  The patch isolates the
>      >>      >>>>>> changes to ZGC as (in
>      >>      >>>>>>               >> particular with the more modern
>      >> collectors) the
>      >>      >>>>>> approach to
>      >>      >>>>>>               >> incrementalizing or respecting safe point
>      >> checks
>      >>      >>>>>> is going to be
>      >>      >>>>>>               >> different.
>      >>      >>>>>>               >>
>      >>      >>>>>>               >> The approach is to keep the region
>      >> holding the
>      >>      >>>>>> array in the allocating
>      >>      >>>>>>               >> state (pin logic) while updating the
>      >> color to the
>      >>      >>>>>> array after checks.
>      >>      >>>>>>               >>
>      >>      >>>>>>               >> Can I get a review?  Thanks.
>      >>      >>>>>>               >>
>      >>      >>>>>>               >> Ryan
>      >>      >>>>>>               >
>      >>      >>>>>>
>      >>      >>
>      >>
>      
> 


From erik.osterlund at oracle.com  Thu Aug  8 09:46:16 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 8 Aug 2019 11:46:16 +0200
Subject: RFR: 8229278: Improve hs_err location printing to assume less about
 GC internals
Message-ID: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>

Hi,

Today when we crash and print hs_err files, the printing utility for 
describing heap locations assumes:
1) That the Java heap memory reservation is contiguous
2) That the Java heap is parseable
We should let the GC describe a location instead, opting in to such 
concepts.

This patch adds a print_location pure virtual function on CollectedHeap
allowing the GC to choose printing strategy. A new LocationPrinter 
utility was added, allowing GCs to implement the functionality easily 
without much code duplication.

Webrev:
http://cr.openjdk.java.net/~eosterlund/8229278/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8229278

Thanks,
/Erik


From tprintezis at twitter.com  Thu Aug  8 13:50:32 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Thu, 8 Aug 2019 13:50:32 +0000
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for
 young gen for gc+heap=info log lines
In-Reply-To: <B2C21495-2C3B-4FDB-B00A-4FA1668C71F4@oracle.com>
References: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
 <B2C21495-2C3B-4FDB-B00A-4FA1668C71F4@oracle.com>
Message-ID: <CAOzU2ikiLsXvPZ_UgR6Hp3dXWubE=1kZo0c5VN66LxbOAz27dA@mail.gmail.com>

Hi Kim,

Inline.


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 7, 2019 at 6:42:56 PM, Kim Barrett (kim.barrett at oracle.com) wrote:

> On Aug 7, 2019, at 10:05 AM, Tony Printezis <tprintezis at twitter.com>
wrote:
>
> Hi all,
>
> Similar to 8227225 but for the GenCollectedHeap GCs. Webrev is here:
>
> http://cr.openjdk.java.net/~tonyp/8227224/webrev.0/
>
> Tony
>
>
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com

Looks good.

I had a somewhat longer comment about the similarities between this new
code for GenCollectedHeap and the recently added code for
ParallelScavengeHeap, but re-reading the previous thread I agree with
the plan to keep things relatively isolated for now and perhaps look for
possible refactorings once we have more complete coverage, particularly
including non-generational collectors.


Yeah, we can definitely increase the amount of code sharing here. And IMHO
the main benefit is not to decrease the amount of code, but to ensure that
the output is consistent across all GCs. But can I also point out that,
before, there was NO code sharing whatsoever (all this code was replicated
multiple times). Now at least there?s some common code and common macros.
And we can improve on that further.

While we?re at it: I?m happy to work on follow-ups. What?s a good next
step? As Thomas had suggested, I can change the formatting code to use more
appropriate units instead of always K. Another possibility is to update the
?gc' log lines to the same format? E.g.,

[29.884s][info][gc           ] GC(24) Pause Young (Allocation Failure)
6147M->3M(9216M) 2.705ms


One pre-existing nit I noticed while looking at the two side by side.
In ParallelScavengeHeap::get_pre_gc_values() the young->to_space()
value is fetched but not used. I don't need a new webrev for removal
of that line.


Thanks for pointing this out. I had added the to-space transition in the
output but, after discussing it with Thomas, we decided that it?s probably
not necessary. So I removed it from the code. But I clearly forgot that
assignment. I?ll add the removal of that line to this change.


Tony


From kirk at kodewerk.com  Thu Aug  8 16:37:16 2019
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 8 Aug 2019 09:37:16 -0700
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
Message-ID: <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>

Hi Thomas,

"n the meantime the Oracle garbage collection team introduced a new garbage collector, ZGC, and Red Hat contributed the Shenandoah collector. Oracle further improved G1, which has been its designated successor since initial introduction in JDK6u14, to a point where we believe there is little reason to use the CMS collector in deployments.?

I fear my experience in tuning GC 1000s of JVMs leaves me at odds with the premise that there is little reason to use CMS. In my experience CMS overheads are no where near the level of those seen with G1. This is not just my experience but there are other organizations that have reached the same conclusion. Further more, with the removal of CMS we are now recommending that customers consider Parallel GC as it offers a far better experience than G1. Again, I?m not alone is seeing this as a growing trend.

Although I do have high hopes for both ZGC and Shenandoah, they are not an option for most sites at this point in time. I would suggested that depreciation of CMS was premature as there was no viable alternative. I would further suggest that removal is also premature as there is still no viable alternative for the majority of workloads that work exceptionally well with CMS.

Kind regards,
Kirk Pepperdine


> On Aug 3, 2019, at 1:37 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  as already discussed during the OCW last week the Oracle garbage collection team is set to remove the CMS collector from OpenJDK for the reasons stated there and in the JEP in JDK 14.
> 
> I wrote up a first draft available at
> 
> https://bugs.openjdk.java.net/browse/JDK-8229049
> 
> Comments and reviewers to move it along appreciated ;)
> 
> Thanks,
>  Thomas


From leo.korinth at oracle.com  Thu Aug  8 17:24:11 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Thu, 8 Aug 2019 19:24:11 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
 <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
Message-ID: <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>

Hi!

Here is the latest fixup (fixup3) rebased on latest.
http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups/

I will push it like this:
http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups_collapsed/

All changes collapsed:
http://cr.openjdk.java.net/~lkorinth/workgang/3/all_collapsed/

I did fix JVMCI include order in:
src/hotspot/share/gc/parallel/psParallelCompact.cpp
but the include was rebased away in:
src/hotspot/share/gc/parallel/psScavenge.cpp

all 8 collapsed changes do compile and pass the :hotspot_gc with 
-XX:+UseParallelGC

If you are okay with this I will run more testing on all changes, again 
checking performance and running slowdebugs etc.

I also added StefanK as reviewer. He has suggested many improvements 
before the first public review.

Are you okay with all the 8 commits?

Thanks,
Leo


From claes.redestad at oracle.com  Thu Aug  8 17:33:08 2019
From: claes.redestad at oracle.com (Claes Redestad)
Date: Thu, 8 Aug 2019 19:33:08 +0200
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
Message-ID: <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>

Hi Kirk,

sorry for barging in, but this genuinely piqued my interest.

On 2019-08-08 18:37, Kirk Pepperdine wrote:
> Hi Thomas,
> 
> "n the meantime the Oracle garbage collection team introduced a new garbage collector, ZGC, and Red Hat contributed the Shenandoah collector. Oracle further improved G1, which has been its designated successor since initial introduction in JDK6u14, to a point where we believe there is little reason to use the CMS collector in deployments.?
> 
> I fear my experience in tuning GC 1000s of JVMs leaves me at odds with the premise that there is little reason to use CMS. In my experience CMS overheads are no where near the level of those seen with G1. This is not just my experience but there are other organizations that have reached the same conclusion. Further more, with the removal of CMS we are now recommending that customers consider Parallel GC as it offers a far better experience than G1. Again, I?m not alone is seeing this as a growing trend.

Which JDK version were these JVMs running? I'm curious if/how you've
taken into account the tuning work and improvements made to G1 in
recent releases in your assessments.

> 
> Although I do have high hopes for both ZGC and Shenandoah, they are not an option for most sites at this point in time. I would suggested that depreciation of CMS was premature as there was no viable alternative. I would further suggest that removal is also premature as there is still no viable alternative for the majority of workloads that work exceptionally well with CMS.

Can you clarify why they aren't options at this point in time? Or why
you think they still won't be once CMS is actually removed, be it from
JDK 14 or a later release?

Thanks!

/Claes

> 
> Kind regards,
> Kirk Pepperdine
> 
> 
>> On Aug 3, 2019, at 1:37 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   as already discussed during the OCW last week the Oracle garbage collection team is set to remove the CMS collector from OpenJDK for the reasons stated there and in the JEP in JDK 14.
>>
>> I wrote up a first draft available at
>>
>> https://bugs.openjdk.java.net/browse/JDK-8229049
>>
>> Comments and reviewers to move it along appreciated ;)
>>
>> Thanks,
>>   Thomas
> 


From rkennke at redhat.com  Thu Aug  8 17:43:38 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 08 Aug 2019 19:43:38 +0200
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
 <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
Message-ID: <D4DF9903-4FB0-4D74-AD42-0F5CECF1D39E@redhat.com>

+1 (my questions too)

Plus, so far nobody has actually come forward and expressed interest in actually maintaining CMS.

Roman


Am 8. August 2019 19:33:08 MESZ schrieb Claes Redestad <claes.redestad at oracle.com>:
>Hi Kirk,
>
>sorry for barging in, but this genuinely piqued my interest.
>
>On 2019-08-08 18:37, Kirk Pepperdine wrote:
>> Hi Thomas,
>> 
>> "n the meantime the Oracle garbage collection team introduced a new
>garbage collector, ZGC, and Red Hat contributed the Shenandoah
>collector. Oracle further improved G1, which has been its designated
>successor since initial introduction in JDK6u14, to a point where we
>believe there is little reason to use the CMS collector in
>deployments.?
>> 
>> I fear my experience in tuning GC 1000s of JVMs leaves me at odds
>with the premise that there is little reason to use CMS. In my
>experience CMS overheads are no where near the level of those seen with
>G1. This is not just my experience but there are other organizations
>that have reached the same conclusion. Further more, with the removal
>of CMS we are now recommending that customers consider Parallel GC as
>it offers a far better experience than G1. Again, I?m not alone is
>seeing this as a growing trend.
>
>Which JDK version were these JVMs running? I'm curious if/how you've
>taken into account the tuning work and improvements made to G1 in
>recent releases in your assessments.
>
>> 
>> Although I do have high hopes for both ZGC and Shenandoah, they are
>not an option for most sites at this point in time. I would suggested
>that depreciation of CMS was premature as there was no viable
>alternative. I would further suggest that removal is also premature as
>there is still no viable alternative for the majority of workloads that
>work exceptionally well with CMS.
>
>Can you clarify why they aren't options at this point in time? Or why
>you think they still won't be once CMS is actually removed, be it from
>JDK 14 or a later release?
>
>Thanks!
>
>/Claes
>
>> 
>> Kind regards,
>> Kirk Pepperdine
>> 
>> 
>>> On Aug 3, 2019, at 1:37 PM, Thomas Schatzl
><thomas.schatzl at oracle.com> wrote:
>>>
>>> Hi all,
>>>
>>>   as already discussed during the OCW last week the Oracle garbage
>collection team is set to remove the CMS collector from OpenJDK for the
>reasons stated there and in the JEP in JDK 14.
>>>
>>> I wrote up a first draft available at
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-8229049
>>>
>>> Comments and reviewers to move it along appreciated ;)
>>>
>>> Thanks,
>>>   Thomas
>> 

-- 
Diese Nachricht wurde von meinem Android-Ger?t mit K-9 Mail gesendet.

From kirk at kodewerk.com  Thu Aug  8 17:59:58 2019
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 8 Aug 2019 10:59:58 -0700
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
 <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
Message-ID: <7B7C4630-6E6D-4303-85EB-E4CFF7769210@kodewerk.com>

Hi Claes,

Thanks for your interest.


> On Aug 8, 2019, at 10:33 AM, Claes Redestad <claes.redestad at oracle.com> wrote:
> 
> Hi Kirk,
> 
> sorry for barging in, but this genuinely piqued my interest.
> 
> On 2019-08-08 18:37, Kirk Pepperdine wrote:
>> Hi Thomas,
>> "n the meantime the Oracle garbage collection team introduced a new garbage collector, ZGC, and Red Hat contributed the Shenandoah collector. Oracle further improved G1, which has been its designated successor since initial introduction in JDK6u14, to a point where we believe there is little reason to use the CMS collector in deployments.?
>> I fear my experience in tuning GC 1000s of JVMs leaves me at odds with the premise that there is little reason to use CMS. In my experience CMS overheads are no where near the level of those seen with G1. This is not just my experience but there are other organizations that have reached the same conclusion. Further more, with the removal of CMS we are now recommending that customers consider Parallel GC as it offers a far better experience than G1. Again, I?m not alone is seeing this as a growing trend.
> 
> Which JDK version were these JVMs running? I'm curious if/how you've
> taken into account the tuning work and improvements made to G1 in
> recent releases in your assessments.

To be fair I?ve not tested since JDK 12 though I didn?t see much in the way of improvement. I know things have improved in 13. However many clients are running in 8 and will be for some time to come. Others are sticking to LTS releases for what ever reasoning. But the big issue is that I?ve not been able to get the collector to scale down to the point where it overlaps with the parallel collector. It is this gap that CMS fills.. and it?s a very common gap for the apps that we are generally involved with (including a number of apps like PeopleSoft).
> 
>> Although I do have high hopes for both ZGC and Shenandoah, they are not an option for most sites at this point in time. I would suggested that depreciation of CMS was premature as there was no viable alternative. I would further suggest that removal is also premature as there is still no viable alternative for the majority of workloads that work exceptionally well with CMS.
> 
> Can you clarify why they aren't options at this point in time? Or why
> you think they still won't be once CMS is actually removed, be it from
> JDK 14 or a later release?

As I stated, at this point in time these collectors are experimental options and thus I can?t recommend them to be used in production environments. Further more, all of the benchmarking suggests that applications will take a 10-15% hit in throughput. I take that to mean, a proportionate hit in the power the bill.

Most of what I?ve stated are things that have been stated by myself and others when CMS depreciation was first announced. We did have a meeting while at JavaONE to discuss. Like I said, since then I?ve seem improvements but there just seem to be some overheads that are unavoidable when building a mostly concurrent collector. That coupled with the fact that for CMS, there is a range of tenured occupancy where it simply just works better than any other option.

Kind regards,
Kirk


From kirk at kodewerk.com  Thu Aug  8 19:19:47 2019
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Thu, 8 Aug 2019 12:19:47 -0700
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <D4DF9903-4FB0-4D74-AD42-0F5CECF1D39E@redhat.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
 <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
 <D4DF9903-4FB0-4D74-AD42-0F5CECF1D39E@redhat.com>
Message-ID: <BA9FB106-E808-4D7E-89AC-6BE07D44E731@kodewerk.com>

Hi Roman,

> 
> Plus, so far nobody has actually come forward and expressed interest in actually maintaining CMS.

Yes, I understand that this is an important issue. However, way back when, we were told by Oracle that they didn?t want to keep the costs of testing meaning that even if someone did decide to take on the role of maintaining CMS, it wouldn?t have addressed the testing costs that seemed to be an important driver for the original decision to depreciate. That and the decision makers were convicted that G1 would manage CMS level workloads efficiently? which IME, it can?t.

Kind regards,
Kirk


From kim.barrett at oracle.com  Thu Aug  8 19:35:01 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 8 Aug 2019 15:35:01 -0400
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
 <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
 <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
Message-ID: <2DC954B9-9003-43CC-9310-9A01866F50B7@oracle.com>

> On Aug 8, 2019, at 1:24 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> Hi!
> 
> Here is the latest fixup (fixup3) rebased on latest.
> http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups/
> 
> I will push it like this:
> http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups_collapsed/
> 
> All changes collapsed:
> http://cr.openjdk.java.net/~lkorinth/workgang/3/all_collapsed/
> 
> I did fix JVMCI include order in:
> src/hotspot/share/gc/parallel/psParallelCompact.cpp
> but the include was rebased away in:
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 
> all 8 collapsed changes do compile and pass the :hotspot_gc with -XX:+UseParallelGC
> 
> If you are okay with this I will run more testing on all changes, again checking performance and running slowdebugs etc.
> 
> I also added StefanK as reviewer. He has suggested many improvements before the first public review.
> 
> Are you okay with all the 8 commits?
> 
> Thanks,
> Leo

Looks good.


From thomas.schatzl at oracle.com  Fri Aug  9 03:56:06 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 9 Aug 2019 05:56:06 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
 <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
 <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
Message-ID: <22e7c37f-e86b-8ab7-24bc-0737f89be682@oracle.com>

Hi Leo,

psScavenge.cpp:
364       // There are not old-to-young pointers if the old gen is empty.

s/not/no

No need for re-review for fixing the typo. Looks good otherwise.

Thomas

On 08.08.19 19:24, Leo Korinth wrote:
> Hi!
> 
> Here is the latest fixup (fixup3) rebased on latest.
> http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups/
> 
> I will push it like this:
> http://cr.openjdk.java.net/~lkorinth/workgang/3/fixups_collapsed/
> 
> All changes collapsed:
> http://cr.openjdk.java.net/~lkorinth/workgang/3/all_collapsed/
> 
> I did fix JVMCI include order in:
> src/hotspot/share/gc/parallel/psParallelCompact.cpp
> but the include was rebased away in:
> src/hotspot/share/gc/parallel/psScavenge.cpp
> 
> all 8 collapsed changes do compile and pass the :hotspot_gc with 
> -XX:+UseParallelGC
> 
> If you are okay with this I will run more testing on all changes, again 
> checking performance and running slowdebugs etc.
> 
> I also added StefanK as reviewer. He has suggested many improvements 
> before the first public review.
> 
> Are you okay with all the 8 commits?
> 
> Thanks,
> Leo


From thomas.schatzl at oracle.com  Fri Aug  9 03:59:06 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 9 Aug 2019 05:59:06 +0200
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for
 young gen for gc+heap=info log lines
In-Reply-To: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
References: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
Message-ID: <826948e1-837f-1885-4e64-9c9d363b363a@oracle.com>

Hi,

On 07.08.19 16:05, Tony Printezis wrote:
> Hi all,
> 
> Similar to 8227225 but for the GenCollectedHeap GCs. Webrev is here:
> 
> http://cr.openjdk.java.net/~tonyp/8227224/webrev.0/
> 
> Tony
> 

   looks good.

Thomas


From leo.korinth at oracle.com  Fri Aug  9 08:50:15 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 9 Aug 2019 10:50:15 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <2DC954B9-9003-43CC-9310-9A01866F50B7@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
 <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
 <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
 <2DC954B9-9003-43CC-9310-9A01866F50B7@oracle.com>
Message-ID: <e7c0adae-a76b-21e8-870b-e18a5572ffbe@oracle.com>


On 08/08/2019 21:35, Kim Barrett wrote:

> 
> Looks good.
> 

Thanks!

/Leo


From leo.korinth at oracle.com  Fri Aug  9 08:56:11 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 9 Aug 2019 10:56:11 +0200
Subject: RFR: 8224663: Parallel GC: Use WorkGang (5: ScavengeRootsTask)
In-Reply-To: <22e7c37f-e86b-8ab7-24bc-0737f89be682@oracle.com>
References: <68496c5c-2b3b-37e5-4d02-69fed10f172e@oracle.com>
 <3ea47da5-4b1e-0301-461a-ede9047cb638@oracle.com>
 <8CA8F38A-1B9F-4B7D-902E-846BF7718D2C@oracle.com>
 <2d970c64-0c42-8213-6432-754d73839783@oracle.com>
 <E3F39DF8-79D1-4916-B59C-785A57C9E5BD@oracle.com>
 <7c703ebd-7ebe-5ec0-c2e9-587798b91c28@oracle.com>
 <c169f7c7-ac60-013a-65cd-0a27ede96356@oracle.com>
 <e8ef2f86-aaee-1599-2109-9a8fa8fd9316@oracle.com>
 <22e7c37f-e86b-8ab7-24bc-0737f89be682@oracle.com>
Message-ID: <754b8201-1ebd-2e6c-75c6-6345b499964e@oracle.com>

On 09/08/2019 05:56, Thomas Schatzl wrote:
> Hi Leo,
> 
> psScavenge.cpp:
> 364?????? // There are not old-to-young pointers if the old gen is empty.
> 
> s/not/no

will fix.

> 
> No need for re-review for fixing the typo. Looks good otherwise.
> 
> Thomas

Thanks!
/Leo


From sgehwolf at redhat.com  Fri Aug  9 09:57:36 2019
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Fri, 09 Aug 2019 11:57:36 +0200
Subject: PING? [8u] RFR: 8135318: CMS wrong max_eden_size for
 check_gc_overhead_limit
In-Reply-To: <cc57413e-ce3b-2627-40d9-bba8db114752@redhat.com>
References: <c3ab08b2d702c6f01d3f2b939bca45cc8ce5c121.camel@redhat.com>
 <5a5800679e314ee3b5545323bb156eff5475b091.camel@redhat.com>
 <cc57413e-ce3b-2627-40d9-bba8db114752@redhat.com>
Message-ID: <a52ce5b2c02710ed4db49ade87c4a2f72c9d5f90.camel@redhat.com>

On Wed, 2019-08-07 at 17:47 +0200, Aleksey Shipilev wrote:
> On 8/7/19 4:36 PM, Severin Gehwolf wrote:
> > On Tue, 2019-07-09 at 14:37 +0200, Severin Gehwolf wrote:
> > > Bug: https://bugs.openjdk.java.net/browse/JDK-8135318
> > > webrev: http://cr.openjdk.java.net/~sgehwolf/webrevs/JDK-8135318/01/webrev/
> > > JDK 9 change: http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/dd0c55eac358
> 
> Backport looks good.

Thanks for the review!

> Looks like a simple fix without any bug tail. So the code in 8u would now compute eden size as
> "max_young - 2*survivor", which makes sense: eden, plus from/to survivor spaces. Before it computed
> it as "max_young - 3*survivor", which is erroneous. (It is weird DefNewGeneration::max_capacity()
> discounts survivor space size, leading to this problem...) Right?

That's my understanding. It's been there since 2007 "Initial load" :)

Thanks,
Severin


From erik.osterlund at oracle.com  Fri Aug  9 09:59:38 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 9 Aug 2019 11:59:38 +0200
Subject: RFR: 8224820: ZGC: Support discontiguous heap reservations
Message-ID: <5567c0b8-00e6-29f0-12c6-e067c924fdce@oracle.com>

Hi,

Today ZGC reserves a huge chunk of virtual address space when the JVM 
starts. This typically succeeds because we grab the VA before anyone 
else has time to do so. But if some ASLR library or something was to 
grab a tiny part of the desired VA, ZGC can't start. We should support 
discontiguous heap reservations to support this.

On linux, by default, this does not happen. But on OS X, it does happen 
relatively frequently. So we need to fix this to allow a mac port.

This patch implements a recursive algorithm for finding holes at 2MB 
granularities in the normally contiguous reservation when initializing 
the heap, removing them from our VA.

This patch depends on 8224815, which depends on 8229189 and 8229278. 
They are all out for review.

Webrev:
http://cr.openjdk.java.net/~eosterlund/8224820/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8224820

Thanks,
/Erik


From shade at redhat.com  Fri Aug  9 10:47:54 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Fri, 9 Aug 2019 12:47:54 +0200
Subject: RFR (XS) 8229350: Shenandoah does not need barriers before CreateEx
Message-ID: <63b00f0f-b31c-66fb-61b7-dc4c60996adc@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8229350

Fix:

diff -r 2e38a71e6038 src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp
--- a/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp  Fri Aug 09 03:51:20 2019 +0200
+++ b/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp  Fri Aug 09 12:18:09 2019 +0200
@@ -3091,10 +3091,12 @@
     case Op_CMoveP:
       return needs_barrier_impl(phase, n->in(2), visited) ||
              needs_barrier_impl(phase, n->in(3), visited);
     case Op_ShenandoahEnqueueBarrier:
       return needs_barrier_impl(phase, n->in(1), visited);
+    case Op_CreateEx:
+      return false;
     default:
       break;
   }
 #ifdef ASSERT
   tty->print("need barrier on?: ");


Testing: {x86_64, x86_32} x {hotspot_gc_shenandoah, vmTestbase_vm_mlvm}

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Fri Aug  9 10:54:44 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 9 Aug 2019 12:54:44 +0200
Subject: RFR (XS) 8229350: Shenandoah does not need barriers before
 CreateEx
In-Reply-To: <63b00f0f-b31c-66fb-61b7-dc4c60996adc@redhat.com>
References: <63b00f0f-b31c-66fb-61b7-dc4c60996adc@redhat.com>
Message-ID: <7438200e-7d51-c424-fddd-31f25cd9724e@redhat.com>

Ok! Thanks!

Roman


> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8229350
> 
> Fix:
> 
> diff -r 2e38a71e6038 src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp
> --- a/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp  Fri Aug 09 03:51:20 2019 +0200
> +++ b/src/hotspot/share/gc/shenandoah/c2/shenandoahSupport.cpp  Fri Aug 09 12:18:09 2019 +0200
> @@ -3091,10 +3091,12 @@
>      case Op_CMoveP:
>        return needs_barrier_impl(phase, n->in(2), visited) ||
>               needs_barrier_impl(phase, n->in(3), visited);
>      case Op_ShenandoahEnqueueBarrier:
>        return needs_barrier_impl(phase, n->in(1), visited);
> +    case Op_CreateEx:
> +      return false;
>      default:
>        break;
>    }
>  #ifdef ASSERT
>    tty->print("need barrier on?: ");
> 
> 
> Testing: {x86_64, x86_32} x {hotspot_gc_shenandoah, vmTestbase_vm_mlvm}
> 


From rkennke at redhat.com  Fri Aug  9 12:40:24 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Fri, 9 Aug 2019 14:40:24 +0200
Subject: RFR: 8228369: Shenandoah: Refactor LRB C1 stubs
In-Reply-To: <72b9fa73-9ae4-7322-09a4-8e66058043db@redhat.com>
References: <e83a4469-4939-f435-fc19-a7ef97ac349a@redhat.com>
 <0788a5a5-3826-6693-608f-b48dd5805ef9@redhat.com>
 <72b9fa73-9ae4-7322-09a4-8e66058043db@redhat.com>
Message-ID: <29ecb1b9-4de5-2e26-22a3-f371a61f1e1a@redhat.com>

Hi Aleksey,

Finally found time to pick this up again.

> On 7/19/19 4:27 PM, Roman Kennke wrote:
>> Please review this updated patch:
>>
>> http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.01/
> 
> Brief look:
> 
> *) The comment here is inconsistent with the implementation: mentions "not_resolved", while argument
> is "resolved": shenandoahBarrierSetAssembler_aarch64.cpp
> 
>  485 // Generate check if object is resolved. Branch to not_resolved label, if not. Otherwise return
> resolved
>  486 // object in obj register.
>  487 // obj: object, resolved object on normal return
>  488 // tmp1, tmp2: temp registers, trashed
>  489 void ShenandoahBarrierSetAssembler::gen_resolved_check(MacroAssembler* masm, Register obj,
> Register tmp1, Register tmp2, Label& resolved) {
>  490   __ mov(tmp2, obj);
>  491   resolve_forward_pointer_not_null(masm, obj, tmp1);
>  492   __ cmp(tmp2, obj);
>  493   __ br(Assembler::EQ, resolved);
>  494 }

All that stuff goes away with this:

> *) Is there any actual benefit for separating gen_cset_check and gen_resolved_check? It seems to
> have only two uses, and copy-paste looks like less evil.

Ok. I usually don't like to keep two identical code paths around, but it
doesn't seem catastrophic here, plus it allows for some tiny
improvements in branching.

> *) In x86 code, can't you use resolve_forward_pointer too, instead of doing decoding by hand?

The opposite: the x86 is better because it avoids the actual
decoding+comparison+use of extra register. The aarch64 is a result of
laziness where I simply plugged in the original path into it. The new
version is a little more efficient.

Incremental:
http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.02.diff/
Full:
http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.02/

Better now?

Roman


From charlie.hunt at oracle.com  Fri Aug  9 19:24:09 2019
From: charlie.hunt at oracle.com (charlie hunt)
Date: Fri, 9 Aug 2019 14:24:09 -0500
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <7B7C4630-6E6D-4303-85EB-E4CFF7769210@kodewerk.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
 <e7ead617-3f0a-1011-ddfe-1fc513eb6c10@oracle.com>
 <7B7C4630-6E6D-4303-85EB-E4CFF7769210@kodewerk.com>
Message-ID: <9ac1c0f7-0479-6d72-2ad4-005c01ec9b44@oracle.com>

Hi Kirk,

A couple questions ...

Could you expand on this statement?

> [....] the big issue is that I?ve not been able to get the [G1] collector to scale down to the point where it overlaps with the parallel collector. It is this gap that CMS fills.
I am not sure clear on what you mean when you say "scale down to the 
point where it overlaps with the parallel collector".

Also, you mentioned PeopleSoft as an example application where you do 
not see an alternative collector. Are there other applications you could 
mention as an example, or some common characteristics of this/these 
kind(s) of application?? I would like to see if we could either run an 
existing app or construct something that could exhibit your observation(s).

thanks,

charlie

On 8/8/19 12:59 PM, Kirk Pepperdine wrote:
> Hi Claes,
>
> Thanks for your interest.
>
>
>> On Aug 8, 2019, at 10:33 AM, Claes Redestad <claes.redestad at oracle.com> wrote:
>>
>> Hi Kirk,
>>
>> sorry for barging in, but this genuinely piqued my interest.
>>
>> On 2019-08-08 18:37, Kirk Pepperdine wrote:
>>> Hi Thomas,
>>> "n the meantime the Oracle garbage collection team introduced a new garbage collector, ZGC, and Red Hat contributed the Shenandoah collector. Oracle further improved G1, which has been its designated successor since initial introduction in JDK6u14, to a point where we believe there is little reason to use the CMS collector in deployments.?
>>> I fear my experience in tuning GC 1000s of JVMs leaves me at odds with the premise that there is little reason to use CMS. In my experience CMS overheads are no where near the level of those seen with G1. This is not just my experience but there are other organizations that have reached the same conclusion. Further more, with the removal of CMS we are now recommending that customers consider Parallel GC as it offers a far better experience than G1. Again, I?m not alone is seeing this as a growing trend.
>> Which JDK version were these JVMs running? I'm curious if/how you've
>> taken into account the tuning work and improvements made to G1 in
>> recent releases in your assessments.
> To be fair I?ve not tested since JDK 12 though I didn?t see much in the way of improvement. I know things have improved in 13. However many clients are running in 8 and will be for some time to come. Others are sticking to LTS releases for what ever reasoning. But the big issue is that I?ve not been able to get the collector to scale down to the point where it overlaps with the parallel collector. It is this gap that CMS fills.. and it?s a very common gap for the apps that we are generally involved with (including a number of apps like PeopleSoft).
>>> Although I do have high hopes for both ZGC and Shenandoah, they are not an option for most sites at this point in time. I would suggested that depreciation of CMS was premature as there was no viable alternative. I would further suggest that removal is also premature as there is still no viable alternative for the majority of workloads that work exceptionally well with CMS.
>> Can you clarify why they aren't options at this point in time? Or why
>> you think they still won't be once CMS is actually removed, be it from
>> JDK 14 or a later release?
> As I stated, at this point in time these collectors are experimental options and thus I can?t recommend them to be used in production environments. Further more, all of the benchmarking suggests that applications will take a 10-15% hit in throughput. I take that to mean, a proportionate hit in the power the bill.
>
> Most of what I?ve stated are things that have been stated by myself and others when CMS depreciation was first announced. We did have a meeting while at JavaONE to discuss. Like I said, since then I?ve seem improvements but there just seem to be some overheads that are unavoidable when building a mostly concurrent collector. That coupled with the fact that for CMS, there is a range of tenured occupancy where it simply just works better than any other option.
>
> Kind regards,
> Kirk
>


From per.liden at oracle.com  Sun Aug 11 09:07:09 2019
From: per.liden at oracle.com (Per Liden)
Date: Sun, 11 Aug 2019 11:07:09 +0200
Subject: ZGC: Fix an incorrect column in GC statistics
In-Reply-To: <2012c86a-f76b-43cb-b2a2-2f5d0d92cd33.albert.th@alibaba-inc.com>
References: <b725cdea-ba33-4278-8d40-1536a66bfe3e.>
 <9bb35b5a-92bd-4c90-877d-04a5868334a4.>
 <2012c86a-f76b-43cb-b2a2-2f5d0d92cd33.albert.th@alibaba-inc.com>
Message-ID: <305e52fa-15b5-1970-f237-3fd9b2095b1e@oracle.com>

Hi,

Nice catch! Change looks good. As far as I can tell there's no JIRA 
entry for this yet, so I'll create one and sponsor your fix.

thanks,
Per

On 2019-08-11 02:16, Hao Tang wrote:
> Hi, I found that column "total" of ZGC log's statistics always reports obviously incorrect average values after running program with ZGC for over 10 hours (36000s). See the GC log below.
> 
> [36000.080s][info][gc,stats    ] === Garbage Collection Statistics =======================================================================================================================
> [36000.080s][info][gc,stats    ]                                                              Last 10s              Last 10m              Last 10h                Total
> [36000.080s][info][gc,stats    ]                                                              Avg / Max             Avg / Max             Avg / Max             Avg / Max
> [36000.080s][info][gc,stats    ]   Collector: Garbage Collection Cycle                     33.269 / 34.686       32.699 / 36.862       32.938 / 60.703        0.000 / 60.703      ms
> [36000.080s][info][gc,stats    ]  Contention: Mark Segment Reset Contention                     0 / 0                 0 / 0                 0 / 0                 1 / 0           ops/s
> [36000.080s][info][gc,stats    ]  Contention: Mark SeqNum Reset Contention                      0 / 0                 0 / 0                 0 / 0                 1 / 0           ops/s
> [36000.080s][info][gc,stats    ]  Contention: Relocation Contention                             0 / 0                 0 / 0                 0 / 2277              1 / 2277        ops/s
> [36000.080s][info][gc,stats    ]    Critical: Allocation Stall                              0.000 / 0.000         0.000 / 0.000         0.000 / 0.000         0.000 / 0.000       ms
> [36000.080s][info][gc,stats    ]    Critical: Allocation Stall                                  0 / 0                 0 / 0                 0 / 0                 1 / 0           ops/s
> [36000.080s][info][gc,stats    ]    Critical: GC Locker Stall                               0.000 / 0.000         0.000 / 0.000         0.000 / 0.000         0.000 / 0.000       ms
> [36000.080s][info][gc,stats    ]    Critical: GC Locker Stall                                   0 / 0                 0 / 0                 0 / 0                 1 / 0           ops/s
> [36000.080s][info][gc,stats    ]      Memory: Allocation Rate                                  45 / 48               45 / 48               45 / 90                0 / 90          MB/s
> [36000.080s][info][gc,stats    ]      Memory: Heap Used After Mark                            126 / 130             125 / 140             125 / 164               0 / 164         MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used After Relocation                       48 / 52               47 / 52               47 / 54                0 / 54          MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used Before Mark                           124 / 130             123 / 138             124 / 158               0 / 158         MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used Before Relocation                      54 / 58               53 / 60               53 / 60                0 / 60          MB
> 
> The bug is caused by mistakenly adding "nsamples" to ZStatSamplerData::_sum. I am applying the patch below to fix the bug. Please give advice on this fix.
> diff -r 8f067351c370 src/hotspot/share/gc/z/zStat.cpp
> --- a/src/hotspot/share/gc/z/zStat.cpp Mon Aug 05 16:27:30 2019 -0700
> +++ b/src/hotspot/share/gc/z/zStat.cpp Fri Aug 09 11:22:58 2019 +0800
> @@ -63,7 +63,7 @@
>    void add(const ZStatSamplerData& new_sample) {
>      _nsamples += new_sample._nsamples;
> -    _sum += new_sample._nsamples;
> +    _sum += new_sample._sum;
>      _max = MAX2(_max, new_sample._max);
>    }
> };
> This patch can enable ZGC to print correct statistics after 10 hours.
> 
> [36000.080s][info][gc,stats    ] === Garbage Collection Statistics =======================================================================================================================
> [36000.080s][info][gc,stats    ]                                                              Last 10s              Last 10m              Last 10h                Total
> [36000.080s][info][gc,stats    ]                                                              Avg / Max             Avg / Max             Avg / Max             Avg / Max
> [36000.080s][info][gc,stats    ]   Collector: Garbage Collection Cycle                     32.794 / 32.985       32.865 / 38.454       33.212 / 80.477       33.212 / 80.477      ms
> [36000.080s][info][gc,stats    ]  Contention: Mark Segment Reset Contention                     0 / 0                 0 / 0                 0 / 0                 0 / 0           ops/s
> [36000.080s][info][gc,stats    ]  Contention: Mark SeqNum Reset Contention                      0 / 0                 0 / 0                 0 / 0                 0 / 0           ops/s
> [36000.080s][info][gc,stats    ]  Contention: Relocation Contention                             0 / 0                 0 / 0                 0 / 5978              0 / 5978        ops/s
> [36000.080s][info][gc,stats    ]    Critical: Allocation Stall                              0.000 / 0.000         0.000 / 0.000         0.000 / 0.000         0.000 / 0.000       ms
> [36000.080s][info][gc,stats    ]    Critical: Allocation Stall                                  0 / 0                 0 / 0                 0 / 0                 0 / 0           ops/s
> [36000.080s][info][gc,stats    ]    Critical: GC Locker Stall                               0.000 / 0.000         0.000 / 0.000         0.000 / 0.000         0.000 / 0.000       ms
> [36000.080s][info][gc,stats    ]    Critical: GC Locker Stall                                   0 / 0                 0 / 0                 0 / 0                 0 / 0           ops/s
> [36000.080s][info][gc,stats    ]      Memory: Allocation Rate                                  45 / 46               45 / 48               45 / 92               45 / 92          MB/s
> [36000.080s][info][gc,stats    ]      Memory: Heap Used After Mark                            124 / 126             125 / 148             126 / 164             126 / 164         MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used After Relocation                       48 / 52               48 / 52               47 / 54               47 / 54          MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used Before Mark                           122 / 124             123 / 148             124 / 158             124 / 158         MB
> [36000.080s][info][gc,stats    ]      Memory: Heap Used Before Relocation                      54 / 58               54 / 60               53 / 60               53 / 60          MB
> 
> 
> Best regards,
> Hao Tang
> 


From poonam.bajaj at oracle.com  Sun Aug 11 14:25:42 2019
From: poonam.bajaj at oracle.com (Poonam Parhar)
Date: Sun, 11 Aug 2019 07:25:42 -0700
Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC
In-Reply-To: <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com>
 <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com>
 <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com>
 <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com>
 <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
Message-ID: <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>

Hello,

The fix for this bug had to be backed out with '8227178: Backout of 
8215523' because it had caused timeout failures for some of the CMS 
tests. Those failures get? resolved by adding the following check before 
calling recalculate_used_stable() in CompactibleFreeListSpace::allocate():

1387 // During GC we do not need to recalculate the stable used value for
1388 // every allocation in old gen. It is done once at the end of GC 
instead
1389 // for performance reasons.
1390 if (!CMSHeap::heap()->is_gc_active()) {
1391 recalculate_used_stable();
1392 }
1393

Please review the updated webrev:
http://cr.openjdk.java.net/~poonam/8215523/webrev.02/

Thanks,
Poonam


On 7/2/19 6:42 AM, Poonam Parhar wrote:
> Hi Aleksey, Thomas,
>
> It wasn't meant to be non-public. I have opened it.
>
> Thanks,
> Poonam
>
> On 7/2/19 3:36 AM, Thomas Schatzl wrote:
>> Hi,
>>
>> On Tue, 2019-07-02 at 10:10 +0200, Aleksey Shipilev wrote:
>>> Hi,
>>>
>>> On 6/21/19 10:30 PM, Poonam Parhar wrote:
>>>> On 6/21/19 12:21 PM, Poonam Parhar wrote:
>>>>> Bug 8215523 <https://bugs.openjdk.java.net/browse/JDK-8215523>:
>>>>> jstat reports incorrect values for
>>>>> OU for CMS GC
>>> This bug is non-public, was it really meant to be?
>>>
>> ?? there does not seem to be anything confidential in the public areas
>> of the bug. Maybe Poonam can open it after looking at it again, and
>> eventually open it (and add a token "Description" ;) ).
>>
>> Thanks,
>> ?? Thomas
>>
>


From thomas.schatzl at oracle.com  Mon Aug 12 03:34:46 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 12 Aug 2019 05:34:46 +0200
Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC
In-Reply-To: <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com>
 <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com>
 <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com>
 <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com>
 <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
 <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
Message-ID: <5cd399da-2186-2ff1-9c8e-1faba4358120@oracle.com>

Hi,

On 11.08.19 16:25, Poonam Parhar wrote:
> Hello,
> 
> The fix for this bug had to be backed out with '8227178: Backout of 
> 8215523' because it had caused timeout failures for some of the CMS 
> tests. Those failures get? resolved by adding the following check before 
> calling recalculate_used_stable() in CompactibleFreeListSpace::allocate():
> 
> 1387 // During GC we do not need to recalculate the stable used value for
> 1388 // every allocation in old gen. It is done once at the end of GC 
> instead
> 1389 // for performance reasons.
> 1390 if (!CMSHeap::heap()->is_gc_active()) {
> 1391 recalculate_used_stable();
> 1392 }
> 1393

   looks good.

For others: the problem with the original 8215523 has been that the 
additional verification in used() (used in recalculate_used_stable()) is 
very slow; since in some cases the heap is resized quite often during 
GC, this caused a significant slowdown.

This is the "point-fix" to workaround this problem as the comment 
indicates. The alternative would have been making a special path for 
old-gen allocations during GC I think. This has been considered as too 
complicated.

Poonam tested the change in some small reproducer, causing no more 
significant slowdowns in debug mode (and none in product anyway).

We performed multiple hs-tier1-7 runs with no more failures due to 
timeouts that we had seen earlier.

> 
> Please review the updated webrev:
> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/

Some procedural issue: this change must be pushed under a new CR that 
still needs to be created, called "[Redo] jstat reports incorrect values 
for OU for CMS GC", linking 8215523. Thanks :)

Thanks,
   Thomas


From sgehwolf at redhat.com  Mon Aug 12 08:22:43 2019
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Mon, 12 Aug 2019 10:22:43 +0200
Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC
In-Reply-To: <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com>
 <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com>
 <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com>
 <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com>
 <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
 <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
Message-ID: <e448ef906331f8d78ed6145ead4c1f7b131e2904.camel@redhat.com>

Hi,

On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote:
> Hello,
> 
> The fix for this bug had to be backed out with '8227178: Backout of
> 8215523' because it had caused timeout failures for some of the CMS
> tests. Those failures get  resolved by adding the following check
> before calling recalculate_used_stable() in
> CompactibleFreeListSpace::allocate():
> 
> 1387   // During GC we do not need to recalculate the stable used value for
> 1388   // every allocation in old gen. It is done once at the end of GC instead
> 1389   // for performance reasons.
> 1390   if (!CMSHeap::heap()->is_gc_active()) {
> 1391     recalculate_used_stable();
> 1392   }
> 1393 
> 
> Please review the updated webrev:
> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/

+  // Returns monotonically increasing stable used space bytes for CMS.
+  // This is required for jhat and other memory monitoring tools

jhat has been removed a while ago: jhat => jstat

Aside: Why has there not been a new bug filed "Redo: jstat reports
incorrect values for OU for CMS GC". It's confusing to look at JDK-
8215523, see it resolved and mention a pushed commit in the comments.
Isn't that what's usually been done for backouts?

Thanks,
Severin


From david.holmes at oracle.com  Mon Aug 12 08:40:38 2019
From: david.holmes at oracle.com (David Holmes)
Date: Mon, 12 Aug 2019 18:40:38 +1000
Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC
In-Reply-To: <e448ef906331f8d78ed6145ead4c1f7b131e2904.camel@redhat.com>
References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com>
 <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com>
 <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com>
 <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com>
 <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
 <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
 <e448ef906331f8d78ed6145ead4c1f7b131e2904.camel@redhat.com>
Message-ID: <2eb84736-d28d-0107-f79c-a168faa70b94@oracle.com>

Poonam,

A new bug must be filed to redo the changes originally done under 8215523.

Thanks,
David

On 12/08/2019 6:22 pm, Severin Gehwolf wrote:
> Hi,
> 
> On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote:
>> Hello,
>>
>> The fix for this bug had to be backed out with '8227178: Backout of
>> 8215523' because it had caused timeout failures for some of the CMS
>> tests. Those failures get  resolved by adding the following check
>> before calling recalculate_used_stable() in
>> CompactibleFreeListSpace::allocate():
>>
>> 1387   // During GC we do not need to recalculate the stable used value for
>> 1388   // every allocation in old gen. It is done once at the end of GC instead
>> 1389   // for performance reasons.
>> 1390   if (!CMSHeap::heap()->is_gc_active()) {
>> 1391     recalculate_used_stable();
>> 1392   }
>> 1393
>>
>> Please review the updated webrev:
>> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/
> 
> +  // Returns monotonically increasing stable used space bytes for CMS.
> +  // This is required for jhat and other memory monitoring tools
> 
> jhat has been removed a while ago: jhat => jstat
> 
> Aside: Why has there not been a new bug filed "Redo: jstat reports
> incorrect values for OU for CMS GC". It's confusing to look at JDK-
> 8215523, see it resolved and mention a pushed commit in the comments.
> Isn't that what's usually been done for backouts?
> 
> Thanks,
> Severin
> 


From erik.osterlund at oracle.com  Mon Aug 12 08:50:49 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Mon, 12 Aug 2019 10:50:49 +0200
Subject: ZGC: Fix an incorrect column in GC statistics
In-Reply-To: <305e52fa-15b5-1970-f237-3fd9b2095b1e@oracle.com>
References: <b725cdea-ba33-4278-8d40-1536a66bfe3e.>
 <9bb35b5a-92bd-4c90-877d-04a5868334a4.>
 <2012c86a-f76b-43cb-b2a2-2f5d0d92cd33.albert.th@alibaba-inc.com>
 <305e52fa-15b5-1970-f237-3fd9b2095b1e@oracle.com>
Message-ID: <07ba71b8-96ad-5cd3-d944-c8123b417aff@oracle.com>

Hi,

Looks good to me too.

/Erik

On 2019-08-11 11:07, Per Liden wrote:
> Hi,
> 
> Nice catch! Change looks good. As far as I can tell there's no JIRA 
> entry for this yet, so I'll create one and sponsor your fix.
> 
> thanks,
> Per
> 
> On 2019-08-11 02:16, Hao Tang wrote:
>> Hi, I found that column "total" of ZGC log's statistics always reports 
>> obviously incorrect average values after running program with ZGC for 
>> over 10 hours (36000s). See the GC log below.
>>
>> [36000.080s][info][gc,stats??? ] === Garbage Collection Statistics 
>> ======================================================================================================================= 
>>
>> [36000.080s][info][gc,stats    
>> ]????????????????????????????????????????????????????????????? Last 
>> 10s????????????? Last 10m????????????? Last 10h??????????????? Total
>> [36000.080s][info][gc,stats    
>> ]????????????????????????????????????????????????????????????? Avg / 
>> Max???????????? Avg / Max???????????? Avg / Max???????????? Avg / Max
>> [36000.080s][info][gc,stats??? ]?? Collector: Garbage Collection 
>> Cycle???????????????????? 33.269 / 34.686?????? 32.699 / 36.862       
>> 32.938 / 60.703??????? 0.000 / 60.703????? ms
>> [36000.080s][info][gc,stats??? ]? Contention: Mark Segment Reset 
>> Contention???????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]? Contention: Mark SeqNum Reset 
>> Contention????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]? Contention: Relocation 
>> Contention???????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 2277????????????? 1 / 2277??????? ops/s
>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>> Stall????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>> Stall????????????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>> Stall?????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>> Stall?????????????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]????? Memory: Allocation 
>> Rate????????????????????????????????? 45 / 48?????????????? 45 / 
>> 48?????????????? 45 / 90??????????????? 0 / 90????????? MB/s
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>> Mark??????????????????????????? 126 / 130???????????? 125 / 
>> 140???????????? 125 / 164?????????????? 0 / 164???????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>> Relocation?????????????????????? 48 / 52?????????????? 47 / 
>> 52?????????????? 47 / 54??????????????? 0 / 54????????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>> Mark?????????????????????????? 124 / 130???????????? 123 / 
>> 138???????????? 124 / 158?????????????? 0 / 158???????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>> Relocation????????????????????? 54 / 58?????????????? 53 / 
>> 60?????????????? 53 / 60??????????????? 0 / 60????????? MB
>>
>> The bug is caused by mistakenly adding "nsamples" to 
>> ZStatSamplerData::_sum. I am applying the patch below to fix the bug. 
>> Please give advice on this fix.
>> diff -r 8f067351c370 src/hotspot/share/gc/z/zStat.cpp
>> --- a/src/hotspot/share/gc/z/zStat.cpp Mon Aug 05 16:27:30 2019 -0700
>> +++ b/src/hotspot/share/gc/z/zStat.cpp Fri Aug 09 11:22:58 2019 +0800
>> @@ -63,7 +63,7 @@
>> ?? void add(const ZStatSamplerData& new_sample) {
>> ???? _nsamples += new_sample._nsamples;
>> -??? _sum += new_sample._nsamples;
>> +??? _sum += new_sample._sum;
>> ???? _max = MAX2(_max, new_sample._max);
>> ?? }
>> };
>> This patch can enable ZGC to print correct statistics after 10 hours.
>>
>> [36000.080s][info][gc,stats??? ] === Garbage Collection Statistics 
>> ======================================================================================================================= 
>>
>> [36000.080s][info][gc,stats    
>> ]????????????????????????????????????????????????????????????? Last 
>> 10s????????????? Last 10m????????????? Last 10h??????????????? Total
>> [36000.080s][info][gc,stats    
>> ]????????????????????????????????????????????????????????????? Avg / 
>> Max???????????? Avg / Max???????????? Avg / Max???????????? Avg / Max
>> [36000.080s][info][gc,stats??? ]?? Collector: Garbage Collection 
>> Cycle???????????????????? 32.794 / 32.985?????? 32.865 / 38.454       
>> 33.212 / 80.477?????? 33.212 / 80.477????? ms
>> [36000.080s][info][gc,stats??? ]? Contention: Mark Segment Reset 
>> Contention???????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]? Contention: Mark SeqNum Reset 
>> Contention????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]? Contention: Relocation 
>> Contention???????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 5978????????????? 0 / 5978??????? ops/s
>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>> Stall????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>> Stall????????????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>> Stall?????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>> Stall?????????????????????????????????? 0 / 0???????????????? 0 / 
>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>> [36000.080s][info][gc,stats??? ]????? Memory: Allocation 
>> Rate????????????????????????????????? 45 / 46?????????????? 45 / 
>> 48?????????????? 45 / 92?????????????? 45 / 92????????? MB/s
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>> Mark??????????????????????????? 124 / 126???????????? 125 / 
>> 148???????????? 126 / 164???????????? 126 / 164???????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>> Relocation?????????????????????? 48 / 52?????????????? 48 / 
>> 52?????????????? 47 / 54?????????????? 47 / 54????????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>> Mark?????????????????????????? 122 / 124???????????? 123 / 
>> 148???????????? 124 / 158???????????? 124 / 158???????? MB
>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>> Relocation????????????????????? 54 / 58?????????????? 54 / 
>> 60?????????????? 53 / 60?????????????? 53 / 60????????? MB
>>
>>
>> Best regards,
>> Hao Tang
>>

From per.liden at oracle.com  Mon Aug 12 09:49:22 2019
From: per.liden at oracle.com (Per Liden)
Date: Mon, 12 Aug 2019 11:49:22 +0200
Subject: ZGC: Fix an incorrect column in GC statistics
In-Reply-To: <07ba71b8-96ad-5cd3-d944-c8123b417aff@oracle.com>
References: <b725cdea-ba33-4278-8d40-1536a66bfe3e.>
 <9bb35b5a-92bd-4c90-877d-04a5868334a4.>
 <2012c86a-f76b-43cb-b2a2-2f5d0d92cd33.albert.th@alibaba-inc.com>
 <305e52fa-15b5-1970-f237-3fd9b2095b1e@oracle.com>
 <07ba71b8-96ad-5cd3-d944-c8123b417aff@oracle.com>
Message-ID: <d1bfc252-69ca-3881-ebf9-6222c45068e2@oracle.com>

Pushed.

https://hg.openjdk.java.net/jdk/jdk/rev/8ebc8f74f2d2

cheers,
Per

On 8/12/19 10:50 AM, Erik ?sterlund wrote:
> Hi,
> 
> Looks good to me too.
> 
> /Erik
> 
> On 2019-08-11 11:07, Per Liden wrote:
>> Hi,
>>
>> Nice catch! Change looks good. As far as I can tell there's no JIRA 
>> entry for this yet, so I'll create one and sponsor your fix.
>>
>> thanks,
>> Per
>>
>> On 2019-08-11 02:16, Hao Tang wrote:
>>> Hi, I found that column "total" of ZGC log's statistics always 
>>> reports obviously incorrect average values after running program with 
>>> ZGC for over 10 hours (36000s). See the GC log below.
>>>
>>> [36000.080s][info][gc,stats??? ] === Garbage Collection Statistics 
>>> ======================================================================================================================= 
>>>
>>> [36000.080s][info][gc,stats 
>>> ]????????????????????????????????????????????????????????????? Last 
>>> 10s????????????? Last 10m????????????? Last 10h??????????????? Total
>>> [36000.080s][info][gc,stats 
>>> ]????????????????????????????????????????????????????????????? Avg / 
>>> Max???????????? Avg / Max???????????? Avg / Max???????????? Avg / Max
>>> [36000.080s][info][gc,stats??? ]?? Collector: Garbage Collection 
>>> Cycle???????????????????? 33.269 / 34.686?????? 32.699 / 36.862 
>>> 32.938 / 60.703??????? 0.000 / 60.703????? ms
>>> [36000.080s][info][gc,stats??? ]? Contention: Mark Segment Reset 
>>> Contention???????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]? Contention: Mark SeqNum Reset 
>>> Contention????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]? Contention: Relocation 
>>> Contention???????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 2277????????????? 1 / 2277??????? ops/s
>>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>>> Stall????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>>> Stall????????????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>>> Stall?????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>>> Stall?????????????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 1 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]????? Memory: Allocation 
>>> Rate????????????????????????????????? 45 / 48?????????????? 45 / 
>>> 48?????????????? 45 / 90??????????????? 0 / 90????????? MB/s
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>>> Mark??????????????????????????? 126 / 130???????????? 125 / 
>>> 140???????????? 125 / 164?????????????? 0 / 164???????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>>> Relocation?????????????????????? 48 / 52?????????????? 47 / 
>>> 52?????????????? 47 / 54??????????????? 0 / 54????????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>>> Mark?????????????????????????? 124 / 130???????????? 123 / 
>>> 138???????????? 124 / 158?????????????? 0 / 158???????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>>> Relocation????????????????????? 54 / 58?????????????? 53 / 
>>> 60?????????????? 53 / 60??????????????? 0 / 60????????? MB
>>>
>>> The bug is caused by mistakenly adding "nsamples" to 
>>> ZStatSamplerData::_sum. I am applying the patch below to fix the bug. 
>>> Please give advice on this fix.
>>> diff -r 8f067351c370 src/hotspot/share/gc/z/zStat.cpp
>>> --- a/src/hotspot/share/gc/z/zStat.cpp Mon Aug 05 16:27:30 2019 -0700
>>> +++ b/src/hotspot/share/gc/z/zStat.cpp Fri Aug 09 11:22:58 2019 +0800
>>> @@ -63,7 +63,7 @@
>>> ?? void add(const ZStatSamplerData& new_sample) {
>>> ???? _nsamples += new_sample._nsamples;
>>> -??? _sum += new_sample._nsamples;
>>> +??? _sum += new_sample._sum;
>>> ???? _max = MAX2(_max, new_sample._max);
>>> ?? }
>>> };
>>> This patch can enable ZGC to print correct statistics after 10 hours.
>>>
>>> [36000.080s][info][gc,stats??? ] === Garbage Collection Statistics 
>>> ======================================================================================================================= 
>>>
>>> [36000.080s][info][gc,stats 
>>> ]????????????????????????????????????????????????????????????? Last 
>>> 10s????????????? Last 10m????????????? Last 10h??????????????? Total
>>> [36000.080s][info][gc,stats 
>>> ]????????????????????????????????????????????????????????????? Avg / 
>>> Max???????????? Avg / Max???????????? Avg / Max???????????? Avg / Max
>>> [36000.080s][info][gc,stats??? ]?? Collector: Garbage Collection 
>>> Cycle???????????????????? 32.794 / 32.985?????? 32.865 / 38.454 
>>> 33.212 / 80.477?????? 33.212 / 80.477????? ms
>>> [36000.080s][info][gc,stats??? ]? Contention: Mark Segment Reset 
>>> Contention???????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]? Contention: Mark SeqNum Reset 
>>> Contention????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]? Contention: Relocation 
>>> Contention???????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 5978????????????? 0 / 5978??????? ops/s
>>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>>> Stall????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>>> [36000.080s][info][gc,stats??? ]??? Critical: Allocation 
>>> Stall????????????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>>> Stall?????????????????????????????? 0.000 / 0.000???????? 0.000 / 
>>> 0.000???????? 0.000 / 0.000???????? 0.000 / 0.000?????? ms
>>> [36000.080s][info][gc,stats??? ]??? Critical: GC Locker 
>>> Stall?????????????????????????????????? 0 / 0???????????????? 0 / 
>>> 0???????????????? 0 / 0???????????????? 0 / 0?????????? ops/s
>>> [36000.080s][info][gc,stats??? ]????? Memory: Allocation 
>>> Rate????????????????????????????????? 45 / 46?????????????? 45 / 
>>> 48?????????????? 45 / 92?????????????? 45 / 92????????? MB/s
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>>> Mark??????????????????????????? 124 / 126???????????? 125 / 
>>> 148???????????? 126 / 164???????????? 126 / 164???????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used After 
>>> Relocation?????????????????????? 48 / 52?????????????? 48 / 
>>> 52?????????????? 47 / 54?????????????? 47 / 54????????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>>> Mark?????????????????????????? 122 / 124???????????? 123 / 
>>> 148???????????? 124 / 158???????????? 124 / 158???????? MB
>>> [36000.080s][info][gc,stats??? ]????? Memory: Heap Used Before 
>>> Relocation????????????????????? 54 / 58?????????????? 54 / 
>>> 60?????????????? 53 / 60?????????????? 53 / 60????????? MB
>>>
>>>
>>> Best regards,
>>> Hao Tang
>>>

From poonam.bajaj at oracle.com  Mon Aug 12 12:56:01 2019
From: poonam.bajaj at oracle.com (Poonam Parhar)
Date: Mon, 12 Aug 2019 05:56:01 -0700
Subject: RFR 8215523: jstat reports incorrect values for OU for CMS GC
In-Reply-To: <e448ef906331f8d78ed6145ead4c1f7b131e2904.camel@redhat.com>
References: <64c5a88a-a39d-fedb-7738-998bca61b6a7@oracle.com>
 <4bb04022-9460-3956-cf86-447bafab0bf4@oracle.com>
 <3744fdba-2755-34fb-f0d5-76543a1faa68@redhat.com>
 <4e7eee4a06416761e1ee63be678c71f1426c9adb.camel@oracle.com>
 <c379ab0d-999a-189a-81f3-ef2ed45ccff6@oracle.com>
 <5b320fa0-fb15-8509-80d5-fcc7f12bd253@oracle.com>
 <e448ef906331f8d78ed6145ead4c1f7b131e2904.camel@redhat.com>
Message-ID: <e74b8056-d66d-39d6-6eea-4864f1b1c040@oracle.com>

Hello Severin,

On 8/12/19 1:22 AM, Severin Gehwolf wrote:
> Hi,
>
> On Sun, 2019-08-11 at 07:25 -0700, Poonam Parhar wrote:
>> Hello,
>>
>> The fix for this bug had to be backed out with '8227178: Backout of
>> 8215523' because it had caused timeout failures for some of the CMS
>> tests. Those failures get  resolved by adding the following check
>> before calling recalculate_used_stable() in
>> CompactibleFreeListSpace::allocate():
>>
>> 1387   // During GC we do not need to recalculate the stable used value for
>> 1388   // every allocation in old gen. It is done once at the end of GC instead
>> 1389   // for performance reasons.
>> 1390   if (!CMSHeap::heap()->is_gc_active()) {
>> 1391     recalculate_used_stable();
>> 1392   }
>> 1393
>>
>> Please review the updated webrev:
>> http://cr.openjdk.java.net/~poonam/8215523/webrev.02/
> +  // Returns monotonically increasing stable used space bytes for CMS.
> +  // This is required for jhat and other memory monitoring tools
>
> jhat has been removed a while ago: jhat => jstat
A typo from the previous changes. Will fix it.
>
> Aside: Why has there not been a new bug filed "Redo: jstat reports
> incorrect values for OU for CMS GC". It's confusing to look at JDK-
> 8215523, see it resolved and mention a pushed commit in the comments.
> Isn't that what's usually been done for backouts?
My mistake. I will file another bug and will then re-submit the review 
request.

Thanks,
Poonam
> Thanks,
> Severin
>


From shade at redhat.com  Mon Aug 12 13:13:17 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 12 Aug 2019 15:13:17 +0200
Subject: RFR: 8228369: Shenandoah: Refactor LRB C1 stubs
In-Reply-To: <29ecb1b9-4de5-2e26-22a3-f371a61f1e1a@redhat.com>
References: <e83a4469-4939-f435-fc19-a7ef97ac349a@redhat.com>
 <0788a5a5-3826-6693-608f-b48dd5805ef9@redhat.com>
 <72b9fa73-9ae4-7322-09a4-8e66058043db@redhat.com>
 <29ecb1b9-4de5-2e26-22a3-f371a61f1e1a@redhat.com>
Message-ID: <0cf34058-26ee-1791-2049-c0bf268d8d20@redhat.com>

On 8/9/19 2:40 PM, Roman Kennke wrote:
> Incremental:
> http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.02.diff/
> Full:
> http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.02/

*) Note stuff like:
  623   __ blrt(lr, 1, 0, MacroAssembler::ret_type_integral);

...would not compile after AArch64 simulator removal (JDK-8228400). The equivalent is:
  __ blr(lr);

*) In src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.cpp,
_load_reference_barrier_rt_code_blob would be left uninitialized with -ShenandoahLoadRefBarrier?

  if (ShenandoahLoadRefBarrier) {
    C1ShenandoahLoadReferenceBarrierCodeGenClosure lrb_code_gen_cl;
    _load_reference_barrier_rt_code_blob = ...
  }

Otherwise looks good.

-- 
Thanks,
-Aleksey


From shade at redhat.com  Mon Aug 12 15:50:09 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 12 Aug 2019 17:50:09 +0200
Subject: RFR (S) 8229416: Shenandoah: Demote or remove
 ShenandoahOptimize*Final optimizations
Message-ID: <2e068144-7d36-8c53-b906-4ad8c1bf801e@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8229416
  https://cr.openjdk.java.net/~shade/8229416/webrev.01/

There are three Shenandoah optimizations at the moment:
 ShenandoahOptimizeStaticFinals (enabled by default)
 ShenandoahOptimizeInstanceFinals (disabled by default)
 ShenandoahOptimizeStableFinals (disabled by default)

The last two are known to break some programs, and they are definitely incorrect in
post-LRB/post-nofwdptr world, where exposing the from-space object with unusual markword would wreck
some havoc. These should be removed.

The first optimization is eliminating barriers on constants, that are handled separately, and never
get exposed as from-space objects. We should keep that optimization on, but to add future debugging,
we would want to keep the flag as diagnostic.

Testing: hotspot_gc_shenandoah {fastdebug,release}

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Mon Aug 12 16:18:54 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 12 Aug 2019 18:18:54 +0200
Subject: RFR: 8228369: Shenandoah: Refactor LRB C1 stubs
In-Reply-To: <0cf34058-26ee-1791-2049-c0bf268d8d20@redhat.com>
References: <e83a4469-4939-f435-fc19-a7ef97ac349a@redhat.com>
 <0788a5a5-3826-6693-608f-b48dd5805ef9@redhat.com>
 <72b9fa73-9ae4-7322-09a4-8e66058043db@redhat.com>
 <29ecb1b9-4de5-2e26-22a3-f371a61f1e1a@redhat.com>
 <0cf34058-26ee-1791-2049-c0bf268d8d20@redhat.com>
Message-ID: <c94a56b4-9fa3-df21-2abe-1c18f78cd893@redhat.com>

Hi Aleksey,

> *) Note stuff like:
>   623   __ blrt(lr, 1, 0, MacroAssembler::ret_type_integral);
> 
> ...would not compile after AArch64 simulator removal (JDK-8228400). The equivalent is:
>   __ blr(lr);

Right. Fixed that.

> *) In src/hotspot/share/gc/shenandoah/c1/shenandoahBarrierSetC1.cpp,
> _load_reference_barrier_rt_code_blob would be left uninitialized with -ShenandoahLoadRefBarrier?
> 
>   if (ShenandoahLoadRefBarrier) {
>     C1ShenandoahLoadReferenceBarrierCodeGenClosure lrb_code_gen_cl;
>     _load_reference_barrier_rt_code_blob = ...
>   }

I added initialization to NULL in the ShBarrierSetC1 to the constructor,
and assert != NULL in the accessor to both stubs.

Incremental changes:
http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.03.diff/
Full:
http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.03/

Good now?

Roman


From shade at redhat.com  Mon Aug 12 17:31:00 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 12 Aug 2019 19:31:00 +0200
Subject: RFR (S) 8229416: Shenandoah: Demote or remove
 ShenandoahOptimize*Final optimizations
In-Reply-To: <4846e265-f2ae-641c-cca6-a02ed6e723f0@kennke.org>
References: <2e068144-7d36-8c53-b906-4ad8c1bf801e@redhat.com>
 <4846e265-f2ae-641c-cca6-a02ed6e723f0@kennke.org>
Message-ID: <efdb28ec-34dc-626f-b183-6cf36ab3a845@redhat.com>

On 8/12/19 6:21 PM, Roman Kennke wrote:
>> The first optimization is eliminating barriers on constants, that are handled separately, and never
>> get exposed as from-space objects. We should keep that optimization on, but to add future debugging,
>> we would want to keep the flag as diagnostic.
> 
> I believe this optimization actually does nothing (interesting). C2
> already optimizes access to static-finals to inlined constants, and we
> eliminate barriers on inlined constants. We might want to check & verify
> this, but I strongly suspect this optimization is actually a no-op (at
> least in most/all interesting cases).

Maybe?

It would rather keep current behavior as is, and treat this as the cleanup. We can remove
ShenandoahOptimizeStaticFinals once we prove it is irrelevant.

-- 
Thanks,
-Aleksey


From martin.doerr at sap.com  Mon Aug 12 17:33:41 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 12 Aug 2019 17:33:41 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory model
 platforms
Message-ID: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>

Hi,

I recently noticed that the selection of weak memory model platforms is outdated in the task queue implementation:
s390 is unnecessarily treated as weak memory model platform.

I could simply fix it by adding "defined S390", but I'd like to get rid of the platform whitelist in the middle of the implementation.

My favorite implementation looks like this:
http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.00/

I've moved the selection to the platform files. They define if they have the required property.
I have also cleaned up some PPC64 related stuff which should use the same property.
The change really improves only s390. It's only cleanup for other ones (no functional change).

I'd like to get reviews from GC first. I guess I'll have to get reviews from runtime and compiler afterwards, too.

Thanks and best regards,
Martin


From rkennke at redhat.com  Mon Aug 12 18:10:59 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Mon, 12 Aug 2019 20:10:59 +0200
Subject: RFR (S) 8229416: Shenandoah: Demote or remove
 ShenandoahOptimize*Final optimizations
In-Reply-To: <efdb28ec-34dc-626f-b183-6cf36ab3a845@redhat.com>
References: <2e068144-7d36-8c53-b906-4ad8c1bf801e@redhat.com>
 <4846e265-f2ae-641c-cca6-a02ed6e723f0@kennke.org>
 <efdb28ec-34dc-626f-b183-6cf36ab3a845@redhat.com>
Message-ID: <52dd6418-cf6a-1eb0-a877-01c720fd879f@redhat.com>


>>> The first optimization is eliminating barriers on constants, that are handled separately, and never>>> get exposed as from-space objects. We should keep that optimization
on, but to add future debugging,
>>> we would want to keep the flag as diagnostic.
>>
>> I believe this optimization actually does nothing (interesting). C2
>> already optimizes access to static-finals to inlined constants, and we
>> eliminate barriers on inlined constants. We might want to check & verify
>> this, but I strongly suspect this optimization is actually a no-op (at
>> least in most/all interesting cases).
> 
> Maybe?
> 
> It would rather keep current behavior as is, and treat this as the cleanup. We can remove
> ShenandoahOptimizeStaticFinals once we prove it is irrelevant.

Yes, go!

Thanks,
Roman


From kim.barrett at oracle.com  Mon Aug 12 18:35:30 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 12 Aug 2019 14:35:30 -0400
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <9C7071B1-C358-461E-BEE2-E2C5955ECFF3@oracle.com>

> On Aug 12, 2019, at 1:33 PM, Doerr, Martin <martin.doerr at sap.com> wrote:
> 
> Hi,
> 
> I recently noticed that the selection of weak memory model platforms is outdated in the task queue implementation:
> s390 is unnecessarily treated as weak memory model platform.
> 
> I could simply fix it by adding "defined S390", but I'd like to get rid of the platform whitelist in the middle of the implementation.
> 
> My favorite implementation looks like this:
> http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.00/
> 
> I've moved the selection to the platform files. They define if they have the required property.
> I have also cleaned up some PPC64 related stuff which should use the same property.
> The change really improves only s390. It's only cleanup for other ones (no functional change).
> 
> I'd like to get reviews from GC first. I guess I'll have to get reviews from runtime and compiler afterwards, too.
> 
> Thanks and best regards,
> Martin

Based on the code in pop_global and the comment explaining the need
for the barrier, I wonder whether a full fence is needed. It seems
like having an acquire barrier for _age.get() would be sufficient.
That we don't need an explicit barrier for the listed architectures
also argues that an acquire barrier is sufficient. And then we
wouldn't need the architecture-based conditionalization, as it would
be handled appropriately by the platform-specific implementation of
the acquire barrier.


From shade at redhat.com  Mon Aug 12 19:28:20 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Mon, 12 Aug 2019 21:28:20 +0200
Subject: RFR: 8228369: Shenandoah: Refactor LRB C1 stubs
In-Reply-To: <c94a56b4-9fa3-df21-2abe-1c18f78cd893@redhat.com>
References: <e83a4469-4939-f435-fc19-a7ef97ac349a@redhat.com>
 <0788a5a5-3826-6693-608f-b48dd5805ef9@redhat.com>
 <72b9fa73-9ae4-7322-09a4-8e66058043db@redhat.com>
 <29ecb1b9-4de5-2e26-22a3-f371a61f1e1a@redhat.com>
 <0cf34058-26ee-1791-2049-c0bf268d8d20@redhat.com>
 <c94a56b4-9fa3-df21-2abe-1c18f78cd893@redhat.com>
Message-ID: <3a6329a0-a6ab-fe2c-e7fa-37ad0337e397@redhat.com>

On 8/12/19 6:18 PM, Roman Kennke wrote:
> Incremental changes:
> http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.03.diff/
> Full:
> http://cr.openjdk.java.net/~rkennke/JDK-8228369/webrev.03/

Good!

-- 
Thanks,
-Aleksey


From david.holmes at oracle.com  Mon Aug 12 22:46:21 2019
From: david.holmes at oracle.com (David Holmes)
Date: Tue, 13 Aug 2019 08:46:21 +1000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>

Hi Martin,

On 13/08/2019 3:33 am, Doerr, Martin wrote:
> Hi,
> 
> I recently noticed that the selection of weak memory model platforms is 
> outdated in the task queue implementation:
> 
> s390 is unnecessarily treated as weak memory model platform.
> 
> I could simply fix it by adding ?defined S390?, but I?d like to get rid 
> of the platform whitelist in the middle of the implementation.
> 
> My favorite implementation looks like this:
> 
> http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.00/
> 
> I?ve moved the selection to the platform files. They define if they have 
> the required property.

I find the inversion of the ifdef slightly confusing. I also don't like 
a comment to say we don't have a given property. Wouldn't it be better 
to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?

> I have also cleaned up some PPC64 related stuff which should use the 
> same property.

Can't comment on ppc64 specifics.

> The change really improves only s390. It?s only cleanup for other ones 
> (no functional change).

It's not at all obvious to me that the need for the fence() in 
pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to see 
that define connected only with the IRIW issue as it currently is.

Thanks,
David
-----

> I?d like to get reviews from GC first. I guess I?ll have to get reviews 
> from runtime and compiler afterwards, too.
> 
> Thanks and best regards,
> 
> Martin
> 


From per.liden at oracle.com  Tue Aug 13 08:42:14 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 13 Aug 2019 10:42:14 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
Message-ID: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>

JDK-8227226 can temporarily create long[] objects on the heap, which 
later become oop arrays, when the array initialization has been 
completed. This is fine from a sampling/reporting point of view (the 
things done in the MemAllocator::Allocation destructor), since that only 
happens after the final klass pointer has been installed. However, if a 
heap iteration (via ZHeapIterator) happens before the final klass 
pointer has been installed, it will then see the long[]. As far as I can 
tell, this isn't a big deal, unless that heap iteration is out to 
JVMTI-tag all long[] instances. In that case, we tag a long[] which will 
later become an oop array, which seems wrong and potentially 
problematic. To avoid this, we want to be able to hide these roots from 
the heap iterator until the final klass pointer has been installed.

The approach here is that these temporary long[] objects are not kept 
alive in a Handle, but instead treated as a special root in 
ZThreadLocalData, that can optionally be made invisible to the 
ZRootsIterator.

Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0

/Per


From per.liden at oracle.com  Tue Aug 13 08:49:17 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 13 Aug 2019 10:49:17 +0200
Subject: 8227226: Segmented array clearing for ZGC
In-Reply-To: <192aed90-07cc-6d59-2a5b-06b665630739@oracle.com>
References: <46566545-B860-4C23-9450-860FD1FBC597@amazon.com>
 <5d2d713e-3a8d-3b7d-72a7-7a538a17532b@oracle.com>
 <eb005dec-18c3-8079-1cd1-49b9b9cda907@oracle.com>
 <8DD2FD76-8995-4BCB-A075-3215F466915E@amazon.com>
 <192aed90-07cc-6d59-2a5b-06b665630739@oracle.com>
Message-ID: <3f472891-779e-1b72-db27-73f2df249f07@oracle.com>

Just a follow up. Stefan correctly noted that a heap iteration that 
happens before the final klass pointer has been installed will see the 
temporary long[], which can be problematic if that heap iteration is 
doing JVMTI-tagging on all long[] instances. In that case, that tag will 
eventually end up on the oop array when we later install the final klass 
pointer.

I filed JDK-8229451 to deal with this, and sent out a patch for review.

cheers,
Per

On 8/8/19 10:42 AM, Per Liden wrote:
> On 8/7/19 3:55 PM, Sciampacone, Ryan wrote:
>> ???? > By overriding finish(), the sampling/reporting remains correct and
>> ???? > unaffected, as it will never see the intermediate long[].
>> I learned something today.? Thank you.
>>
>> For MemAllocator, I think we all agree the flow is locked in a bit too 
>> rigidly but this helps with some of the VM/GC assumptions so we end up 
>> battling it.? That said I'm with you - if there's a rewrite to be had, 
>> it's not in this patch.
>>
>> Otherwise, fwiw lgtm.
> 
> Thanks Ryan!
> 
> (Since you don't have an OpenJDK id I can't add you as Reviewed-by, but 
> Stefan, Erik and you will all be added as Contributed-by)
> 
> cheers,
> Per
> 
>>
>>
>> ?On 8/7/19, 3:24 AM, "Per Liden" <per.liden at oracle.com> wrote:
>>
>> ???? Hi again,
>> ???? On 8/7/19 11:59 AM, Per Liden wrote:
>> ???? > Hi Ryan,
>> ???? >
>> ???? > On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
>> ???? >> Although least intrusive, it goes back to some of the earlier
>> ???? >> complaints about using false in the constructor for do_zero.  
>> It also
>> ???? >> makes a fair number of
>> ???? >
>> ???? > My earlier comment about this was not about passing false to the
>> ???? > constructor, but the duplication of the _do_zero member, which 
>> I thought
>> ???? > looked a bit odd. In this patch, this was avoided by separation 
>> these
>> ???? > paths already in ZCollectedHeap::array_allocate().
>> ???? >
>> ???? >> assumptions (and goes against the hierarchies intent) on
>> ???? >> initialization logic to hide in finish().? That said, I agree 
>> that is
>> ???? >> fairly clean - and definitely addresses the missed cases of the
>> ???? >> earlier webrev.
>> ???? >>
>> ???? >
>> ???? > We've had the same discussions here and concluded that we might 
>> want to
>> ???? > restructure parts of MemAllocator to better accommodate this 
>> use case,
>> ???? > but that overriding finish() seems ok for now. A patch to 
>> restructure
>> ???? > MemAllocator could come later if we think it's needed.
>> ???? >
>> ???? >> 2 things,
>> ???? >>
>> ???? >> 1. Isn't the substitute_oop_array_klass() check too narrow?  
>> It will
>> ???? >> only detect types Object[], and not any other type of 
>> reference array
>> ???? >> (such as String[]) ?? I believe there's a bug here (correct me 
>> if I'm
>> ???? >> wrong).
>> ???? >
>> ???? > On the JVM level, Object[], String[] and int[][] all have the same
>> ???? > Klass, so we should catch them all with this single check.
>> ???? Sorry, I'm of course wrong here. Changed the check to call
>> ???? klass->is_objArray_klass() instead. Thanks!
>> ???? Updated webrev.4 in-place.
>> ???? cheers,
>> ???? Per
>> ???? >
>> ???? >> 2. I'd want to see an assert() on the sizeof(long) == 
>> sizeof(void *)
>> ???? >> dependency.? I realize what code base this is in but it would be
>> ???? >> properly defensive.
>> ???? >
>> ???? > Sounds good.
>> ???? >
>> ???? >>
>> ???? >> What does the reporting look like in this case?? Is the long[] 
>> type
>> ???? >> reported accepted?? I'm wondering if this depletes some of the
>> ???? >> simplicity.
>> ???? >
>> ???? > By overriding finish(), the sampling/reporting remains correct and
>> ???? > unaffected, as it will never see the intermediate long[].
>> ???? >
>> ???? > Updated webrev:
>> ???? >
>> ???? > http://cr.openjdk.java.net/~pliden/8227226/webrev.4
>> ???? >
>> ???? > cheers,
>> ???? > Per
>> ???? >
>> ???? >>
>> ???? >> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden"
>> ???? >> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>> ???? >> per.liden at oracle.com> wrote:
>> ???? >>
>> ???? >>????? Did some micro-benchmarking (on a Xeon E5-2630) with 
>> various segment
>> ???? >>????? sizes between 4K and 512K, and 64K seems to offer a good
>> ???? >> trade-off. For
>> ???? >>????? a 1G array, the allocation time increases by ~1%, but in 
>> exchange
>> ???? >> the
>> ???? >>????? worst case TTSP drops from ~280ms to ~0.6ms.
>> ???? >>????? Updated webrev using 64K:
>> ???? >>????? http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>> ???? >>????? cheers,
>> ???? >>????? Per
>> ???? >>????? On 8/2/19 11:11 AM, Per Liden wrote:
>> ???? >>????? > Hi Erik,
>> ???? >>????? >
>> ???? >>????? > On 8/1/19 5:56 PM, Erik Osterlund wrote:
>> ???? >>????? >> Hi Per,
>> ???? >>????? >>
>> ???? >>????? >> I like that this approach is unintrusive, does its 
>> thing at
>> ???? >> the right
>> ???? >>????? >> abstraction layer, and also handles medium sized arrays.
>> ???? >>????? >
>> ???? >>????? > It even handles small arrays (i.e. arrays in small 
>> zpages) ;)
>> ???? >>????? >
>> ???? >>????? >> Looks good.
>> ???? >>????? >
>> ???? >>????? > Thanks! I'll test various segment sizes and see how 
>> that affects
>> ???? >>????? > performance and TTSP.
>> ???? >>????? >
>> ???? >>????? > cheers,
>> ???? >>????? > Per
>> ???? >>????? >
>> ???? >>????? >>
>> ???? >>????? >> Thanks,
>> ???? >>????? >> /Erik
>> ???? >>????? >>
>> ???? >>????? >>> On 1 Aug 2019, at 16:14, Per Liden 
>> <per.liden at oracle.com> wrote:
>> ???? >>????? >>>
>> ???? >>????? >>> Here's an updated webrev that should be complete, 
>> i.e. fixes the
>> ???? >>????? >>> issues related to allocation sampling/reporting that I
>> ???? >> mentioned.
>> ???? >>????? >>> This patch makes MemAllocator::finish() virtual, so 
>> that we
>> ???? >> can do
>> ???? >>????? >>> our thing and install the correct klass pointer 
>> before the
>> ???? >> Allocation
>> ???? >>????? >>> destructor executes. This seems to be the least 
>> intrusive why of
>> ???? >>????? >>> doing this.
>> ???? >>????? >>>
>> ???? >>????? >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>> ???? >>????? >>>
>> ???? >>????? >>> This passed function testing, but proper benchmarking 
>> remains
>> ???? >> to be
>> ???? >>????? >>> done.
>> ???? >>????? >>>
>> ???? >>????? >>> cheers,
>> ???? >>????? >>> Per
>> ???? >>????? >>>
>> ???? >>????? >>>> On 7/31/19 7:19 PM, Per Liden wrote:
>> ???? >>????? >>>> Hi,
>> ???? >>????? >>>> I found some time to benchmark the "GC clears
>> ???? >> pages"-approach, and
>> ???? >>????? >>>> it's fairly clear that it's not paying off. So 
>> ditching that
>> ???? >> idea.
>> ???? >>????? >>>> However, I'm still looking for something that would 
>> not just do
>> ???? >>????? >>>> segmented clearing of arrays in large zpages. 
>> Letting oop
>> ???? >> arrays
>> ???? >>????? >>>> temporarily be typed arrays while it's being cleared 
>> could
>> ???? >> be an
>> ???? >>????? >>>> option. I did a prototype for that, which looks like 
>> this:
>> ???? >>????? >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>> ???? >>????? >>>> There's at least one issue here, the code doing 
>> allocation
>> ???? >> sampling
>> ???? >>????? >>>> will see that we allocated long arrays instead of oop
>> ???? >> arrays, so the
>> ???? >>????? >>>> reporting there will be skewed. That can be 
>> addressed if we
>> ???? >> go down
>> ???? >>????? >>>> this path. The code is otherwise fairly simple and
>> ???? >> contained. Feel
>> ???? >>????? >>>> free to spot any issues.
>> ???? >>????? >>>> cheers,
>> ???? >>????? >>>> Per
>> ???? >>????? >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>> ???? >>????? >>>>> Hi Ryan & Erik,
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> I had a look at this and started exploring a slightly
>> ???? >> different
>> ???? >>????? >>>>> approach. Instead doing segmented clearing in the
>> ???? >> allocation path,
>> ???? >>????? >>>>> we can have the concurrent GC thread clear pages 
>> when they are
>> ???? >>????? >>>>> reclaimed and not do any clearing in the allocation 
>> path at
>> ???? >> all.
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> That would look like this:
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> 
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> (I've had to temporarily comment out three lines of
>> ???? >> assert/debug
>> ???? >>????? >>>>> code to make this work)
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> The relocation set selection phase will now be 
>> tasked with
>> ???? >> some
>> ???? >>????? >>>>> potentially expensive clearing work, so we'll want 
>> to make
>> ???? >> that
>> ???? >>????? >>>>> part parallel also.
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> 
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> Moving this work from Java threads onto the 
>> concurrent GC
>> ???? >> threads
>> ???? >>????? >>>>> means we will potentially prolong the
>> ???? >> RelocationSetSelection and
>> ???? >>????? >>>>> Relocation phases. That might be a trade-off worth 
>> doing. In
>> ???? >>????? >>>>> return, we get:
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> * Faster array allocations, as there's now less 
>> work done
>> ???? >> in the
>> ???? >>????? >>>>> allocation path.
>> ???? >>????? >>>>> * This benefits all arrays, not just those 
>> allocated in
>> ???? >> large pages.
>> ???? >>????? >>>>> * No need to consider/tune a "chunk size".
>> ???? >>????? >>>>> * I also tend think we'll end up with slightly less 
>> complex
>> ???? >> code,
>> ???? >>????? >>>>> that is a bit easier to reason about. Can be 
>> debated of
>> ???? >> course.
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> This approach might also "survive" longer, because 
>> the YC
>> ???? >> scheme
>> ???? >>????? >>>>> we've been loosely thinking about currently 
>> requires newly
>> ???? >>????? >>>>> allocated pages to be cleared anyway. It's of 
>> course too
>> ???? >> early to
>> ???? >>????? >>>>> tell if that requirement will stand in the end, but 
>> it's
>> ???? >> possible
>> ???? >>????? >>>>> anyway.
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> I'll need to do some more testing and benchmarking 
>> to make
>> ???? >> sure
>> ???? >>????? >>>>> there's no regression or bugs here. The commented 
>> out debug
>> ???? >> code
>> ???? >>????? >>>>> also needs to be addressed of course.
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> Comments? Other ideas?
>> ???? >>????? >>>>>
>> ???? >>????? >>>>> cheers,
>> ???? >>????? >>>>> Per
>> ???? >>????? >>>>>
>> ???? >>????? >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>> ???? >>????? >>>>>>
>> ???? >>????? >>>>>> Somehow I lost the RFR off the front and started a 
>> new
>> ???? >> thread.
>> ???? >>????? >>>>>> Now that we're both off vacation I'd like to revisit
>> ???? >> this.? Can
>> ???? >>????? >>>>>> you take a look?
>> ???? >>????? >>>>>>
>> ???? >>????? >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of
>> ???? >> Sciampacone,
>> ???? >>????? >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on 
>> behalf of
>> ???? >>????? >>>>>> sci at amazon.com> wrote:
>> ???? >>????? >>>>>>
>> ???? >>????? >>>>>>       
>> http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>> ???? >>????? >>>>>>?????? This shifts away from abusing the constructor
>> ???? >> do_zero magic
>> ???? >>????? >>>>>> in exchange for virtualizing mem_clear() and 
>> specializing
>> ???? >> for the
>> ???? >>????? >>>>>> Z version.? It does create a change in mem_clear 
>> in that it
>> ???? >>????? >>>>>> returns an updated version of mem.? It does create 
>> change
>> ???? >> outside
>> ???? >>????? >>>>>> of the Z code however it does feel cleaner.
>> ???? >>????? >>>>>>?????? I didn't make a change to PinAllocating - 
>> looking at
>> ???? >> it, the
>> ???? >>????? >>>>>> safety of keeping it constructor / destructor 
>> based still
>> ???? >> seemed
>> ???? >>????? >>>>>> appropriate to me.? If the objection is to using 
>> the sequence
>> ???? >>????? >>>>>> numbers to pin (and instead using handles to 
>> update) -
>> ???? >> this to me
>> ???? >>????? >>>>>> seems less error prone.? I had originally discussed
>> ???? >> handles with
>> ???? >>????? >>>>>> Stefan but the proposal came down to this which 
>> looks much
>> ???? >> cleaner.
>> ???? >>????? >>>>>>?????? On 7/8/19, 6:36 AM, "hotspot-gc-dev on 
>> behalf of
>> ???? >>????? >>>>>> Sciampacone, Ryan"
>> ???? >> <hotspot-gc-dev-bounces at openjdk.java.net on
>> ???? >>????? >>>>>> behalf of sci at amazon.com> wrote:
>> ???? >>????? >>>>>>?????????? 1) Yes this was a conscious decision.  
>> There was
>> ???? >>????? >>>>>> discussion on determining the optimal point for 
>> breakup
>> ???? >> but given
>> ???? >>????? >>>>>> the existing sizes this seemed sufficient.? This 
>> doesn't
>> ???? >> preclude
>> ???? >>????? >>>>>> the ability to go down that path if its deemed 
>> absolutely
>> ???? >>????? >>>>>> necessary.? The path for more complex decisions is 
>> now
>> ???? >> available.
>> ???? >>????? >>>>>>?????????? 2) Agree
>> ???? >>????? >>>>>>?????????? 3) I'm not clear here.? Do you mean 
>> effectively
>> ???? >> going
>> ???? >>????? >>>>>> direct to ZHeap and bypassing the single function
>> ???? >> PinAllocating?
>> ???? >>????? >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>> ???? >>????? >>>>>>?????????? 4) Agree
>> ???? >>????? >>>>>>?????????? 5) I initially had the exact same 
>> reaction but I
>> ???? >> played
>> ???? >>????? >>>>>> around with a few other versions (including 
>> breaking up
>> ???? >>????? >>>>>> initialization points between header and body to 
>> get the
>> ???? >> desired
>> ???? >>????? >>>>>> results) and this ended up looking correct.? I'll try
>> ???? >> mixing in
>> ???? >>????? >>>>>> the mem clearer function again (fresh start) to 
>> see if it
>> ???? >> looks
>> ???? >>????? >>>>>> any better.
>> ???? >>????? >>>>>>?????????? On 7/8/19, 5:49 AM, "Per Liden"
>> ???? >> <per.liden at oracle.com>
>> ???? >>????? >>>>>> wrote:
>> ???? >>????? >>>>>>?????????????? Hi Ryan,
>> ???? >>????? >>>>>>?????????????? A few general comments:
>> ???? >>????? >>>>>>?????????????? 1) It looks like this still only 
>> work for
>> ???? >> large pages?
>> ???? >>????? >>>>>>?????????????? 2) The log_info stuff should be 
>> removed.
>> ???? >>????? >>>>>>?????????????? 3) I'm not a huge fan of single-use
>> ???? >> utilities like
>> ???? >>????? >>>>>> PinAllocating, at
>> ???? >>????? >>>>>>?????????????? least not when, IMO, the alternative 
>> is more
>> ???? >>????? >>>>>> straight forward and less code.
>> ???? >>????? >>>>>>?????????????? 4) Please make locals const when 
>> possible.
>> ???? >>????? >>>>>>?????????????? 5) Duplicating _do_zero looks odd. 
>> Injecting
>> ???? >> a "mem
>> ???? >>????? >>>>>> clearer", similar to
>> ???? >>????? >>>>>>?????????????? what Stefans original patch did, 
>> seems worth
>> ???? >> exploring.
>> ???? >>????? >>>>>>?????????????? cheers,
>> ???? >>????? >>>>>>?????????????? /Per
>> ???? >>????? >>>>>>?????????????? (Btw, I'm on vacation so I might not be
>> ???? >>????? >>>>>> super-responsive to emails)
>> ???? >>????? >>>>>>?????????????? On 2019-07-08 12:42, Erik ?sterlund 
>> wrote:
>> ???? >>????? >>>>>>?????????????? > Hi Ryan,
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>?????????????? > This looks good in general. Just some
>> ???? >> stylistic
>> ???? >>????? >>>>>> things...
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>?????????????? > 1) In the ZGC project we like the 
>> letter
>> ???? >> 'Z' so
>> ???? >>????? >>>>>> much that we put it in
>> ???? >>????? >>>>>>?????????????? > front of everything we possibly can,
>> ???? >> including all
>> ???? >>????? >>>>>> class names.
>> ???? >>????? >>>>>>?????????????? > 2) We also explicitly state things 
>> are
>> ???? >> private
>> ???? >>????? >>>>>> even though it's
>> ???? >>????? >>>>>>?????????????? > bleedingly obvious.
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>?????????????? > So:
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>?????????????? > 39 class PinAllocating {
>> ???? >>????? >>>>>>?????????????? > 40 HeapWord* _mem;
>> ???? >>????? >>>>>>?????????????? > 41 public: -> 39 class 
>> ZPinAllocating { 40
>> ???? >>????? >>>>>> private: 41 HeapWord* _mem;
>> ???? >>????? >>>>>>?????????????? >??? 42
>> ???? >>????? >>>>>>?????????????? >?? 41 public: I can be your sponsor 
>> and
>> ???? >> push this
>> ???? >>????? >>>>>> change for you. I don't
>> ???? >>????? >>>>>>?????????????? > think there is a need for another 
>> webrev
>> ???? >> for my
>> ???? >>????? >>>>>> small stylistic remarks,
>> ???? >>????? >>>>>>?????????????? > so I can just fix that before 
>> pushing this
>> ???? >> for
>> ???? >>????? >>>>>> you. On that note, I'll
>> ???? >>????? >>>>>>?????????????? > add me and StefanK to the 
>> contributed-by
>> ???? >> section
>> ???? >>????? >>>>>> as we all worked out
>> ???? >>????? >>>>>>?????????????? > the right solution to this problem
>> ???? >>????? >>>>>> collaboratively. I have run through
>> ???? >>????? >>>>>>?????????????? > mach5 tier1-5, and found no issues 
>> with this
>> ???? >>????? >>>>>> patch. Thanks, /Erik
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>?????????????? > On 2019-07-05 17:18, Sciampacone, 
>> Ryan wrote:
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >> https://bugs.openjdk.java.net/browse/JDK-8227226
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >>????? >>>>>>?????????????? >> This patch introduces safe point 
>> checks into
>> ???? >>????? >>>>>> array clearing during
>> ???? >>????? >>>>>>?????????????? >> allocation for ZGC.? The patch 
>> isolates the
>> ???? >>????? >>>>>> changes to ZGC as (in
>> ???? >>????? >>>>>>?????????????? >> particular with the more modern
>> ???? >> collectors) the
>> ???? >>????? >>>>>> approach to
>> ???? >>????? >>>>>>?????????????? >> incrementalizing or respecting 
>> safe point
>> ???? >> checks
>> ???? >>????? >>>>>> is going to be
>> ???? >>????? >>>>>>?????????????? >> different.
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >>????? >>>>>>?????????????? >> The approach is to keep the region
>> ???? >> holding the
>> ???? >>????? >>>>>> array in the allocating
>> ???? >>????? >>>>>>?????????????? >> state (pin logic) while updating the
>> ???? >> color to the
>> ???? >>????? >>>>>> array after checks.
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >>????? >>>>>>?????????????? >> Can I get a review?? Thanks.
>> ???? >>????? >>>>>>?????????????? >>
>> ???? >>????? >>>>>>?????????????? >> Ryan
>> ???? >>????? >>>>>>?????????????? >
>> ???? >>????? >>>>>>
>> ???? >>????? >>
>> ???? >>
>>

From martin.doerr at sap.com  Tue Aug 13 10:30:20 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Tue, 13 Aug 2019 10:30:20 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
Message-ID: <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>

Hi Kim and David,

thank you for looking into this issue.

@Kim:
I've replied to your comment in the issue.

@David:
> I find the inversion of the ifdef slightly confusing. I also don't like
> a comment to say we don't have a given property. Wouldn't it be better
> to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
Hmm. We could change that. I'm not sure what is better.
I think it should be designed such that correct usage is easy and wrong usage is difficult.

It has already happened that people used an #ifdef for a macro which is always defined (0 or 1) by mistake.
That's why I'm not a big fan of defining things to 0 or 1.

With the #define or not define approach, all platforms except those which explicitly specify the property are conservatively treated as non-multi-copy atomic.

But if your version is preferred by all reviewers, I can use it.


> Can't comment on ppc64 specifics.
I'll ask for additional reviews once the main issue was reviewed.


> It's not at all obvious to me that the need for the fence() in
> pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to see
> that define connected only with the IRIW issue as it currently is.
This was explained in the email thread a few emails later:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html


Best regards,
Martin


From rkennke at redhat.com  Tue Aug 13 10:55:25 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 13 Aug 2019 12:55:25 +0200
Subject: RFR: 8229470: Shenandoah: Fix C1 getAndSetObject() failure
Message-ID: <9ce79688-130d-9542-6731-42d6e21b3ab4@redhat.com>

JDK-8228369 broke C1's getAndSetObject() intrinsic with Shenandoah:

# Internal Error
(/home/shade/trunks/jdk-jdk/src/hotspot/share/c1/c1_LIRGenerator.hpp:224),
pid=20194, tid=20205
# assert(!opr->is_register() || opr->is_virtual()) failed: should never
set result to a physical register

 V [libjvm.so+0x78417e] LIRGenerator::set_result(Instruction*,
LIR_OprDesc*)+0x12e
V [libjvm.so+0x782a67]
LIRGenerator::do_UnsafeGetAndSetObject(UnsafeGetAndSetObject*)+0xe7
V [libjvm.so+0x76b043] LIRGenerator::do_root(Instruction*)+0x93

Fix:
http://cr.openjdk.java.net/~rkennke/JDK-8229470/webrev.00/
Testing: failed testcase, hotspot_gc_shenandoah

Ok?

Roman


From shade at redhat.com  Tue Aug 13 11:03:08 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 13 Aug 2019 13:03:08 +0200
Subject: RFR: 8229470: Shenandoah: Fix C1 getAndSetObject() failure
In-Reply-To: <9ce79688-130d-9542-6731-42d6e21b3ab4@redhat.com>
References: <9ce79688-130d-9542-6731-42d6e21b3ab4@redhat.com>
Message-ID: <6e1ba4aa-f5b4-cb8d-15ee-df72043b08ca@redhat.com>

On 8/13/19 12:55 PM, Roman Kennke wrote:
> Fix:
> http://cr.openjdk.java.net/~rkennke/JDK-8229470/webrev.00/
> Testing: failed testcase, hotspot_gc_shenandoah

It fixes the failure for me.

But how does this fix work? Does that failure happen at pre_barrier path? If so, should it be moved
under ShenandoahSATBBarrier branch, or into the pre_barrier itself?

-- 
Thanks,
-Aleksey


From shade at redhat.com  Tue Aug 13 12:48:02 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 13 Aug 2019 14:48:02 +0200
Subject: RFR: 8229470: Shenandoah: Fix C1 getAndSetObject() failure
In-Reply-To: <f00d45e1-bdd1-4ffc-ae53-e5271a0dfe45@kennke.org>
References: <9ce79688-130d-9542-6731-42d6e21b3ab4@redhat.com>
 <6e1ba4aa-f5b4-cb8d-15ee-df72043b08ca@redhat.com>
 <f00d45e1-bdd1-4ffc-ae53-e5271a0dfe45@kennke.org>
Message-ID: <9e125b73-fe99-cedd-b756-28ed243bf119@redhat.com>

On 8/13/19 1:28 PM, Roman Kennke wrote:
>>> Fix:
>>> http://cr.openjdk.java.net/~rkennke/JDK-8229470/webrev.00/
>>> Testing: failed testcase, hotspot_gc_shenandoah
>>
>> It fixes the failure for me.
>>
>> But how does this fix work? Does that failure happen at pre_barrier path? If so, should it be moved
>> under ShenandoahSATBBarrier branch, or into the pre_barrier itself?
> 
> The intrinsic code doesn't like a physical register in set_result().
> However, yesterday I changed to:
> 
>   LIR_Opr result = gen->result_register_for(obj->value_type());
> 
> in load_reference_barrier(), which allocates a physical reg (rax on x86,
> r0 on aarch64). This is done so that we have the result reg in the right
> place for the runtime call, which returns the oop in that register. This
> avoids some register shuffling in the LRB stub. However, it doesn't work
> for xchg, and this fix allocates a (virtual) tmp reg, and copies the
> result into that. 

OK, fine.

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Aug 13 14:46:08 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 13 Aug 2019 10:46:08 -0400
Subject: RFR 8229474: Shenandoah: Cleanup CM::update_roots()
Message-ID: <b3a1f71f-aa20-2080-ff4f-be9a256b94fd@redhat.com>

After recent root processor refactoring, there are only two phases that 
need full root updates and both require to update code roots. Let's 
clean it up and remove always true parameter of ShenandoahRootUpdater.


Bug: https://bugs.openjdk.java.net/browse/JDK-8229474
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229474/webrev.00/

Test:
   hotspot_gc_shenandoah (fastdebug and release)

Thanks,

-Zhengyu


From shangxinli at hotmail.com  Tue Aug 13 15:01:51 2019
From: shangxinli at hotmail.com (shang xinli)
Date: Tue, 13 Aug 2019 15:01:51 +0000
Subject: G1GC is not able reclaim bytes 
Message-ID: <CY4PR19MB14772263A32B99446C81A93CC6D20@CY4PR19MB1477.namprd19.prod.outlook.com>

There are two connected full GC but they are not able to reclaim bytes. Here are gc logs.


 [Times: user=0.35 sys=0.00, real=0.03 secs]

2019-08-01T15:12:14.594+0000: 755123.074: [Full GC (Allocation Failure)  132G->132G(180G), 37.4525886 secs]

   [Eden: 0.0B(9216.0M)->0.0B(9216.0M) Survivors: 0.0B->0.0B Heap: 132.1G(180.0G)->132.1G(180.0G)], [Metaspace: 204550K->204459K(229376K)]

 [Times: user=68.14 sys=0.07, real=37.45 secs]

2019-08-01T15:12:52.048+0000: 755160.527: [Full GC (Allocation Failure)  132G->132G(180G), 37.4584894 secs]

   [Eden: 0.0B(9216.0M)->0.0B(9216.0M) Survivors: 0.0B->0.0B Heap: 132.1G(180.0G)->132.1G(180.0G)], [Metaspace: 204459K->204451K(229376K)]

 [Times: user=68.38 sys=0.05, real=37.46 secs]

Here is the chart. We can see before and after GC, the heap is almost the same.
[cid:65dca3e8-3bc4-4b87-93dc-997ba0e601f2]


The service OOMs eventually, but we don't configure to generate heap dump. Is there a way to investigate why?


From rkennke at redhat.com  Tue Aug 13 16:23:37 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 13 Aug 2019 18:23:37 +0200
Subject: RFR 8229474: Shenandoah: Cleanup CM::update_roots()
In-Reply-To: <b3a1f71f-aa20-2080-ff4f-be9a256b94fd@redhat.com>
References: <b3a1f71f-aa20-2080-ff4f-be9a256b94fd@redhat.com>
Message-ID: <9faac5cd-6618-5077-9b98-5c0cdd474208@redhat.com>

Makes sense, looks good!

Thanks,
Roman


> After recent root processor refactoring, there are only two phases that
> need full root updates and both require to update code roots. Let's
> clean it up and remove always true parameter of ShenandoahRootUpdater.
> 
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229474
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229474/webrev.00/
> 
> Test:
> ? hotspot_gc_shenandoah (fastdebug and release)
> 
> Thanks,
> 
> -Zhengyu


From erik.osterlund at oracle.com  Wed Aug 14 07:57:26 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 14 Aug 2019 09:57:26 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
In-Reply-To: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
References: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
Message-ID: <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>

Hi Per,

Unfortunate with another special path in the root iterator. But alternatives also look bad. Looks good.

Thanks,
/Erik

> On 13 Aug 2019, at 10:42, Per Liden <per.liden at oracle.com> wrote:
> 
> JDK-8227226 can temporarily create long[] objects on the heap, which later become oop arrays, when the array initialization has been completed. This is fine from a sampling/reporting point of view (the things done in the MemAllocator::Allocation destructor), since that only happens after the final klass pointer has been installed. However, if a heap iteration (via ZHeapIterator) happens before the final klass pointer has been installed, it will then see the long[]. As far as I can tell, this isn't a big deal, unless that heap iteration is out to JVMTI-tag all long[] instances. In that case, we tag a long[] which will later become an oop array, which seems wrong and potentially problematic. To avoid this, we want to be able to hide these roots from the heap iterator until the final klass pointer has been installed.
> 
> The approach here is that these temporary long[] objects are not kept alive in a Handle, but instead treated as a special root in ZThreadLocalData, that can optionally be made invisible to the ZRootsIterator.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
> Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0
> 
> /Per


From per.liden at oracle.com  Wed Aug 14 08:27:12 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 14 Aug 2019 10:27:12 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
In-Reply-To: <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>
References: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
 <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>
Message-ID: <dd0a4ef5-a545-224b-998d-ca3ca05a75ef@oracle.com>

Thanks Erik!

I agree that another path in ZRootIterator is unfortunate, but the 
alternatives I've managed to come up with tend to be worse.

/Per

On 8/14/19 9:57 AM, Erik Osterlund wrote:
> Hi Per,
> 
> Unfortunate with another special path in the root iterator. But alternatives also look bad. Looks good.
> 
> Thanks,
> /Erik
> 
>> On 13 Aug 2019, at 10:42, Per Liden <per.liden at oracle.com> wrote:
>>
>> JDK-8227226 can temporarily create long[] objects on the heap, which later become oop arrays, when the array initialization has been completed. This is fine from a sampling/reporting point of view (the things done in the MemAllocator::Allocation destructor), since that only happens after the final klass pointer has been installed. However, if a heap iteration (via ZHeapIterator) happens before the final klass pointer has been installed, it will then see the long[]. As far as I can tell, this isn't a big deal, unless that heap iteration is out to JVMTI-tag all long[] instances. In that case, we tag a long[] which will later become an oop array, which seems wrong and potentially problematic. To avoid this, we want to be able to hide these roots from the heap iterator until the final klass pointer has been installed.
>>
>> The approach here is that these temporary long[] objects are not kept alive in a Handle, but instead treated as a special root in ZThreadLocalData, that can optionally be made invisible to the ZRootsIterator.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
>> Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0
>>
>> /Per
> 


From thomas.schatzl at oracle.com  Wed Aug 14 12:55:27 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 14 Aug 2019 14:55:27 +0200
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
Message-ID: <bd2aafe9-f4d0-ada0-6089-ba39c854e0a6@oracle.com>

Hi Kirk,

   sorry for the late answer.

On 08.08.19 18:37, Kirk Pepperdine wrote:
> Hi Thomas,
> 
> "n the meantime the Oracle garbage collection team introduced a new 
> garbage collector, ZGC, and Red Hat contributed the Shenandoah 
> collector. Oracle further improved G1, which has been its designated 
> successor since initial introduction in JDK6u14, to a point where we 
> believe there is little reason to use the CMS collector in deployments.?
> 
> I fear my?experience in tuning GC 1000s of JVMs leaves me at odds with 
> the premise?that?there is little reason to use CMS. In my experience CMS 

Nobody is in any way guaranteeing that any of the many alternatives will 
be the absolute best garbage collector for all the applications out 
there ever written. That's impossible.

CMS has been available in Hotspot for 15+ years, and we understand that 
there is a lot of experience tuning that garbage collector; we also know 
that there are applications that have been specifically (re-)written to 
work best with CMS. Particularly these applications are unlikely to 
work better (for a particular measure of "better") on any different 
collector unless that (probably?) collector reimplements significant 
parts of CMS.

However we think that the overhead of one of the alternatives for these 
imho few applications is small enough even for those to be able to move 
away from CMS _now_.

I will adapt that paragraph in the JEP to make this more clear. Thanks 
for your feedback.

Looking in the bug tracker for particular issues, during the last two 
years of CMS deprecation we got very little actionable feedback from the 
community on what exactly G1 (in particular) is missing.
All of the example code we have run better than CMS with some G1 tuning 
(e.g. JDK-8062128; with jdk8!) or are simply throughput applications for 
which Parallel (e.g. JDK-8133055) is suited well.

In some other email you mentioned that most people are still at JDK8 
(which we know), and most people are only moving from one LTS versions 
to another (which we also know). CMS support is not going away in either 
8 or 11, and 17 will be released in two years.

Since CMS deprecation we demonstrated that we improved G1 a lot (e.g. 
[2],[3]) together with the community; at the recent JVMLS there has been 
a workshop that showed a selection of these G1 improvements since JDK8 
([0]), and an imho decent roadmap for the future to, among other things, 
with the community, address concerns that were described sufficiently so 
they can actually be worked on.

So I would like to reiterate Charlie's request to work with us to allow 
us to analyze these applications that exhibit your observations.

There has also always been the option to organize maintenance of CMS in 
the community, but nobody even stepped up starting to fix the 
long-standing existing known minor issues CMS (to get contributors to 
know CMS code and to give us confidence that these persons can take over 
maintenance of such a large component).

> Although I do have high hopes for both ZGC and Shenandoah, they are not 
> an option for most sites at this point in time. I would suggested that 
> depreciation of CMS was premature as there was no viable alternative. I 
 > would further suggest that removal is also premature as there is still
 > no viable alternative for the majority of workloads that work
 > exceptionally well with CMS.

The answer to this is the same as it has been two years ago when CMS has 
been deprecated: work together with the community to evaluate, 
understand and try to address these issues if viable (e.g. Oracle is 
unlikely to directly reimplement CMS' free list memory management on top 
of G1 if this is the real problem).

As the JEP points out, CMS maintenance, e.g. adapting code to interface 
changes required by improvements to other components, fixing important 
issues, or even small improvements, but mostly testing, take up a lot of 
resources that we think could be better spent elsewhere.

Doing so seems better to me than relentlessly stating: "There is no 
viable alternative".

Thanks,
   Thomas

[0] 
http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2019-August/002816.html 
; feel free to ask about particulars.
[2] https://www.youtube.com/watch?v=LppgqvKOUKsv
[3] 
https://www.optaplanner.org/blog/2019/01/17/HowMuchFasterIsJava11.html#table1 
; note that is a pure throughput load


From poonam.bajaj at oracle.com  Wed Aug 14 13:13:53 2019
From: poonam.bajaj at oracle.com (Poonam Parhar)
Date: Wed, 14 Aug 2019 06:13:53 -0700
Subject: RFR 8229420: [Redo] jstat reports incorrect values for OU for CMS GC
Message-ID: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com>

Hello,

The fix for JDK-8215523 
<https://bugs.openjdk.java.net/browse/JDK-8215523> had to be backed out 
with '8227178: Backout of 8215523' because it had caused timeout 
failures for some of the CMS tests.

Changeset of JDK-8215523: 
http://hg.openjdk.java.net/jdk/jdk/rev/734e58d8477b

Those failures get resolved by adding the following check before calling 
recalculate_used_stable() in CompactibleFreeListSpace::allocate():

1387   // During GC we do not need to recalculate the stable used value for
1388   // every allocation in old gen. It is done once at the end of GC instead
1389   // for performance reasons.
1390   if (!CMSHeap::heap()->is_gc_active()) {
1391     recalculate_used_stable();
1392   }
1393


Please review the webrev with the updated fix:
http://cr.openjdk.java.net/~poonam/8229420/webrev.00/

Thanks,
Poonam


From thomas.schatzl at oracle.com  Wed Aug 14 13:51:01 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Wed, 14 Aug 2019 15:51:01 +0200
Subject: RFR 8229420: [Redo] jstat reports incorrect values for OU for CMS
 GC
In-Reply-To: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com>
References: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com>
Message-ID: <0195ae49-cc0f-d3e8-4bee-fa53c29184c1@oracle.com>

Hi,

On 14.08.19 15:13, Poonam Parhar wrote:
> Hello,
> 
> The fix for JDK-8215523 
> <https://bugs.openjdk.java.net/browse/JDK-8215523> had to be backed out 
> with '8227178: Backout of 8215523' because it had caused timeout 
> failures for some of the CMS tests.
> 
> Changeset of JDK-8215523: 
> http://hg.openjdk.java.net/jdk/jdk/rev/734e58d8477b
> 
> Those failures get resolved by adding the following check before calling 
> recalculate_used_stable() in CompactibleFreeListSpace::allocate():
> 
> 1387?? // During GC we do not need to recalculate the stable used value for
> 1388?? // every allocation in old gen. It is done once at the end of GC 
> instead
> 1389?? // for performance reasons.
> 1390?? if (!CMSHeap::heap()->is_gc_active()) {
> 1391???? recalculate_used_stable();
> 1392?? }
> 1393
> 
> 
> Please review the webrev with the updated fix:
> http://cr.openjdk.java.net/~poonam/8229420/webrev.00/

   still good.

Thanks,
   Thomas


From kirk at kodewerk.com  Wed Aug 14 14:11:47 2019
From: kirk at kodewerk.com (Kirk Pepperdine)
Date: Wed, 14 Aug 2019 07:11:47 -0700
Subject: RFC: JEP: Remove the Concurrent Mark Sweep Garbage Collector
In-Reply-To: <bd2aafe9-f4d0-ada0-6089-ba39c854e0a6@oracle.com>
References: <0148c4d5-ec42-f2b1-d954-bc95b33cfd07@oracle.com>
 <81F15403-E42B-46B0-9BE4-88B8BFE8100F@kodewerk.com>
 <bd2aafe9-f4d0-ada0-6089-ba39c854e0a6@oracle.com>
Message-ID: <F857070B-57AA-424E-8372-F2E4A9862B8D@kodewerk.com>

Hi Thomas,


> On Aug 14, 2019, at 5:55 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi Kirk,
> 
>  sorry for the late answer.

No worries and I appreciate your response.

Believe me, I have no great affinity for CMS, its improperly configured out of the box and most people, including many support organizations simply offer very very bad tuning advice? so moving to a collector that is more self tuned out of the box is a huge win?. But then, for many workloads it?s offered the best balance between pause times and overhead. Also, I look at the benchmarks that I?ve run in the past to be  are an indirect measure of electron budget so it?s not just about pause time nor is it just about throughput, it?s about the power and cooling bill.

As for the community picking up working on CMS. While in theory it?s possible, in practice it?s also possible that individuals could or might setup up but a more likely scenario is that those individuals would need to be sponsored by a larger organization that has the resources to help. But I believe you are correct on this point, at the end of the day, if we want CMS we?re going to have to step up and do something about it. I?ll start by restoring and rerunning some benchmarks with the most recent versions of G1 and see where that goes.

Kind regards,
Kirk

> 
> On 08.08.19 18:37, Kirk Pepperdine wrote:
>> Hi Thomas,
>> "n the meantime the Oracle garbage collection team introduced a new garbage collector, ZGC, and Red Hat contributed the Shenandoah collector. Oracle further improved G1, which has been its designated successor since initial introduction in JDK6u14, to a point where we believe there is little reason to use the CMS collector in deployments.?
>> I fear my experience in tuning GC 1000s of JVMs leaves me at odds with the premise that there is little reason to use CMS. In my experience CMS 
> 
> Nobody is in any way guaranteeing that any of the many alternatives will be the absolute best garbage collector for all the applications out there ever written. That's impossible.
> 
> CMS has been available in Hotspot for 15+ years, and we understand that there is a lot of experience tuning that garbage collector; we also know that there are applications that have been specifically (re-)written to work best with CMS. Particularly these applications are unlikely to work better (for a particular measure of "better") on any different collector unless that (probably?) collector reimplements significant parts of CMS.
> 
> However we think that the overhead of one of the alternatives for these imho few applications is small enough even for those to be able to move away from CMS _now_.
> 
> I will adapt that paragraph in the JEP to make this more clear. Thanks for your feedback.
> 
> Looking in the bug tracker for particular issues, during the last two years of CMS deprecation we got very little actionable feedback from the community on what exactly G1 (in particular) is missing.
> All of the example code we have run better than CMS with some G1 tuning (e.g. JDK-8062128; with jdk8!) or are simply throughput applications for which Parallel (e.g. JDK-8133055) is suited well.
> 
> In some other email you mentioned that most people are still at JDK8 (which we know), and most people are only moving from one LTS versions to another (which we also know). CMS support is not going away in either 8 or 11, and 17 will be released in two years.
> 
> Since CMS deprecation we demonstrated that we improved G1 a lot (e.g. [2],[3]) together with the community; at the recent JVMLS there has been a workshop that showed a selection of these G1 improvements since JDK8 ([0]), and an imho decent roadmap for the future to, among other things, with the community, address concerns that were described sufficiently so they can actually be worked on.
> 
> So I would like to reiterate Charlie's request to work with us to allow us to analyze these applications that exhibit your observations.
> 
> There has also always been the option to organize maintenance of CMS in the community, but nobody even stepped up starting to fix the long-standing existing known minor issues CMS (to get contributors to know CMS code and to give us confidence that these persons can take over maintenance of such a large component).
> 
>> Although I do have high hopes for both ZGC and Shenandoah, they are not an option for most sites at this point in time. I would suggested that depreciation of CMS was premature as there was no viable alternative. I 
> > would further suggest that removal is also premature as there is still
> > no viable alternative for the majority of workloads that work
> > exceptionally well with CMS.
> 
> The answer to this is the same as it has been two years ago when CMS has been deprecated: work together with the community to evaluate, understand and try to address these issues if viable (e.g. Oracle is unlikely to directly reimplement CMS' free list memory management on top of G1 if this is the real problem).
> 
> As the JEP points out, CMS maintenance, e.g. adapting code to interface changes required by improvements to other components, fixing important issues, or even small improvements, but mostly testing, take up a lot of resources that we think could be better spent elsewhere.
> 
> Doing so seems better to me than relentlessly stating: "There is no viable alternative".
> 
> Thanks,
>  Thomas
> 
> [0] http://mail.openjdk.java.net/pipermail/hotspot-gc-use/2019-August/002816.html ; feel free to ask about particulars.
> [2] https://www.youtube.com/watch?v=LppgqvKOUKsv
> [3] https://www.optaplanner.org/blog/2019/01/17/HowMuchFasterIsJava11.html#table1 ; note that is a pure throughput load


From shade at redhat.com  Wed Aug 14 16:49:24 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 14 Aug 2019 18:49:24 +0200
Subject: RFR (XS) 8229709: x86_32 build and test failures after JDK-8228369
 (Shenandoah: Refactor LRB C1 stubs)
Message-ID: <a4e9eb5a-8f95-fa6f-b3e4-c0770eb45fc4@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8229709

This is a recent regression. Unfortunately, the fix is complicated due to C1 giving us the
non-byte-capable tmp registers in 32-bit mode, which requires some workarounds. We could have
instead rewrote the affected parts to use *ptr versions, but that would regress x86_64 perf and
maybe introduce bugs there. So I opted to LP64 new code.

Fix:
  http://cr.openjdk.java.net/~shade/8229709/webrev.01/

Testing: {x86_32, x86_64} hotspot_gc_shenandoah; {x86_32, x86_64} CTW tests

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Wed Aug 14 16:56:45 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 14 Aug 2019 18:56:45 +0200
Subject: RFR (XS) 8229709: x86_32 build and test failures after
 JDK-8228369 (Shenandoah: Refactor LRB C1 stubs)
In-Reply-To: <a4e9eb5a-8f95-fa6f-b3e4-c0770eb45fc4@redhat.com>
References: <a4e9eb5a-8f95-fa6f-b3e4-c0770eb45fc4@redhat.com>
Message-ID: <9ecd58cf-d913-e309-b9e3-e2fc794ed3e1@redhat.com>

Ok, looks good. Well it doesn't exactly look good, but it's ok
nonetheless ;-)

Thanks,
Roman


> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8229709
> 
> This is a recent regression. Unfortunately, the fix is complicated due to C1 giving us the
> non-byte-capable tmp registers in 32-bit mode, which requires some workarounds. We could have
> instead rewrote the affected parts to use *ptr versions, but that would regress x86_64 perf and
> maybe introduce bugs there. So I opted to LP64 new code.
> 
> Fix:
>   http://cr.openjdk.java.net/~shade/8229709/webrev.01/
> 
> Testing: {x86_32, x86_64} hotspot_gc_shenandoah; {x86_32, x86_64} CTW tests
> 


From shade at redhat.com  Wed Aug 14 17:47:04 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 14 Aug 2019 19:47:04 +0200
Subject: RFR (XS) 8229707: [TESTBUG] Some Shenandoah tests assume Server VM by
 default
Message-ID: <a9c089b0-61e3-399d-81a4-4568e678b1a1@redhat.com>

Testbug:
  https://bugs.openjdk.java.net/browse/JDK-8229707

x86_32 client VM should not try to run some Shenandoah tests that expect C2 to be there.

Fix:

diff -r 6f919024e550 test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java
--- a/test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java       Wed Aug 14
18:52:30 2019 +0200
+++ b/test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java       Wed Aug 14
19:44:31 2019 +0200
@@ -27,4 +27,5 @@
  * @key gc
  * @requires vm.gc.Shenandoah & !vm.graal.enabled
+ * @requires vm.flavor == "server"
  * @run main/othervm -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:-TieredCompilation
  *                   -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC

diff -r 6f919024e550 test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java
--- a/test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java     Wed Aug 14 18:52:30
2019 +0200
+++ b/test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java     Wed Aug 14 19:44:31
2019 +0200
@@ -27,4 +27,5 @@
  * @key gc
  * @requires vm.gc.Shenandoah & !vm.graal.enabled
+ * @requires vm.flavor == "server"
  * @library /test/lib
  * @run driver TestLoopMiningArguments


Testing: {x86_32-client, x86_64-server} hotspot_gc_shenandoah

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Wed Aug 14 18:16:50 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 14 Aug 2019 20:16:50 +0200
Subject: RFR (XS) 8229707: [TESTBUG] Some Shenandoah tests assume Server
 VM by default
In-Reply-To: <a9c089b0-61e3-399d-81a4-4568e678b1a1@redhat.com>
References: <a9c089b0-61e3-399d-81a4-4568e678b1a1@redhat.com>
Message-ID: <f8f545df-93d9-5aed-1d6c-176a80817a15@redhat.com>

Okidoki.

Roman

> Testbug:
>   https://bugs.openjdk.java.net/browse/JDK-8229707
> 
> x86_32 client VM should not try to run some Shenandoah tests that expect C2 to be there.
> 
> Fix:
> 
> diff -r 6f919024e550 test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java
> --- a/test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java       Wed Aug 14
> 18:52:30 2019 +0200
> +++ b/test/hotspot/jtreg/gc/shenandoah/compiler/TestWriteBarrierClearControl.java       Wed Aug 14
> 19:44:31 2019 +0200
> @@ -27,4 +27,5 @@
>   * @key gc
>   * @requires vm.gc.Shenandoah & !vm.graal.enabled
> + * @requires vm.flavor == "server"
>   * @run main/othervm -XX:-BackgroundCompilation -XX:-UseOnStackReplacement -XX:-TieredCompilation
>   *                   -XX:+UnlockExperimentalVMOptions -XX:+UseShenandoahGC
> 
> diff -r 6f919024e550 test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java
> --- a/test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java     Wed Aug 14 18:52:30
> 2019 +0200
> +++ b/test/hotspot/jtreg/gc/shenandoah/options/TestLoopMiningArguments.java     Wed Aug 14 19:44:31
> 2019 +0200
> @@ -27,4 +27,5 @@
>   * @key gc
>   * @requires vm.gc.Shenandoah & !vm.graal.enabled
> + * @requires vm.flavor == "server"
>   * @library /test/lib
>   * @run driver TestLoopMiningArguments
> 
> 
> Testing: {x86_32-client, x86_64-server} hotspot_gc_shenandoah
> 


From sangheon.kim at oracle.com  Wed Aug 14 22:17:34 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Wed, 14 Aug 2019 15:17:34 -0700
Subject: RFR: 8229044: G1RedirtyCardsQueueSet should be local to a
 collection
In-Reply-To: <1BA16061-A0AA-4022-B9FB-D5AB11A8924B@oracle.com>
References: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>
 <19ad902e-0974-79fd-a0c6-67d53d686dcd@oracle.com>
 <1BA16061-A0AA-4022-B9FB-D5AB11A8924B@oracle.com>
Message-ID: <eb215c43-5735-8c45-bcdf-17bfa4416ddb@oracle.com>

Hi Kim,

On 8/3/19 12:22 PM, Kim Barrett wrote:
>> On Aug 3, 2019, at 3:10 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi,
>>
>> On 03.08.19 01:43, Kim Barrett wrote:
>>> Please review this change to the use of G1RedirtyCardsQueueSet.
>>> Rather than a singleton instance in the G1CollectedHeap that is reused
>>> by each collection pause, we now (stack) allocate one for use by the
>>> current collection pause.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8229044
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8229044/open.00/
>>> Testing:
>>> mach5 tier1-5
>>   looks good. Thanks.
>>
>> I was wondering whether at some point we should extract all transient GC state coupled with the actual algorithms applied into a separate class. But that's probably something for the future :)
>>
>> Thomas
> Thanks.
>
> I thought about putting the redirty set directly in the ParScanThreadStateSet with an
> accessor and passing that set to more places, but that seemed like it would make it
> more difficult to understand the usage of the ParScanThreadState[Set].
>
> I also thought about putting it in the EvacuationInfo, but what?s there seems to be
> accounting stuff and not otherwise interesting data structures.  And again, I?d probably
> prefer to extract the redirty set to pass into call trees that need it and not all the other
> stuff.
>
> I think part of the problem is that there?s just a lot of varied state shared between various
> largish pieces of the collector.  Finding ways to reduce that would be nice, but detangling
> is often hard work.
Looks good.

And it would be really nice if above ideas are accomplished. :)

Thanks,
Sangheon


>


From kim.barrett at oracle.com  Thu Aug 15 05:32:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 15 Aug 2019 01:32:42 -0400
Subject: RFR: 8229044: G1RedirtyCardsQueueSet should be local to a
 collection
In-Reply-To: <eb215c43-5735-8c45-bcdf-17bfa4416ddb@oracle.com>
References: <09AAF075-80C5-42B4-BDD6-5DB2265FE9C0@oracle.com>
 <19ad902e-0974-79fd-a0c6-67d53d686dcd@oracle.com>
 <1BA16061-A0AA-4022-B9FB-D5AB11A8924B@oracle.com>
 <eb215c43-5735-8c45-bcdf-17bfa4416ddb@oracle.com>
Message-ID: <D3A8E440-FE7D-4EA9-BB95-89EE78CA1C48@oracle.com>

> On Aug 14, 2019, at 6:17 PM, sangheon.kim at oracle.com wrote:
> 
> Hi Kim,
> 
> On 8/3/19 12:22 PM, Kim Barrett wrote:
>>> On Aug 3, 2019, at 3:10 PM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>> 
>>> Hi,
>>> 
>>> On 03.08.19 01:43, Kim Barrett wrote:
>>>> Please review this change to the use of G1RedirtyCardsQueueSet.
>>>> Rather than a singleton instance in the G1CollectedHeap that is reused
>>>> by each collection pause, we now (stack) allocate one for use by the
>>>> current collection pause.
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8229044
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kbarrett/8229044/open.00/
>>>> Testing:
>>>> mach5 tier1-5
>>>  looks good. Thanks.
>>> 
>>> I was wondering whether at some point we should extract all transient GC state coupled with the actual algorithms applied into a separate class. But that's probably something for the future :)
>>> 
>>> Thomas
>> Thanks.
>> 
>> I thought about putting the redirty set directly in the ParScanThreadStateSet with an
>> accessor and passing that set to more places, but that seemed like it would make it
>> more difficult to understand the usage of the ParScanThreadState[Set].
>> 
>> I also thought about putting it in the EvacuationInfo, but what?s there seems to be
>> accounting stuff and not otherwise interesting data structures.  And again, I?d probably
>> prefer to extract the redirty set to pass into call trees that need it and not all the other
>> stuff.
>> 
>> I think part of the problem is that there?s just a lot of varied state shared between various
>> largish pieces of the collector.  Finding ways to reduce that would be nice, but detangling
>> is often hard work.
> Looks good.
> 
> And it would be really nice if above ideas are accomplished. :)
> 
> Thanks,
> Sangheon

Thanks.


From derekw at marvell.com  Thu Aug 15 15:49:37 2019
From: derekw at marvell.com (Derek White)
Date: Thu, 15 Aug 2019 15:49:37 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>

Hi Martin,

This is a good area to clean up! I have 2 issues to consider with this patch:

Re - the AArch64 side of things:
Arm retconned the ARMv8 spec [1][2] to decide that multi-copy atomicity was a good idea after all (after checking that no CPU implementations took advantage of this level of weakness).

However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in changing behavior (removing fence in taskqueue) that should be looked at and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest deferring changing this AArch64 behavior to a separate issue.

BTW, this could be a nice improvement for AArch64 - Thanks for bringing this up!

	[1] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
	[2] https://bugs.openjdk.java.net/browse/JDK-8165058

Re - the patch generally:

There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in src/hotspot/share/utilities/globalDefinitions.hpp are not simply a change to GC behavior for S390, but are changing interpreter and compiler behavior arm32 (and maybe Aarch64?).

I believe this change will remove required fences from arm32 interpreter and Jitted code relating to JMM volatile accesses. 

In any case I think it should be reviewed beyond the context of hotspot-gc-dev.

Thanks again bringing this up!
 - Derek

-----Original Message-----
From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On Behalf Of Doerr, Martin
Sent: Tuesday, August 13, 2019 6:30 AM
To: David Holmes <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory model platforms

External Email

----------------------------------------------------------------------
Hi Kim and David,

thank you for looking into this issue.

@Kim:
I've replied to your comment in the issue.

@David:
> I find the inversion of the ifdef slightly confusing. I also don't 
> like a comment to say we don't have a given property. Wouldn't it be 
> better to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
Hmm. We could change that. I'm not sure what is better.
I think it should be designed such that correct usage is easy and wrong usage is difficult.

It has already happened that people used an #ifdef for a macro which is always defined (0 or 1) by mistake.
That's why I'm not a big fan of defining things to 0 or 1.

With the #define or not define approach, all platforms except those which explicitly specify the property are conservatively treated as non-multi-copy atomic.

But if your version is preferred by all reviewers, I can use it.


> Can't comment on ppc64 specifics.
I'll ask for additional reviews once the main issue was reviewed.


> It's not at all obvious to me that the need for the fence() in 
> pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to 
> see that define connected only with the IRIW issue as it currently is.
This was explained in the email thread a few emails later:
http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html


Best regards,
Martin


From martin.doerr at sap.com  Thu Aug 15 16:28:41 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Thu, 15 Aug 2019 16:28:41 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
Message-ID: <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>

Hi Derek,

thanks for pointing me to the aarch64 issue and paper.
I have added a link to that issue.

> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a change to
> GC behavior for S390, but are changing interpreter and compiler behavior
> arm32 (and maybe Aarch64?).

My initial webrev is only a functional change for s390.

Current implementation:
- CPU_NOT_MULTIPLE_COPY_ATOMIC is only used to control the following:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC and x86

My webrev:
- Still:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC, x86 and s390


According to the paper, we could enable CPU_MULTI_COPY_ATOMIC on aarch64 (not arm32). But I think this should be worth a separate issue once we have decided how to proceed with this one.
I think it'd be also good to have support_IRIW_for_not_multiple_copy_atomic_cpu switchable for PPC64.

> In any case I think it should be reviewed beyond the context of hotspot-gc-dev.
Definitely, yes. But since the taskqueue belongs to GC, I'd like to get feedback from GC, first.

It is always surprising how many ideas and opinions show up when touching this code ?

Best regards,
Martin


> -----Original Message-----
> From: Derek White <derekw at marvell.com>
> Sent: Donnerstag, 15. August 2019 17:50
> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes
> <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim
> Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
> Subject: RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> Hi Martin,
> 
> This is a good area to clean up! I have 2 issues to consider with this patch:
> 
> Re - the AArch64 side of things:
> Arm retconned the ARMv8 spec [1][2] to decide that multi-copy atomicity
> was a good idea after all (after checking that no CPU implementations took
> advantage of this level of weakness).
> 
> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in
> changing behavior (removing fence in taskqueue) that should be looked at
> and tested by the aarch64 folks, so if Andrew Haley agrees, I suggest
> deferring changing this AArch64 behavior to a separate issue.
> 
> BTW, this could be a nice improvement for AArch64 - Thanks for bringing this
> up!
> 
> 	[1] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
> 	[2] https://bugs.openjdk.java.net/browse/JDK-8165058
> 
> Re - the patch generally:
> 
> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a change to
> GC behavior for S390, but are changing interpreter and compiler behavior
> arm32 (and maybe Aarch64?).
> 
> I believe this change will remove required fences from arm32 interpreter and
> Jitted code relating to JMM volatile accesses.
> 
> In any case I think it should be reviewed beyond the context of hotspot-gc-
> dev.
> 
> Thanks again bringing this up!
>  - Derek
> 
> -----Original Message-----
> From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On
> Behalf Of Doerr, Martin
> Sent: Tuesday, August 13, 2019 6:30 AM
> To: David Holmes <david.holmes at oracle.com>; hotspot-gc-
> dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Kim and David,
> 
> thank you for looking into this issue.
> 
> @Kim:
> I've replied to your comment in the issue.
> 
> @David:
> > I find the inversion of the ifdef slightly confusing. I also don't
> > like a comment to say we don't have a given property. Wouldn't it be
> > better to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
> Hmm. We could change that. I'm not sure what is better.
> I think it should be designed such that correct usage is easy and wrong usage
> is difficult.
> 
> It has already happened that people used an #ifdef for a macro which is
> always defined (0 or 1) by mistake.
> That's why I'm not a big fan of defining things to 0 or 1.
> 
> With the #define or not define approach, all platforms except those which
> explicitly specify the property are conservatively treated as non-multi-copy
> atomic.
> 
> But if your version is preferred by all reviewers, I can use it.
> 
> 
> > Can't comment on ppc64 specifics.
> I'll ask for additional reviews once the main issue was reviewed.
> 
> 
> > It's not at all obvious to me that the need for the fence() in
> > pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to
> > see that define connected only with the IRIW issue as it currently is.
> This was explained in the email thread a few emails later:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-
> March/008853.html
> 
> 
> Best regards,
> Martin


From derekw at marvell.com  Thu Aug 15 21:29:24 2019
From: derekw at marvell.com (Derek White)
Date: Thu, 15 Aug 2019 21:29:24 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <BN6PR18MB0946BDAC7CA2D72EE82272BED2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>

Hi Martin,

You are right about the non-impact on arm32 or aarch64. I read over the existing implementation too quickly. Sorry!

And I agree that we should enable CPU_MULTI_COPY_ATOMIC as a separate issue.

So no more issues on this from me.

Thanks,
 - Derek

-----Original Message-----
From: Doerr, Martin <martin.doerr at sap.com> 
Sent: Thursday, August 15, 2019 12:29 PM
To: Derek White <derekw at marvell.com>; David Holmes <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory model platforms

----------------------------------------------------------------------
Hi Derek,

thanks for pointing me to the aarch64 issue and paper.
I have added a link to that issue.

> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in 
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a 
> change to GC behavior for S390, but are changing interpreter and 
> compiler behavior
> arm32 (and maybe Aarch64?).

My initial webrev is only a functional change for s390.

Current implementation:
- CPU_NOT_MULTIPLE_COPY_ATOMIC is only used to control the following:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC and x86

My webrev:
- Still:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC, x86 and s390


According to the paper, we could enable CPU_MULTI_COPY_ATOMIC on aarch64 (not arm32). But I think this should be worth a separate issue once we have decided how to proceed with this one.
I think it'd be also good to have support_IRIW_for_not_multiple_copy_atomic_cpu switchable for PPC64.

> In any case I think it should be reviewed beyond the context of hotspot-gc-dev.
Definitely, yes. But since the taskqueue belongs to GC, I'd like to get feedback from GC, first.

It is always surprising how many ideas and opinions show up when touching this code ?

Best regards,
Martin


> -----Original Message-----
> From: Derek White <derekw at marvell.com>
> Sent: Donnerstag, 15. August 2019 17:50
> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes 
> <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim 
> Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
> Subject: RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak 
> memory model platforms
> 
> Hi Martin,
> 
> This is a good area to clean up! I have 2 issues to consider with this patch:
> 
> Re - the AArch64 side of things:
> Arm retconned the ARMv8 spec [1][2] to decide that multi-copy 
> atomicity was a good idea after all (after checking that no CPU 
> implementations took advantage of this level of weakness).
> 
> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in 
> changing behavior (removing fence in taskqueue) that should be looked 
> at and tested by the aarch64 folks, so if Andrew Haley agrees, I 
> suggest deferring changing this AArch64 behavior to a separate issue.
> 
> BTW, this could be a nice improvement for AArch64 - Thanks for 
> bringing this up!
> 
> 	[1] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
> 	[2] https://bugs.openjdk.java.net/browse/JDK-8165058
> 
> Re - the patch generally:
> 
> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in 
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a 
> change to GC behavior for S390, but are changing interpreter and 
> compiler behavior
> arm32 (and maybe Aarch64?).
> 
> I believe this change will remove required fences from arm32 
> interpreter and Jitted code relating to JMM volatile accesses.
> 
> In any case I think it should be reviewed beyond the context of 
> hotspot-gc- dev.
> 
> Thanks again bringing this up!
>  - Derek
> 
> -----Original Message-----
> From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On 
> Behalf Of Doerr, Martin
> Sent: Tuesday, August 13, 2019 6:30 AM
> To: David Holmes <david.holmes at oracle.com>; hotspot-gc- 
> dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of 
> weak memory model platforms
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Kim and David,
> 
> thank you for looking into this issue.
> 
> @Kim:
> I've replied to your comment in the issue.
> 
> @David:
> > I find the inversion of the ifdef slightly confusing. I also don't 
> > like a comment to say we don't have a given property. Wouldn't it be 
> > better to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
> Hmm. We could change that. I'm not sure what is better.
> I think it should be designed such that correct usage is easy and 
> wrong usage is difficult.
> 
> It has already happened that people used an #ifdef for a macro which 
> is always defined (0 or 1) by mistake.
> That's why I'm not a big fan of defining things to 0 or 1.
> 
> With the #define or not define approach, all platforms except those 
> which explicitly specify the property are conservatively treated as 
> non-multi-copy atomic.
> 
> But if your version is preferred by all reviewers, I can use it.
> 
> 
> > Can't comment on ppc64 specifics.
> I'll ask for additional reviews once the main issue was reviewed.
> 
> 
> > It's not at all obvious to me that the need for the fence() in 
> > pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to 
> > see that define connected only with the IRIW issue as it currently is.
> This was explained in the email thread a few emails later:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-
> March/008853.html
> 
> 
> Best regards,
> Martin


From derekw at marvell.com  Thu Aug 15 21:34:16 2019
From: derekw at marvell.com (Derek White)
Date: Thu, 15 Aug 2019 21:34:16 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <BN6PR18MB0946BDAC7CA2D72EE82272BED2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946BDAC7CA2D72EE82272BED2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
Message-ID: <BN6PR18MB0946F913E708898A8211FE61D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>

"And I agree that we should enable CPU_MULTI_COPY_ATOMIC as a separate issue."
                                                                                                                 ^ on AARCH64

Somebody take my keyboard from me!
- Derek

-----Original Message-----
From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On Behalf Of Derek White
Sent: Thursday, August 15, 2019 5:29 PM
To: Doerr, Martin <martin.doerr at sap.com>; David Holmes <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory model platforms

----------------------------------------------------------------------
Hi Martin,

You are right about the non-impact on arm32 or aarch64. I read over the existing implementation too quickly. Sorry!

And I agree that we should enable CPU_MULTI_COPY_ATOMIC as a separate issue.

So no more issues on this from me.

Thanks,
 - Derek

-----Original Message-----
From: Doerr, Martin <martin.doerr at sap.com>
Sent: Thursday, August 15, 2019 12:29 PM
To: Derek White <derekw at marvell.com>; David Holmes <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory model platforms

----------------------------------------------------------------------
Hi Derek,

thanks for pointing me to the aarch64 issue and paper.
I have added a link to that issue.

> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in 
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a 
> change to GC behavior for S390, but are changing interpreter and 
> compiler behavior
> arm32 (and maybe Aarch64?).

My initial webrev is only a functional change for s390.

Current implementation:
- CPU_NOT_MULTIPLE_COPY_ATOMIC is only used to control the following:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC and x86

My webrev:
- Still:
  support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
- TaskQueue uses fence on all platforms except SPARC, x86 and s390


According to the paper, we could enable CPU_MULTI_COPY_ATOMIC on aarch64 (not arm32). But I think this should be worth a separate issue once we have decided how to proceed with this one.
I think it'd be also good to have support_IRIW_for_not_multiple_copy_atomic_cpu switchable for PPC64.

> In any case I think it should be reviewed beyond the context of hotspot-gc-dev.
Definitely, yes. But since the taskqueue belongs to GC, I'd like to get feedback from GC, first.

It is always surprising how many ideas and opinions show up when touching this code ?

Best regards,
Martin


> -----Original Message-----
> From: Derek White <derekw at marvell.com>
> Sent: Donnerstag, 15. August 2019 17:50
> To: Doerr, Martin <martin.doerr at sap.com>; David Holmes 
> <david.holmes at oracle.com>; hotspot-gc-dev at openjdk.java.net; Kim 
> Barrett <kim.barrett at oracle.com>; Andrew Haley <aph at redhat.com>
> Subject: RE: RFR(S): 8229422: Taskqueue: Outdated selection of weak 
> memory model platforms
> 
> Hi Martin,
> 
> This is a good area to clean up! I have 2 issues to consider with this patch:
> 
> Re - the AArch64 side of things:
> Arm retconned the ARMv8 spec [1][2] to decide that multi-copy 
> atomicity was a good idea after all (after checking that no CPU 
> implementations took advantage of this level of weakness).
> 
> However, setting CPU_MULTI_COPY_ATOMIC for AArch64 would result in 
> changing behavior (removing fence in taskqueue) that should be looked 
> at and tested by the aarch64 folks, so if Andrew Haley agrees, I 
> suggest deferring changing this AArch64 behavior to a separate issue.
> 
> BTW, this could be a nice improvement for AArch64 - Thanks for 
> bringing this up!
> 
> 	[1] https://www.cl.cam.ac.uk/~pes20/armv8-mca/
> 	[2] https://bugs.openjdk.java.net/browse/JDK-8165058
> 
> Re - the patch generally:
> 
> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in 
> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a 
> change to GC behavior for S390, but are changing interpreter and 
> compiler behavior
> arm32 (and maybe Aarch64?).
> 
> I believe this change will remove required fences from arm32 
> interpreter and Jitted code relating to JMM volatile accesses.
> 
> In any case I think it should be reviewed beyond the context of
> hotspot-gc- dev.
> 
> Thanks again bringing this up!
>  - Derek
> 
> -----Original Message-----
> From: hotspot-gc-dev <hotspot-gc-dev-bounces at openjdk.java.net> On 
> Behalf Of Doerr, Martin
> Sent: Tuesday, August 13, 2019 6:30 AM
> To: David Holmes <david.holmes at oracle.com>; hotspot-gc- 
> dev at openjdk.java.net; Kim Barrett <kim.barrett at oracle.com>
> Subject: [EXT] RE: RFR(S): 8229422: Taskqueue: Outdated selection of 
> weak memory model platforms
> 
> External Email
> 
> ----------------------------------------------------------------------
> Hi Kim and David,
> 
> thank you for looking into this issue.
> 
> @Kim:
> I've replied to your comment in the issue.
> 
> @David:
> > I find the inversion of the ifdef slightly confusing. I also don't 
> > like a comment to say we don't have a given property. Wouldn't it be 
> > better to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
> Hmm. We could change that. I'm not sure what is better.
> I think it should be designed such that correct usage is easy and 
> wrong usage is difficult.
> 
> It has already happened that people used an #ifdef for a macro which 
> is always defined (0 or 1) by mistake.
> That's why I'm not a big fan of defining things to 0 or 1.
> 
> With the #define or not define approach, all platforms except those 
> which explicitly specify the property are conservatively treated as 
> non-multi-copy atomic.
> 
> But if your version is preferred by all reviewers, I can use it.
> 
> 
> > Can't comment on ppc64 specifics.
> I'll ask for additional reviews once the main issue was reviewed.
> 
> 
> > It's not at all obvious to me that the need for the fence() in 
> > pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to 
> > see that define connected only with the IRIW issue as it currently is.
> This was explained in the email thread a few emails later:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-
> March/008853.html
> 
> 
> Best regards,
> Martin


From david.holmes at oracle.com  Fri Aug 16 02:02:05 2019
From: david.holmes at oracle.com (David Holmes)
Date: Fri, 16 Aug 2019 12:02:05 +1000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>

Hi Martin,

On 13/08/2019 8:30 pm, Doerr, Martin wrote:
> Hi Kim and David,
> 
> thank you for looking into this issue.
> 
> @Kim:
> I've replied to your comment in the issue.
> 
> @David:
>> I find the inversion of the ifdef slightly confusing. I also don't like
>> a comment to say we don't have a given property. Wouldn't it be better
>> to set CPU_MULTI_COPY_ATOMIC to 0 or 1 as appropriate?
> Hmm. We could change that. I'm not sure what is better.
> I think it should be designed such that correct usage is easy and wrong usage is difficult.
> 
> It has already happened that people used an #ifdef for a macro which is always defined (0 or 1) by mistake.
> That's why I'm not a big fan of defining things to 0 or 1.
> 
> With the #define or not define approach, all platforms except those which explicitly specify the property are conservatively treated as non-multi-copy atomic.

If you need to put a comment saying "we don't have this property" then 
to me that means there should be something more than a comment to 
indicate that.

> But if your version is preferred by all reviewers, I can use it.

I'll defer to others/majority.

> 
>> Can't comment on ppc64 specifics.
> I'll ask for additional reviews once the main issue was reviewed.
> 
> 
>> It's not at all obvious to me that the need for the fence() in
>> pop_global is directly related to CPU_MULTI_COPY_ATOMIC. I prefer to see
>> that define connected only with the IRIW issue as it currently is.
> This was explained in the email thread a few emails later:
> http://mail.openjdk.java.net/pipermail/hotspot-dev/2013-March/008853.html

Okay I refreshed my memory. Yes the fence() does relate to 
non-multi-copy-atomic systems. But we were only using the 
CPU_NOT_MULTIPLE_COPY_ATOMIC define to control the setting of 
support_IRIW_for_not_multiple_copy_atomic_cpu. I don't know why we added 
the cpu-based ifdefs around the fence() rather than using 
CPU_NOT_MULTIPLE_COPY_ATOMIC in the first place, but it use for that 
does seem valid.

Thanks,
David


> 
> Best regards,
> Martin
> 


From sgehwolf at redhat.com  Fri Aug 16 08:38:39 2019
From: sgehwolf at redhat.com (Severin Gehwolf)
Date: Fri, 16 Aug 2019 10:38:39 +0200
Subject: RFR 8229420: [Redo] jstat reports incorrect values for OU for
 CMS GC
In-Reply-To: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com>
References: <573749a6-2389-6de2-4f2f-abc3b0e1dd6d@oracle.com>
Message-ID: <053248daacbd69bd29b4bcc4db7678323784b7f0.camel@redhat.com>

On Wed, 2019-08-14 at 06:13 -0700, Poonam Parhar wrote:
> Please review the webrev with the updated fix:
> http://cr.openjdk.java.net/~poonam/8229420/webrev.00/

As far as the jhat/jstat typo is concerned this looks good. I haven't
reviewed other bits. Thanks for doing this via new bug.

Thanks,
Severin


From martin.doerr at sap.com  Fri Aug 16 14:22:19 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 16 Aug 2019 14:22:19 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
Message-ID: <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>

Hi David,

> If you need to put a comment saying "we don't have this property" then
> to me that means there should be something more than a comment to
> indicate that.

I need to think a little more about that. Especially after feedback from Derek.
Maybe I should replace support_IRIW_for_not_multiple_copy_atomic_cpu by a macro in platform code as well.

I can also make further changes if desired by arm/aarch64 folks.


> Okay I refreshed my memory. Yes the fence() does relate to
> non-multi-copy-atomic systems. But we were only using the
> CPU_NOT_MULTIPLE_COPY_ATOMIC define to control the setting of
> support_IRIW_for_not_multiple_copy_atomic_cpu. I don't know why we
> added
> the cpu-based ifdefs around the fence() rather than using
> CPU_NOT_MULTIPLE_COPY_ATOMIC in the first place, but it use for that
> does seem valid.

Thanks for looking into that again and for confirming.
Note that the cpu-based ifdefs for the task queue were discussed in 2013 while the IRIW support was introduced by JDK-8029101 in 2014.
Nobody has cleaned these things up, yet.

Best regards,
Martin


From david.holmes at oracle.com  Fri Aug 16 22:39:28 2019
From: david.holmes at oracle.com (David Holmes)
Date: Sat, 17 Aug 2019 08:39:28 +1000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>

On 17/08/2019 12:22 am, Doerr, Martin wrote:
> Hi David,
> 
>> If you need to put a comment saying "we don't have this property" then
>> to me that means there should be something more than a comment to
>> indicate that.
> 
> I need to think a little more about that. Especially after feedback from Derek.
> Maybe I should replace support_IRIW_for_not_multiple_copy_atomic_cpu by a macro in platform code as well.
> 
> I can also make further changes if desired by arm/aarch64 folks.
> 
> 
>> Okay I refreshed my memory. Yes the fence() does relate to
>> non-multi-copy-atomic systems. But we were only using the
>> CPU_NOT_MULTIPLE_COPY_ATOMIC define to control the setting of
>> support_IRIW_for_not_multiple_copy_atomic_cpu. I don't know why we
>> added
>> the cpu-based ifdefs around the fence() rather than using
>> CPU_NOT_MULTIPLE_COPY_ATOMIC in the first place, but it use for that
>> does seem valid.
> 
> Thanks for looking into that again and for confirming.
> Note that the cpu-based ifdefs for the task queue were discussed in 2013 while the IRIW support was introduced by JDK-8029101 in 2014.
> Nobody has cleaned these things up, yet.

Ah I see - makes more sense now. I wonder if there are other ifdef'd 
memory barriers that might need to be cleaned up ...

Thanks,
David

> Best regards,
> Martin
> 


From kim.barrett at oracle.com  Sat Aug 17 04:48:55 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 17 Aug 2019 00:48:55 -0400
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
In-Reply-To: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
References: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
Message-ID: <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>

> On Aug 7, 2019, at 6:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi all,
> 
>  can I have reviews for this refactoring that changes the minimum index for the young indices (used for determining survivors per young region) from -1 to 0?
> 
> This avoids some imho unnecessary increment in the copy_to_survivor_space() method.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8227442
> Webrev:
> http://cr.openjdk.java.net/~tschatzl/8227442/webrev/
> Testing:
> hs-tier1-5 almost done with no issues
> 
> Thanks,
>  Thomas

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegion.hpp
 578     assert(_surv_rate_group != NULL, "pre-condition" );
and several others

If you are going to the bother of removing leading whitespace from
these asserts, why not also do the trailing whitespace?

------------------------------------------------------------------------------ 
src/hotspot/share/gc/g1/heapRegion.hpp
 572     assert(index != 0, "just checking");
 573     assert((index == 0) || is_young(), "pre-condition" );

`index == 0` check on 573 is not useful with the preceeding check.

------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1ParScanThreadState.cpp

In G1ParScanThreadState::copy_to_survivor_space:

[existing and retained, though modified for +1 removal]
 225   HeapRegion* const from_region = _g1h->heap_region_containing(old);
 226 
 227   const int young_index = from_region->young_index_in_cset();

[added later in the same function]
 280     HeapRegion* const from_region = _g1h->heap_region_containing(old);
 281     const uint young_index = from_region->young_index_in_cset();

I'm assuming the new addition was to put young_index closer to the
scope where it is actually used?  I think the earlier declaration is
now unused except by the immediately following assert.

If it's important to have that assert up front, the associated
from_region and young_index ought to be debug-only too.

I'd prefer not having these nested declarations for the same variables.

------------------------------------------------------------------------------


From thomas.schatzl at oracle.com  Mon Aug 19 08:15:12 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 19 Aug 2019 10:15:12 +0200
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <BN6PR18MB0946D4C8E696226FBC191032D2AC0@BN6PR18MB0946.namprd18.prod.outlook.com>
 <DB8PR02MB582083391827FA63F02235FB9AAC0@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <12fb4ffa-ee1f-7883-74ac-5a6b579b4c14@oracle.com>

Hi,

On 15.08.19 18:28, Doerr, Martin wrote:
> Hi Derek,
> 
> thanks for pointing me to the aarch64 issue and paper.
> I have added a link to that issue.
> 
>> There changes around CPU_NOT_MULTIPLE_COPY_ATOMIC in
>> src/hotspot/share/utilities/globalDefinitions.hpp are not simply a change to
>> GC behavior for S390, but are changing interpreter and compiler behavior
>> arm32 (and maybe Aarch64?).
> 
> My initial webrev is only a functional change for s390.
> 
> Current implementation:
> - CPU_NOT_MULTIPLE_COPY_ATOMIC is only used to control the following:
>    support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
> - TaskQueue uses fence on all platforms except SPARC and x86
> 
> My webrev:
> - Still:
>    support_IRIW_for_not_multiple_copy_atomic_cpu: only true on PPC64
> - TaskQueue uses fence on all platforms except SPARC, x86 and s390

Maybe the comments above the define seem superfluous: they do not add 
more information. Same for the "we don't have this property" comments. 
Otherwise I would prefer something like "<platform> is [not] multiple 
copy atomic", but that would only repeat the code just below.

I would tend to just remove the comments; I do not need to see another 
webrev for just such a comment removal.

Note sure if making support_IRIW_for_not_multiple_copy_atomic_cpu 
another define makes the code better.

Also thanks for moving the other ideas pointed out to separate RFRs.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Mon Aug 19 09:43:55 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 19 Aug 2019 11:43:55 +0200
Subject: G1GC is not able reclaim bytes
In-Reply-To: <CY4PR19MB14772263A32B99446C81A93CC6D20@CY4PR19MB1477.namprd19.prod.outlook.com>
References: <CY4PR19MB14772263A32B99446C81A93CC6D20@CY4PR19MB1477.namprd19.prod.outlook.com>
Message-ID: <994b71be-de12-df4b-6ff2-1243a372a4f6@oracle.com>

Hi,

   (cc'ed hotspot-gc-use as it is the appropriate mailing list for such 
quesstions).

On 13.08.19 17:01, shang xinli wrote:
> There are two connected full GC but they are not able to reclaim bytes. Here are gc logs.
> 
> 
>   [Times: user=0.35 sys=0.00, real=0.03 secs]
> 
> 2019-08-01T15:12:14.594+0000: 755123.074: [Full GC (Allocation Failure)  132G->132G(180G), 37.4525886 secs]
> 
>     [Eden: 0.0B(9216.0M)->0.0B(9216.0M) Survivors: 0.0B->0.0B Heap: 132.1G(180.0G)->132.1G(180.0G)], [Metaspace: 204550K->204459K(229376K)]
> 
>   [Times: user=68.14 sys=0.07, real=37.45 secs]
> 
> 2019-08-01T15:12:52.048+0000: 755160.527: [Full GC (Allocation Failure)  132G->132G(180G), 37.4584894 secs]
> 
>     [Eden: 0.0B(9216.0M)->0.0B(9216.0M) Survivors: 0.0B->0.0B Heap: 132.1G(180.0G)->132.1G(180.0G)], [Metaspace: 204459K->204451K(229376K)]
> 
>   [Times: user=68.38 sys=0.05, real=37.46 secs]
> 
> Here is the chart. We can see before and after GC, the heap is almost the same.
> [cid:65dca3e8-3bc4-4b87-93dc-997ba0e601f2]
> 
> 
> The service OOMs eventually, but we don't configure to generate heap dump. Is there a way to investigate why?
> 
>
The reason of the full gc might be humongous objects that use up lots of 
additional space given that you get full gcs at 132 GB out of 180.

So does your application happen to use many - and I mean many - large 
objects (~16M in size)?

You can add -XX:+UnlockExperimentalVMOptions -XX:G1LogLevel=finest to 
your command line? (I assume you use some JDK8 given the log output, but 
not sure)

Of particular interest are the following lines.

[Humongous Register: x.x ms]
   [Humongous Total: y]
   [Humongous Candidate: z]

Where y shows you the number of live humongous objects at the time of 
the GC(s).

If that value is rather high, we found the issue.

The reason is that objects that have a total size that is just above the 
humongous object threshold always use up full regions. This can waste a 
lot of space.

That threshold is >= half a region size, i.e. 16M in your case, since 
regions are 32M at that heap size by default. Such a humongous object 
wastes the other half of the region. Remember that an object adds a 
small header, so an object that has exactly 16M of usable data is 
actually 16M + a little bit in size.

If possible, try to lower your object size a bit. Then the waste will be 
much smaller in any case.

The only other option I can see is increasing the heap, or use a 
different collector.

Thanks,
   Thomas


From thomas.schatzl at oracle.com  Mon Aug 19 13:46:20 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Mon, 19 Aug 2019 15:46:20 +0200
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
In-Reply-To: <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>
References: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
 <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>
Message-ID: <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>

Hi Kim,

   thanks for your review.

On 17.08.19 06:48, Kim Barrett wrote:
>> On Aug 7, 2019, at 6:39 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>>
>> Hi all,
>>
>>   can I have reviews for this refactoring that changes the minimum index for the young indices (used for determining survivors per young region) from -1 to 0?
>>
>> This avoids some imho unnecessary increment in the copy_to_survivor_space() method.
>>
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8227442
>> Webrev:
>> http://cr.openjdk.java.net/~tschatzl/8227442/webrev/
>> Testing:
>> hs-tier1-5 almost done with no issues
>>
>> Thanks,
>>   Thomas
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegion.hpp
>   578     assert(_surv_rate_group != NULL, "pre-condition" );
> and several others
> 
> If you are going to the bother of removing leading whitespace from
> these asserts, why not also do the trailing whitespace?

Done.

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegion.hpp
>   572     assert(index != 0, "just checking");
>   573     assert((index == 0) || is_young(), "pre-condition" );
> 
> `index == 0` check on 573 is not useful with the preceeding check.

Removed.

> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
> 
> In G1ParScanThreadState::copy_to_survivor_space:
> 
> [existing and retained, though modified for +1 removal]
>   225   HeapRegion* const from_region = _g1h->heap_region_containing(old);
>   226
>   227   const int young_index = from_region->young_index_in_cset();
> 
> [added later in the same function]
>   280     HeapRegion* const from_region = _g1h->heap_region_containing(old);
>   281     const uint young_index = from_region->young_index_in_cset();
> 
> I'm assuming the new addition was to put young_index closer to the
> scope where it is actually used?  I think the earlier declaration is
> now unused except by the immediately following assert.
> 
> If it's important to have that assert up front, the associated
> from_region and young_index ought to be debug-only too.
> 
> I'd prefer not having these nested declarations for the same variables.

Removed the first occurrence.

Sorry for these issues, which stem from me moving around this change in 
my patch queue :(

http://cr.openjdk.java.net/~tschatzl/8227442/webrev.0_to_1/ (diff)
http://cr.openjdk.java.net/~tschatzl/8227442/webrev.1/ (full)

Passes hs-tier1-3

Thanks,
   Thomas


From stefan.karlsson at oracle.com  Mon Aug 19 14:11:31 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Mon, 19 Aug 2019 16:11:31 +0200
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp and
 markWord.inline.hpp
Message-ID: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>

Hi all,

Please review this patch to break the circular dependency between 
oop.inline.hpp and markWord.inline.hpp.

http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8229839

The patch removes the call to oopDesc::klass() from markWord.inline.hpp. 
This is done by passing in the klass from callers to the different 
markWord::must_be_preserved functions.

Some of the paths inside markWord::must_be:preserved don't need the 
klass, and calling oopDesc::klass() in those cases would be wasteful. To 
prevent this, I changed the code to allow the callers to provide a 
KlassProxy that can resolve to a const Klass* when and if a Klass is 
needed. I'm not sure if this is needed or not, but I didn't want to 
pessimise the code by introducing new calls to oopDesc::klass().

I also took the opportunity to consolidate and remove some code 
duplication in must_be_preserved functions. This could of course be 
split into a separate patch if that's requested.

Testing done locally. Will run tier123.

Thanks,
StefanK


From martin.doerr at sap.com  Mon Aug 19 15:43:43 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 19 Aug 2019 15:43:43 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
Message-ID: <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>

Hi everybody,

thanks a lot for reviewing it, Thomas, David and Derek!

@Derek & Andrew:
I've created a webrev which provides full control for platform maintainers:
http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.01/

It also allows making the IRIW stuff switchable in platform code which was desired by
https://bugs.openjdk.java.net/browse/JDK-8165058

As an example, I've added an experimental switch for PPC64 "SupportIRIW" which was already on Michihiro's wish list (CC'ed).

So if you prefer this implementation for arm/aarch64, too, we can go forward with webrev.01.

This webrev doesn't contain functional changes except for s390 (taskqueue) and PPC64 (experimental switch added).
Both platforms maintained by us, anyway. ?

I'll request reviews from runtime and compiler mailing lists later.

Best regards,
Martin


From kim.barrett at oracle.com  Mon Aug 19 18:34:22 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 19 Aug 2019 14:34:22 -0400
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
In-Reply-To: <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>
References: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
 <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>
 <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>
Message-ID: <6FEB8B70-C1E8-4A2B-BA3C-7282884BF41C@oracle.com>

> On Aug 19, 2019, at 9:46 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> http://cr.openjdk.java.net/~tschatzl/8227442/webrev.0_to_1/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8227442/webrev.1/ (full)
> 
> Passes hs-tier1-3
> 
> Thanks,
>  Thomas

Looks good.


From tprintezis at twitter.com  Mon Aug 19 18:37:59 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Mon, 19 Aug 2019 14:37:59 -0400
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
Message-ID: <CAOzU2ik=w6a+q7wOPfy4sOcYQ6rp6iT85_=PhTmNtmDTAGMvmA@mail.gmail.com>

Hey Stefan,

This looks good. Would it be helpful to introduce:

inline bool oopDesc::mark_must_be_preserved() const {
  return mark_must_be_preserved(mark_raw());
}

for the cases where you don?t already have the mark word? But I only saw of
places where that?s the case...

Tony

?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 19, 2019 at 10:21:58 AM, Stefan Karlsson (
stefan.karlsson at oracle.com) wrote:

Hi all,

Please review this patch to break the circular dependency between
oop.inline.hpp and markWord.inline.hpp.

http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
https://bugs.openjdk.java.net/browse/JDK-8229839

The patch removes the call to oopDesc::klass() from markWord.inline.hpp.
This is done by passing in the klass from callers to the different
markWord::must_be_preserved functions.

Some of the paths inside markWord::must_be:preserved don't need the
klass, and calling oopDesc::klass() in those cases would be wasteful. To
prevent this, I changed the code to allow the callers to provide a
KlassProxy that can resolve to a const Klass* when and if a Klass is
needed. I'm not sure if this is needed or not, but I didn't want to
pessimise the code by introducing new calls to oopDesc::klass().

I also took the opportunity to consolidate and remove some code
duplication in must_be_preserved functions. This could of course be
split into a separate patch if that's requested.

Testing done locally. Will run tier123.

Thanks,
StefanK


From sangheon.kim at oracle.com  Tue Aug 20 04:14:34 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Mon, 19 Aug 2019 21:14:34 -0700
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
In-Reply-To: <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>
References: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
 <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>
 <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>
Message-ID: <9479b16d-3d17-a258-3138-d66bf4bc7e33@oracle.com>

Hi Thomas,

On 8/19/19 6:46 AM, Thomas Schatzl wrote:
> Hi Kim,
>
> ? thanks for your review.
>
> On 17.08.19 06:48, Kim Barrett wrote:
>>> On Aug 7, 2019, at 6:39 AM, Thomas Schatzl 
>>> <thomas.schatzl at oracle.com> wrote:
>>>
>>> Hi all,
>>>
>>> ? can I have reviews for this refactoring that changes the minimum 
>>> index for the young indices (used for determining survivors per 
>>> young region) from -1 to 0?
>>>
>>> This avoids some imho unnecessary increment in the 
>>> copy_to_survivor_space() method.
>>>
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8227442
>>> Webrev:
>>> http://cr.openjdk.java.net/~tschatzl/8227442/webrev/
>>> Testing:
>>> hs-tier1-5 almost done with no issues
>>>
>>> Thanks,
>>> ? Thomas
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegion.hpp
>> ? 578???? assert(_surv_rate_group != NULL, "pre-condition" );
>> and several others
>>
>> If you are going to the bother of removing leading whitespace from
>> these asserts, why not also do the trailing whitespace?
>
> Done.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegion.hpp
>> ? 572???? assert(index != 0, "just checking");
>> ? 573???? assert((index == 0) || is_young(), "pre-condition" );
>>
>> `index == 0` check on 573 is not useful with the preceeding check.
>
> Removed.
>
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1ParScanThreadState.cpp
>>
>> In G1ParScanThreadState::copy_to_survivor_space:
>>
>> [existing and retained, though modified for +1 removal]
>> ? 225?? HeapRegion* const from_region = 
>> _g1h->heap_region_containing(old);
>> ? 226
>> ? 227?? const int young_index = from_region->young_index_in_cset();
>>
>> [added later in the same function]
>> ? 280???? HeapRegion* const from_region = 
>> _g1h->heap_region_containing(old);
>> ? 281???? const uint young_index = from_region->young_index_in_cset();
>>
>> I'm assuming the new addition was to put young_index closer to the
>> scope where it is actually used?? I think the earlier declaration is
>> now unused except by the immediately following assert.
>>
>> If it's important to have that assert up front, the associated
>> from_region and young_index ought to be debug-only too.
>>
>> I'd prefer not having these nested declarations for the same variables.
>
> Removed the first occurrence.
>
> Sorry for these issues, which stem from me moving around this change 
> in my patch queue :(
>
> http://cr.openjdk.java.net/~tschatzl/8227442/webrev.0_to_1/ (diff)
> http://cr.openjdk.java.net/~tschatzl/8227442/webrev.1/ (full)
webrev.1 looks good.

One minor nit:
src/hotspot/share/gc/g1/heapRegion.hpp

618 assert( _age_index == -1, "pre-condition");

- Still there is a whitespace before '_age_index'. Please remove it 
before the push.

Thanks,
Sangheon


>
> Passes hs-tier1-3
>
> Thanks,
> ? Thomas


From thomas.schatzl at oracle.com  Tue Aug 20 08:08:26 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Tue, 20 Aug 2019 10:08:26 +0200
Subject: RFR (S): 8227442: Make young_index_in_cset zero-based
In-Reply-To: <9479b16d-3d17-a258-3138-d66bf4bc7e33@oracle.com>
References: <60e6e3c6-efe7-7f5c-5f60-15b531cac76f@oracle.com>
 <DC6171F2-612E-4542-B6E2-2A8394788438@oracle.com>
 <4d95e3b6-fba0-6760-5cc5-fc9c01595c7c@oracle.com>
 <9479b16d-3d17-a258-3138-d66bf4bc7e33@oracle.com>
Message-ID: <a0906443addfab6d0b063d637eeb84a514c4e5b4.camel@oracle.com>

Hi Kim, Sangheon,

On Mon, 2019-08-19 at 21:14 -0700, sangheon.kim at oracle.com wrote:
> Hi Thomas,
> 
> On 8/19/19 6:46 AM, Thomas Schatzl wrote:
> > Hi Kim, 
> > 
> >   thanks for your review. 
> > 
> > On 17.08.19 06:48, Kim Barrett wrote: 
> > > > On Aug 7, 2019, at 6:39 AM, Thomas Schatzl <
> > > > thomas.schatzl at oracle.com> wrote: 
> > > > 
> > > > Hi all, 
> > > > 
> > > >   can I have reviews for this refactoring that changes the
> > > > minimum index for the young indices (used for determining
> > > > survivors per young region) from -1 to 0? 
> > > > 
> > > > This avoids some imho unnecessary increment in the
> > > > copy_to_survivor_space() method. 
> > > > 
> > > > [...]
> >  
> > http://cr.openjdk.java.net/~tschatzl/8227442/webrev.0_to_1/ (diff) 
> > http://cr.openjdk.java.net/~tschatzl/8227442/webrev.1/ (full) 
>  webrev.1 looks good.
> 
> One minor nit:
> src/hotspot/share/gc/g1/heapRegion.hpp
>  618       assert( _age_index == -1, "pre-condition");
> - Still there is a whitespace before '_age_index'. Please remove it
> before the push.

  thanks for your reviews. Pushed with this suggestion.

Thanks,
  Thomas


From per.liden at oracle.com  Tue Aug 20 08:43:40 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 20 Aug 2019 10:43:40 +0200
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
Message-ID: <31e087f5-7940-eef3-d422-f2f3bae99ea4@oracle.com>

Looks good!

/Per

On 2019-08-19 16:11, Stefan Karlsson wrote:
> Hi all,
> 
> Please review this patch to break the circular dependency between 
> oop.inline.hpp and markWord.inline.hpp.
> 
> http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
> https://bugs.openjdk.java.net/browse/JDK-8229839
> 
> The patch removes the call to oopDesc::klass() from markWord.inline.hpp. 
> This is done by passing in the klass from callers to the different 
> markWord::must_be_preserved functions.
> 
> Some of the paths inside markWord::must_be:preserved don't need the 
> klass, and calling oopDesc::klass() in those cases would be wasteful. To 
> prevent this, I changed the code to allow the callers to provide a 
> KlassProxy that can resolve to a const Klass* when and if a Klass is 
> needed. I'm not sure if this is needed or not, but I didn't want to 
> pessimise the code by introducing new calls to oopDesc::klass().
> 
> I also took the opportunity to consolidate and remove some code 
> duplication in must_be_preserved functions. This could of course be 
> split into a separate patch if that's requested.
> 
> Testing done locally. Will run tier123.
> 
> Thanks,
> StefanK


From stefan.karlsson at oracle.com  Tue Aug 20 08:54:56 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 20 Aug 2019 10:54:56 +0200
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <CAOzU2ik=w6a+q7wOPfy4sOcYQ6rp6iT85_=PhTmNtmDTAGMvmA@mail.gmail.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
 <CAOzU2ik=w6a+q7wOPfy4sOcYQ6rp6iT85_=PhTmNtmDTAGMvmA@mail.gmail.com>
Message-ID: <a39c519c-68a6-9c98-99f5-51fac188bb2e@oracle.com>

Hi Tony,

On 2019-08-19 20:37, Tony Printezis wrote:
> Hey Stefan,
> 
> This looks good. Would it be helpful to introduce:
> 
> inline bool oopDesc::mark_must_be_preserved() const {
>  ? return mark_must_be_preserved(mark_raw());
> }
> 
> for the cases where you don?t already have the mark word? But I only saw 
> of places where that?s the case...

Thanks for reviewing!

I've added the function you suggested, and updated the places where this 
could be used:
  http://cr.openjdk.java.net/~stefank/8229839/webrev.02.delta/
  http://cr.openjdk.java.net/~stefank/8229839/webrev.02/

It passes tier1-3

Thanks,
StefanK

> 
> Tony
> 
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com 
> <mailto:tprintezis at twitter.com>
> 
> 
> On August 19, 2019 at 10:21:58 AM, Stefan Karlsson 
> (stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>) wrote:
> 
>> Hi all,
>>
>> Please review this patch to break the circular dependency between
>> oop.inline.hpp and markWord.inline.hpp.
>>
>> http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8229839
>>
>> The patch removes the call to oopDesc::klass() from markWord.inline.hpp.
>> This is done by passing in the klass from callers to the different
>> markWord::must_be_preserved functions.
>>
>> Some of the paths inside markWord::must_be:preserved don't need the
>> klass, and calling oopDesc::klass() in those cases would be wasteful. To
>> prevent this, I changed the code to allow the callers to provide a
>> KlassProxy that can resolve to a const Klass* when and if a Klass is
>> needed. I'm not sure if this is needed or not, but I didn't want to
>> pessimise the code by introducing new calls to oopDesc::klass().
>>
>> I also took the opportunity to consolidate and remove some code
>> duplication in must_be_preserved functions. This could of course be
>> split into a separate patch if that's requested.
>>
>> Testing done locally. Will run tier123.
>>
>> Thanks,
>> StefanK


From stefan.karlsson at oracle.com  Tue Aug 20 08:56:03 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 20 Aug 2019 10:56:03 +0200
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <31e087f5-7940-eef3-d422-f2f3bae99ea4@oracle.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
 <31e087f5-7940-eef3-d422-f2f3bae99ea4@oracle.com>
Message-ID: <86bf5c53-03b4-8b16-b356-4ec17f1288f3@oracle.com>

Thanks Per.

As you might have seen in my reply to Tony, I've updated the patch slightly:
  http://cr.openjdk.java.net/~stefank/8229839/webrev.02.delta/
  http://cr.openjdk.java.net/~stefank/8229839/webrev.02/

Thanks,
StefanK

On 2019-08-20 10:43, Per Liden wrote:
> Looks good!
> 
> /Per
> 
> On 2019-08-19 16:11, Stefan Karlsson wrote:
>> Hi all,
>>
>> Please review this patch to break the circular dependency between 
>> oop.inline.hpp and markWord.inline.hpp.
>>
>> http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8229839
>>
>> The patch removes the call to oopDesc::klass() from 
>> markWord.inline.hpp. This is done by passing in the klass from callers 
>> to the different markWord::must_be_preserved functions.
>>
>> Some of the paths inside markWord::must_be:preserved don't need the 
>> klass, and calling oopDesc::klass() in those cases would be wasteful. 
>> To prevent this, I changed the code to allow the callers to provide a 
>> KlassProxy that can resolve to a const Klass* when and if a Klass is 
>> needed. I'm not sure if this is needed or not, but I didn't want to 
>> pessimise the code by introducing new calls to oopDesc::klass().
>>
>> I also took the opportunity to consolidate and remove some code 
>> duplication in must_be_preserved functions. This could of course be 
>> split into a separate patch if that's requested.
>>
>> Testing done locally. Will run tier123.
>>
>> Thanks,
>> StefanK


From per.liden at oracle.com  Tue Aug 20 10:07:24 2019
From: per.liden at oracle.com (Per Liden)
Date: Tue, 20 Aug 2019 12:07:24 +0200
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <86bf5c53-03b4-8b16-b356-4ec17f1288f3@oracle.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
 <31e087f5-7940-eef3-d422-f2f3bae99ea4@oracle.com>
 <86bf5c53-03b4-8b16-b356-4ec17f1288f3@oracle.com>
Message-ID: <64afaf27-cfac-1a38-5c00-131dc5c21171@oracle.com>

Still looks good.

cheers,
Per

On 2019-08-20 10:56, Stefan Karlsson wrote:
> Thanks Per.
> 
> As you might have seen in my reply to Tony, I've updated the patch 
> slightly:
>  ?http://cr.openjdk.java.net/~stefank/8229839/webrev.02.delta/
>  ?http://cr.openjdk.java.net/~stefank/8229839/webrev.02/
> 
> Thanks,
> StefanK
> 
> On 2019-08-20 10:43, Per Liden wrote:
>> Looks good!
>>
>> /Per
>>
>> On 2019-08-19 16:11, Stefan Karlsson wrote:
>>> Hi all,
>>>
>>> Please review this patch to break the circular dependency between 
>>> oop.inline.hpp and markWord.inline.hpp.
>>>
>>> http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
>>> https://bugs.openjdk.java.net/browse/JDK-8229839
>>>
>>> The patch removes the call to oopDesc::klass() from 
>>> markWord.inline.hpp. This is done by passing in the klass from 
>>> callers to the different markWord::must_be_preserved functions.
>>>
>>> Some of the paths inside markWord::must_be:preserved don't need the 
>>> klass, and calling oopDesc::klass() in those cases would be wasteful. 
>>> To prevent this, I changed the code to allow the callers to provide a 
>>> KlassProxy that can resolve to a const Klass* when and if a Klass is 
>>> needed. I'm not sure if this is needed or not, but I didn't want to 
>>> pessimise the code by introducing new calls to oopDesc::klass().
>>>
>>> I also took the opportunity to consolidate and remove some code 
>>> duplication in must_be_preserved functions. This could of course be 
>>> split into a separate patch if that's requested.
>>>
>>> Testing done locally. Will run tier123.
>>>
>>> Thanks,
>>> StefanK


From tprintezis at twitter.com  Tue Aug 20 12:46:46 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Tue, 20 Aug 2019 08:46:46 -0400
Subject: RFR: 8229839: Break circular dependency between oop.inline.hpp
 and markWord.inline.hpp
In-Reply-To: <a39c519c-68a6-9c98-99f5-51fac188bb2e@oracle.com>
References: <17f6973a-3545-149a-9a55-5b54588caf04@oracle.com>
 <CAOzU2ik=w6a+q7wOPfy4sOcYQ6rp6iT85_=PhTmNtmDTAGMvmA@mail.gmail.com>
 <a39c519c-68a6-9c98-99f5-51fac188bb2e@oracle.com>
Message-ID: <CAOzU2inicxR7Hy2F9uZKVi0_Pah7qHAJGvZyqz3YQzsEv4hmwQ@mail.gmail.com>

Yep, looks good - thanks!


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 20, 2019 at 4:55:05 AM, Stefan Karlsson (
stefan.karlsson at oracle.com) wrote:

Hi Tony,

On 2019-08-19 20:37, Tony Printezis wrote:
> Hey Stefan,
>
> This looks good. Would it be helpful to introduce:
>
> inline bool oopDesc::mark_must_be_preserved() const {
>   return mark_must_be_preserved(mark_raw());
> }
>
> for the cases where you don?t already have the mark word? But I only saw
> of places where that?s the case...

Thanks for reviewing!

I've added the function you suggested, and updated the places where this
could be used:
http://cr.openjdk.java.net/~stefank/8229839/webrev.02.delta/
http://cr.openjdk.java.net/~stefank/8229839/webrev.02/

It passes tier1-3

Thanks,
StefanK

>
> Tony
>
> ?????
> Tony Printezis | @TonyPrintezis | tprintezis at twitter.com
> <mailto:tprintezis at twitter.com>
>
>
> On August 19, 2019 at 10:21:58 AM, Stefan Karlsson
> (stefan.karlsson at oracle.com <mailto:stefan.karlsson at oracle.com>) wrote:
>
>> Hi all,
>>
>> Please review this patch to break the circular dependency between
>> oop.inline.hpp and markWord.inline.hpp.
>>
>> http://cr.openjdk.java.net/~stefank/8229839/webrev.01/
>> https://bugs.openjdk.java.net/browse/JDK-8229839
>>
>> The patch removes the call to oopDesc::klass() from markWord.inline.hpp.
>> This is done by passing in the klass from callers to the different
>> markWord::must_be_preserved functions.
>>
>> Some of the paths inside markWord::must_be:preserved don't need the
>> klass, and calling oopDesc::klass() in those cases would be wasteful. To
>> prevent this, I changed the code to allow the callers to provide a
>> KlassProxy that can resolve to a const Klass* when and if a Klass is
>> needed. I'm not sure if this is needed or not, but I didn't want to
>> pessimise the code by introducing new calls to oopDesc::klass().
>>
>> I also took the opportunity to consolidate and remove some code
>> duplication in must_be_preserved functions. This could of course be
>> split into a separate patch if that's requested.
>>
>> Testing done locally. Will run tier123.
>>
>> Thanks,
>> StefanK


From rkennke at redhat.com  Tue Aug 20 13:15:40 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Tue, 20 Aug 2019 15:15:40 +0200
Subject: RFR: JDK-8229921: Shenandoah: Make Traversal mode non-experimental
Message-ID: <b93b814e-fdfe-12b9-e78c-753756e6b64f@redhat.com>

Currently, Shenandoah's Traversal mode is experimental. We should make
it a product mode/heuristic instead.

This is mostly a symbolic change, because Shenandoah is behind
experimental anyway.

http://cr.openjdk.java.net/~rkennke/JDK-8229921/webrev.00/

Testing: hotspot_gc_shenandoah

Ok?

Roman


From shade at redhat.com  Tue Aug 20 13:35:37 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Tue, 20 Aug 2019 15:35:37 +0200
Subject: RFR: JDK-8229921: Shenandoah: Make Traversal mode non-experimental
In-Reply-To: <b93b814e-fdfe-12b9-e78c-753756e6b64f@redhat.com>
References: <b93b814e-fdfe-12b9-e78c-753756e6b64f@redhat.com>
Message-ID: <da2d4ada-74d9-ce37-0e64-0b19cbe95d00@redhat.com>

On 8/20/19 3:15 PM, Roman Kennke wrote:
> Currently, Shenandoah's Traversal mode is experimental. We should make
> it a product mode/heuristic instead.
> 
> This is mostly a symbolic change, because Shenandoah is behind
> experimental anyway.
> 
> http://cr.openjdk.java.net/~rkennke/JDK-8229921/webrev.00/

OK!

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Tue Aug 20 15:41:45 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 20 Aug 2019 11:41:45 -0400
Subject: RFR 8229923: Shenandoah: Fix JVM selections for Shenandoah critical
 native tests
Message-ID: <48fae518-1806-f258-8d66-cb961aeadbbe@redhat.com>

A couple of Shenandoah critical native tests are only valid on 64-bit 
JVM on x86 arch, os arch selector is not enough to guarantee that, need 
to add vm.bits selector.

Bug: https://bugs.openjdk.java.net/browse/JDK-8229923
Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229923/webrev.00/

Test:
   32-bit JVM on x86_64 Linux
   64-bit JVM on x86_64 Linux
   64-bit JVM on aarch64 Linux

Thanks,

-Zhengyu


From zgu at redhat.com  Tue Aug 20 19:10:40 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Tue, 20 Aug 2019 15:10:40 -0400
Subject: RFR 8229923: Shenandoah: Fix JVM selections for Shenandoah
 critical native tests
In-Reply-To: <48fae518-1806-f258-8d66-cb961aeadbbe@redhat.com>
References: <48fae518-1806-f258-8d66-cb961aeadbbe@redhat.com>
Message-ID: <4a3fc213-59b9-051c-cb46-d18c10084fe0@redhat.com>

I had offline conversation with Aleksey. We can tolerate failures on 
x86_32 until JDK-8229919. Therefore, I would like to withdraw this RFR.

Thanks,

-Zhengyu

On 8/20/19 11:41 AM, Zhengyu Gu wrote:
> A couple of Shenandoah critical native tests are only valid on 64-bit 
> JVM on x86 arch, os arch selector is not enough to guarantee that, need 
> to add vm.bits selector.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8229923
> Webrev: http://cr.openjdk.java.net/~zgu/JDK-8229923/webrev.00/
> 
> Test:
>  ? 32-bit JVM on x86_64 Linux
>  ? 64-bit JVM on x86_64 Linux
>  ? 64-bit JVM on aarch64 Linux
> 
> Thanks,
> 
> -Zhengyu


From per.liden at oracle.com  Wed Aug 21 13:43:25 2019
From: per.liden at oracle.com (Per Liden)
Date: Wed, 21 Aug 2019 15:43:25 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
In-Reply-To: <dd0a4ef5-a545-224b-998d-ca3ca05a75ef@oracle.com>
References: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
 <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>
 <dd0a4ef5-a545-224b-998d-ca3ca05a75ef@oracle.com>
Message-ID: <2c853a9b-879f-1b04-86df-c795a0d6e4aa@oracle.com>

Returned to this patch and did some more testing, and realized that the 
UnhandledOops checker will complain about the raw oop. Adjusted the code 
to avoid that.

Diff: http://cr.openjdk.java.net/~pliden/8229451/webrev.1-diff
Full: http://cr.openjdk.java.net/~pliden/8229451/webrev.1

Testing: Passed tier1-3

cheers,
Per

On 8/14/19 10:27 AM, Per Liden wrote:
> Thanks Erik!
> 
> I agree that another path in ZRootIterator is unfortunate, but the 
> alternatives I've managed to come up with tend to be worse.
> 
> /Per
> 
> On 8/14/19 9:57 AM, Erik Osterlund wrote:
>> Hi Per,
>>
>> Unfortunate with another special path in the root iterator. But 
>> alternatives also look bad. Looks good.
>>
>> Thanks,
>> /Erik
>>
>>> On 13 Aug 2019, at 10:42, Per Liden <per.liden at oracle.com> wrote:
>>>
>>> JDK-8227226 can temporarily create long[] objects on the heap, which 
>>> later become oop arrays, when the array initialization has been 
>>> completed. This is fine from a sampling/reporting point of view (the 
>>> things done in the MemAllocator::Allocation destructor), since that 
>>> only happens after the final klass pointer has been installed. 
>>> However, if a heap iteration (via ZHeapIterator) happens before the 
>>> final klass pointer has been installed, it will then see the long[]. 
>>> As far as I can tell, this isn't a big deal, unless that heap 
>>> iteration is out to JVMTI-tag all long[] instances. In that case, we 
>>> tag a long[] which will later become an oop array, which seems wrong 
>>> and potentially problematic. To avoid this, we want to be able to 
>>> hide these roots from the heap iterator until the final klass pointer 
>>> has been installed.
>>>
>>> The approach here is that these temporary long[] objects are not kept 
>>> alive in a Handle, but instead treated as a special root in 
>>> ZThreadLocalData, that can optionally be made invisible to the 
>>> ZRootsIterator.
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
>>> Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0
>>>
>>> /Per
>>


From rkennke at redhat.com  Wed Aug 21 16:10:37 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Wed, 21 Aug 2019 18:10:37 +0200
Subject: RFR: JDK-8229977: Shenandoah: save/restore FPU state aroud LRB
 runtime call
Message-ID: <f1e90dc0-b688-de9f-1c45-190cd3727316@redhat.com>

Nightlies show failures in CAS/CAE related jcstress tests, and they only
seem to affect float/double variants.
The root cause is JDK-8228369 which removed
save_vector_registers()/restore_vector_registers() around the LRB
runtime call. It turns out that we actually need them, because the LRB
stub is used by C1/C2 CAS intrinsics.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8229977

Instead of re-introducing save/restore_vector_registers(), my proposed
change uses push/pop_FPU_state() which uses whatever the platform
supports (e.g. fxsave/fxrestore) to push the whole FPU state on stack. A
little quirk is the requirement on 16-byte-alignment of the stack, hence
the extra setup code for this.

Testing:
The failing tests are passing with this change, and
hotspot_gc_shenandoah is happy too.

Webrev:
http://cr.openjdk.java.net/~rkennke/JDK-8229977/webrev.02/

Ok?

Roman


From shade at redhat.com  Wed Aug 21 17:08:05 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Wed, 21 Aug 2019 19:08:05 +0200
Subject: RFR: JDK-8229977: Shenandoah: save/restore FPU state aroud LRB
 runtime call
In-Reply-To: <f1e90dc0-b688-de9f-1c45-190cd3727316@redhat.com>
References: <f1e90dc0-b688-de9f-1c45-190cd3727316@redhat.com>
Message-ID: <54b16c27-3dec-2a53-efa6-7d49467f0ca9@redhat.com>

On 8/21/19 6:10 PM, Roman Kennke wrote:
> Webrev:
> http://cr.openjdk.java.net/~rkennke/JDK-8229977/webrev.02/

This line looks misaligned:

1002   #ifdef _LP64

Otherwise good.

-- 
Thanks,
-Aleksey


From erik.osterlund at oracle.com  Thu Aug 22 08:03:19 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 22 Aug 2019 10:03:19 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
In-Reply-To: <2c853a9b-879f-1b04-86df-c795a0d6e4aa@oracle.com>
References: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
 <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>
 <dd0a4ef5-a545-224b-998d-ca3ca05a75ef@oracle.com>
 <2c853a9b-879f-1b04-86df-c795a0d6e4aa@oracle.com>
Message-ID: <3cb906bc-bfba-cea1-af26-1e0bbccbb267@oracle.com>

Hi Per,

Looks reasonable.

/Erik

On 2019-08-21 15:43, Per Liden wrote:
> Returned to this patch and did some more testing, and realized that 
> the UnhandledOops checker will complain about the raw oop. Adjusted 
> the code to avoid that.
>
> Diff: http://cr.openjdk.java.net/~pliden/8229451/webrev.1-diff
> Full: http://cr.openjdk.java.net/~pliden/8229451/webrev.1
>
> Testing: Passed tier1-3
>
> cheers,
> Per
>
> On 8/14/19 10:27 AM, Per Liden wrote:
>> Thanks Erik!
>>
>> I agree that another path in ZRootIterator is unfortunate, but the 
>> alternatives I've managed to come up with tend to be worse.
>>
>> /Per
>>
>> On 8/14/19 9:57 AM, Erik Osterlund wrote:
>>> Hi Per,
>>>
>>> Unfortunate with another special path in the root iterator. But 
>>> alternatives also look bad. Looks good.
>>>
>>> Thanks,
>>> /Erik
>>>
>>>> On 13 Aug 2019, at 10:42, Per Liden <per.liden at oracle.com> wrote:
>>>>
>>>> JDK-8227226 can temporarily create long[] objects on the heap, 
>>>> which later become oop arrays, when the array initialization has 
>>>> been completed. This is fine from a sampling/reporting point of 
>>>> view (the things done in the MemAllocator::Allocation destructor), 
>>>> since that only happens after the final klass pointer has been 
>>>> installed. However, if a heap iteration (via ZHeapIterator) happens 
>>>> before the final klass pointer has been installed, it will then see 
>>>> the long[]. As far as I can tell, this isn't a big deal, unless 
>>>> that heap iteration is out to JVMTI-tag all long[] instances. In 
>>>> that case, we tag a long[] which will later become an oop array, 
>>>> which seems wrong and potentially problematic. To avoid this, we 
>>>> want to be able to hide these roots from the heap iterator until 
>>>> the final klass pointer has been installed.
>>>>
>>>> The approach here is that these temporary long[] objects are not 
>>>> kept alive in a Handle, but instead treated as a special root in 
>>>> ZThreadLocalData, that can optionally be made invisible to the 
>>>> ZRootsIterator.
>>>>
>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
>>>> Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0
>>>>
>>>> /Per
>>>


From per.liden at oracle.com  Thu Aug 22 08:52:55 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 22 Aug 2019 10:52:55 +0200
Subject: RFR: 8229451: ZGC: Make some roots invisible to the heap iterator
In-Reply-To: <3cb906bc-bfba-cea1-af26-1e0bbccbb267@oracle.com>
References: <dfba9c11-7812-ce9a-e49e-9d20ed7e5785@oracle.com>
 <EE263944-8EF1-4C89-81FE-D56410B66950@oracle.com>
 <dd0a4ef5-a545-224b-998d-ca3ca05a75ef@oracle.com>
 <2c853a9b-879f-1b04-86df-c795a0d6e4aa@oracle.com>
 <3cb906bc-bfba-cea1-af26-1e0bbccbb267@oracle.com>
Message-ID: <609c18e2-f87a-0923-aa28-35d46b1b9889@oracle.com>

Thanks Erik!

/Per

On 8/22/19 10:03 AM, Erik ?sterlund wrote:
> Hi Per,
> 
> Looks reasonable.
> 
> /Erik
> 
> On 2019-08-21 15:43, Per Liden wrote:
>> Returned to this patch and did some more testing, and realized that 
>> the UnhandledOops checker will complain about the raw oop. Adjusted 
>> the code to avoid that.
>>
>> Diff: http://cr.openjdk.java.net/~pliden/8229451/webrev.1-diff
>> Full: http://cr.openjdk.java.net/~pliden/8229451/webrev.1
>>
>> Testing: Passed tier1-3
>>
>> cheers,
>> Per
>>
>> On 8/14/19 10:27 AM, Per Liden wrote:
>>> Thanks Erik!
>>>
>>> I agree that another path in ZRootIterator is unfortunate, but the 
>>> alternatives I've managed to come up with tend to be worse.
>>>
>>> /Per
>>>
>>> On 8/14/19 9:57 AM, Erik Osterlund wrote:
>>>> Hi Per,
>>>>
>>>> Unfortunate with another special path in the root iterator. But 
>>>> alternatives also look bad. Looks good.
>>>>
>>>> Thanks,
>>>> /Erik
>>>>
>>>>> On 13 Aug 2019, at 10:42, Per Liden <per.liden at oracle.com> wrote:
>>>>>
>>>>> JDK-8227226 can temporarily create long[] objects on the heap, 
>>>>> which later become oop arrays, when the array initialization has 
>>>>> been completed. This is fine from a sampling/reporting point of 
>>>>> view (the things done in the MemAllocator::Allocation destructor), 
>>>>> since that only happens after the final klass pointer has been 
>>>>> installed. However, if a heap iteration (via ZHeapIterator) happens 
>>>>> before the final klass pointer has been installed, it will then see 
>>>>> the long[]. As far as I can tell, this isn't a big deal, unless 
>>>>> that heap iteration is out to JVMTI-tag all long[] instances. In 
>>>>> that case, we tag a long[] which will later become an oop array, 
>>>>> which seems wrong and potentially problematic. To avoid this, we 
>>>>> want to be able to hide these roots from the heap iterator until 
>>>>> the final klass pointer has been installed.
>>>>>
>>>>> The approach here is that these temporary long[] objects are not 
>>>>> kept alive in a Handle, but instead treated as a special root in 
>>>>> ZThreadLocalData, that can optionally be made invisible to the 
>>>>> ZRootsIterator.
>>>>>
>>>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8229451
>>>>> Webrev: http://cr.openjdk.java.net/~pliden/8229451/webrev.0
>>>>>
>>>>> /Per
>>>>
> 


From shade at redhat.com  Thu Aug 22 10:29:01 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 22 Aug 2019 12:29:01 +0200
Subject: RFR (S) 8230024: Shenandoah: remove unnecessary
 ShenandoahTimingConverter
Message-ID: <06718016-c6d2-14ae-a1af-39b510a794e5@redhat.com>

RFE:
  https://bugs.openjdk.java.net/browse/JDK-8230024

Webrev:
  https://cr.openjdk.java.net/~shade/8230024/webrev.01/

This is not used, and it is the source of many build failures when WeakProcessor and OopStorage
shapes change a lot. Should be removed for clarity. This blocks the fix for JDK-8229998. I believe
this one should go in separately to be backportable.

Testing: hotspot_gc_shenandoah {fastdebug|release}, grepping around the source code

-- 
Thanks,
-Aleksey


From shade at redhat.com  Thu Aug 22 10:29:12 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 22 Aug 2019 12:29:12 +0200
Subject: RFR (XS) 8229998: Build failure after JDK-8227054
Message-ID: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8229998

Fix:
  https://cr.openjdk.java.net/~shade/8229998/webrev.01/

This is caused by rewiring for new OopStorage accessors. I think I got the new mappings correctly.
The other part of the build fix is JDK-8230024, which I do separately as backportable change.

Testing: hotspot_gc_shenandoah {fastdebug,release}

-- 
Thanks,
-Aleksey


From rkennke at redhat.com  Thu Aug 22 10:31:33 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 22 Aug 2019 12:31:33 +0200
Subject: RFR (S) 8230024: Shenandoah: remove unnecessary
 ShenandoahTimingConverter
In-Reply-To: <06718016-c6d2-14ae-a1af-39b510a794e5@redhat.com>
References: <06718016-c6d2-14ae-a1af-39b510a794e5@redhat.com>
Message-ID: <da86cd44-3da6-2e6d-e2f4-77b5c38e5a30@redhat.com>

Looks ok to me. Let Zhengyu also look at it.

Thanks,
Roman


> RFE:
>   https://bugs.openjdk.java.net/browse/JDK-8230024
> 
> Webrev:
>   https://cr.openjdk.java.net/~shade/8230024/webrev.01/
> 
> This is not used, and it is the source of many build failures when WeakProcessor and OopStorage
> shapes change a lot. Should be removed for clarity. This blocks the fix for JDK-8229998. I believe
> this one should go in separately to be backportable.
> 
> Testing: hotspot_gc_shenandoah {fastdebug|release}, grepping around the source code
> 


From rkennke at redhat.com  Thu Aug 22 10:31:44 2019
From: rkennke at redhat.com (Roman Kennke)
Date: Thu, 22 Aug 2019 12:31:44 +0200
Subject: RFR (XS) 8229998: Build failure after JDK-8227054
In-Reply-To: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
References: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
Message-ID: <9f653a02-8719-10bd-3bd0-215096d8c228@redhat.com>

Looks ok to me. Let Zhengyu also look at it.

Thanks,Roman

> Bug:
>   https://bugs.openjdk.java.net/browse/JDK-8229998
> 
> Fix:
>   https://cr.openjdk.java.net/~shade/8229998/webrev.01/
> 
> This is caused by rewiring for new OopStorage accessors. I think I got the new mappings correctly.
> The other part of the build fix is JDK-8230024, which I do separately as backportable change.
> 
> Testing: hotspot_gc_shenandoah {fastdebug,release}
> 


From zgu at redhat.com  Thu Aug 22 11:06:25 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 22 Aug 2019 07:06:25 -0400
Subject: RFR (S) 8230024: Shenandoah: remove unnecessary
 ShenandoahTimingConverter
In-Reply-To: <06718016-c6d2-14ae-a1af-39b510a794e5@redhat.com>
References: <06718016-c6d2-14ae-a1af-39b510a794e5@redhat.com>
Message-ID: <e49f8b3c-17f6-777c-7564-c78d65fb6908@redhat.com>

Good to me.

Thanks,

-Zhengyu

On 8/22/19 6:29 AM, Aleksey Shipilev wrote:
> RFE:
>    https://bugs.openjdk.java.net/browse/JDK-8230024
> 
> Webrev:
>    https://cr.openjdk.java.net/~shade/8230024/webrev.01/
> 
> This is not used, and it is the source of many build failures when WeakProcessor and OopStorage
> shapes change a lot. Should be removed for clarity. This blocks the fix for JDK-8229998. I believe
> this one should go in separately to be backportable.
> 
> Testing: hotspot_gc_shenandoah {fastdebug|release}, grepping around the source code
> 


From zgu at redhat.com  Thu Aug 22 11:08:22 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 22 Aug 2019 07:08:22 -0400
Subject: RFR (XS) 8229998: Build failure after JDK-8227054
In-Reply-To: <9f653a02-8719-10bd-3bd0-215096d8c228@redhat.com>
References: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
 <9f653a02-8719-10bd-3bd0-215096d8c228@redhat.com>
Message-ID: <30e2353e-6df8-94aa-49be-63ba28e1f8e3@redhat.com>

Good to me too.

Thanks,

-Zhengyu

On 8/22/19 6:31 AM, Roman Kennke wrote:
> Looks ok to me. Let Zhengyu also look at it.
> 
> Thanks,Roman
> 
>> Bug:
>>    https://bugs.openjdk.java.net/browse/JDK-8229998
>>
>> Fix:
>>    https://cr.openjdk.java.net/~shade/8229998/webrev.01/
>>
>> This is caused by rewiring for new OopStorage accessors. I think I got the new mappings correctly.
>> The other part of the build fix is JDK-8230024, which I do separately as backportable change.
>>
>> Testing: hotspot_gc_shenandoah {fastdebug,release}
>>
> 


From kim.barrett at oracle.com  Thu Aug 22 17:54:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 22 Aug 2019 13:54:42 -0400
Subject: RFR (XS) 8229998: Build failure after JDK-8227054
In-Reply-To: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
References: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
Message-ID: <A69E6915-FE83-4422-A9A4-7E766BF1DDEE@oracle.com>

> On Aug 22, 2019, at 6:29 AM, Aleksey Shipilev <shade at redhat.com> wrote:
> 
> Bug:
>  https://bugs.openjdk.java.net/browse/JDK-8229998
> 
> Fix:
>  https://cr.openjdk.java.net/~shade/8229998/webrev.01/
> 
> This is caused by rewiring for new OopStorage accessors. I think I got the new mappings correctly.
> The other part of the build fix is JDK-8230024, which I do separately as backportable change.
> 
> Testing: hotspot_gc_shenandoah {fastdebug,release}
> 
> -- 
> Thanks,
> -Aleksey

Drat!  Sorry about that.  I even had a Shenandoah update for an earlier version of
the change, but dropped the ball on that with the later revision.

Your change looks good.


From shade at redhat.com  Thu Aug 22 18:01:06 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 22 Aug 2019 20:01:06 +0200
Subject: RFR (XS) 8229998: Build failure after JDK-8227054
In-Reply-To: <A69E6915-FE83-4422-A9A4-7E766BF1DDEE@oracle.com>
References: <8c7a83c0-1426-a179-828b-2448aaa0183b@redhat.com>
 <A69E6915-FE83-4422-A9A4-7E766BF1DDEE@oracle.com>
Message-ID: <97f6dad3-84d3-83c1-ef66-ead9531075d1@redhat.com>

On 8/22/19 7:54 PM, Kim Barrett wrote:
>> On Aug 22, 2019, at 6:29 AM, Aleksey Shipilev <shade at redhat.com> wrote:
>>
>> Bug:
>>  https://bugs.openjdk.java.net/browse/JDK-8229998
>>
>> Fix:
>>  https://cr.openjdk.java.net/~shade/8229998/webrev.01/
>>
>> This is caused by rewiring for new OopStorage accessors. I think I got the new mappings correctly.
>> The other part of the build fix is JDK-8230024, which I do separately as backportable change.
>>
>> Testing: hotspot_gc_shenandoah {fastdebug,release}
>>
>> -- 
>> Thanks,
>> -Aleksey
> 
> Drat!  Sorry about that.  I even had a Shenandoah update for an earlier version of
> the change, but dropped the ball on that with the later revision.
> 
> Your change looks good.

Thanks, we have already pushed it :)

-- 
Thanks,
-Aleksey


From shade at redhat.com  Thu Aug 22 18:17:50 2019
From: shade at redhat.com (Aleksey Shipilev)
Date: Thu, 22 Aug 2019 20:17:50 +0200
Subject: RFR (XS) 8230046: Build failure after JDK-8230003
Message-ID: <8c9a892b-7e1b-af95-4b43-9208268bae33@redhat.com>

Bug:
  https://bugs.openjdk.java.net/browse/JDK-8230046

Fix:

diff -r 01d9a1cff83a src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp     Thu Aug 22 18:54:56 2019 +0100
+++ b/src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp     Thu Aug 22 20:12:05 2019 +0200
@@ -364,5 +364,5 @@
 }

-void ShenandoahAsserts::assert_locked_or_shenandoah_safepoint(const Monitor* lock, const char*
file, int line) {
+void ShenandoahAsserts::assert_locked_or_shenandoah_safepoint(const Mutex* lock, const char* file,
int line) {
   if (ShenandoahSafepoint::is_at_shenandoah_safepoint()) {
     return;

diff -r 01d9a1cff83a src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp
--- a/src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp     Thu Aug 22 18:54:56 2019 +0100
+++ b/src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp     Thu Aug 22 20:12:05 2019 +0200
@@ -67,5 +67,5 @@
   static void assert_rp_isalive_installed(const char *file, int line);

-  static void assert_locked_or_shenandoah_safepoint(const Monitor* lock, const char*file, int line);
+  static void assert_locked_or_shenandoah_safepoint(const Mutex* lock, const char*file, int line);

 #ifdef ASSERT


Testing: build, tier1_gc_shenandoah {fastdebug,release}

-- 
Thanks,
-Aleksey


From zgu at redhat.com  Thu Aug 22 18:20:04 2019
From: zgu at redhat.com (Zhengyu Gu)
Date: Thu, 22 Aug 2019 14:20:04 -0400
Subject: RFR (XS) 8230046: Build failure after JDK-8230003
In-Reply-To: <8c9a892b-7e1b-af95-4b43-9208268bae33@redhat.com>
References: <8c9a892b-7e1b-af95-4b43-9208268bae33@redhat.com>
Message-ID: <bfa12ed8-0857-0eeb-7855-cc6dea5744fb@redhat.com>

Looks good.

Thanks,

-Zhengyu

On 8/22/19 2:17 PM, Aleksey Shipilev wrote:
> Bug:
>    https://bugs.openjdk.java.net/browse/JDK-8230046
> 
> Fix:
> 
> diff -r 01d9a1cff83a src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp     Thu Aug 22 18:54:56 2019 +0100
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahAsserts.cpp     Thu Aug 22 20:12:05 2019 +0200
> @@ -364,5 +364,5 @@
>   }
> 
> -void ShenandoahAsserts::assert_locked_or_shenandoah_safepoint(const Monitor* lock, const char*
> file, int line) {
> +void ShenandoahAsserts::assert_locked_or_shenandoah_safepoint(const Mutex* lock, const char* file,
> int line) {
>     if (ShenandoahSafepoint::is_at_shenandoah_safepoint()) {
>       return;
> 
> diff -r 01d9a1cff83a src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp
> --- a/src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp     Thu Aug 22 18:54:56 2019 +0100
> +++ b/src/hotspot/share/gc/shenandoah/shenandoahAsserts.hpp     Thu Aug 22 20:12:05 2019 +0200
> @@ -67,5 +67,5 @@
>     static void assert_rp_isalive_installed(const char *file, int line);
> 
> -  static void assert_locked_or_shenandoah_safepoint(const Monitor* lock, const char*file, int line);
> +  static void assert_locked_or_shenandoah_safepoint(const Mutex* lock, const char*file, int line);
> 
>   #ifdef ASSERT
> 
> 
> Testing: build, tier1_gc_shenandoah {fastdebug,release}
> 


From per.liden at oracle.com  Fri Aug 23 08:42:08 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 10:42:08 +0200
Subject: RFR: 8230090: ZGC: Introduce ZSyscall
Message-ID: <e65f2412-a307-462f-801a-68e8bf9577b0@oracle.com>

Move the raw syscalls done in ZBackingFile into a class that can be 
shared by all CPUs on Linux. Only the SYS_<number> macros need to be CPU 
specific. This is paves the way for JDK-8230092, where I consolidate 
ZBackingFile, ZBackingPath and ZPhysicalMemoryBacking, which today are 
duplicated on linux_x86 and linux_aarch64.

Bug: https://bugs.openjdk.java.net/browse/JDK-8230090
Webrev: http://cr.openjdk.java.net/~pliden/8230090/webrev.0

Testing: Builds on linux_x86 and linux_aarch64, passes dacapo

/Per


From per.liden at oracle.com  Fri Aug 23 08:42:14 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 10:42:14 +0200
Subject: RFR: 8230092: ZGC: Consolidate ZBackingFile, ZBackingPath and
 ZPhysicalMemoryBacking on Linux
Message-ID: <3eb978e9-928b-f421-5127-5f64ddeb93fd@oracle.com>

After JDK-8230090, we can move ZBackingFile, ZBackingPath and 
ZPhysicalMemoryBacking to os/linux/gc/z rather than having them 
duplicated in os_cpu/linux_{x86,aarch64}/gc/z.

Bug: https://bugs.openjdk.java.net/browse/JDK-8230092
Webrev: http://cr.openjdk.java.net/~pliden/8230092/webrev.0

Testing: Builds on linux_x86 and linux_aarch64, passes dacapo

/Per


From per.liden at oracle.com  Fri Aug 23 11:14:33 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 13:14:33 +0200
Subject: RFR: 8230096: ZGC: Remove unused ZObjectAllocator::_nworkers
Message-ID: <65a4a062-b50c-0e50-278c-6df730203373@oracle.com>

ZObjectAllocator::_nworkers is unused and should be removed.

Bug: https://bugs.openjdk.java.net/browse/JDK-8230096
Webrev: http://cr.openjdk.java.net/~pliden/8230096/webrev.0

/Per


From stefan.karlsson at oracle.com  Fri Aug 23 12:29:52 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 23 Aug 2019 14:29:52 +0200
Subject: RFR: 8230090: ZGC: Introduce ZSyscall
In-Reply-To: <e65f2412-a307-462f-801a-68e8bf9577b0@oracle.com>
References: <e65f2412-a307-462f-801a-68e8bf9577b0@oracle.com>
Message-ID: <7b5521dd-7110-3ef7-2b86-f20a2ce1b9ba@oracle.com>

Looks good.

StefanK

On 2019-08-23 10:42, Per Liden wrote:
> Move the raw syscalls done in ZBackingFile into a class that can be 
> shared by all CPUs on Linux. Only the SYS_<number> macros need to be CPU 
> specific. This is paves the way for JDK-8230092, where I consolidate 
> ZBackingFile, ZBackingPath and ZPhysicalMemoryBacking, which today are 
> duplicated on linux_x86 and linux_aarch64.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8230090
> Webrev: http://cr.openjdk.java.net/~pliden/8230090/webrev.0
> 
> Testing: Builds on linux_x86 and linux_aarch64, passes dacapo
> 
> /Per


From stefan.karlsson at oracle.com  Fri Aug 23 12:33:48 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 23 Aug 2019 14:33:48 +0200
Subject: RFR: 8230092: ZGC: Consolidate ZBackingFile, ZBackingPath and
 ZPhysicalMemoryBacking on Linux
In-Reply-To: <3eb978e9-928b-f421-5127-5f64ddeb93fd@oracle.com>
References: <3eb978e9-928b-f421-5127-5f64ddeb93fd@oracle.com>
Message-ID: <9da62f29-4267-6457-c04f-321971fb57ab@oracle.com>

Looks good.

StefanK

On 2019-08-23 10:42, Per Liden wrote:
> After JDK-8230090, we can move ZBackingFile, ZBackingPath and 
> ZPhysicalMemoryBacking to os/linux/gc/z rather than having them 
> duplicated in os_cpu/linux_{x86,aarch64}/gc/z.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8230092
> Webrev: http://cr.openjdk.java.net/~pliden/8230092/webrev.0
> 
> Testing: Builds on linux_x86 and linux_aarch64, passes dacapo
> 
> /Per


From stefan.karlsson at oracle.com  Fri Aug 23 12:34:47 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Fri, 23 Aug 2019 14:34:47 +0200
Subject: RFR: 8230096: ZGC: Remove unused ZObjectAllocator::_nworkers
In-Reply-To: <65a4a062-b50c-0e50-278c-6df730203373@oracle.com>
References: <65a4a062-b50c-0e50-278c-6df730203373@oracle.com>
Message-ID: <dea54194-2eb6-931b-55e2-aa2c1a36d50e@oracle.com>

Looks good.

StefanK

On 2019-08-23 13:14, Per Liden wrote:
> ZObjectAllocator::_nworkers is unused and should be removed.
> 
> Bug: https://bugs.openjdk.java.net/browse/JDK-8230096
> Webrev: http://cr.openjdk.java.net/~pliden/8230096/webrev.0
> 
> /Per


From per.liden at oracle.com  Fri Aug 23 12:37:53 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 14:37:53 +0200
Subject: RFR: 8230090: ZGC: Introduce ZSyscall
In-Reply-To: <7b5521dd-7110-3ef7-2b86-f20a2ce1b9ba@oracle.com>
References: <e65f2412-a307-462f-801a-68e8bf9577b0@oracle.com>
 <7b5521dd-7110-3ef7-2b86-f20a2ce1b9ba@oracle.com>
Message-ID: <d46816aa-f9e5-d7cb-6eb8-6b87161f6ed5@oracle.com>

Thanks!

/Per

On 8/23/19 2:29 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-23 10:42, Per Liden wrote:
>> Move the raw syscalls done in ZBackingFile into a class that can be 
>> shared by all CPUs on Linux. Only the SYS_<number> macros need to be 
>> CPU specific. This is paves the way for JDK-8230092, where I 
>> consolidate ZBackingFile, ZBackingPath and ZPhysicalMemoryBacking, 
>> which today are duplicated on linux_x86 and linux_aarch64.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230090
>> Webrev: http://cr.openjdk.java.net/~pliden/8230090/webrev.0
>>
>> Testing: Builds on linux_x86 and linux_aarch64, passes dacapo
>>
>> /Per


From per.liden at oracle.com  Fri Aug 23 12:38:01 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 14:38:01 +0200
Subject: RFR: 8230092: ZGC: Consolidate ZBackingFile, ZBackingPath and
 ZPhysicalMemoryBacking on Linux
In-Reply-To: <9da62f29-4267-6457-c04f-321971fb57ab@oracle.com>
References: <3eb978e9-928b-f421-5127-5f64ddeb93fd@oracle.com>
 <9da62f29-4267-6457-c04f-321971fb57ab@oracle.com>
Message-ID: <758f8eab-5e3d-1a37-0299-ae4034a92610@oracle.com>

Thanks!

/Per

On 8/23/19 2:33 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-23 10:42, Per Liden wrote:
>> After JDK-8230090, we can move ZBackingFile, ZBackingPath and 
>> ZPhysicalMemoryBacking to os/linux/gc/z rather than having them 
>> duplicated in os_cpu/linux_{x86,aarch64}/gc/z.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230092
>> Webrev: http://cr.openjdk.java.net/~pliden/8230092/webrev.0
>>
>> Testing: Builds on linux_x86 and linux_aarch64, passes dacapo
>>
>> /Per


From per.liden at oracle.com  Fri Aug 23 12:38:09 2019
From: per.liden at oracle.com (Per Liden)
Date: Fri, 23 Aug 2019 14:38:09 +0200
Subject: RFR: 8230096: ZGC: Remove unused ZObjectAllocator::_nworkers
In-Reply-To: <dea54194-2eb6-931b-55e2-aa2c1a36d50e@oracle.com>
References: <65a4a062-b50c-0e50-278c-6df730203373@oracle.com>
 <dea54194-2eb6-931b-55e2-aa2c1a36d50e@oracle.com>
Message-ID: <9c6c8dc0-2f56-6d4d-d940-64ece473d371@oracle.com>

Thanks!

/Per

On 8/23/19 2:34 PM, Stefan Karlsson wrote:
> Looks good.
> 
> StefanK
> 
> On 2019-08-23 13:14, Per Liden wrote:
>> ZObjectAllocator::_nworkers is unused and should be removed.
>>
>> Bug: https://bugs.openjdk.java.net/browse/JDK-8230096
>> Webrev: http://cr.openjdk.java.net/~pliden/8230096/webrev.0
>>
>> /Per


From kim.barrett at oracle.com  Sat Aug 24 04:02:33 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 24 Aug 2019 00:02:33 -0400
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather than
 buffer counts 
Message-ID: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>

Please review this change to G1DirtyCardQueueSet and its clients to
consistently use card counts, rather than a mix of card counts and
buffer counts, to measure pending work and work performed.

JDK-8227719 already changed DCQS to use card counts internally, but
retained the buffer count API (estimating based on card counts and the
buffer size) to reduce the fanout from that change.  This change
removes that buffer count API and updates clients to consistently use
card counts.  It also updates some names accordingly.  For example,
*log_buffer_entry* => *logged_cards*.

There aren't any _intentional_ behavioral changes here, just unit and
nomenclature changes.

A lingering use of buffer counts is DCQS::_processed_buffers_(mut|rs_thread).
These are only used in the RemSetSummary, to print some statistics.
I'm planning to address that as part of other work.

CR:
https://bugs.openjdk.java.net/browse/JDK-8230109

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230109/open.00/

Testing:
mach5 tier1-3

Manually examined log output for gcbasher to verify refinement related
values were consistent with using card units rather than buffer units.


From mandrikov at gmail.com  Sun Aug 25 19:28:07 2019
From: mandrikov at gmail.com (Evgeny Mandrikov)
Date: Sun, 25 Aug 2019 21:28:07 +0200
Subject: RFR: JDK-8215166: Remove unused G1PretouchAuxiliaryMemory option
Message-ID: <CAEPFu6_go4sfFAfuf3s79JOAMHX0Y4v7YknH-+odDTNHvZdbZQ@mail.gmail.com>

Hello!

Please review patch [1] for JDK-8215166 [2]. Also it needs a sponsor since
I have only author status in OpenJDK Census [3].

After this change tier1 tests pass on my machine and I don't find other
occurrences of "G1PretouchAuxiliaryMemory".


With best regards,
Evgeny Mandrikov

[1] http://cr.openjdk.java.net/~godin/8215166/webrev.00/
[2] https://bugs.openjdk.java.net/browse/JDK-8215166
[3] https://openjdk.java.net/census#godin


From stefan.johansson at oracle.com  Mon Aug 26 08:52:40 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 26 Aug 2019 10:52:40 +0200
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
Message-ID: <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>

Hi Kim,

> 24 aug. 2019 kl. 06:02 skrev Kim Barrett <kim.barrett at oracle.com>:
> 
> Please review this change to G1DirtyCardQueueSet and its clients to
> consistently use card counts, rather than a mix of card counts and
> buffer counts, to measure pending work and work performed.
> 
> JDK-8227719 already changed DCQS to use card counts internally, but
> retained the buffer count API (estimating based on card counts and the
> buffer size) to reduce the fanout from that change.  This change
> removes that buffer count API and updates clients to consistently use
> card counts.  It also updates some names accordingly.  For example,
> *log_buffer_entry* => *logged_cards*.
> 
> There aren't any _intentional_ behavioral changes here, just unit and
> nomenclature changes.
> 
> A lingering use of buffer counts is DCQS::_processed_buffers_(mut|rs_thread).
> These are only used in the RemSetSummary, to print some statistics.
> I'm planning to address that as part of other work.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230109
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/ <http://cr.openjdk.java.net/~kbarrett/8230109/open.00/>

I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?

Thanks,
Stefan 

> 
> Testing:
> mach5 tier1-3
> 
> Manually examined log output for gcbasher to verify refinement related
> values were consistent with using card units rather than buffer units.
> 


From martin.doerr at sap.com  Mon Aug 26 13:04:27 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Mon, 26 Aug 2019 13:04:27 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
 <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>
Message-ID: <VI1PR0201MB2479CCFC54585663AF96B1E79AA10@VI1PR0201MB2479.eurprd02.prod.outlook.com>

Hi all,

I had noticed that the platforms selection which need a fence in taskqueue.inline.hpp should get updated.

My initial webrev
http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.00/
was already reviewed on hotspot-gc-dev. It is an attempt to make things more consistent, especially the property "CPU_MULTI_COPY_ATOMIC".
Also the compiler constant "support_IRIW_for_not_multiple_copy_atomic_cpu" depends on this property (currently only used on PPC64).

We could go one step further and move even more #defines into the platform files to give platform maintainers more control.
I haven't got feedback from arm/aarch64 folks about this addition, yet:
http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.01/
With this proposal, each platform which is "CPU_MULTI_COPY_ATOMIC" is supposed to define this macro.
Other platforms must define SUPPORT_IRIW_FOR_NOT_MULTI_COPY_ATOMIC_CPU and IRIW_WITH_RELEASE_VOLATILE_IN_CONSTRUCTOR for fine-grained control of the memory ordering behavior.
We can even control them dynamically (added an experimental switch for PPC64 as an example).

Note that neither webrev.00 nor webrev.01 contain any functional changes other than the taskqueue update for s390 (and the experimental switch for PPC64 in webrev.01).

Feedback is welcome. Also if you have a preference wrt. webrev.00 vs. webrev.01.

Best regards,
Martin


From kim.barrett at oracle.com  Mon Aug 26 15:42:29 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 26 Aug 2019 11:42:29 -0400
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
Message-ID: <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>

> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>> 
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230109
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
> 
> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?

Those are product options; changing their semantics like that is not so easy.


From stefan.johansson at oracle.com  Mon Aug 26 18:29:15 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Mon, 26 Aug 2019 20:29:15 +0200
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
Message-ID: <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>


> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
> 
>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>> 
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>> 
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>> 
>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
> 
> Those are product options; changing their semantics like that is not so easy.
> 

True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?

From kim.barrett at oracle.com  Mon Aug 26 20:48:40 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 26 Aug 2019 16:48:40 -0400
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
 <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
Message-ID: <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>

> On Aug 26, 2019, at 2:29 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> 
> 
>> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
>> 
>>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>> 
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>>> 
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>>> 
>>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
>> 
>> Those are product options; changing their semantics like that is not so easy.
>> 
> 
> True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?

Maybe, and maybe not.

If expressed in card counts, some of these numbers probably ought to
have minimum values that are some multiple of the buffer size (which
is also controlled by a CLA, G1UpdateBufferSize).  Having the external
value be in buffers, internally scaled by the buffer size, might have
usability benefits.  And suggesting or encouraging apparently higher
precision by having options in card counts isn't necessarily useful.

Also, some of those options might go away entirely.  I did some
experimental work a while ago on improving the adaptive controllers in
this area, which I want to get back to.  Some of those options were no
longer relevant or even interfering with that work.

To make such a change would involve adding new options that have card
units (and probably have worse / longer names than the existing names
that are already a mouthful).  I'd probably want to be experimental
because of the above mentioned possibility of elimination or other
changes.  Deprecate the old buffer-based options.  Add code to reject
explicit use of both.  Retain the new code to convert buffer options
to card units.  Somewhere down the road remove the deprecated options,
and perhaps upgrade to product the new options; or not, if some go
away entirely.

So I'd prefer they were left alone until such time as we have a better
understanding of what we actually want / need here.


From kim.barrett at oracle.com  Mon Aug 26 23:05:02 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Mon, 26 Aug 2019 19:05:02 -0400
Subject: RFR(T): 8230192: Rename G1RedirtyCardsBufferList to G1BufferNodeList 
Message-ID: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>

Please review this trivial refactoring that renames a simple utility
struct and moves its definition to its own .hpp/.cpp files.  This will
permit the class to be used in other contexts that aren't related to
redirtying cards.  (I have a change in development that does just
that, and broke this piece out.)

CR:
https://bugs.openjdk.java.net/browse/JDK-8230192

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230192/open.00/

Testing:
Local (linux-x64) hotspot:tier1.


From stefan.johansson at oracle.com  Tue Aug 27 07:34:50 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 27 Aug 2019 09:34:50 +0200
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
 <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
 <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>
Message-ID: <B55A783A-E822-44EA-946C-8D8ED6285912@oracle.com>


> 26 aug. 2019 kl. 22:48 skrev Kim Barrett <kim.barrett at oracle.com>:
> 
>> On Aug 26, 2019, at 2:29 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>> 
>> 
>> 
>>> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
>>> 
>>>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>> 
>>>>> CR:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>>>> 
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>>>> 
>>>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
>>> 
>>> Those are product options; changing their semantics like that is not so easy.
>>> 
>> 
>> True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?
> 
> Maybe, and maybe not.
> 
> If expressed in card counts, some of these numbers probably ought to
> have minimum values that are some multiple of the buffer size (which
> is also controlled by a CLA, G1UpdateBufferSize).  Having the external
> value be in buffers, internally scaled by the buffer size, might have
> usability benefits.  And suggesting or encouraging apparently higher
> precision by having options in card counts isn't necessarily useful.
> 
> Also, some of those options might go away entirely.  I did some
> experimental work a while ago on improving the adaptive controllers in
> this area, which I want to get back to.  Some of those options were no
> longer relevant or even interfering with that work.
> 
> To make such a change would involve adding new options that have card
> units (and probably have worse / longer names than the existing names
> that are already a mouthful).  I'd probably want to be experimental
> because of the above mentioned possibility of elimination or other
> changes.  Deprecate the old buffer-based options.  Add code to reject
> explicit use of both.  Retain the new code to convert buffer options
> to card units.  Somewhere down the road remove the deprecated options,
> and perhaps upgrade to product the new options; or not, if some go
> away entirely.
> 
> So I'd prefer they were left alone until such time as we have a better
> understanding of what we actually want / need here.
> 
Sounds like a good plan, and let?s hope we can figure out some good names :)


From stefan.johansson at oracle.com  Tue Aug 27 08:10:00 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Tue, 27 Aug 2019 10:10:00 +0200
Subject: RFR(T): 8230192: Rename G1RedirtyCardsBufferList to
 G1BufferNodeList
In-Reply-To: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
References: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
Message-ID: <F07A8288-34A1-4897-8A5E-C1DC587F6A2F@oracle.com>

Hi Kim,

> 27 aug. 2019 kl. 01:05 skrev Kim Barrett <kim.barrett at oracle.com>:
> 
> Please review this trivial refactoring that renames a simple utility
> struct and moves its definition to its own .hpp/.cpp files.  This will
> permit the class to be used in other contexts that aren't related to
> redirtying cards.  (I have a change in development that does just
> that, and broke this piece out.)
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230192
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230192/open.00/
> 
Looks good,
StefanJ

> Testing:
> Local (linux-x64) hotspot:tier1.
> 


From leo.korinth at oracle.com  Tue Aug 27 09:43:58 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Tue, 27 Aug 2019 11:43:58 +0200
Subject: RFR(T): 8230192: Rename G1RedirtyCardsBufferList to
 G1BufferNodeList
In-Reply-To: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
References: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
Message-ID: <c67ab137-a6a9-976d-3a6c-1073b83a3b4f@oracle.com>

On 27/08/2019 01:05, Kim Barrett wrote:
> Please review this trivial refactoring that renames a simple utility
> struct and moves its definition to its own .hpp/.cpp files.  This will
> permit the class to be used in other contexts that aren't related to
> redirtying cards.  (I have a change in development that does just
> that, and broke this piece out.)
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230192
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230192/open.00/
> 
> Testing:
> Local (linux-x64) hotspot:tier1.
> 

Looks good to me.

Thanks,
Leo


From stefan.karlsson at oracle.com  Tue Aug 27 13:20:51 2019
From: stefan.karlsson at oracle.com (Stefan Karlsson)
Date: Tue, 27 Aug 2019 15:20:51 +0200
Subject: RFR: 8229278: Improve hs_err location printing to assume less
 about GC internals
In-Reply-To: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>
References: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>
Message-ID: <2c700b4b-6590-de84-bde4-b6a04e69f708@oracle.com>

Hi Erik,

On 2019-08-08 11:46, Erik ?sterlund wrote:
> Hi,
> 
> Today when we crash and print hs_err files, the printing utility for 
> describing heap locations assumes:
> 1) That the Java heap memory reservation is contiguous
> 2) That the Java heap is parseable
> We should let the GC describe a location instead, opting in to such 
> concepts.
> 
> This patch adds a print_location pure virtual function on CollectedHeap
> allowing the GC to choose printing strategy. A new LocationPrinter 
> utility was added, allowing GCs to implement the functionality easily 
> without much code duplication.
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8229278/webrev.00/

This looks good, but I have some cleanups I'd like to make. If you agree 
we can either fold this into your patch, or I can create a separate RFE 
for it:

http://cr.openjdk.java.net/~stefank/8229278/webrev.01.delta/

1) Fixed some includes.

2) Replaced address with void*. I'd like to no propagate the usage of 
'address' into GC code.

3) Fixed the usage of is_readable_range to actually include check that 
the address of the klass field is readable.

4) Changed/removed some casts.

5) Removed redundant is_in check in ZGC's print_location.

Thanks,
StefanK

> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8229278
> 
> Thanks,
> /Erik


From kim.barrett at oracle.com  Tue Aug 27 14:46:50 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 10:46:50 -0400
Subject: RFR(T): 8230192: Rename G1RedirtyCardsBufferList to
 G1BufferNodeList
In-Reply-To: <c67ab137-a6a9-976d-3a6c-1073b83a3b4f@oracle.com>
References: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
 <c67ab137-a6a9-976d-3a6c-1073b83a3b4f@oracle.com>
Message-ID: <3B176186-2FB9-40E2-ACCC-0F41ED041C3B@oracle.com>

> On Aug 27, 2019, at 5:43 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> On 27/08/2019 01:05, Kim Barrett wrote:
>> Please review this trivial refactoring that renames a simple utility
>> struct and moves its definition to its own .hpp/.cpp files.  This will
>> permit the class to be used in other contexts that aren't related to
>> redirtying cards.  (I have a change in development that does just
>> that, and broke this piece out.)
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230192
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230192/open.00/
>> Testing:
>> Local (linux-x64) hotspot:tier1.
> 
> Looks good to me.
> 
> Thanks,
> Leo

Thanks.


From kim.barrett at oracle.com  Tue Aug 27 14:46:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 10:46:42 -0400
Subject: RFR(T): 8230192: Rename G1RedirtyCardsBufferList to
 G1BufferNodeList
In-Reply-To: <F07A8288-34A1-4897-8A5E-C1DC587F6A2F@oracle.com>
References: <5C3B055B-2E56-46C6-B573-868EAB37762F@oracle.com>
 <F07A8288-34A1-4897-8A5E-C1DC587F6A2F@oracle.com>
Message-ID: <126ABB2B-BA14-410A-93E5-1E58C9294675@oracle.com>

> On Aug 27, 2019, at 4:10 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
>> 27 aug. 2019 kl. 01:05 skrev Kim Barrett <kim.barrett at oracle.com>:
>> 
>> Please review this trivial refactoring that renames a simple utility
>> struct and moves its definition to its own .hpp/.cpp files.  This will
>> permit the class to be used in other contexts that aren't related to
>> redirtying cards.  (I have a change in development that does just
>> that, and broke this piece out.)
>> 
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230192
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230192/open.00/
>> 
> Looks good,
> StefanJ
> 
>> Testing:
>> Local (linux-x64) hotspot:tier1.

Thanks.


From erik.osterlund at oracle.com  Tue Aug 27 14:49:50 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Tue, 27 Aug 2019 16:49:50 +0200
Subject: RFR: 8229278: Improve hs_err location printing to assume less
 about GC internals
In-Reply-To: <2c700b4b-6590-de84-bde4-b6a04e69f708@oracle.com>
References: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>
 <2c700b4b-6590-de84-bde4-b6a04e69f708@oracle.com>
Message-ID: <4440cf8c-c492-6040-9db2-747e9b638bde@oracle.com>

Hi Stefan,

Thank you for the review. I like your cleanups. I folded them in to the 
next webrev: http://cr.openjdk.java.net/~eosterlund/8229278/webrev.01/

/Erik

On 2019-08-27 15:20, Stefan Karlsson wrote:
> Hi Erik,
> 
> On 2019-08-08 11:46, Erik ?sterlund wrote:
>> Hi,
>>
>> Today when we crash and print hs_err files, the printing utility for 
>> describing heap locations assumes:
>> 1) That the Java heap memory reservation is contiguous
>> 2) That the Java heap is parseable
>> We should let the GC describe a location instead, opting in to such 
>> concepts.
>>
>> This patch adds a print_location pure virtual function on CollectedHeap
>> allowing the GC to choose printing strategy. A new LocationPrinter 
>> utility was added, allowing GCs to implement the functionality easily 
>> without much code duplication.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8229278/webrev.00/
> 
> This looks good, but I have some cleanups I'd like to make. If you agree 
> we can either fold this into your patch, or I can create a separate RFE 
> for it:
> 
> http://cr.openjdk.java.net/~stefank/8229278/webrev.01.delta/
> 
> 1) Fixed some includes.
> 
> 2) Replaced address with void*. I'd like to no propagate the usage of 
> 'address' into GC code.
> 
> 3) Fixed the usage of is_readable_range to actually include check that 
> the address of the klass field is readable.
> 
> 4) Changed/removed some casts.
> 
> 5) Removed redundant is_in check in ZGC's print_location.
> 
> Thanks,
> StefanK
> 
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8229278
>>
>> Thanks,
>> /Erik


From kim.barrett at oracle.com  Tue Aug 27 14:54:03 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 10:54:03 -0400
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <B55A783A-E822-44EA-946C-8D8ED6285912@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
 <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
 <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>
 <B55A783A-E822-44EA-946C-8D8ED6285912@oracle.com>
Message-ID: <586198B3-4ABE-4A81-96F2-627AF82F47FA@oracle.com>

> On Aug 27, 2019, at 3:34 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> 
> 
>> 26 aug. 2019 kl. 22:48 skrev Kim Barrett <kim.barrett at oracle.com>:
>> 
>>> On Aug 26, 2019, at 2:29 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
>>>> 
>>>>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>>> 
>>>>>> CR:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>>>>> 
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>>>>> 
>>>>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
>>>> 
>>>> Those are product options; changing their semantics like that is not so easy.
>>>> 
>>> 
>>> True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?
>> 
>> Maybe, and maybe not.
>> 
>> [?]
>> 
>> So I'd prefer they were left alone until such time as we have a better
>> understanding of what we actually want / need here.
>> 
> Sounds like a good plan, and let?s hope we can figure out some good names :)

Thanks.


From kim.barrett at oracle.com  Tue Aug 27 19:22:41 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 15:22:41 -0400
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown 
Message-ID: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>

Please review this change to G1ConcurrentMarkThread::delay_to_keep_mmu
to use a wait-with-timeout on CGC_lock rather than os::sleep to
implement the delay.  This allows the delay to be terminated early by
a thread termination request.

Also fixed a units bug in the calculation of the delay.  The call to
G1MMUTracker::when_ms was with "now" in seconds and "prediction" in
milliseconds.  This function expects both arguments to be in seconds,
and returns a millisecond result.  This bug was largely masked by the
calculation first clipping the prediction value to the max GC time.

CR:
https://bugs.openjdk.java.net/browse/JDK-8230126

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230126/open.00/

Testing:
mach5 tier1-3

Locally (linux-x64) ran gcbasher test with some additional logging
added to the changed functions to verify the various values looked
okay.


From sangheon.kim at oracle.com  Tue Aug 27 23:06:40 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 27 Aug 2019 16:06:40 -0700
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
Message-ID: <45b37a40-6bea-9f28-8c2c-d0a6ee6bd2ee@oracle.com>

Hi Kim,

On 8/27/19 12:22 PM, Kim Barrett wrote:
> Please review this change to G1ConcurrentMarkThread::delay_to_keep_mmu
> to use a wait-with-timeout on CGC_lock rather than os::sleep to
> implement the delay.  This allows the delay to be terminated early by
> a thread termination request.
>
> Also fixed a units bug in the calculation of the delay.  The call to
> G1MMUTracker::when_ms was with "now" in seconds and "prediction" in
> milliseconds.  This function expects both arguments to be in seconds,
> and returns a millisecond result.  This bug was largely masked by the
> calculation first clipping the prediction value to the max GC time.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230126
>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
Couldn't always use 'ms'?
I think 'os::elapsedTime() * 1000' seems simpler instead of 'ms -> sec 
-> ms' conversion and may remove ceil() call.

Thanks,
Sangheon


>
> Testing:
> mach5 tier1-3
>
> Locally (linux-x64) ran gcbasher test with some additional logging
> added to the changed functions to verify the various values looked
> okay.
>


From kim.barrett at oracle.com  Tue Aug 27 23:07:35 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 19:07:35 -0400
Subject: RFR: 8229278: Improve hs_err location printing to assume less
 about GC internals
In-Reply-To: <4440cf8c-c492-6040-9db2-747e9b638bde@oracle.com>
References: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>
 <2c700b4b-6590-de84-bde4-b6a04e69f708@oracle.com>
 <4440cf8c-c492-6040-9db2-747e9b638bde@oracle.com>
Message-ID: <74A6DF3D-4175-4CD1-8146-0413F4533B0D@oracle.com>

> On Aug 27, 2019, at 10:49 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
> 
> Hi Stefan,
> 
> Thank you for the review. I like your cleanups. I folded them in to the next webrev: http://cr.openjdk.java.net/~eosterlund/8229278/webrev.01/

------------------------------------------------------------------------------

It's kind of annoying that we have both locationPrinter.hpp and
locationPrinter.inline.hpp.  I think locationPrinter.hpp probably
isn't useful without the .inline file also being included.

I was going to suggest merging into the .hpp and eliminating the
.inline.hpp, but noticed the use of compressedOops.inline.hpp.
Maybe go the other way, no .hpp file?  That's a little odd, but I
think there are some other places where that's been done.

Your call on doing anything with this comment.

------------------------------------------------------------------------------
src/hotspot/share/gc/shared/collectedHeap.hpp

Removed pure virtual block_start and block_is_obj.  However, some of
the derived classes still (now unnecessarily) declare their
implementations virtual.  (Too bad they aren't declared using the
C++11 "override" virtual specifier.)

------------------------------------------------------------------------------
src/hotspot/share/gc/z/zCollectedHeap.cpp
352 bool ZCollectedHeap::print_location(outputStream* st, void* addr) const {
353   if (LocationPrinter::is_valid_obj(addr)) {
...
358   }
359   return false;
360 }

ZGC won't attempt to determine whether addr is a pointer into the
middle of an object and print something useful in that case, correct?

I'm not sure whether it was previously attempting to do so either; I
don't really want to study the code being deleted. :)

But that seems like a loss of useful functionality compared to the
other collectors, assuming it's actually possible to do.  Which I'm
guessing it isn't?

------------------------------------------------------------------------------

This looks good, except for the now extraneous virtual specifiers.  I
don't need a new webrev for that.


From kim.barrett at oracle.com  Tue Aug 27 23:19:46 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Tue, 27 Aug 2019 19:19:46 -0400
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <45b37a40-6bea-9f28-8c2c-d0a6ee6bd2ee@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
 <45b37a40-6bea-9f28-8c2c-d0a6ee6bd2ee@oracle.com>
Message-ID: <63A69459-5DE5-44FF-B4E2-0F7323D559BD@oracle.com>

> On Aug 27, 2019, at 7:06 PM, sangheon.kim at oracle.com wrote:
> 
> Hi Kim,
> 
> On 8/27/19 12:22 PM, Kim Barrett wrote:
>> [?]
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230126
>> 
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
> Couldn't always use 'ms'?
> I think 'os::elapsedTime() * 1000' seems simpler instead of 'ms -> sec -> ms' conversion and may remove ceil() call.

The only ms -> sec conversion is the prediction in mmu_delay_end, and
that's necessary because we have a prediction value provided in ms and
the tracker wants sec.  Having the delay_end return a ms value would
add another sec -> ms conversion (using when_ms rather than when_sec;
remember that both take arguments in secs, and there was a bug in the
old code wrto that).

In delay_to_keep_mmu, there's necessarily going to be a sec -> ms
conversion somewhere because os::elapsedTime() returns secs and we
need to compute a ms value based on it, for Monitor::wait.

The use of ceil doesn't go away if the sec -> ms conversion is moved.
The floating point time value is a minimum delay; conversion of a
positive value to a jlong will truncate, which is the wrong rounding
direction.


From sangheon.kim at oracle.com  Wed Aug 28 04:37:01 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Tue, 27 Aug 2019 21:37:01 -0700
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <63A69459-5DE5-44FF-B4E2-0F7323D559BD@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
 <45b37a40-6bea-9f28-8c2c-d0a6ee6bd2ee@oracle.com>
 <63A69459-5DE5-44FF-B4E2-0F7323D559BD@oracle.com>
Message-ID: <982cd976-e927-ab9d-60e2-aba834874f79@oracle.com>


On 8/27/19 4:19 PM, Kim Barrett wrote:
>> On Aug 27, 2019, at 7:06 PM, sangheon.kim at oracle.com wrote:
>>
>> Hi Kim,
>>
>> On 8/27/19 12:22 PM, Kim Barrett wrote:
>>> [?]
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8230126
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
>> Couldn't always use 'ms'?
>> I think 'os::elapsedTime() * 1000' seems simpler instead of 'ms -> sec -> ms' conversion and may remove ceil() call.
> The only ms -> sec conversion is the prediction in mmu_delay_end, and
> that's necessary because we have a prediction value provided in ms and
> the tracker wants sec.  Having the delay_end return a ms value would
> add another sec -> ms conversion (using when_ms rather than when_sec;
> remember that both take arguments in secs, and there was a bug in the
> old code wrto that).
>
> In delay_to_keep_mmu, there's necessarily going to be a sec -> ms
> conversion somewhere because os::elapsedTime() returns secs and we
> need to compute a ms value based on it, for Monitor::wait.
You are right.
I misunderstood the signature of when_ms().
I thought it gets (ms, ms) but it isn't. Sorry for the noise.

>
> The use of ceil doesn't go away if the sec -> ms conversion is moved.
> The floating point time value is a minimum delay; conversion of a
> positive value to a jlong will truncate, which is the wrong rounding
> direction.
I do understand your intent here, but I think GC codes don't care less 
than 1ms. Do we?
I'm okay with ceil().

Thanks,
Sangheon


From kim.barrett at oracle.com  Wed Aug 28 04:54:29 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 00:54:29 -0400
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <982cd976-e927-ab9d-60e2-aba834874f79@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
 <45b37a40-6bea-9f28-8c2c-d0a6ee6bd2ee@oracle.com>
 <63A69459-5DE5-44FF-B4E2-0F7323D559BD@oracle.com>
 <982cd976-e927-ab9d-60e2-aba834874f79@oracle.com>
Message-ID: <779F13DE-F3F4-4381-A86E-58A7C8B687CD@oracle.com>

> On Aug 28, 2019, at 12:37 AM, sangheon.kim at oracle.com wrote:
> 
> 
> 
> On 8/27/19 4:19 PM, Kim Barrett wrote:
>>> On Aug 27, 2019, at 7:06 PM, sangheon.kim at oracle.com
>>>  wrote:
>>> 
>>> Hi Kim,
>>> 
>>> On 8/27/19 12:22 PM, Kim Barrett wrote:
>>> 
>>>> [?]
>>>> CR:
>>>> 
>>>> https://bugs.openjdk.java.net/browse/JDK-8230126
>>>> 
>>>> 
>>>> Webrev:
>>>> 
>>>> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
>>> Couldn't always use 'ms'?
>>> I think 'os::elapsedTime() * 1000' seems simpler instead of 'ms -> sec -> ms' conversion and may remove ceil() call.
>>> 
>> The only ms -> sec conversion is the prediction in mmu_delay_end, and
>> that's necessary because we have a prediction value provided in ms and
>> the tracker wants sec.  Having the delay_end return a ms value would
>> add another sec -> ms conversion (using when_ms rather than when_sec;
>> remember that both take arguments in secs, and there was a bug in the
>> old code wrto that).
>> 
>> In delay_to_keep_mmu, there's necessarily going to be a sec -> ms
>> conversion somewhere because os::elapsedTime() returns secs and we
>> need to compute a ms value based on it, for Monitor::wait.
>> 
> You are right. 
> I misunderstood the signature of when_ms().
> I thought it gets (ms, ms) but it isn't. Sorry for the noise.

Yeah, the name can easily fool one.

>> The use of ceil doesn't go away if the sec -> ms conversion is moved.
>> The floating point time value is a minimum delay; conversion of a
>> positive value to a jlong will truncate, which is the wrong rounding
>> direction.
>> 
> I do understand your intent here, but I think GC codes don't care less than 1ms. Do we?
> I'm okay with ceil().
> 
> Thanks,
> Sangheon

Thanks.


From erik.osterlund at oracle.com  Wed Aug 28 07:08:00 2019
From: erik.osterlund at oracle.com (Erik Osterlund)
Date: Wed, 28 Aug 2019 09:08:00 +0200
Subject: RFR: 8229278: Improve hs_err location printing to assume less
 about GC internals
In-Reply-To: <74A6DF3D-4175-4CD1-8146-0413F4533B0D@oracle.com>
References: <7d2f5a53-682e-1a04-4727-d2a5a08be3eb@oracle.com>
 <2c700b4b-6590-de84-bde4-b6a04e69f708@oracle.com>
 <4440cf8c-c492-6040-9db2-747e9b638bde@oracle.com>
 <74A6DF3D-4175-4CD1-8146-0413F4533B0D@oracle.com>
Message-ID: <FDB5227C-D34A-4E00-80D7-173BF88B9C47@oracle.com>

Hi Kim,

Thanks for reviewing this.

On 28 Aug 2019, at 01:07, Kim Barrett <kim.barrett at oracle.com> wrote:

>> On Aug 27, 2019, at 10:49 AM, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>> 
>> Hi Stefan,
>> 
>> Thank you for the review. I like your cleanups. I folded them in to the next webrev: http://cr.openjdk.java.net/~eosterlund/8229278/webrev.01/
> 
> ------------------------------------------------------------------------------
> 
> It's kind of annoying that we have both locationPrinter.hpp and
> locationPrinter.inline.hpp.  I think locationPrinter.hpp probably
> isn't useful without the .inline file also being included.
> 
> I was going to suggest merging into the .hpp and eliminating the
> .inline.hpp, but noticed the use of compressedOops.inline.hpp.
> Maybe go the other way, no .hpp file?  That's a little odd, but I
> think there are some other places where that's been done.
> 
> Your call on doing anything with this comment.
> 
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/shared/collectedHeap.hpp
> 
> Removed pure virtual block_start and block_is_obj.  However, some of
> the derived classes still (now unnecessarily) declare their
> implementations virtual.  (Too bad they aren't declared using the
> C++11 "override" virtual specifier.)

Ahhh yes. I will remove the unnecessary virtual specifiers.

> ------------------------------------------------------------------------------
> src/hotspot/share/gc/z/zCollectedHeap.cpp
> 352 bool ZCollectedHeap::print_location(outputStream* st, void* addr) const {
> 353   if (LocationPrinter::is_valid_obj(addr)) {
> ...
> 358   }
> 359   return false;
> 360 }
> 
> ZGC won't attempt to determine whether addr is a pointer into the
> middle of an object and print something useful in that case, correct?

Correct.

> I'm not sure whether it was previously attempting to do so either; I
> don't really want to study the code being deleted. :)

No it failed to do that in the past as well.

> But that seems like a loss of useful functionality compared to the
> other collectors, assuming it's actually possible to do.  Which I'm
> guessing it isn't?

Possible is a relative thing. ;)

The ZGC heap isn?t parsable, and we don?t have block offset tables. That makes it pretty hard to find the surrounding object of an arbitrary address in the heap. When we have to walk the heap to find things, we use tracing instead of iterative scanning due to the lack of parsability. It might be possible to sometimes find the base pointer using tracing, assuming you can find it in the object graph. But then again, we just crashed when we get here, so the tracing might crash too, and interesting values might be hidden in garbage.

So while it would be nice to improve heap location printing in crash reports, the solution is unfortunately not clear yet.

> ------------------------------------------------------------------------------
> 
> This looks good, except for the now extraneous virtual specifiers.  I
> don't need a new webrev for that.

Thanks for the review!

/Erik


From erik.osterlund at oracle.com  Wed Aug 28 12:42:23 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 28 Aug 2019 14:42:23 +0200
Subject: RFR: ZGC: Make zGlobals and zArguments OS agnostic
Message-ID: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>

Hi,

The contents of zGlobals and zArguments are cpu-specific, but not 
os_cpu-specific. Therefore, these files should be moved to be 
cpu-specific only.

Webrev:
http://cr.openjdk.java.net/~eosterlund/8230307/webrev.00/

Bug:
https://bugs.openjdk.java.net/browse/JDK-8230307

Thanks,
/Erik


From stefan.johansson at oracle.com  Wed Aug 28 13:16:45 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Wed, 28 Aug 2019 15:16:45 +0200
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
Message-ID: <45213d47-c37f-4a3d-c58e-a0f6e97f1de8@oracle.com>

Hi Kim,

On 2019-08-27 21:22, Kim Barrett wrote:
> Please review this change to G1ConcurrentMarkThread::delay_to_keep_mmu
> to use a wait-with-timeout on CGC_lock rather than os::sleep to
> implement the delay.  This allows the delay to be terminated early by
> a thread termination request.
> 
> Also fixed a units bug in the calculation of the delay.  The call to
> G1MMUTracker::when_ms was with "now" in seconds and "prediction" in
> milliseconds.  This function expects both arguments to be in seconds,
> and returns a millisecond result.  This bug was largely masked by the
> calculation first clipping the prediction value to the max GC time.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230126
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
> 
Looks good, thanks for fixing this.

Cheers,
Stefan

> Testing:
> mach5 tier1-3
> 
> Locally (linux-x64) ran gcbasher test with some additional logging
> added to the changed functions to verify the various values looked
> okay.
> 


From stuart.monteith at linaro.org  Wed Aug 28 14:42:10 2019
From: stuart.monteith at linaro.org (Stuart Monteith)
Date: Wed, 28 Aug 2019 15:42:10 +0100
Subject: RFR: ZGC: Make zGlobals and zArguments OS agnostic
In-Reply-To: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>
References: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>
Message-ID: <CAEGA6kY_7qvSYYRwADoxC3A8JpiVXarFGK=6b_k5L9Yrmi-Zrg@mail.gmail.com>

Looks OK to me. Built and tested on aarch64.

On Wed, 28 Aug 2019 at 13:43, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>
> Hi,
>
> The contents of zGlobals and zArguments are cpu-specific, but not
> os_cpu-specific. Therefore, these files should be moved to be
> cpu-specific only.
>
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8230307/webrev.00/
>
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8230307
>
> Thanks,
> /Erik


From erik.osterlund at oracle.com  Wed Aug 28 15:05:26 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Wed, 28 Aug 2019 17:05:26 +0200
Subject: RFR: ZGC: Make zGlobals and zArguments OS agnostic
In-Reply-To: <CAEGA6kY_7qvSYYRwADoxC3A8JpiVXarFGK=6b_k5L9Yrmi-Zrg@mail.gmail.com>
References: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>
 <CAEGA6kY_7qvSYYRwADoxC3A8JpiVXarFGK=6b_k5L9Yrmi-Zrg@mail.gmail.com>
Message-ID: <2c6c8d93-afb3-9a21-ec0d-ef044cd35a84@oracle.com>

Hi Stuart,

Thanks for the review!

/Erik

On 2019-08-28 16:42, Stuart Monteith wrote:
> Looks OK to me. Built and tested on aarch64.
> 
> On Wed, 28 Aug 2019 at 13:43, Erik ?sterlund <erik.osterlund at oracle.com> wrote:
>>
>> Hi,
>>
>> The contents of zGlobals and zArguments are cpu-specific, but not
>> os_cpu-specific. Therefore, these files should be moved to be
>> cpu-specific only.
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8230307/webrev.00/
>>
>> Bug:
>> https://bugs.openjdk.java.net/browse/JDK-8230307
>>
>> Thanks,
>> /Erik


From kim.barrett at oracle.com  Wed Aug 28 17:39:51 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 13:39:51 -0400
Subject: RFR(S): 8230126: delay_to_keep_mmu can delay shutdown
In-Reply-To: <45213d47-c37f-4a3d-c58e-a0f6e97f1de8@oracle.com>
References: <4F197A94-03FD-4143-85E4-3DF09EF918FA@oracle.com>
 <45213d47-c37f-4a3d-c58e-a0f6e97f1de8@oracle.com>
Message-ID: <B2CB6413-8274-46A6-8566-10E6498B9D30@oracle.com>

> On Aug 28, 2019, at 9:16 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 2019-08-27 21:22, Kim Barrett wrote:
>> Please review this change to G1ConcurrentMarkThread::delay_to_keep_mmu
>> to use a wait-with-timeout on CGC_lock rather than os::sleep to
>> implement the delay.  This allows the delay to be terminated early by
>> a thread termination request.
>> Also fixed a units bug in the calculation of the delay.  The call to
>> G1MMUTracker::when_ms was with "now" in seconds and "prediction" in
>> milliseconds.  This function expects both arguments to be in seconds,
>> and returns a millisecond result.  This bug was largely masked by the
>> calculation first clipping the prediction value to the max GC time.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230126
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230126/open.00/
> Looks good, thanks for fixing this.
> 
> Cheers,
> Stefan

Thanks.

> 
>> Testing:
>> mach5 tier1-3
>> Locally (linux-x64) ran gcbasher test with some additional logging
>> added to the changed functions to verify the various values looked
>> okay.


From kim.barrett at oracle.com  Wed Aug 28 18:12:58 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 14:12:58 -0400
Subject: RFR: JDK-8215166: Remove unused G1PretouchAuxiliaryMemory option
In-Reply-To: <CAEPFu6_go4sfFAfuf3s79JOAMHX0Y4v7YknH-+odDTNHvZdbZQ@mail.gmail.com>
References: <CAEPFu6_go4sfFAfuf3s79JOAMHX0Y4v7YknH-+odDTNHvZdbZQ@mail.gmail.com>
Message-ID: <9CE4B77F-EC98-4804-B744-6C2E15030936@oracle.com>

> On Aug 25, 2019, at 3:28 PM, Evgeny Mandrikov <mandrikov at gmail.com> wrote:
> 
> Hello!
> 
> Please review patch [1] for JDK-8215166 [2]. Also it needs a sponsor since
> I have only author status in OpenJDK Census [3].
> 
> After this change tier1 tests pass on my machine and I don't find other
> occurrences of "G1PretouchAuxiliaryMemory".
> 
> 
> With best regards,
> Evgeny Mandrikov
> 
> [1] http://cr.openjdk.java.net/~godin/8215166/webrev.00/
> [2] https://bugs.openjdk.java.net/browse/JDK-8215166
> [3] https://openjdk.java.net/census#godin

Looks good, and ?trivial? (only one Reviewer needed).

I will sponsor.


From mikhailo.seledtsov at oracle.com  Wed Aug 28 18:42:49 2019
From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com)
Date: Wed, 28 Aug 2019 11:42:49 -0700
Subject: RFR(XS): 8229210: [TESTBUG] Move gc stress tests from JFR directory
 tree to gc/stress
Message-ID: <8440d9c8-996f-581f-9b39-a51cb8b19539@oracle.com>

Please review this change that simply moves GC+JFR stress tests from 
test/jdk/jdk/jfr/event/gc/detailed/TestStress*
to test/hotspot/jtreg/gc/stress/jfr/

This change was originated by JFR team, and later informally discussed 
with GC team members. The goal is to remove
heavy time-consuming GC event stress tests from JFR hierarchy to 
GC/stress, where they better fit.

 ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8229210
 ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8229210.00/
 ?????? Just a simple move, no changes to the tests:
 ?????? hg mv test/jdk/jdk/jfr/event/gc/detailed/TestStress* 
test/hotspot/jtreg/gc/stress/jfr/

 ??? Testing:
 ??????? 1. Sanity: Ran test/hotspot/jtreg/gc/stress/jfr/ - PASS
 ??????? 2. Sanity: running jdk_jfr - in progress


Thank you,
Misha


From kim.barrett at oracle.com  Wed Aug 28 18:59:00 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 14:59:00 -0400
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for
 young gen for gc+heap=info log lines
In-Reply-To: <CAOzU2ikiLsXvPZ_UgR6Hp3dXWubE=1kZo0c5VN66LxbOAz27dA@mail.gmail.com>
References: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
 <B2C21495-2C3B-4FDB-B00A-4FA1668C71F4@oracle.com>
 <CAOzU2ikiLsXvPZ_UgR6Hp3dXWubE=1kZo0c5VN66LxbOAz27dA@mail.gmail.com>
Message-ID: <5A76D42A-91DA-491E-B78B-39966B9C62F0@oracle.com>

> On Aug 8, 2019, at 9:50 AM, Tony Printezis <tprintezis at twitter.com> wrote:
> 
> Hi Kim,
> 
> Inline.

Lost track of this and finally responding.  Sorry for the delay.

> Yeah, we can definitely increase the amount of code sharing here. And IMHO the main benefit is not to decrease the amount of code, but to ensure that the output is consistent across all GCs. But can I also point out that, before, there was NO code sharing whatsoever (all this code was replicated multiple times). Now at least there?s some common code and common macros. And we can improve on that further.

Right, the key points are (1) provide some of this information at all, and (2) provide it in a common format to
make it easier for analysis tools to extract.

> While we?re at it: I?m happy to work on follow-ups. What?s a good next step? As Thomas had suggested, I can change the formatting code to use more appropriate units instead of always K. Another possibility is to update the ?gc' log lines to the same format? E.g.,
> 
> [29.884s][info][gc           ] GC(24) Pause Young (Allocation Failure) 6147M->3M(9216M) 2.705ms

Similar log line format would be nice.  Varying the units might make analysis tools a
bit more complex, but would probably be nicer for human readers.  Though sometimes
those low order digits actually are important.

The main followup I?m interested in is a similar change for G1.  I have a test bug whose fix
is waiting for all of parallel, gch, and g1 to provide this transition information.  Are you planning
to do the G1 version too, or leave that to someone else?


From mandy.chung at oracle.com  Wed Aug 28 22:58:43 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Wed, 28 Aug 2019 15:58:43 -0700
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
Message-ID: <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>

Hi Paul,

The CSR proposes this method in java.lang.management.ThreadMXBean as a 
Java SE feature.

Has this been discussed with the GC team to commit measuring current 
thread's allocated bytes as Java SE feature??? Can this be supported by 
all JVM implementation??? What is the overhead if this is enabled by 
default?? Does it need to be disabled??? This metric is from TLAB that 
might be okay.? This needs advice/discussion with GC experts.

I see that CSR mentions it can be disabled and link to 
isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() 
methods but these methods are defined in com.sun.management.ThreadMXBean.

As Alan points out, current thread makes sense only in local VM 
management.? When this is monitored from a JMX client (e.g. jconsole to 
connect to a running JVM, "currentThreadAllowcatedBytes" attribute is 
the current thread in jconsole process which invoking Thread::currentThread?

Mandy

On 8/28/19 12:22 PM, Hohensee, Paul wrote:
>
> Please review a performance improvement for 
> ThreadMXBean.getThreadAllocatedBytes and the addition of 
> getCurrentThreadAllocatedBytes.
>
> JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266
>
> Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/
>
> CSR:https://bugs.openjdk.java.net/browse/JDK-8230311
>
> Previous email threads:
> https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html
> https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html
>
> The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d 
> be great for someone to review it.
>
> I took Mandy?s advice and put the fast paths in the library code. I 
> added a new JMM method GetOneThreadsAllocatedBytes that works the same 
> as GetThreadCpuTime: it uses a thread_id value of zero to distinguish 
> the current thread. On my Mac laptop, the result runs 47x faster for 
> the current thread than the old implementation.
>
> The 3 tests intest/jdk/com/sun/management/ThreadMXBean all pass. I 
> added code to ThreadAllocatedMemory.java to test 
> getCurrentThreadAllocatedBytes as well as variations on 
> getThreadAllocatedBytes(id). A submit repo job is in progress.
>
> Thanks,
>
> Paul
>


From kim.barrett at oracle.com  Thu Aug 29 00:27:10 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 20:27:10 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init unconditional 
Message-ID: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>

Please review this trivial cleanup of G1DirtyCardQueueSet
initialization. With the separation of G1RedirtyCardsQueueSet from
G1DirtyCardQueueSet, the latter is now a singleton class that should
always have an associated G1FreeIdSet.  We can now unconditionally
construct that object when constructing the DCQS.

CR:
https://bugs.openjdk.java.net/browse/JDK-8230327

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230327/open.00/

Testing:
Local (linux-x64) hotspot:tier1.


From kim.barrett at oracle.com  Thu Aug 29 01:05:11 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Wed, 28 Aug 2019 21:05:11 -0400
Subject: RFR(T): 8230332: G1DirtyCardQueueSet _notify_when_complete is always
 true 
Message-ID: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>

Please review this trivial cleanup of G1DirtyCardQueueSet.  With the
separation of G1RedirtyCardsQueueSet from G1DirtyCardQueueSet, the
latter is now a singleton class.  As a result, the
_notify_when_complete member (used to control whether adding completed
buffers should notify the completed buffer monitor) is always true.

This change removes that member and changes the conditional
notifications to be unconditional.  Also cleaned up some locker usage
for _cbl_mon when notification is needed, and changed to consistently
use notify_all() (there's no good reason to use notify(), and
definitely no good reason to use a mix of the two here).

CR:
https://bugs.openjdk.java.net/browse/JDK-8230332

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230332/open.00/

Testing:
Local (linux-x64) hotspot:tier1


From stefan.johansson at oracle.com  Thu Aug 29 06:21:08 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 29 Aug 2019 08:21:08 +0200
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
Message-ID: <d77cd9f0-94e3-36e1-56ff-d02e51911e39@oracle.com>

Hi Kim,

On 2019-08-29 02:27, Kim Barrett wrote:
> Please review this trivial cleanup of G1DirtyCardQueueSet
> initialization. With the separation of G1RedirtyCardsQueueSet from
> G1DirtyCardQueueSet, the latter is now a singleton class that should
> always have an associated G1FreeIdSet.  We can now unconditionally
> construct that object when constructing the DCQS.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230327
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
Looks good.

Cheers,
Stefan

> 
> Testing:
> Local (linux-x64) hotspot:tier1.
> 


From kim.barrett at oracle.com  Thu Aug 29 06:26:38 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 02:26:38 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <d77cd9f0-94e3-36e1-56ff-d02e51911e39@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <d77cd9f0-94e3-36e1-56ff-d02e51911e39@oracle.com>
Message-ID: <2D084C83-09FE-4A6E-BC55-6F1E1F334B9A@oracle.com>

> On Aug 29, 2019, at 2:21 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 2019-08-29 02:27, Kim Barrett wrote:
>> Please review this trivial cleanup of G1DirtyCardQueueSet
>> initialization. With the separation of G1RedirtyCardsQueueSet from
>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>> always have an associated G1FreeIdSet.  We can now unconditionally
>> construct that object when constructing the DCQS.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230327
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
> Looks good.
> 
> Cheers,
> Stefan
> 
>> Testing:
>> Local (linux-x64) hotspot:tier1.

Thanks.


From stefan.johansson at oracle.com  Thu Aug 29 07:10:00 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Thu, 29 Aug 2019 09:10:00 +0200
Subject: RFR(T): 8230332: G1DirtyCardQueueSet _notify_when_complete is
 always true
In-Reply-To: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
References: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
Message-ID: <d9b9afdd-b609-4c00-b649-bc1edd55fd49@oracle.com>

Hi Kim,

On 2019-08-29 03:05, Kim Barrett wrote:
> Please review this trivial cleanup of G1DirtyCardQueueSet.  With the
> separation of G1RedirtyCardsQueueSet from G1DirtyCardQueueSet, the
> latter is now a singleton class.  As a result, the
> _notify_when_complete member (used to control whether adding completed
> buffers should notify the completed buffer monitor) is always true.
> 
> This change removes that member and changes the conditional
> notifications to be unconditional.  Also cleaned up some locker usage
> for _cbl_mon when notification is needed, and changed to consistently
> use notify_all() (there's no good reason to use notify(), and
> definitely no good reason to use a mix of the two here).
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230332
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230332/open.00/
Looks good, nice cleanup.
Stefan

> 
> Testing:
> Local (linux-x64) hotspot:tier1
> 


From Alan.Bateman at oracle.com  Thu Aug 29 07:18:49 2019
From: Alan.Bateman at oracle.com (Alan Bateman)
Date: Thu, 29 Aug 2019 08:18:49 +0100
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
Message-ID: <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com>

On 28/08/2019 23:58, Mandy Chung wrote:
> Hi Paul,
>
> The CSR proposes this method in java.lang.management.ThreadMXBean as a 
> Java SE feature.
>
> Has this been discussed with the GC team to commit measuring current 
> thread's allocated bytes as Java SE feature??? Can this be supported 
> by all JVM implementation??? What is the overhead if this is enabled 
> by default?? Does it need to be disabled??? This metric is from TLAB 
> that might be okay.? This needs advice/discussion with GC experts.
The webrev adds it to jdk.management/com.sun.management.ThreadMXBean so 
I suspect it is a typo in the CSR and the proposal is for it to be 
JDK-specific.

-Alan.


From thomas.schatzl at oracle.com  Thu Aug 29 08:52:52 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Thu, 29 Aug 2019 10:52:52 +0200
Subject: RFR(T): 8230332: G1DirtyCardQueueSet _notify_when_complete is
 always true
In-Reply-To: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
References: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
Message-ID: <9aa7dc33-8331-baf8-6ff2-649a2f5990a0@oracle.com>

Hi,

On 29.08.19 03:05, Kim Barrett wrote:
> Please review this trivial cleanup of G1DirtyCardQueueSet.  With the
> separation of G1RedirtyCardsQueueSet from G1DirtyCardQueueSet, the
> latter is now a singleton class.  As a result, the
> _notify_when_complete member (used to control whether adding completed
> buffers should notify the completed buffer monitor) is always true.
> 
> This change removes that member and changes the conditional
> notifications to be unconditional.  Also cleaned up some locker usage
> for _cbl_mon when notification is needed, and changed to consistently
> use notify_all() (there's no good reason to use notify(), and
> definitely no good reason to use a mix of the two here).
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230332
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230332/open.00/
> 
> Testing:
> Local (linux-x64) hotspot:tier1

   looks good.

Thanks,
   Thomas


From leo.korinth at oracle.com  Thu Aug 29 09:49:39 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Thu, 29 Aug 2019 11:49:39 +0200
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
Message-ID: <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>

On 29/08/2019 02:27, Kim Barrett wrote:
> Please review this trivial cleanup of G1DirtyCardQueueSet
> initialization. With the separation of G1RedirtyCardsQueueSet from
> G1DirtyCardQueueSet, the latter is now a singleton class that should
> always have an associated G1FreeIdSet.  We can now unconditionally
> construct that object when constructing the DCQS.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230327
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
> 
> Testing:
> Local (linux-x64) hotspot:tier1.
> 


Looks good! How about making G1FreeIdSet inline instead of a pointer? I 
am fine with either.

Thanks,
Leo


Something like:

diff --git a/src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp 
b/src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp
index 91982fd4be..e62e9a80b9 100644
--- a/src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp
+++ b/src/hotspot/share/gc/g1/g1DirtyCardQueue.cpp
@@ -27,7 +27,6 @@
  #include "gc/g1/g1CardTableEntryClosure.hpp"
  #include "gc/g1/g1CollectedHeap.inline.hpp"
  #include "gc/g1/g1DirtyCardQueue.hpp"
-#include "gc/g1/g1FreeIdSet.hpp"
  #include "gc/g1/g1RedirtyCardsQueue.hpp"
  #include "gc/g1/g1RemSet.hpp"
  #include "gc/g1/g1ThreadLocalData.hpp"
@@ -91,7 +90,7 @@ G1DirtyCardQueueSet::G1DirtyCardQueueSet(bool 
notify_when_complete) :
    _notify_when_complete(notify_when_complete),
    _max_completed_buffers(MaxCompletedBuffersUnlimited),
    _completed_buffers_padding(0),
-  _free_ids(new G1FreeIdSet(0, num_par_ids())),
+  _free_ids(0, num_par_ids()),
    _processed_buffers_mut(0),
    _processed_buffers_rs_thread(0)
  {
@@ -100,7 +99,6 @@ G1DirtyCardQueueSet::G1DirtyCardQueueSet(bool 
notify_when_complete) :

  G1DirtyCardQueueSet::~G1DirtyCardQueueSet() {
    abandon_completed_buffers();
-  delete _free_ids;
  }

  // Determines how many mutator threads can process the buffers in 
parallel.
@@ -287,10 +285,10 @@ bool 
G1DirtyCardQueueSet::process_or_enqueue_completed_buffer(BufferNode* node)
  }

  bool G1DirtyCardQueueSet::mut_process_buffer(BufferNode* node) {
-  uint worker_i = _free_ids->claim_par_id(); // temporarily claim an id
+  uint worker_i = _free_ids.claim_par_id(); // temporarily claim an id
    G1RefineCardConcurrentlyClosure cl;
    bool result = apply_closure_to_buffer(&cl, node, worker_i);
-  _free_ids->release_par_id(worker_i); // release the id
+  _free_ids.release_par_id(worker_i); // release the id

    if (result) {
      assert_fully_consumed(node, buffer_size());
diff --git a/src/hotspot/share/gc/g1/g1DirtyCardQueue.hpp 
b/src/hotspot/share/gc/g1/g1DirtyCardQueue.hpp
index a9eb5c2d96..c94867c950 100644
--- a/src/hotspot/share/gc/g1/g1DirtyCardQueue.hpp
+++ b/src/hotspot/share/gc/g1/g1DirtyCardQueue.hpp
@@ -25,12 +25,12 @@
  #ifndef SHARE_GC_G1_G1DIRTYCARDQUEUE_HPP
  #define SHARE_GC_G1_G1DIRTYCARDQUEUE_HPP

+#include "gc/g1/g1FreeIdSet.hpp"
  #include "gc/shared/ptrQueue.hpp"
  #include "memory/allocation.hpp"

  class G1CardTableEntryClosure;
  class G1DirtyCardQueueSet;
-class G1FreeIdSet;
  class G1RedirtyCardsQueueSet;
  class Thread;
  class Monitor;
@@ -118,7 +118,7 @@ class G1DirtyCardQueueSet: public PtrQueueSet {
    size_t _completed_buffers_padding;
    static const size_t MaxCompletedBuffersUnlimited = SIZE_MAX;

-  G1FreeIdSet* _free_ids;
+  G1FreeIdSet _free_ids;

    // The number of completed buffers processed by mutator and rs thread,
    // respectively.

/Leo
-------------- next part --------------
A non-text attachment was scrubbed...
Name: inline.patch
Type: text/x-patch
Size: 2707 bytes
Desc: not available
URL: <https://mail.openjdk.org/pipermail/hotspot-gc-dev/attachments/20190829/be45d43b/inline.patch>

From erik.gahlin at oracle.com  Thu Aug 29 13:01:57 2019
From: erik.gahlin at oracle.com (Erik Gahlin)
Date: Thu, 29 Aug 2019 15:01:57 +0200
Subject: RFR(XS): 8229210: [TESTBUG] Move gc stress tests from JFR
 directory tree to gc/stress
In-Reply-To: <8440d9c8-996f-581f-9b39-a51cb8b19539@oracle.com>
References: <8440d9c8-996f-581f-9b39-a51cb8b19539@oracle.com>
Message-ID: <5D67CCC5.1050007@oracle.com>

Looks good!

Erik
> Please review this change that simply moves GC+JFR stress tests from 
> test/jdk/jdk/jfr/event/gc/detailed/TestStress*
> to test/hotspot/jtreg/gc/stress/jfr/
>
> This change was originated by JFR team, and later informally discussed 
> with GC team members. The goal is to remove
> heavy time-consuming GC event stress tests from JFR hierarchy to 
> GC/stress, where they better fit.
>
>     JBS: https://bugs.openjdk.java.net/browse/JDK-8229210
>     Webrev: http://cr.openjdk.java.net/~mseledtsov/8229210.00/
>        Just a simple move, no changes to the tests:
>        hg mv test/jdk/jdk/jfr/event/gc/detailed/TestStress* 
> test/hotspot/jtreg/gc/stress/jfr/
>
>     Testing:
>         1. Sanity: Ran test/hotspot/jtreg/gc/stress/jfr/ - PASS
>         2. Sanity: running jdk_jfr - in progress
>
>
> Thank you,
> Misha
>


From per.liden at oracle.com  Thu Aug 29 13:44:37 2019
From: per.liden at oracle.com (Per Liden)
Date: Thu, 29 Aug 2019 15:44:37 +0200
Subject: RFR: ZGC: Make zGlobals and zArguments OS agnostic
In-Reply-To: <2c6c8d93-afb3-9a21-ec0d-ef044cd35a84@oracle.com>
References: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>
 <CAEGA6kY_7qvSYYRwADoxC3A8JpiVXarFGK=6b_k5L9Yrmi-Zrg@mail.gmail.com>
 <2c6c8d93-afb3-9a21-ec0d-ef044cd35a84@oracle.com>
Message-ID: <a7fba1a1-1a6e-a627-e9ba-83f94c2f7f9f@oracle.com>

Looks good!

/Per

On 8/28/19 5:05 PM, Erik ?sterlund wrote:
> Hi Stuart,
> 
> Thanks for the review!
> 
> /Erik
> 
> On 2019-08-28 16:42, Stuart Monteith wrote:
>> Looks OK to me. Built and tested on aarch64.
>>
>> On Wed, 28 Aug 2019 at 13:43, Erik ?sterlund 
>> <erik.osterlund at oracle.com> wrote:
>>>
>>> Hi,
>>>
>>> The contents of zGlobals and zArguments are cpu-specific, but not
>>> os_cpu-specific. Therefore, these files should be moved to be
>>> cpu-specific only.
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8230307/webrev.00/
>>>
>>> Bug:
>>> https://bugs.openjdk.java.net/browse/JDK-8230307
>>>
>>> Thanks,
>>> /Erik


From erik.osterlund at oracle.com  Thu Aug 29 14:05:47 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Thu, 29 Aug 2019 16:05:47 +0200
Subject: RFR: ZGC: Make zGlobals and zArguments OS agnostic
In-Reply-To: <a7fba1a1-1a6e-a627-e9ba-83f94c2f7f9f@oracle.com>
References: <89e94425-2c9b-0007-6784-a4147ffd3783@oracle.com>
 <CAEGA6kY_7qvSYYRwADoxC3A8JpiVXarFGK=6b_k5L9Yrmi-Zrg@mail.gmail.com>
 <2c6c8d93-afb3-9a21-ec0d-ef044cd35a84@oracle.com>
 <a7fba1a1-1a6e-a627-e9ba-83f94c2f7f9f@oracle.com>
Message-ID: <09803903-3c33-04bd-42e6-d550fc61d498@oracle.com>

Hi Per,

Thanks for the review!

/Erik

On 2019-08-29 15:44, Per Liden wrote:
> Looks good!
> 
> /Per
> 
> On 8/28/19 5:05 PM, Erik ?sterlund wrote:
>> Hi Stuart,
>>
>> Thanks for the review!
>>
>> /Erik
>>
>> On 2019-08-28 16:42, Stuart Monteith wrote:
>>> Looks OK to me. Built and tested on aarch64.
>>>
>>> On Wed, 28 Aug 2019 at 13:43, Erik ?sterlund 
>>> <erik.osterlund at oracle.com> wrote:
>>>>
>>>> Hi,
>>>>
>>>> The contents of zGlobals and zArguments are cpu-specific, but not
>>>> os_cpu-specific. Therefore, these files should be moved to be
>>>> cpu-specific only.
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8230307/webrev.00/
>>>>
>>>> Bug:
>>>> https://bugs.openjdk.java.net/browse/JDK-8230307
>>>>
>>>> Thanks,
>>>> /Erik


From mikhailo.seledtsov at oracle.com  Thu Aug 29 15:25:23 2019
From: mikhailo.seledtsov at oracle.com (mikhailo.seledtsov at oracle.com)
Date: Thu, 29 Aug 2019 08:25:23 -0700
Subject: RFR(XS): 8229210: [TESTBUG] Move gc stress tests from JFR
 directory tree to gc/stress
In-Reply-To: <5D67CCC5.1050007@oracle.com>
References: <8440d9c8-996f-581f-9b39-a51cb8b19539@oracle.com>
 <5D67CCC5.1050007@oracle.com>
Message-ID: <3a314f50-7dff-e5d0-f041-cfde0f70893b@oracle.com>

Thank you,

Misha

On 8/29/19 6:01 AM, Erik Gahlin wrote:
> Looks good!
>
> Erik
>> Please review this change that simply moves GC+JFR stress tests from 
>> test/jdk/jdk/jfr/event/gc/detailed/TestStress*
>> to test/hotspot/jtreg/gc/stress/jfr/
>>
>> This change was originated by JFR team, and later informally 
>> discussed with GC team members. The goal is to remove
>> heavy time-consuming GC event stress tests from JFR hierarchy to 
>> GC/stress, where they better fit.
>>
>> ??? JBS: https://bugs.openjdk.java.net/browse/JDK-8229210
>> ??? Webrev: http://cr.openjdk.java.net/~mseledtsov/8229210.00/
>> ?????? Just a simple move, no changes to the tests:
>> ?????? hg mv test/jdk/jdk/jfr/event/gc/detailed/TestStress* 
>> test/hotspot/jtreg/gc/stress/jfr/
>>
>> ??? Testing:
>> ??????? 1. Sanity: Ran test/hotspot/jtreg/gc/stress/jfr/ - PASS
>> ??????? 2. Sanity: running jdk_jfr - in progress
>>
>>
>> Thank you,
>> Misha
>>
>


From kim.barrett at oracle.com  Thu Aug 29 17:00:38 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 13:00:38 -0400
Subject: RFR(T): 8230332: G1DirtyCardQueueSet _notify_when_complete is
 always true
In-Reply-To: <d9b9afdd-b609-4c00-b649-bc1edd55fd49@oracle.com>
References: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
 <d9b9afdd-b609-4c00-b649-bc1edd55fd49@oracle.com>
Message-ID: <CDF7DD22-B2EE-4DDA-95B3-E7CB85AEC6C4@oracle.com>

> On Aug 29, 2019, at 3:10 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> Hi Kim,
> 
> On 2019-08-29 03:05, Kim Barrett wrote:
>> Please review this trivial cleanup of G1DirtyCardQueueSet.  With the
>> separation of G1RedirtyCardsQueueSet from G1DirtyCardQueueSet, the
>> latter is now a singleton class.  As a result, the
>> _notify_when_complete member (used to control whether adding completed
>> buffers should notify the completed buffer monitor) is always true.
>> This change removes that member and changes the conditional
>> notifications to be unconditional.  Also cleaned up some locker usage
>> for _cbl_mon when notification is needed, and changed to consistently
>> use notify_all() (there's no good reason to use notify(), and
>> definitely no good reason to use a mix of the two here).
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230332
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230332/open.00/
> Looks good, nice cleanup.
> Stefan
> 
>> Testing:
>> Local (linux-x64) hotspot:tier1

Thanks.


From kim.barrett at oracle.com  Thu Aug 29 17:00:29 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 13:00:29 -0400
Subject: RFR(T): 8230332: G1DirtyCardQueueSet _notify_when_complete is
 always true
In-Reply-To: <9aa7dc33-8331-baf8-6ff2-649a2f5990a0@oracle.com>
References: <F0AC42B1-6817-457D-9DA6-6951AE3422F2@oracle.com>
 <9aa7dc33-8331-baf8-6ff2-649a2f5990a0@oracle.com>
Message-ID: <ECB43EF9-738A-4A4D-8E15-7C3A69BA17BB@oracle.com>

> On Aug 29, 2019, at 4:52 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 29.08.19 03:05, Kim Barrett wrote:
>> Please review this trivial cleanup of G1DirtyCardQueueSet.  With the
>> separation of G1RedirtyCardsQueueSet from G1DirtyCardQueueSet, the
>> latter is now a singleton class.  As a result, the
>> _notify_when_complete member (used to control whether adding completed
>> buffers should notify the completed buffer monitor) is always true.
>> This change removes that member and changes the conditional
>> notifications to be unconditional.  Also cleaned up some locker usage
>> for _cbl_mon when notification is needed, and changed to consistently
>> use notify_all() (there's no good reason to use notify(), and
>> definitely no good reason to use a mix of the two here).
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230332
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230332/open.00/
>> Testing:
>> Local (linux-x64) hotspot:tier1
> 
>  looks good.
> 
> Thanks,
>  Thomas

Thanks.


From hohensee at amazon.com  Thu Aug 29 17:01:17 2019
From: hohensee at amazon.com (Hohensee, Paul)
Date: Thu, 29 Aug 2019 17:01:17 +0000
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
Message-ID: <AC4FEEF0-3FD7-4C48-AC1F-429C52EEBCF3@amazon.com>

My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible.

There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code.

I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement.

Thanks,

Paul

From: Mandy Chung <mandy.chung at oracle.com>
Date: Wednesday, August 28, 2019 at 3:59 PM
To: "Hohensee, Paul" <hohensee at amazon.com>
Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>
Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread

Hi Paul,

The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature.

Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature?   Can this be supported by all JVM implementation?   What is the overhead if this is enabled by default?  Does it need to be disabled?   This metric is from TLAB that might be okay.  This needs advice/discussion with GC experts.

I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean.

As Alan points out, current thread makes sense only in local VM management.  When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread?

Mandy
On 8/28/19 12:22 PM, Hohensee, Paul wrote:
Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes.

JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266
Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/
CSR: https://bugs.openjdk.java.net/browse/JDK-8230311

Previous email threads:
https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html
https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html

The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it.

I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation.

The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress.

Thanks,

Paul


From hohensee at amazon.com  Thu Aug 29 18:05:04 2019
From: hohensee at amazon.com (Hohensee, Paul)
Date: Thu, 29 Aug 2019 18:05:04 +0000
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
 <9b62699c-0aac-a8fd-27a6-a8d4d3820dd2@oracle.com>
Message-ID: <C4241AFF-E9D2-4014-BB50-F760C268B76C@amazon.com>

Yes. See previous email.

Thanks,

?On 8/29/19, 12:19 AM, "Alan Bateman" <Alan.Bateman at oracle.com> wrote:

    On 28/08/2019 23:58, Mandy Chung wrote:
    > Hi Paul,
    >
    > The CSR proposes this method in java.lang.management.ThreadMXBean as a 
    > Java SE feature.
    >
    > Has this been discussed with the GC team to commit measuring current 
    > thread's allocated bytes as Java SE feature?   Can this be supported 
    > by all JVM implementation?   What is the overhead if this is enabled 
    > by default?  Does it need to be disabled?   This metric is from TLAB 
    > that might be okay.  This needs advice/discussion with GC experts.
    The webrev adds it to jdk.management/com.sun.management.ThreadMXBean so 
    I suspect it is a typo in the CSR and the proposal is for it to be 
    JDK-specific.
    
    -Alan.
    

From kim.barrett at oracle.com  Thu Aug 29 18:35:01 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 14:35:01 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
Message-ID: <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>

> On Aug 29, 2019, at 5:49 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> On 29/08/2019 02:27, Kim Barrett wrote:
>> Please review this trivial cleanup of G1DirtyCardQueueSet
>> initialization. With the separation of G1RedirtyCardsQueueSet from
>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>> always have an associated G1FreeIdSet.  We can now unconditionally
>> construct that object when constructing the DCQS.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230327
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
>> Testing:
>> Local (linux-x64) hotspot:tier1.
> 
> 
> Looks good! How about making G1FreeIdSet inline instead of a pointer? I am fine with either.

Sure.

New webrevs:
full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/


From mandrikov at gmail.com  Thu Aug 29 18:42:44 2019
From: mandrikov at gmail.com (Evgeny Mandrikov)
Date: Thu, 29 Aug 2019 20:42:44 +0200
Subject: RFR: JDK-8215166: Remove unused G1PretouchAuxiliaryMemory option
In-Reply-To: <9CE4B77F-EC98-4804-B744-6C2E15030936@oracle.com>
References: <CAEPFu6_go4sfFAfuf3s79JOAMHX0Y4v7YknH-+odDTNHvZdbZQ@mail.gmail.com>
 <9CE4B77F-EC98-4804-B744-6C2E15030936@oracle.com>
Message-ID: <CAEPFu68BKRchsH2SdL5Twjq8JBaXdT0Q6oWeOeFcM4PcS7-kYQ@mail.gmail.com>

Thank you, Kim!


Regards,
Evgeny


On Wed, Aug 28, 2019 at 8:13 PM Kim Barrett <kim.barrett at oracle.com> wrote:

> > On Aug 25, 2019, at 3:28 PM, Evgeny Mandrikov <mandrikov at gmail.com>
> wrote:
> >
> > Hello!
> >
> > Please review patch [1] for JDK-8215166 [2]. Also it needs a sponsor
> since
> > I have only author status in OpenJDK Census [3].
> >
> > After this change tier1 tests pass on my machine and I don't find other
> > occurrences of "G1PretouchAuxiliaryMemory".
> >
> >
> > With best regards,
> > Evgeny Mandrikov
> >
> > [1] http://cr.openjdk.java.net/~godin/8215166/webrev.00/
> > [2] https://bugs.openjdk.java.net/browse/JDK-8215166
> > [3] https://openjdk.java.net/census#godin
>
> Looks good, and ?trivial? (only one Reviewer needed).
>
> I will sponsor.
>
>


From sangheon.kim at oracle.com  Thu Aug 29 21:26:59 2019
From: sangheon.kim at oracle.com (sangheon.kim at oracle.com)
Date: Thu, 29 Aug 2019 14:26:59 -0700
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <586198B3-4ABE-4A81-96F2-627AF82F47FA@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
 <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
 <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>
 <B55A783A-E822-44EA-946C-8D8ED6285912@oracle.com>
 <586198B3-4ABE-4A81-96F2-627AF82F47FA@oracle.com>
Message-ID: <27178dd0-1098-d51c-cab4-9f213dbd7f2b@oracle.com>

Hi Kim,

On 8/27/19 7:54 AM, Kim Barrett wrote:
>> On Aug 27, 2019, at 3:34 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>
>>
>>
>>> 26 aug. 2019 kl. 22:48 skrev Kim Barrett <kim.barrett at oracle.com>:
>>>
>>>> On Aug 26, 2019, at 2:29 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>
>>>>
>>>>
>>>>> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
>>>>>
>>>>>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>>>> CR:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>>>>>>
>>>>>>> Webrev:
>>>>>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>>>>>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
>>>>> Those are product options; changing their semantics like that is not so easy.
>>>>>
>>>> True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?
>>> Maybe, and maybe not.
>>>
>>> [?]
>>>
>>> So I'd prefer they were left alone until such time as we have a better
>>> understanding of what we actually want / need here.
>>>
>> Sounds like a good plan, and let?s hope we can figure out some good names :)
> Thanks.
Looks good to me too.

Thanks,
Sangheon


From kim.barrett at oracle.com  Thu Aug 29 21:49:42 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 17:49:42 -0400
Subject: RFR: 8230109: G1DirtyCardQueueSet should use card counts rather
 than buffer counts
In-Reply-To: <27178dd0-1098-d51c-cab4-9f213dbd7f2b@oracle.com>
References: <97BAE9AC-B0F6-4D6B-924C-4AB6C24777B3@oracle.com>
 <FD199B1F-9D15-4FCF-ABDA-5568E982EDBB@oracle.com>
 <284FD120-AEFB-4DAA-BCA4-E81803A73290@oracle.com>
 <DD3D7A06-3D50-4C97-85C3-FA04E4DEFF05@oracle.com>
 <00AE231F-2AE6-440C-92DE-D5EC76EFFE0E@oracle.com>
 <B55A783A-E822-44EA-946C-8D8ED6285912@oracle.com>
 <586198B3-4ABE-4A81-96F2-627AF82F47FA@oracle.com>
 <27178dd0-1098-d51c-cab4-9f213dbd7f2b@oracle.com>
Message-ID: <5E3178B6-4743-4449-9858-35CB78F78491@oracle.com>

> On Aug 29, 2019, at 5:26 PM, sangheon.kim at oracle.com wrote:
> 
> Hi Kim,
> 
> On 8/27/19 7:54 AM, Kim Barrett wrote:
>>> On Aug 27, 2019, at 3:34 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>> 
>>> 
>>> 
>>>> 26 aug. 2019 kl. 22:48 skrev Kim Barrett <kim.barrett at oracle.com>:
>>>> 
>>>>> On Aug 26, 2019, at 2:29 PM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>> 
>>>>> 
>>>>> 
>>>>>> 26 aug. 2019 kl. 17:42 skrev Kim Barrett <kim.barrett at oracle.com>:
>>>>>> 
>>>>>>> On Aug 26, 2019, at 4:52 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
>>>>>>>> CR:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8230109
>>>>>>>> 
>>>>>>>> Webrev:
>>>>>>>> http://cr.openjdk.java.net/~kbarrett/8230109/open.00/
>>>>>>> I really like the cleanups you?ve done in this area of the code base and this one is no exception. Looks good, just one question around the different G1ConcRefinement-flags (threshold and zones). Couldn?t we make these be number of cards and get rid of the buffers_to_cards conversion in g1ConcurrentRefine.cpp?
>>>>>> Those are product options; changing their semantics like that is not so easy.
>>>>>> 
>>>>> True, but I think it is something we want to do in the longer run so maybe creating an enhancement for it to track it?
>>>> Maybe, and maybe not.
>>>> 
>>>> [?]
>>>> 
>>>> So I'd prefer they were left alone until such time as we have a better
>>>> understanding of what we actually want / need here.
>>>> 
>>> Sounds like a good plan, and let?s hope we can figure out some good names :)
>> Thanks.
> Looks good to me too.
> 
> Thanks,
> Sangheon

Thanks.


From kim.barrett at oracle.com  Thu Aug 29 23:24:03 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Thu, 29 Aug 2019 19:24:03 -0400
Subject: RFR(T): 8230372: Remove G1GCPhaseTimes::MergeLBProcessedBuffers 
Message-ID: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>

Please review this trivial cleanup.  We're removing a phase time work
item that is no longer used for anything except logging.  It is no
longer interesting even for that, having been superseded by other,
more useful information that is similarly logged.

CR:
https://bugs.openjdk.java.net/browse/JDK-8230372

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230372/open.00/

Testing:
Local (linux-x64) hotspot:tier1.  That includes gc/g1/TestGCLogMessages.java,
which was checking for the associated 


From stefan.johansson at oracle.com  Fri Aug 30 07:06:12 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 30 Aug 2019 09:06:12 +0200
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
Message-ID: <751fd937-7e09-ef00-9c24-f82615ba4de8@oracle.com>


On 2019-08-29 20:35, Kim Barrett wrote:
>> On Aug 29, 2019, at 5:49 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> On 29/08/2019 02:27, Kim Barrett wrote:
>>> Please review this trivial cleanup of G1DirtyCardQueueSet
>>> initialization. With the separation of G1RedirtyCardsQueueSet from
>>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>>> always have an associated G1FreeIdSet.  We can now unconditionally
>>> construct that object when constructing the DCQS.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8230327
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
>>> Testing:
>>> Local (linux-x64) hotspot:tier1.
>>
>>
>> Looks good! How about making G1FreeIdSet inline instead of a pointer? I am fine with either.
> 
> Sure.
> 
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/
> 
Still good,
Stefan


From thomas.schatzl at oracle.com  Fri Aug 30 07:43:53 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 30 Aug 2019 09:43:53 +0200
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
Message-ID: <ca2f8a43-b24c-9a34-f916-7bf615c4aefe@oracle.com>

Hi,

On 29.08.19 20:35, Kim Barrett wrote:
>> On Aug 29, 2019, at 5:49 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> On 29/08/2019 02:27, Kim Barrett wrote:
>>> Please review this trivial cleanup of G1DirtyCardQueueSet
>>> initialization. With the separation of G1RedirtyCardsQueueSet from
>>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>>> always have an associated G1FreeIdSet.  We can now unconditionally
>>> construct that object when constructing the DCQS.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8230327
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
>>> Testing:
>>> Local (linux-x64) hotspot:tier1.
>>
>>
>> Looks good! How about making G1FreeIdSet inline instead of a pointer? I am fine with either.
> 
> Sure.
> 
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/
> 

   looks good.

Thomas


From thomas.schatzl at oracle.com  Fri Aug 30 07:53:07 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 30 Aug 2019 09:53:07 +0200
Subject: RFR(T): 8230372: Remove G1GCPhaseTimes::MergeLBProcessedBuffers
In-Reply-To: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>
References: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>
Message-ID: <18e065dd-f02d-47d7-63c6-1d4fc3d900b7@oracle.com>

Hi,

On 30.08.19 01:24, Kim Barrett wrote:
> Please review this trivial cleanup.  We're removing a phase time work
> item that is no longer used for anything except logging.  It is no
> longer interesting even for that, having been superseded by other,
> more useful information that is similarly logged.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230372
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230372/open.00/
> 
> Testing:
> Local (linux-x64) hotspot:tier1.  That includes gc/g1/TestGCLogMessages.java,
> which was checking for the associated
> 

   looks good.


Thanks,
   Thomas


From stefan.johansson at oracle.com  Fri Aug 30 08:09:58 2019
From: stefan.johansson at oracle.com (Stefan Johansson)
Date: Fri, 30 Aug 2019 10:09:58 +0200
Subject: RFR(T): 8230372: Remove G1GCPhaseTimes::MergeLBProcessedBuffers
In-Reply-To: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>
References: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>
Message-ID: <424bef8f-f7d2-2f91-6d6e-fa58b1f4c039@oracle.com>

Hi Kim,

On 2019-08-30 01:24, Kim Barrett wrote:
> Please review this trivial cleanup.  We're removing a phase time work
> item that is no longer used for anything except logging.  It is no
> longer interesting even for that, having been superseded by other,
> more useful information that is similarly logged.
> 
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8230372
> 
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8230372/open.00/
> 
> Testing:
> Local (linux-x64) hotspot:tier1.  That includes gc/g1/TestGCLogMessages.java,
> which was checking for the associated
> 

Looks good,
Stefan


From thomas.schatzl at oracle.com  Fri Aug 30 08:32:50 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 30 Aug 2019 10:32:50 +0200
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <VI1PR0201MB2479CCFC54585663AF96B1E79AA10@VI1PR0201MB2479.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
 <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <VI1PR0201MB2479CCFC54585663AF96B1E79AA10@VI1PR0201MB2479.eurprd02.prod.outlook.com>
Message-ID: <95f94a8d-d32f-c2e4-25a0-9d7471f74e08@oracle.com>

Hi,

On 26.08.19 15:04, Doerr, Martin wrote:
> Hi all,
> 
> I had noticed that the platforms selection which need a fence in taskqueue.inline.hpp should get updated.
> 
> My initial webrev
> http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.00/
> was already reviewed on hotspot-gc-dev. It is an attempt to make things more consistent, especially the property "CPU_MULTI_COPY_ATOMIC".
> Also the compiler constant "support_IRIW_for_not_multiple_copy_atomic_cpu" depends on this property (currently only used on PPC64).
> 
> We could go one step further and move even more #defines into the platform files to give platform maintainers more control.
> I haven't got feedback from arm/aarch64 folks about this addition, yet:
> http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.01/
> With this proposal, each platform which is "CPU_MULTI_COPY_ATOMIC" is supposed to define this macro.
> Other platforms must define SUPPORT_IRIW_FOR_NOT_MULTI_COPY_ATOMIC_CPU and IRIW_WITH_RELEASE_VOLATILE_IN_CONSTRUCTOR for fine-grained control of the memory ordering behavior.
> We can even control them dynamically (added an experimental switch for PPC64 as an example).
> 
> Note that neither webrev.00 nor webrev.01 contain any functional changes other than the taskqueue update for s390 (and the experimental switch for PPC64 in webrev.01).
> 
> Feedback is welcome. Also if you have a preference wrt. webrev.00 vs. webrev.01.

   for pushing I would prefer the minimal amount of changes to solve the 
original issue, and move all other changes to a different CR.

Also, I would prefer if all globalDefinitions files contained all 
defines, commented out if needed. I.e. to try to show that not defining 
a particular macro has been deliberate and not an oversight.

(Like in the 00 webrev where the code at least states for aarch64:
37 // aarch64 is not CPU_MULTI_COPY_ATOMIC

I am aware that this is not correct given new information, but in 
context of the CR it is/was)

Further, let's avoid "TODOs" in the sources, the correct place for those 
is JIRA imho. :)

Thanks,
   Thomas


From martin.doerr at sap.com  Fri Aug 30 11:14:45 2019
From: martin.doerr at sap.com (Doerr, Martin)
Date: Fri, 30 Aug 2019 11:14:45 +0000
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <95f94a8d-d32f-c2e4-25a0-9d7471f74e08@oracle.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
 <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <VI1PR0201MB2479CCFC54585663AF96B1E79AA10@VI1PR0201MB2479.eurprd02.prod.outlook.com>
 <95f94a8d-d32f-c2e4-25a0-9d7471f74e08@oracle.com>
Message-ID: <VI1PR0201MB2479543A538C039B68B8AA639ABD0@VI1PR0201MB2479.eurprd02.prod.outlook.com>

Hi Thomas,

good proposal.

Here's the minimal version:
http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.02/

I've removed the compiler part. I can create a separate issue for making C1 and C2 consistent.

Arm32/aarch64 folks can create new issues if they like further changes.
I don't have any further requirements for s390 and PPC64 at the moment.

Can I consider it as reviewed by Thomas, David and Derek?

Best regards,
Martin


> -----Original Message-----
> From: Thomas Schatzl <thomas.schatzl at oracle.com>
> Sent: Freitag, 30. August 2019 10:33
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-runtime-
> dev at openjdk.java.net; 'hotspot-compiler-dev at openjdk.java.net' <hotspot-
> compiler-dev at openjdk.java.net>
> Cc: hotspot-gc-dev at openjdk.java.net; David Holmes
> (david.holmes at oracle.com) <david.holmes at oracle.com>; Derek White
> <derekw at marvell.com>
> Subject: Re: RFR(S): 8229422: Taskqueue: Outdated selection of weak
> memory model platforms
> 
> Hi,
> 
> On 26.08.19 15:04, Doerr, Martin wrote:
> > Hi all,
> >
> > I had noticed that the platforms selection which need a fence in
> taskqueue.inline.hpp should get updated.
> >
> > My initial webrev
> > http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-
> atomic/webrev.00/
> > was already reviewed on hotspot-gc-dev. It is an attempt to make things
> more consistent, especially the property "CPU_MULTI_COPY_ATOMIC".
> > Also the compiler constant
> "support_IRIW_for_not_multiple_copy_atomic_cpu" depends on this
> property (currently only used on PPC64).
> >
> > We could go one step further and move even more #defines into the
> platform files to give platform maintainers more control.
> > I haven't got feedback from arm/aarch64 folks about this addition, yet:
> > http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-
> atomic/webrev.01/
> > With this proposal, each platform which is "CPU_MULTI_COPY_ATOMIC" is
> supposed to define this macro.
> > Other platforms must define
> SUPPORT_IRIW_FOR_NOT_MULTI_COPY_ATOMIC_CPU and
> IRIW_WITH_RELEASE_VOLATILE_IN_CONSTRUCTOR for fine-grained control
> of the memory ordering behavior.
> > We can even control them dynamically (added an experimental switch for
> PPC64 as an example).
> >
> > Note that neither webrev.00 nor webrev.01 contain any functional changes
> other than the taskqueue update for s390 (and the experimental switch for
> PPC64 in webrev.01).
> >
> > Feedback is welcome. Also if you have a preference wrt. webrev.00 vs.
> webrev.01.
> 
>    for pushing I would prefer the minimal amount of changes to solve the
> original issue, and move all other changes to a different CR.
> 
> Also, I would prefer if all globalDefinitions files contained all
> defines, commented out if needed. I.e. to try to show that not defining
> a particular macro has been deliberate and not an oversight.
> 
> (Like in the 00 webrev where the code at least states for aarch64:
> 37 // aarch64 is not CPU_MULTI_COPY_ATOMIC
> 
> I am aware that this is not correct given new information, but in
> context of the CR it is/was)
> 
> Further, let's avoid "TODOs" in the sources, the correct place for those
> is JIRA imho. :)
> 
> Thanks,
>    Thomas
> 


From thomas.schatzl at oracle.com  Fri Aug 30 11:34:24 2019
From: thomas.schatzl at oracle.com (Thomas Schatzl)
Date: Fri, 30 Aug 2019 13:34:24 +0200
Subject: RFR(S): 8229422: Taskqueue: Outdated selection of weak memory
 model platforms
In-Reply-To: <VI1PR0201MB2479543A538C039B68B8AA639ABD0@VI1PR0201MB2479.eurprd02.prod.outlook.com>
References: <DB8PR02MB58205880F81417945629A86E9AD30@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <9d9819fe-560f-13f0-1907-794e063ee687@oracle.com>
 <DB8PR02MB5820AB8B23A7B4A173DB875B9AD20@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <7035ccb8-000c-3a58-b5ac-fb0a3b949784@oracle.com>
 <DB8PR02MB5820A602CE4151359659D18A9AAF0@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <381f185e-ca2e-50c4-fe35-1e5e62ff88f6@oracle.com>
 <DB8PR02MB5820A9D3D815B26DAE38E5649AA80@DB8PR02MB5820.eurprd02.prod.outlook.com>
 <VI1PR0201MB2479CCFC54585663AF96B1E79AA10@VI1PR0201MB2479.eurprd02.prod.outlook.com>
 <95f94a8d-d32f-c2e4-25a0-9d7471f74e08@oracle.com>
 <VI1PR0201MB2479543A538C039B68B8AA639ABD0@VI1PR0201MB2479.eurprd02.prod.outlook.com>
Message-ID: <55b931eb-6cc9-1352-02a8-12e51d1231e9@oracle.com>

Hi Martin,

On 30.08.19 13:14, Doerr, Martin wrote:
> Hi Thomas,
> 
> good proposal.
> 
> Here's the minimal version:
> http://cr.openjdk.java.net/~mdoerr/8229422_multi-copy-atomic/webrev.02/
> 
> I've removed the compiler part. I can create a separate issue for making C1 and C2 consistent.
> 
> Arm32/aarch64 folks can create new issues if they like further changes.
> I don't have any further requirements for s390 and PPC64 at the moment.
> 
> Can I consider it as reviewed by Thomas, David and Derek?
> 

   looks good. I filed JDK-8230392 to pick up and test by Aarch64 
maintainers.

I am not so knowledgeable about the other proposals made here earlier, 
so I defer filing and fixing these to the respective maintainers.

Thanks,
   Thomas


From tprintezis at twitter.com  Fri Aug 30 14:29:37 2019
From: tprintezis at twitter.com (Tony Printezis)
Date: Fri, 30 Aug 2019 16:29:37 +0200
Subject: RFR(S): 8227224: GenCollectedHeap: add subspace transitions for
 young gen for gc+heap=info log lines
In-Reply-To: <826948e1-837f-1885-4e64-9c9d363b363a@oracle.com>
References: <CAOzU2inYb4RFbBiQC5PjnTWTCYTouNAyPQ-vX2SpENgUQdvQPQ@mail.gmail.com>
 <826948e1-837f-1885-4e64-9c9d363b363a@oracle.com>
Message-ID: <CAOzU2imDBrZ3YJC7trioj91Wb04o1rRQsfSwh4dVqEvE1SPmuw@mail.gmail.com>

Thomas,

Apologies - For some reason I completely missed your response! Thanks, I?ll
push this today.

Tony


?????
Tony Printezis | @TonyPrintezis | tprintezis at twitter.com


On August 9, 2019 at 12:10:02 AM, Thomas Schatzl (thomas.schatzl at oracle.com)
wrote:

Hi,

On 07.08.19 16:05, Tony Printezis wrote:
> Hi all,
>
> Similar to 8227225 but for the GenCollectedHeap GCs. Webrev is here:
>
> http://cr.openjdk.java.net/~tonyp/8227224/webrev.0/
>
> Tony
>

looks good.

Thomas


From erik.osterlund at oracle.com  Fri Aug 30 15:17:48 2019
From: erik.osterlund at oracle.com (=?UTF-8?Q?Erik_=c3=96sterlund?=)
Date: Fri, 30 Aug 2019 17:17:48 +0200
Subject: RFR: 8219724: ZGC: Make inline cache cleaning more robust
Message-ID: <29e3a713-113a-1486-bffc-c7c2e6a988f7@oracle.com>

Hi,

Today, during the nmethod unlinking phase, the per-nmethod lock is held 
across first an is_unloading() call on the nmethod and then inline cache 
cleaning, which may take the nmethod locks of all nmethods referred to 
from the inline caches.
If care is not taken, an nmethod A can have an inline cache pointing at 
nmethod B, and B can have an inline cache pointing back at A. This could 
potentially cause a deadlock. Today it is subtly safe, because between 
calling is_unloading() and cleaning the inline caches, the nmethod entry 
barrier is disarmed, which causes an mfence in the patching code. This 
ensures that the racing threads do not enter a deadlock situation, 
because they will observe the is_unloading state that was published as a 
cache by the other thread in the race, causing the locks that would 
cause the deadlock to not be taken.

I would like to move the locks so that this becomes more robust, and 
does not rely on the implicit fencing between is_unloading() and 
cleaning the inline caches.

Bug:
https://bugs.openjdk.java.net/browse/JDK-8219724

Webrev:
http://cr.openjdk.java.net/~eosterlund/8219724/webrev.00/

Thanks,
/Erik


From kim.barrett at oracle.com  Fri Aug 30 17:14:46 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 30 Aug 2019 13:14:46 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <751fd937-7e09-ef00-9c24-f82615ba4de8@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
 <751fd937-7e09-ef00-9c24-f82615ba4de8@oracle.com>
Message-ID: <79F3D17A-B9C3-4487-ADF1-252F1A1FCE85@oracle.com>

> On Aug 30, 2019, at 3:06 AM, Stefan Johansson <stefan.johansson at oracle.com> wrote:
> 
> 
> 
> On 2019-08-29 20:35, Kim Barrett wrote:
>>> On Aug 29, 2019, at 5:49 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>> 
>>> On 29/08/2019 02:27, Kim Barrett wrote:
>>>> Please review this trivial cleanup of G1DirtyCardQueueSet
>>>> initialization. With the separation of G1RedirtyCardsQueueSet from
>>>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>>>> always have an associated G1FreeIdSet.  We can now unconditionally
>>>> construct that object when constructing the DCQS.
>>>> CR:
>>>> https://bugs.openjdk.java.net/browse/JDK-8230327
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
>>>> Testing:
>>>> Local (linux-x64) hotspot:tier1.
>>> 
>>> 
>>> Looks good! How about making G1FreeIdSet inline instead of a pointer? I am fine with either.
>> Sure.
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
>> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/
> Still good,
> Stefan

Thanks.


From kim.barrett at oracle.com  Fri Aug 30 17:15:09 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 30 Aug 2019 13:15:09 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <ca2f8a43-b24c-9a34-f916-7bf615c4aefe@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
 <ca2f8a43-b24c-9a34-f916-7bf615c4aefe@oracle.com>
Message-ID: <0C58C041-5233-44CC-8393-6C6C8B12AFDC@oracle.com>

> On Aug 30, 2019, at 3:43 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
>> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/
> 
>  looks good.
> 
> Thomas

Thanks.


From kim.barrett at oracle.com  Fri Aug 30 17:16:11 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 30 Aug 2019 13:16:11 -0400
Subject: RFR(T): 8230372: Remove G1GCPhaseTimes::MergeLBProcessedBuffers
In-Reply-To: <18e065dd-f02d-47d7-63c6-1d4fc3d900b7@oracle.com>
References: <ED296D10-2430-4795-8B9F-12723A7F1AFC@oracle.com>
 <18e065dd-f02d-47d7-63c6-1d4fc3d900b7@oracle.com>
Message-ID: <482AC0C9-D228-4669-8CCE-5E85D778B944@oracle.com>

> On Aug 30, 2019, at 3:53 AM, Thomas Schatzl <thomas.schatzl at oracle.com> wrote:
> 
> Hi,
> 
> On 30.08.19 01:24, Kim Barrett wrote:
>> Please review this trivial cleanup.  We're removing a phase time work
>> item that is no longer used for anything except logging.  It is no
>> longer interesting even for that, having been superseded by other,
>> more useful information that is similarly logged.
>> CR:
>> https://bugs.openjdk.java.net/browse/JDK-8230372
>> Webrev:
>> http://cr.openjdk.java.net/~kbarrett/8230372/open.00/
>> Testing:
>> Local (linux-x64) hotspot:tier1.  That includes gc/g1/TestGCLogMessages.java,
>> which was checking for the associated
> 
>  looks good.
> 
> 
> Thanks,
>  Thomas

Thanks, Thomas and Stefan.


From leo.korinth at oracle.com  Fri Aug 30 17:20:02 2019
From: leo.korinth at oracle.com (Leo Korinth)
Date: Fri, 30 Aug 2019 19:20:02 +0200
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
Message-ID: <b71c0e65-657a-c3f5-8273-0c4731598c3b@oracle.com>

Still looks good!

Thanks,
Leo

On 29/08/2019 20:35, Kim Barrett wrote:
>> On Aug 29, 2019, at 5:49 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> On 29/08/2019 02:27, Kim Barrett wrote:
>>> Please review this trivial cleanup of G1DirtyCardQueueSet
>>> initialization. With the separation of G1RedirtyCardsQueueSet from
>>> G1DirtyCardQueueSet, the latter is now a singleton class that should
>>> always have an associated G1FreeIdSet.  We can now unconditionally
>>> construct that object when constructing the DCQS.
>>> CR:
>>> https://bugs.openjdk.java.net/browse/JDK-8230327
>>> Webrev:
>>> http://cr.openjdk.java.net/~kbarrett/8230327/open.00/
>>> Testing:
>>> Local (linux-x64) hotspot:tier1.
>>
>>
>> Looks good! How about making G1FreeIdSet inline instead of a pointer? I am fine with either.
> 
> Sure.
> 
> New webrevs:
> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/
> 


From mandy.chung at oracle.com  Fri Aug 30 17:21:50 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 30 Aug 2019 10:21:50 -0700
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <AC4FEEF0-3FD7-4C48-AC1F-429C52EEBCF3@amazon.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
 <AC4FEEF0-3FD7-4C48-AC1F-429C52EEBCF3@amazon.com>
Message-ID: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com>

OK.? That's better.? Some review comments:

The javadoc of getCurrentThreadAllocatedBytes() can simply say:

"Returns an approximation of the total amount of memory, in bytes,
allocated in heap memory for the current thread.

This is a convenient method for local management use and is equivalent
to calling getThreadAllocatedBytes(Thread.currentThread().getId()).


src/hotspot/share/include/jmm.h

GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/

sun/management/ThreadImpl.java

 ? 43???? private static final String 
THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED =
 ? 44???????? "Thread allocated memory measurement is not supported.";

if (!isThreadAllocatedMemorySupported()) {
 ?? throw new 
UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED);
}

Perhaps the above can be refactored as 
throwIfAllocatedMemoryUnsupported() method.

 ?391???????? if (ids.length == 1) {
 ?392???????????? sizes[0] = -1;
 ?:
 ?398???????????? if (ids.length == 1) {
 ?399???????????????? long id = ids[0];
 ?400???????????????? sizes[0] = getThreadAllocatedMemory0(
 ?401???????????????????? Thread.currentThread().getId() == id ? 0 : id);
 ?402???????????? } else {

It seems cleaner to handle the 1-element array case at the beginning
of this method:
 ?? if (ids.length == 1) {
 ?????? long size = getThreadAllocatedBytes(ids[0]);
 ?????? return new long[] { size };
 ?? }

I didn't review the hotspot implementation and the test.

Mandy

On 8/29/19 10:01 AM, Hohensee, Paul wrote:
>
> My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in 
> com.sun.management.ThreadMXBean along with the current two 
> getThreadAllocatedBytes methods for the reasons you list. I?ve updated 
> the CSR to specify com.sun.management and added a rationale. 
> AllocatedBytes is currently enabled by Hotspot by default because the 
> overhead of recording TLAB occupancy is negligible.
>
> There?s no new GC code, nor will there be, so imo we don?t have to 
> involve the GC folks. I.e., the new JMM method 
> GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes 
> JavaThread method, and getCurrentThreadAllocatedBytes is the same as 
> getThreadAllocatedBytes: it just bypasses the thread lookup code.
>
> I hadn?t tracked down what happens when getCurrentThreadUserTime and 
> getCurrentThreadCpuTime are called before, but if I?m not mistaken, it 
> the code in jcmd() in attachListener.cpp will call 
> GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use 
> Thread::current() as the subject of the call, see 
> os::current_thread_cpu_time in os_linux.cpp. That means that the 
> CurrentThread methods should work remotely the same way they do 
> locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as 
> its subject when called on behalf of getCurrentThreadAllocatedBytes, 
> so it will also uses the current remote Java thread. Even if these 
> methods only worked locally, there are many setups where apps are 
> self-monitoring that could use the performance improvement.
>
> Thanks,
>
> Paul
>
> *From: *Mandy Chung <mandy.chung at oracle.com>
> *Date: *Wednesday, August 28, 2019 at 3:59 PM
> *To: *"Hohensee, Paul" <hohensee at amazon.com>
> *Cc: *OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, 
> "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>
> *Subject: *Re: RFR (M): 8207266: 
> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread
>
> Hi Paul,
>
> The CSR proposes this method in java.lang.management.ThreadMXBean as a 
> Java SE feature.
>
> Has this been discussed with the GC team to commit measuring current 
> thread's allocated bytes as Java SE feature??? Can this be supported 
> by all JVM implementation??? What is the overhead if this is enabled 
> by default?? Does it need to be disabled??? This metric is from TLAB 
> that might be okay.? This needs advice/discussion with GC experts.
>
> I see that CSR mentions it can be disabled and link to 
> isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() 
> methods but these methods are defined in com.sun.management.ThreadMXBean.
>
> As Alan points out, current thread makes sense only in local VM 
> management.? When this is monitored from a JMX client (e.g. jconsole 
> to connect to a running JVM, "currentThreadAllowcatedBytes" attribute 
> is the current thread in jconsole process which invoking 
> Thread::currentThread?
>
> Mandy
>
> On 8/28/19 12:22 PM, Hohensee, Paul wrote:
>
>     Please review a performance improvement for
>     ThreadMXBean.getThreadAllocatedBytes and the addition of
>     getCurrentThreadAllocatedBytes.
>
>     JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266
>
>     Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/
>
>     CSR:https://bugs.openjdk.java.net/browse/JDK-8230311
>
>     Previous email threads:
>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html
>     https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html
>
>     The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes.
>     I?d be great for someone to review it.
>
>     I took Mandy?s advice and put the fast paths in the library code.
>     I added a new JMM method GetOneThreadsAllocatedBytes that works
>     the same as GetThreadCpuTime: it uses a thread_id value of zero to
>     distinguish the current thread. On my Mac laptop, the result runs
>     47x faster for the current thread than the old implementation.
>
>     The 3 tests intest/jdk/com/sun/management/ThreadMXBean all pass. I
>     added code to ThreadAllocatedMemory.java to test
>     getCurrentThreadAllocatedBytes as well as variations on
>     getThreadAllocatedBytes(id). A submit repo job is in progress.
>
>     Thanks,
>
>     Paul
>
>
>


From kim.barrett at oracle.com  Fri Aug 30 17:30:08 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Fri, 30 Aug 2019 13:30:08 -0400
Subject: RFR(T): 8230327: Make G1DirtyCardQueueSet free-id init
 unconditional
In-Reply-To: <b71c0e65-657a-c3f5-8273-0c4731598c3b@oracle.com>
References: <309EA90A-4C68-4836-A3AA-CCB2F44C81FC@oracle.com>
 <2d5e36aa-987e-8562-86e9-8627642bfd7d@oracle.com>
 <DB308B25-86BA-4970-B1DA-935F91C0F52C@oracle.com>
 <b71c0e65-657a-c3f5-8273-0c4731598c3b@oracle.com>
Message-ID: <A68D7037-3B1B-4FD4-B03C-3CBCDAB9A533@oracle.com>

> On Aug 30, 2019, at 1:20 PM, Leo Korinth <leo.korinth at oracle.com> wrote:
> 
> Still looks good!
> 
> Thanks,
> Leo
> 
>> New webrevs:
>> full: http://cr.openjdk.java.net/~kbarrett/8230327/open.01/
>> incr: http://cr.openjdk.java.net/~kbarrett/8230327/open.01.inc/

Thanks.


From hohensee at amazon.com  Fri Aug 30 22:33:05 2019
From: hohensee at amazon.com (Hohensee, Paul)
Date: Fri, 30 Aug 2019 22:33:05 +0000
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
 <AC4FEEF0-3FD7-4C48-AC1F-429C52EEBCF3@amazon.com>
 <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com>
Message-ID: <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com>

Thanks for your review, Mandy. Revised webrev at http://cr.openjdk.java.net/~phh/8207266/webrev.02/.

I updated the CSR with your suggested javadoc for getCurrentThreadAllocatedBytes. It now matches that for getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java.

I meant GetOneThreads to be the possessive, but don?t feel strongly either way so I?m fine with GetOneThread.


I updated ThreadImpl.java as you suggested, though in getThreadAllocatedBytes(long[] ids) I had to add a redundant-in-the-not-length-1-case check for a null ids reference.


Would someone take a look at the Hotspot side and the test please?


Paul

From: Mandy Chung <mandy.chung at oracle.com>
Date: Friday, August 30, 2019 at 10:22 AM
To: "Hohensee, Paul" <hohensee at amazon.com>
Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>
Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread

OK.  That's better.  Some review comments:

The javadoc of getCurrentThreadAllocatedBytes() can simply say:

"Returns an approximation of the total amount of memory, in bytes,
allocated in heap memory for the current thread.

This is a convenient method for local management use and is equivalent
to calling getThreadAllocatedBytes(Thread.currentThread().getId()).


src/hotspot/share/include/jmm.h

GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/

sun/management/ThreadImpl.java

  43     private static final String THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED =
  44         "Thread allocated memory measurement is not supported.";

if (!isThreadAllocatedMemorySupported()) {
   throw new UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED);
}

Perhaps the above can be refactored as throwIfAllocatedMemoryUnsupported() method.

 391         if (ids.length == 1) {
 392             sizes[0] = -1;
 :
 398             if (ids.length == 1) {
 399                 long id = ids[0];
 400                 sizes[0] = getThreadAllocatedMemory0(
 401                     Thread.currentThread().getId() == id ? 0 : id);
 402             } else {

It seems cleaner to handle the 1-element array case at the beginning
of this method:
   if (ids.length == 1) {
       long size = getThreadAllocatedBytes(ids[0]);
       return new long[] { size };
   }

I didn't review the hotspot implementation and the test.

Mandy
On 8/29/19 10:01 AM, Hohensee, Paul wrote:
My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in com.sun.management.ThreadMXBean along with the current two getThreadAllocatedBytes methods for the reasons you list. I?ve updated the CSR to specify com.sun.management and added a rationale. AllocatedBytes is currently enabled by Hotspot by default because the overhead of recording TLAB occupancy is negligible.

There?s no new GC code, nor will there be, so imo we don?t have to involve the GC folks. I.e., the new JMM method GetOneThreadsAllocatedBytes uses the existing cooked_allocated_bytes JavaThread method, and getCurrentThreadAllocatedBytes is the same as getThreadAllocatedBytes: it just bypasses the thread lookup code.

I hadn?t tracked down what happens when getCurrentThreadUserTime and getCurrentThreadCpuTime are called before, but if I?m not mistaken, it the code in jcmd() in attachListener.cpp will call GetThreadCpuTimeWithKind in management.cpp, and it will ultimately use Thread::current() as the subject of the call, see os::current_thread_cpu_time in os_linux.cpp. That means that the CurrentThread methods should work remotely the same way they do locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD as its subject when called on behalf of getCurrentThreadAllocatedBytes, so it will also uses the current remote Java thread. Even if these methods only worked locally, there are many setups where apps are self-monitoring that could use the performance improvement.

Thanks,

Paul

From: Mandy Chung <mandy.chung at oracle.com><mailto:mandy.chung at oracle.com>
Date: Wednesday, August 28, 2019 at 3:59 PM
To: "Hohensee, Paul" <hohensee at amazon.com><mailto:hohensee at amazon.com>
Cc: OpenJDK Serviceability <serviceability-dev at openjdk.java.net><mailto:serviceability-dev at openjdk.java.net>, "hotspot-gc-dev at openjdk.java.net"<mailto:hotspot-gc-dev at openjdk.java.net> <hotspot-gc-dev at openjdk.java.net><mailto:hotspot-gc-dev at openjdk.java.net>
Subject: Re: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread

Hi Paul,

The CSR proposes this method in java.lang.management.ThreadMXBean as a Java SE feature.

Has this been discussed with the GC team to commit measuring current thread's allocated bytes as Java SE feature?   Can this be supported by all JVM implementation?   What is the overhead if this is enabled by default?  Does it need to be disabled?   This metric is from TLAB that might be okay.  This needs advice/discussion with GC experts.

I see that CSR mentions it can be disabled and link to isThreadAllocatedMemoryEnabled() and setThreadAllocatedMemoryEnabled() methods but these methods are defined in com.sun.management.ThreadMXBean.

As Alan points out, current thread makes sense only in local VM management.  When this is monitored from a JMX client (e.g. jconsole to connect to a running JVM, "currentThreadAllowcatedBytes" attribute is the current thread in jconsole process which invoking Thread::currentThread?

Mandy
On 8/28/19 12:22 PM, Hohensee, Paul wrote:
Please review a performance improvement for ThreadMXBean.getThreadAllocatedBytes and the addition of getCurrentThreadAllocatedBytes.

JBS issue: https://bugs.openjdk.java.net/browse/JDK-8207266
Webrev: http://cr.openjdk.java.net/~phh/8207266/webrev.00/
CSR: https://bugs.openjdk.java.net/browse/JDK-8230311

Previous email threads:
https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html
https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html

The CSR is for adding ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for someone to review it.

I took Mandy?s advice and put the fast paths in the library code. I added a new JMM method GetOneThreadsAllocatedBytes that works the same as GetThreadCpuTime: it uses a thread_id value of zero to distinguish the current thread. On my Mac laptop, the result runs 47x faster for the current thread than the old implementation.

The 3 tests in test/jdk/com/sun/management/ThreadMXBean all pass. I added code to ThreadAllocatedMemory.java to test getCurrentThreadAllocatedBytes as well as variations on getThreadAllocatedBytes(id). A submit repo job is in progress.

Thanks,

Paul


From mandy.chung at oracle.com  Fri Aug 30 23:25:30 2019
From: mandy.chung at oracle.com (Mandy Chung)
Date: Fri, 30 Aug 2019 16:25:30 -0700
Subject: RFR (M): 8207266: ThreadMXBean::getThreadAllocatedBytes() can be
 quicker for self thread
In-Reply-To: <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com>
References: <CDB47B55-1327-4EFA-93E0-61E04A9EB61F@amazon.com>
 <ad45e908-b608-2f86-9c77-7a6e19144275@oracle.com>
 <AC4FEEF0-3FD7-4C48-AC1F-429C52EEBCF3@amazon.com>
 <588a91ec-8d4a-1157-5d72-88bb1eef1e6e@oracle.com>
 <30EA5D0C-1AEC-4242-B17B-CA4D39ECAF71@amazon.com>
Message-ID: <0d42d653-d158-a6e4-45b6-84f087c7e592@oracle.com>

CSR reviewed.

management.cpp
2083???? java_thread = (JavaThread*)THREAD;
2084???? if (java_thread->is_Java_thread()) {
2085?????? return java_thread->cooked_allocated_bytes();
2086???? }

The cast should be done after is_Java_thread() test.

ThreadImpl.java
 ?162???? private void throwIfNullThreadIds(long[] ids) {

Even better: simply use Objects::requiresNonNull and this method can be 
removed.

This suggests positive naming alternative to 
throwIfThreadAllocatedMemoryNotSupported - 
"ensureThreadAllocatedMemorySupported" (sorry I should have suggested that)

ThreadMXBean.java
 ?130????? * @throws java.lang.UnsupportedOperationException if the Java 
virtual

Nit: "java.lang." can be dropped.

@since 14 is missing.

Mandy

On 8/30/19 3:33 PM, Hohensee, Paul wrote:
>
> Thanks for your review, Mandy. Revised webrev at 
> http://cr.openjdk.java.net/~phh/8207266/webrev.02/.
>
> I updated the CSR with your suggested javadoc for 
> getCurrentThreadAllocatedBytes. It now matches that for 
> getCurrentThreadUserTime and getCurrentThreadCputime. I also fixed the 
> ?convenient? -> ?convenience? typos in j.l.m.ThreadMXBean.java.
>
> I meant GetOneThreads to be the possessive, but don?t feel strongly 
> either way so I?m fine with GetOneThread.
>
> I updated ThreadImpl.java as you suggested, though in 
> getThreadAllocatedBytes(long[] ids) I had to add a 
> redundant-in-the-not-length-1-case check for a null ids reference.
> Would someone take a look at the Hotspot side and the test please?
> Paul
>
> *From: *Mandy Chung <mandy.chung at oracle.com>
> *Date: *Friday, August 30, 2019 at 10:22 AM
> *To: *"Hohensee, Paul" <hohensee at amazon.com>
> *Cc: *OpenJDK Serviceability <serviceability-dev at openjdk.java.net>, 
> "hotspot-gc-dev at openjdk.java.net" <hotspot-gc-dev at openjdk.java.net>
> *Subject: *Re: RFR (M): 8207266: 
> ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread
>
> OK.? That's better.? Some review comments:
>
> The javadoc of getCurrentThreadAllocatedBytes() can simply say:
>
> "Returns an approximation of the total amount of memory, in bytes,
> allocated in heap memory for the current thread.
>
> This is a convenient method for local management use and is equivalent
> to calling getThreadAllocatedBytes(Thread.currentThread().getId()).
>
>
> src/hotspot/share/include/jmm.h
>
> GetOneThreadsAllocatedMemory: s/OneThreads/OneThread/
>
> sun/management/ThreadImpl.java
>
> ? 43???? private static final String 
> THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED =
> ? 44???????? "Thread allocated memory measurement is not supported.";
>
> if (!isThreadAllocatedMemorySupported()) {
> ?? throw new 
> UnsupportedOperationException(THREAD_ALLOCATED_MEMORY_NOT_SUPPORTED);
> }
>
> Perhaps the above can be refactored as 
> throwIfAllocatedMemoryUnsupported() method.
>
> ?391???????? if (ids.length == 1) {
> ?392???????????? sizes[0] = -1;
> ?:
> ?398???????????? if (ids.length == 1) {
> ?399???????????????? long id = ids[0];
> ?400???????????????? sizes[0] = getThreadAllocatedMemory0(
> ?401???????????????????? Thread.currentThread().getId() == id ? 0 : id);
> ?402???????????? } else {
>
> It seems cleaner to handle the 1-element array case at the beginning
> of this method:
> ?? if (ids.length == 1) {
> ?????? long size = getThreadAllocatedBytes(ids[0]);
> ?????? return new long[] { size };
> ?? }
>
> I didn't review the hotspot implementation and the test.
>
> Mandy
>
> On 8/29/19 10:01 AM, Hohensee, Paul wrote:
>
>     My bad, Mandy. The webrev puts getCurrentThreadAllocatedBytes in
>     com.sun.management.ThreadMXBean along with the current two
>     getThreadAllocatedBytes methods for the reasons you list. I?ve
>     updated the CSR to specify com.sun.management and added a
>     rationale. AllocatedBytes is currently enabled by Hotspot by
>     default because the overhead of recording TLAB occupancy is
>     negligible.
>
>     There?s no new GC code, nor will there be, so imo we don?t have to
>     involve the GC folks. I.e., the new JMM method
>     GetOneThreadsAllocatedBytes uses the existing
>     cooked_allocated_bytes JavaThread method, and
>     getCurrentThreadAllocatedBytes is the same as
>     getThreadAllocatedBytes: it just bypasses the thread lookup code.
>
>     I hadn?t tracked down what happens when getCurrentThreadUserTime
>     and getCurrentThreadCpuTime are called before, but if I?m not
>     mistaken, it the code in jcmd() in attachListener.cpp will call
>     GetThreadCpuTimeWithKind in management.cpp, and it will ultimately
>     use Thread::current() as the subject of the call, see
>     os::current_thread_cpu_time in os_linux.cpp. That means that the
>     CurrentThread methods should work remotely the same way they do
>     locally. GetOneThreadsAllocatedBytes in management.cpp uses THREAD
>     as its subject when called on behalf of
>     getCurrentThreadAllocatedBytes, so it will also uses the current
>     remote Java thread. Even if these methods only worked locally,
>     there are many setups where apps are self-monitoring that could
>     use the performance improvement.
>
>     Thanks,
>
>     Paul
>
>     *From: *Mandy Chung <mandy.chung at oracle.com>
>     <mailto:mandy.chung at oracle.com>
>     *Date: *Wednesday, August 28, 2019 at 3:59 PM
>     *To: *"Hohensee, Paul" <hohensee at amazon.com>
>     <mailto:hohensee at amazon.com>
>     *Cc: *OpenJDK Serviceability <serviceability-dev at openjdk.java.net>
>     <mailto:serviceability-dev at openjdk.java.net>,
>     "hotspot-gc-dev at openjdk.java.net"
>     <mailto:hotspot-gc-dev at openjdk.java.net>
>     <hotspot-gc-dev at openjdk.java.net>
>     <mailto:hotspot-gc-dev at openjdk.java.net>
>     *Subject: *Re: RFR (M): 8207266:
>     ThreadMXBean::getThreadAllocatedBytes() can be quicker for self thread
>
>     Hi Paul,
>
>     The CSR proposes this method in java.lang.management.ThreadMXBean
>     as a Java SE feature.
>
>     Has this been discussed with the GC team to commit measuring
>     current thread's allocated bytes as Java SE feature??? Can this be
>     supported by all JVM implementation??? What is the overhead if
>     this is enabled by default?? Does it need to be disabled??? This
>     metric is from TLAB that might be okay. This needs
>     advice/discussion with GC experts.
>
>     I see that CSR mentions it can be disabled and link to
>     isThreadAllocatedMemoryEnabled() and
>     setThreadAllocatedMemoryEnabled() methods but these methods are
>     defined in com.sun.management.ThreadMXBean.
>
>     As Alan points out, current thread makes sense only in local VM
>     management.? When this is monitored from a JMX client (e.g.
>     jconsole to connect to a running JVM,
>     "currentThreadAllowcatedBytes" attribute is the current thread in
>     jconsole process which invoking Thread::currentThread?
>
>     Mandy
>
>     On 8/28/19 12:22 PM, Hohensee, Paul wrote:
>
>         Please review a performance improvement for
>         ThreadMXBean.getThreadAllocatedBytes and the addition of
>         getCurrentThreadAllocatedBytes.
>
>         JBS issue:https://bugs.openjdk.java.net/browse/JDK-8207266
>
>         Webrev:http://cr.openjdk.java.net/~phh/8207266/webrev.00/
>
>         CSR:https://bugs.openjdk.java.net/browse/JDK-8230311
>
>         Previous email threads:
>         https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-July/024441.html
>         https://mail.openjdk.java.net/pipermail/serviceability-dev/2018-August/024763.html
>
>         The CSR is for adding
>         ThreadMXBean.getCurrentThreadAllocatedBytes. I?d be great for
>         someone to review it.
>
>         I took Mandy?s advice and put the fast paths in the library
>         code. I added a new JMM method GetOneThreadsAllocatedBytes
>         that works the same as GetThreadCpuTime: it uses a thread_id
>         value of zero to distinguish the current thread. On my Mac
>         laptop, the result runs 47x faster for the current thread than
>         the old implementation.
>
>         The 3 tests intest/jdk/com/sun/management/ThreadMXBean all
>         pass. I added code to ThreadAllocatedMemory.java to test
>         getCurrentThreadAllocatedBytes as well as variations on
>         getThreadAllocatedBytes(id). A submit repo job is in progress.
>
>         Thanks,
>
>         Paul
>
>
>
>
>
>


From kim.barrett at oracle.com  Sat Aug 31 06:09:18 2019
From: kim.barrett at oracle.com (Kim Barrett)
Date: Sat, 31 Aug 2019 02:09:18 -0400
Subject: RFR: 8230404: Refactor logged card refinement support in
 G1DirtyCardQueueSet 
Message-ID: <9794C377-E574-43FC-9AD1-6A4F5BA8EE00@oracle.com>

Please review this refactoring of logged card refinement.

We presently have two cases embedded in G1DirtyCardQueueSet. There is
support for both concurrent refinement and during GC (STW) refinement.
These cases share some code, but that code sharing causes more
problems than it solves.

The yield to safepoint request handling in the concurrent case is
embedded in the closure and reported via the bool return value, making
for some awkwardness and ambiguity in the shared helper code as to
whether the current card was processed or not.

STW refinement needs to go through the DCQS lock and deal with the
head/tail/count in the DCQS list representation. It performs a useless
stop_at check.  And it needs to have a closure that returns bool but
must always return true.

By separating these two cases we end up removing more code than we
add, and substantially simplify both.  This also fixes a bug in
counting buffers processed concurrently: _processed_buffers_rs_thread
was being incremented for both concurrent refinement and STW
refinement, but being reported as being for concurrent refinement.
(The only use of that information currently is in logging.)  This
might get a small performance benefit from STW refinement dealing with
a lock-free stack rather than dealing with the DCQS lock, but I didn't
try to measure that.

CR:
https://bugs.openjdk.java.net/browse/JDK-8230404

Webrev:
http://cr.openjdk.java.net/~kbarrett/8230404/open.00/

Testing:
mach5 tier1-5


From mandrikov at gmail.com  Sat Aug 31 23:57:04 2019
From: mandrikov at gmail.com (Evgeny Mandrikov)
Date: Sun, 1 Sep 2019 01:57:04 +0200
Subject: RFR: JDK-8073188: Remove disable of old MSVC++ warning C4786
Message-ID: <CAEPFu69xvLWsDrP9vRP=DgUZrAXRdBG5_eHn_RsnX11fAtxYiA@mail.gmail.com>

Hello!

Please review patch [1] for JDK-8073188 [2]. Also it needs a sponsor since
I have only author status in OpenJDK Census [3].

After this change build passes without warnings on my machine.


With best regards,
Evgeny Mandrikov

[1] http://cr.openjdk.java.net/~godin/8073188/webrev.00/
[2] https://bugs.openjdk.java.net/browse/JDK-8073188
[3] https://openjdk.java.net/census#godin