RFR: 8227226: Segmented array clearing for ZGC

Thu Aug 1 14:14:25 UTC 2019

Here's an updated webrev that should be complete, i.e. fixes the issues 
related to allocation sampling/reporting that I mentioned. This patch 
makes MemAllocator::finish() virtual, so that we can do our thing and 
install the correct klass pointer before the Allocation destructor 
executes. This seems to be the least intrusive why of doing this.

http://cr.openjdk.java.net/~pliden/8227226/webrev.2

This passed function testing, but proper benchmarking remains to be done.

cheers,
Per

On 7/31/19 7:19 PM, Per Liden wrote:
> Hi,
> 
> I found some time to benchmark the "GC clears pages"-approach, and it's 
> fairly clear that it's not paying off. So ditching that idea.
> 
> However, I'm still looking for something that would not just do 
> segmented clearing of arrays in large zpages. Letting oop arrays 
> temporarily be typed arrays while it's being cleared could be an option. 
> I did a prototype for that, which looks like this:
> 
> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
> 
> There's at least one issue here, the code doing allocation sampling will 
> see that we allocated long arrays instead of oop arrays, so the 
> reporting there will be skewed. That can be addressed if we go down this 
> path. The code is otherwise fairly simple and contained. Feel free to 
> spot any issues.
> 
> cheers,
> Per
> 
> On 7/26/19 2:27 PM, Per Liden wrote:
>> Hi Ryan & Erik,
>>
>> I had a look at this and started exploring a slightly different 
>> approach. Instead doing segmented clearing in the allocation path, we 
>> can have the concurrent GC thread clear pages when they are reclaimed 
>> and not do any clearing in the allocation path at all.
>>
>> That would look like this:
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>
>> (I've had to temporarily comment out three lines of assert/debug code 
>> to make this work)
>>
>> The relocation set selection phase will now be tasked with some 
>> potentially expensive clearing work, so we'll want to make that part 
>> parallel also.
>>
>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>
>> Moving this work from Java threads onto the concurrent GC threads 
>> means we will potentially prolong the RelocationSetSelection and 
>> Relocation phases. That might be a trade-off worth doing. In return, 
>> we get:
>>
>> * Faster array allocations, as there's now less work done in the 
>> allocation path.
>> * This benefits all arrays, not just those allocated in large pages.
>> * No need to consider/tune a "chunk size".
>> * I also tend think we'll end up with slightly less complex code, that 
>> is a bit easier to reason about. Can be debated of course.
>>
>> This approach might also "survive" longer, because the YC scheme we've 
>> been loosely thinking about currently requires newly allocated pages 
>> to be cleared anyway. It's of course too early to tell if that 
>> requirement will stand in the end, but it's possible anyway.
>>
>> I'll need to do some more testing and benchmarking to make sure 
>> there's no regression or bugs here. The commented out debug code also 
>> needs to be addressed of course.
>>
>> Comments? Other ideas?
>>
>> cheers,
>> Per
>>
>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>
>>> Somehow I lost the RFR off the front and started a new thread.
>>> Now that we're both off vacation I'd like to revisit this.  Can you 
>>> take a look?
>>>
>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone, Ryan" 
>>> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of sci at amazon.com> 
>>> wrote:
>>>
>>>      http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>>      This shifts away from abusing the constructor do_zero magic in 
>>> exchange for virtualizing mem_clear() and specializing for the Z 
>>> version.  It does create a change in mem_clear in that it returns an 
>>> updated version of mem.  It does create change outside of the Z code 
>>> however it does feel cleaner.
>>>      I didn't make a change to PinAllocating - looking at it, the 
>>> safety of keeping it constructor / destructor based still seemed 
>>> appropriate to me.  If the objection is to using the sequence numbers 
>>> to pin (and instead using handles to update) - this to me seems less 
>>> error prone.  I had originally discussed handles with Stefan but the 
>>> proposal came down to this which looks much cleaner.
>>>      On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of Sciampacone, 
>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
>>> sci at amazon.com> wrote:
>>>          1) Yes this was a conscious decision.  There was discussion 
>>> on determining the optimal point for breakup but given the existing 
>>> sizes this seemed sufficient.  This doesn't preclude the ability to 
>>> go down that path if its deemed absolutely necessary.  The path for 
>>> more complex decisions is now available.
>>>          2) Agree
>>>          3) I'm not clear here.  Do you mean effectively going direct 
>>> to ZHeap and bypassing the single function PinAllocating?  Agree. 
>>> Otherwise I'll ask you to be a bit clearer.
>>>          4) Agree
>>>          5) I initially had the exact same reaction but I played 
>>> around with a few other versions (including breaking up 
>>> initialization points between header and body to get the desired 
>>> results) and this ended up looking correct.  I'll try mixing in the 
>>> mem clearer function again (fresh start) to see if it looks any better.
>>>          On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com> wrote:
>>>              Hi Ryan,
>>>              A few general comments:
>>>              1) It looks like this still only work for large pages?
>>>              2) The log_info stuff should be removed.
>>>              3) I'm not a huge fan of single-use utilities like 
>>> PinAllocating, at
>>>              least not when, IMO, the alternative is more straight 
>>> forward and less code.
>>>              4) Please make locals const when possible.
>>>              5) Duplicating _do_zero looks odd. Injecting a "mem 
>>> clearer", similar to
>>>              what Stefans original patch did, seems worth exploring.
>>>              cheers,
>>>              /Per
>>>              (Btw, I'm on vacation so I might not be super-responsive 
>>> to emails)
>>>              On 2019-07-08 12:42, Erik Österlund wrote:
>>>              > Hi Ryan,
>>>              >
>>>              > This looks good in general. Just some stylistic things...
>>>              >
>>>              > 1) In the ZGC project we like the letter 'Z' so much 
>>> that we put it in
>>>              > front of everything we possibly can, including all 
>>> class names.
>>>              > 2) We also explicitly state things are private even 
>>> though it's
>>>              > bleedingly obvious.
>>>              >
>>>              > So:
>>>              >
>>>              > 39 class PinAllocating {
>>>              > 40 HeapWord* _mem;
>>>              > 41 public: -> 39 class ZPinAllocating { 40 private: 41 
>>> HeapWord* _mem;
>>>              >    42
>>>              >   41 public: I can be your sponsor and push this 
>>> change for you. I don't
>>>              > think there is a need for another webrev for my small 
>>> stylistic remarks,
>>>              > so I can just fix that before pushing this for you. On 
>>> that note, I'll
>>>              > add me and StefanK to the contributed-by section as we 
>>> all worked out
>>>              > the right solution to this problem collaboratively. I 
>>> have run through
>>>              > mach5 tier1-5, and found no issues with this patch. 
>>> Thanks, /Erik
>>>              >
>>>              > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>>              >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>>              >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>>              >>
>>>              >> This patch introduces safe point checks into array 
>>> clearing during
>>>              >> allocation for ZGC.  The patch isolates the changes 
>>> to ZGC as (in
>>>              >> particular with the more modern collectors) the 
>>> approach to
>>>              >> incrementalizing or respecting safe point checks is 
>>> going to be
>>>              >> different.
>>>              >>
>>>              >> The approach is to keep the region holding the array 
>>> in the allocating
>>>              >> state (pin logic) while updating the color to the 
>>> array after checks.
>>>              >>
>>>              >> Can I get a review?  Thanks.
>>>              >>
>>>              >> Ryan
>>>              >
>>>