RFR: 8227226: Segmented array clearing for ZGC
Per Liden
per.liden at oracle.com
Fri Aug 2 13:11:04 UTC 2019
Did some micro-benchmarking (on a Xeon E5-2630) with various segment
sizes between 4K and 512K, and 64K seems to offer a good trade-off. For
a 1G array, the allocation time increases by ~1%, but in exchange the
worst case TTSP drops from ~280ms to ~0.6ms.
Updated webrev using 64K:
http://cr.openjdk.java.net/~pliden/8227226/webrev.3
cheers,
Per
On 8/2/19 11:11 AM, Per Liden wrote:
> Hi Erik,
>
> On 8/1/19 5:56 PM, Erik Osterlund wrote:
>> Hi Per,
>>
>> I like that this approach is unintrusive, does its thing at the right
>> abstraction layer, and also handles medium sized arrays.
>
> It even handles small arrays (i.e. arrays in small zpages) ;)
>
>> Looks good.
>
> Thanks! I'll test various segment sizes and see how that affects
> performance and TTSP.
>
> cheers,
> Per
>
>>
>> Thanks,
>> /Erik
>>
>>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>>>
>>> Here's an updated webrev that should be complete, i.e. fixes the
>>> issues related to allocation sampling/reporting that I mentioned.
>>> This patch makes MemAllocator::finish() virtual, so that we can do
>>> our thing and install the correct klass pointer before the Allocation
>>> destructor executes. This seems to be the least intrusive why of
>>> doing this.
>>>
>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>>>
>>> This passed function testing, but proper benchmarking remains to be
>>> done.
>>>
>>> cheers,
>>> Per
>>>
>>>> On 7/31/19 7:19 PM, Per Liden wrote:
>>>> Hi,
>>>> I found some time to benchmark the "GC clears pages"-approach, and
>>>> it's fairly clear that it's not paying off. So ditching that idea.
>>>> However, I'm still looking for something that would not just do
>>>> segmented clearing of arrays in large zpages. Letting oop arrays
>>>> temporarily be typed arrays while it's being cleared could be an
>>>> option. I did a prototype for that, which looks like this:
>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>>> There's at least one issue here, the code doing allocation sampling
>>>> will see that we allocated long arrays instead of oop arrays, so the
>>>> reporting there will be skewed. That can be addressed if we go down
>>>> this path. The code is otherwise fairly simple and contained. Feel
>>>> free to spot any issues.
>>>> cheers,
>>>> Per
>>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>>>>> Hi Ryan & Erik,
>>>>>
>>>>> I had a look at this and started exploring a slightly different
>>>>> approach. Instead doing segmented clearing in the allocation path,
>>>>> we can have the concurrent GC thread clear pages when they are
>>>>> reclaimed and not do any clearing in the allocation path at all.
>>>>>
>>>>> That would look like this:
>>>>>
>>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>>>>
>>>>> (I've had to temporarily comment out three lines of assert/debug
>>>>> code to make this work)
>>>>>
>>>>> The relocation set selection phase will now be tasked with some
>>>>> potentially expensive clearing work, so we'll want to make that
>>>>> part parallel also.
>>>>>
>>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>>>>
>>>>> Moving this work from Java threads onto the concurrent GC threads
>>>>> means we will potentially prolong the RelocationSetSelection and
>>>>> Relocation phases. That might be a trade-off worth doing. In
>>>>> return, we get:
>>>>>
>>>>> * Faster array allocations, as there's now less work done in the
>>>>> allocation path.
>>>>> * This benefits all arrays, not just those allocated in large pages.
>>>>> * No need to consider/tune a "chunk size".
>>>>> * I also tend think we'll end up with slightly less complex code,
>>>>> that is a bit easier to reason about. Can be debated of course.
>>>>>
>>>>> This approach might also "survive" longer, because the YC scheme
>>>>> we've been loosely thinking about currently requires newly
>>>>> allocated pages to be cleared anyway. It's of course too early to
>>>>> tell if that requirement will stand in the end, but it's possible
>>>>> anyway.
>>>>>
>>>>> I'll need to do some more testing and benchmarking to make sure
>>>>> there's no regression or bugs here. The commented out debug code
>>>>> also needs to be addressed of course.
>>>>>
>>>>> Comments? Other ideas?
>>>>>
>>>>> cheers,
>>>>> Per
>>>>>
>>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>>>>>
>>>>>> Somehow I lost the RFR off the front and started a new thread.
>>>>>> Now that we're both off vacation I'd like to revisit this. Can
>>>>>> you take a look?
>>>>>>
>>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone,
>>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>>>>>> sci at amazon.com> wrote:
>>>>>>
>>>>>> http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>>>>> This shifts away from abusing the constructor do_zero magic
>>>>>> in exchange for virtualizing mem_clear() and specializing for the
>>>>>> Z version. It does create a change in mem_clear in that it
>>>>>> returns an updated version of mem. It does create change outside
>>>>>> of the Z code however it does feel cleaner.
>>>>>> I didn't make a change to PinAllocating - looking at it, the
>>>>>> safety of keeping it constructor / destructor based still seemed
>>>>>> appropriate to me. If the objection is to using the sequence
>>>>>> numbers to pin (and instead using handles to update) - this to me
>>>>>> seems less error prone. I had originally discussed handles with
>>>>>> Stefan but the proposal came down to this which looks much cleaner.
>>>>>> On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
>>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on
>>>>>> behalf of sci at amazon.com> wrote:
>>>>>> 1) Yes this was a conscious decision. There was
>>>>>> discussion on determining the optimal point for breakup but given
>>>>>> the existing sizes this seemed sufficient. This doesn't preclude
>>>>>> the ability to go down that path if its deemed absolutely
>>>>>> necessary. The path for more complex decisions is now available.
>>>>>> 2) Agree
>>>>>> 3) I'm not clear here. Do you mean effectively going
>>>>>> direct to ZHeap and bypassing the single function PinAllocating?
>>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>>>>>> 4) Agree
>>>>>> 5) I initially had the exact same reaction but I played
>>>>>> around with a few other versions (including breaking up
>>>>>> initialization points between header and body to get the desired
>>>>>> results) and this ended up looking correct. I'll try mixing in
>>>>>> the mem clearer function again (fresh start) to see if it looks
>>>>>> any better.
>>>>>> On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com>
>>>>>> wrote:
>>>>>> Hi Ryan,
>>>>>> A few general comments:
>>>>>> 1) It looks like this still only work for large pages?
>>>>>> 2) The log_info stuff should be removed.
>>>>>> 3) I'm not a huge fan of single-use utilities like
>>>>>> PinAllocating, at
>>>>>> least not when, IMO, the alternative is more
>>>>>> straight forward and less code.
>>>>>> 4) Please make locals const when possible.
>>>>>> 5) Duplicating _do_zero looks odd. Injecting a "mem
>>>>>> clearer", similar to
>>>>>> what Stefans original patch did, seems worth exploring.
>>>>>> cheers,
>>>>>> /Per
>>>>>> (Btw, I'm on vacation so I might not be
>>>>>> super-responsive to emails)
>>>>>> On 2019-07-08 12:42, Erik Österlund wrote:
>>>>>> > Hi Ryan,
>>>>>> >
>>>>>> > This looks good in general. Just some stylistic
>>>>>> things...
>>>>>> >
>>>>>> > 1) In the ZGC project we like the letter 'Z' so
>>>>>> much that we put it in
>>>>>> > front of everything we possibly can, including all
>>>>>> class names.
>>>>>> > 2) We also explicitly state things are private
>>>>>> even though it's
>>>>>> > bleedingly obvious.
>>>>>> >
>>>>>> > So:
>>>>>> >
>>>>>> > 39 class PinAllocating {
>>>>>> > 40 HeapWord* _mem;
>>>>>> > 41 public: -> 39 class ZPinAllocating { 40
>>>>>> private: 41 HeapWord* _mem;
>>>>>> > 42
>>>>>> > 41 public: I can be your sponsor and push this
>>>>>> change for you. I don't
>>>>>> > think there is a need for another webrev for my
>>>>>> small stylistic remarks,
>>>>>> > so I can just fix that before pushing this for
>>>>>> you. On that note, I'll
>>>>>> > add me and StefanK to the contributed-by section
>>>>>> as we all worked out
>>>>>> > the right solution to this problem
>>>>>> collaboratively. I have run through
>>>>>> > mach5 tier1-5, and found no issues with this
>>>>>> patch. Thanks, /Erik
>>>>>> >
>>>>>> > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>>>>> >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>>>>> >> https://bugs.openjdk.java.net/browse/JDK-8227226
>>>>>> >>
>>>>>> >> This patch introduces safe point checks into
>>>>>> array clearing during
>>>>>> >> allocation for ZGC. The patch isolates the
>>>>>> changes to ZGC as (in
>>>>>> >> particular with the more modern collectors) the
>>>>>> approach to
>>>>>> >> incrementalizing or respecting safe point checks
>>>>>> is going to be
>>>>>> >> different.
>>>>>> >>
>>>>>> >> The approach is to keep the region holding the
>>>>>> array in the allocating
>>>>>> >> state (pin logic) while updating the color to the
>>>>>> array after checks.
>>>>>> >>
>>>>>> >> Can I get a review? Thanks.
>>>>>> >>
>>>>>> >> Ryan
>>>>>> >
>>>>>>
>>
More information about the hotspot-gc-dev
mailing list