8227226: Segmented array clearing for ZGC
Per Liden
per.liden at oracle.com
Wed Aug 7 09:59:42 UTC 2019
Hi Ryan,
On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
> Although least intrusive, it goes back to some of the earlier complaints about using false in the constructor for do_zero. It also makes a fair number of
My earlier comment about this was not about passing false to the
constructor, but the duplication of the _do_zero member, which I thought
looked a bit odd. In this patch, this was avoided by separation these
paths already in ZCollectedHeap::array_allocate().
> assumptions (and goes against the hierarchies intent) on initialization logic to hide in finish(). That said, I agree that is fairly clean - and definitely addresses the missed cases of the earlier webrev.
>
We've had the same discussions here and concluded that we might want to
restructure parts of MemAllocator to better accommodate this use case,
but that overriding finish() seems ok for now. A patch to restructure
MemAllocator could come later if we think it's needed.
> 2 things,
>
> 1. Isn't the substitute_oop_array_klass() check too narrow? It will only detect types Object[], and not any other type of reference array (such as String[]) ? I believe there's a bug here (correct me if I'm wrong).
On the JVM level, Object[], String[] and int[][] all have the same
Klass, so we should catch them all with this single check.
> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) dependency. I realize what code base this is in but it would be properly defensive.
Sounds good.
>
> What does the reporting look like in this case? Is the long[] type reported accepted? I'm wondering if this depletes some of the simplicity.
By overriding finish(), the sampling/reporting remains correct and
unaffected, as it will never see the intermediate long[].
Updated webrev:
http://cr.openjdk.java.net/~pliden/8227226/webrev.4
cheers,
Per
>
> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of per.liden at oracle.com> wrote:
>
> Did some micro-benchmarking (on a Xeon E5-2630) with various segment
> sizes between 4K and 512K, and 64K seems to offer a good trade-off. For
> a 1G array, the allocation time increases by ~1%, but in exchange the
> worst case TTSP drops from ~280ms to ~0.6ms.
>
> Updated webrev using 64K:
>
> http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>
> cheers,
> Per
>
> On 8/2/19 11:11 AM, Per Liden wrote:
> > Hi Erik,
> >
> > On 8/1/19 5:56 PM, Erik Osterlund wrote:
> >> Hi Per,
> >>
> >> I like that this approach is unintrusive, does its thing at the right
> >> abstraction layer, and also handles medium sized arrays.
> >
> > It even handles small arrays (i.e. arrays in small zpages) ;)
> >
> >> Looks good.
> >
> > Thanks! I'll test various segment sizes and see how that affects
> > performance and TTSP.
> >
> > cheers,
> > Per
> >
> >>
> >> Thanks,
> >> /Erik
> >>
> >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
> >>>
> >>> Here's an updated webrev that should be complete, i.e. fixes the
> >>> issues related to allocation sampling/reporting that I mentioned.
> >>> This patch makes MemAllocator::finish() virtual, so that we can do
> >>> our thing and install the correct klass pointer before the Allocation
> >>> destructor executes. This seems to be the least intrusive why of
> >>> doing this.
> >>>
> >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
> >>>
> >>> This passed function testing, but proper benchmarking remains to be
> >>> done.
> >>>
> >>> cheers,
> >>> Per
> >>>
> >>>> On 7/31/19 7:19 PM, Per Liden wrote:
> >>>> Hi,
> >>>> I found some time to benchmark the "GC clears pages"-approach, and
> >>>> it's fairly clear that it's not paying off. So ditching that idea.
> >>>> However, I'm still looking for something that would not just do
> >>>> segmented clearing of arrays in large zpages. Letting oop arrays
> >>>> temporarily be typed arrays while it's being cleared could be an
> >>>> option. I did a prototype for that, which looks like this:
> >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
> >>>> There's at least one issue here, the code doing allocation sampling
> >>>> will see that we allocated long arrays instead of oop arrays, so the
> >>>> reporting there will be skewed. That can be addressed if we go down
> >>>> this path. The code is otherwise fairly simple and contained. Feel
> >>>> free to spot any issues.
> >>>> cheers,
> >>>> Per
> >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
> >>>>> Hi Ryan & Erik,
> >>>>>
> >>>>> I had a look at this and started exploring a slightly different
> >>>>> approach. Instead doing segmented clearing in the allocation path,
> >>>>> we can have the concurrent GC thread clear pages when they are
> >>>>> reclaimed and not do any clearing in the allocation path at all.
> >>>>>
> >>>>> That would look like this:
> >>>>>
> >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
> >>>>>
> >>>>> (I've had to temporarily comment out three lines of assert/debug
> >>>>> code to make this work)
> >>>>>
> >>>>> The relocation set selection phase will now be tasked with some
> >>>>> potentially expensive clearing work, so we'll want to make that
> >>>>> part parallel also.
> >>>>>
> >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
> >>>>>
> >>>>> Moving this work from Java threads onto the concurrent GC threads
> >>>>> means we will potentially prolong the RelocationSetSelection and
> >>>>> Relocation phases. That might be a trade-off worth doing. In
> >>>>> return, we get:
> >>>>>
> >>>>> * Faster array allocations, as there's now less work done in the
> >>>>> allocation path.
> >>>>> * This benefits all arrays, not just those allocated in large pages.
> >>>>> * No need to consider/tune a "chunk size".
> >>>>> * I also tend think we'll end up with slightly less complex code,
> >>>>> that is a bit easier to reason about. Can be debated of course.
> >>>>>
> >>>>> This approach might also "survive" longer, because the YC scheme
> >>>>> we've been loosely thinking about currently requires newly
> >>>>> allocated pages to be cleared anyway. It's of course too early to
> >>>>> tell if that requirement will stand in the end, but it's possible
> >>>>> anyway.
> >>>>>
> >>>>> I'll need to do some more testing and benchmarking to make sure
> >>>>> there's no regression or bugs here. The commented out debug code
> >>>>> also needs to be addressed of course.
> >>>>>
> >>>>> Comments? Other ideas?
> >>>>>
> >>>>> cheers,
> >>>>> Per
> >>>>>
> >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
> >>>>>>
> >>>>>> Somehow I lost the RFR off the front and started a new thread.
> >>>>>> Now that we're both off vacation I'd like to revisit this. Can
> >>>>>> you take a look?
> >>>>>>
> >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of Sciampacone,
> >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
> >>>>>> sci at amazon.com> wrote:
> >>>>>>
> >>>>>> http://cr.openjdk.java.net/~phh/8227226/webrev.01/
> >>>>>> This shifts away from abusing the constructor do_zero magic
> >>>>>> in exchange for virtualizing mem_clear() and specializing for the
> >>>>>> Z version. It does create a change in mem_clear in that it
> >>>>>> returns an updated version of mem. It does create change outside
> >>>>>> of the Z code however it does feel cleaner.
> >>>>>> I didn't make a change to PinAllocating - looking at it, the
> >>>>>> safety of keeping it constructor / destructor based still seemed
> >>>>>> appropriate to me. If the objection is to using the sequence
> >>>>>> numbers to pin (and instead using handles to update) - this to me
> >>>>>> seems less error prone. I had originally discussed handles with
> >>>>>> Stefan but the proposal came down to this which looks much cleaner.
> >>>>>> On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
> >>>>>> Sciampacone, Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on
> >>>>>> behalf of sci at amazon.com> wrote:
> >>>>>> 1) Yes this was a conscious decision. There was
> >>>>>> discussion on determining the optimal point for breakup but given
> >>>>>> the existing sizes this seemed sufficient. This doesn't preclude
> >>>>>> the ability to go down that path if its deemed absolutely
> >>>>>> necessary. The path for more complex decisions is now available.
> >>>>>> 2) Agree
> >>>>>> 3) I'm not clear here. Do you mean effectively going
> >>>>>> direct to ZHeap and bypassing the single function PinAllocating?
> >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
> >>>>>> 4) Agree
> >>>>>> 5) I initially had the exact same reaction but I played
> >>>>>> around with a few other versions (including breaking up
> >>>>>> initialization points between header and body to get the desired
> >>>>>> results) and this ended up looking correct. I'll try mixing in
> >>>>>> the mem clearer function again (fresh start) to see if it looks
> >>>>>> any better.
> >>>>>> On 7/8/19, 5:49 AM, "Per Liden" <per.liden at oracle.com>
> >>>>>> wrote:
> >>>>>> Hi Ryan,
> >>>>>> A few general comments:
> >>>>>> 1) It looks like this still only work for large pages?
> >>>>>> 2) The log_info stuff should be removed.
> >>>>>> 3) I'm not a huge fan of single-use utilities like
> >>>>>> PinAllocating, at
> >>>>>> least not when, IMO, the alternative is more
> >>>>>> straight forward and less code.
> >>>>>> 4) Please make locals const when possible.
> >>>>>> 5) Duplicating _do_zero looks odd. Injecting a "mem
> >>>>>> clearer", similar to
> >>>>>> what Stefans original patch did, seems worth exploring.
> >>>>>> cheers,
> >>>>>> /Per
> >>>>>> (Btw, I'm on vacation so I might not be
> >>>>>> super-responsive to emails)
> >>>>>> On 2019-07-08 12:42, Erik Österlund wrote:
> >>>>>> > Hi Ryan,
> >>>>>> >
> >>>>>> > This looks good in general. Just some stylistic
> >>>>>> things...
> >>>>>> >
> >>>>>> > 1) In the ZGC project we like the letter 'Z' so
> >>>>>> much that we put it in
> >>>>>> > front of everything we possibly can, including all
> >>>>>> class names.
> >>>>>> > 2) We also explicitly state things are private
> >>>>>> even though it's
> >>>>>> > bleedingly obvious.
> >>>>>> >
> >>>>>> > So:
> >>>>>> >
> >>>>>> > 39 class PinAllocating {
> >>>>>> > 40 HeapWord* _mem;
> >>>>>> > 41 public: -> 39 class ZPinAllocating { 40
> >>>>>> private: 41 HeapWord* _mem;
> >>>>>> > 42
> >>>>>> > 41 public: I can be your sponsor and push this
> >>>>>> change for you. I don't
> >>>>>> > think there is a need for another webrev for my
> >>>>>> small stylistic remarks,
> >>>>>> > so I can just fix that before pushing this for
> >>>>>> you. On that note, I'll
> >>>>>> > add me and StefanK to the contributed-by section
> >>>>>> as we all worked out
> >>>>>> > the right solution to this problem
> >>>>>> collaboratively. I have run through
> >>>>>> > mach5 tier1-5, and found no issues with this
> >>>>>> patch. Thanks, /Erik
> >>>>>> >
> >>>>>> > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
> >>>>>> >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
> >>>>>> >> https://bugs.openjdk.java.net/browse/JDK-8227226
> >>>>>> >>
> >>>>>> >> This patch introduces safe point checks into
> >>>>>> array clearing during
> >>>>>> >> allocation for ZGC. The patch isolates the
> >>>>>> changes to ZGC as (in
> >>>>>> >> particular with the more modern collectors) the
> >>>>>> approach to
> >>>>>> >> incrementalizing or respecting safe point checks
> >>>>>> is going to be
> >>>>>> >> different.
> >>>>>> >>
> >>>>>> >> The approach is to keep the region holding the
> >>>>>> array in the allocating
> >>>>>> >> state (pin logic) while updating the color to the
> >>>>>> array after checks.
> >>>>>> >>
> >>>>>> >> Can I get a review? Thanks.
> >>>>>> >>
> >>>>>> >> Ryan
> >>>>>> >
> >>>>>>
> >>
>
>
More information about the hotspot-gc-dev
mailing list