8227226: Segmented array clearing for ZGC

Wed Aug 7 13:55:46 UTC 2019

    > By overriding finish(), the sampling/reporting remains correct and 
    > unaffected, as it will never see the intermediate long[].

I learned something today.  Thank you.

For MemAllocator, I think we all agree the flow is locked in a bit too rigidly but this helps with some of the VM/GC assumptions so we end up battling it.  That said I'm with you - if there's a rewrite to be had, it's not in this patch.

Otherwise, fwiw lgtm.

On 8/7/19, 3:24 AM, "Per Liden" <per.liden at oracle.com> wrote:

    Hi again,

    On 8/7/19 11:59 AM, Per Liden wrote:
    > Hi Ryan,
    > 
    > On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
    >> Although least intrusive, it goes back to some of the earlier 
    >> complaints about using false in the constructor for do_zero.  It also 
    >> makes a fair number of 
    > 
    > My earlier comment about this was not about passing false to the 
    > constructor, but the duplication of the _do_zero member, which I thought 
    > looked a bit odd. In this patch, this was avoided by separation these 
    > paths already in ZCollectedHeap::array_allocate().
    > 
    >> assumptions (and goes against the hierarchies intent) on 
    >> initialization logic to hide in finish().  That said, I agree that is 
    >> fairly clean - and definitely addresses the missed cases of the 
    >> earlier webrev.
    >>
    > 
    > We've had the same discussions here and concluded that we might want to 
    > restructure parts of MemAllocator to better accommodate this use case, 
    > but that overriding finish() seems ok for now. A patch to restructure 
    > MemAllocator could come later if we think it's needed.
    > 
    >> 2 things,
    >>
    >> 1. Isn't the substitute_oop_array_klass() check too narrow?  It will 
    >> only detect types Object[], and not any other type of reference array 
    >> (such as String[]) ?  I believe there's a bug here (correct me if I'm 
    >> wrong).
    > 
    > On the JVM level, Object[], String[] and int[][] all have the same 
    > Klass, so we should catch them all with this single check.

    Sorry, I'm of course wrong here. Changed the check to call 
    klass->is_objArray_klass() instead. Thanks!

    Updated webrev.4 in-place.

    cheers,
    Per

    > 
    >> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) 
    >> dependency.  I realize what code base this is in but it would be 
    >> properly defensive.
    > 
    > Sounds good.
    > 
    >>
    >> What does the reporting look like in this case?  Is the long[] type 
    >> reported accepted?  I'm wondering if this depletes some of the 
    >> simplicity.
    > 
    > By overriding finish(), the sampling/reporting remains correct and 
    > unaffected, as it will never see the intermediate long[].
    > 
    > Updated webrev:
    > 
    > http://cr.openjdk.java.net/~pliden/8227226/webrev.4
    > 
    > cheers,
    > Per
    > 
    >>
    >> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" 
    >> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
    >> per.liden at oracle.com> wrote:
    >>
    >>      Did some micro-benchmarking (on a Xeon E5-2630) with various segment
    >>      sizes between 4K and 512K, and 64K seems to offer a good 
    >> trade-off. For
    >>      a 1G array, the allocation time increases by ~1%, but in exchange 
    >> the
    >>      worst case TTSP drops from ~280ms to ~0.6ms.
    >>      Updated webrev using 64K:
    >>      http://cr.openjdk.java.net/~pliden/8227226/webrev.3
    >>      cheers,
    >>      Per
    >>      On 8/2/19 11:11 AM, Per Liden wrote:
    >>      > Hi Erik,
    >>      >
    >>      > On 8/1/19 5:56 PM, Erik Osterlund wrote:
    >>      >> Hi Per,
    >>      >>
    >>      >> I like that this approach is unintrusive, does its thing at 
    >> the right
    >>      >> abstraction layer, and also handles medium sized arrays.
    >>      >
    >>      > It even handles small arrays (i.e. arrays in small zpages) ;)
    >>      >
    >>      >> Looks good.
    >>      >
    >>      > Thanks! I'll test various segment sizes and see how that affects
    >>      > performance and TTSP.
    >>      >
    >>      > cheers,
    >>      > Per
    >>      >
    >>      >>
    >>      >> Thanks,
    >>      >> /Erik
    >>      >>
    >>      >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
    >>      >>>
    >>      >>> Here's an updated webrev that should be complete, i.e. fixes the
    >>      >>> issues related to allocation sampling/reporting that I 
    >> mentioned.
    >>      >>> This patch makes MemAllocator::finish() virtual, so that we 
    >> can do
    >>      >>> our thing and install the correct klass pointer before the 
    >> Allocation
    >>      >>> destructor executes. This seems to be the least intrusive why of
    >>      >>> doing this.
    >>      >>>
    >>      >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
    >>      >>>
    >>      >>> This passed function testing, but proper benchmarking remains 
    >> to be
    >>      >>> done.
    >>      >>>
    >>      >>> cheers,
    >>      >>> Per
    >>      >>>
    >>      >>>> On 7/31/19 7:19 PM, Per Liden wrote:
    >>      >>>> Hi,
    >>      >>>> I found some time to benchmark the "GC clears 
    >> pages"-approach, and
    >>      >>>> it's fairly clear that it's not paying off. So ditching that 
    >> idea.
    >>      >>>> However, I'm still looking for something that would not just do
    >>      >>>> segmented clearing of arrays in large zpages. Letting oop 
    >> arrays
    >>      >>>> temporarily be typed arrays while it's being cleared could 
    >> be an
    >>      >>>> option. I did a prototype for that, which looks like this:
    >>      >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
    >>      >>>> There's at least one issue here, the code doing allocation 
    >> sampling
    >>      >>>> will see that we allocated long arrays instead of oop 
    >> arrays, so the
    >>      >>>> reporting there will be skewed. That can be addressed if we 
    >> go down
    >>      >>>> this path. The code is otherwise fairly simple and 
    >> contained. Feel
    >>      >>>> free to spot any issues.
    >>      >>>> cheers,
    >>      >>>> Per
    >>      >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
    >>      >>>>> Hi Ryan & Erik,
    >>      >>>>>
    >>      >>>>> I had a look at this and started exploring a slightly 
    >> different
    >>      >>>>> approach. Instead doing segmented clearing in the 
    >> allocation path,
    >>      >>>>> we can have the concurrent GC thread clear pages when they are
    >>      >>>>> reclaimed and not do any clearing in the allocation path at 
    >> all.
    >>      >>>>>
    >>      >>>>> That would look like this:
    >>      >>>>>
    >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
    >>      >>>>>
    >>      >>>>> (I've had to temporarily comment out three lines of 
    >> assert/debug
    >>      >>>>> code to make this work)
    >>      >>>>>
    >>      >>>>> The relocation set selection phase will now be tasked with 
    >> some
    >>      >>>>> potentially expensive clearing work, so we'll want to make 
    >> that
    >>      >>>>> part parallel also.
    >>      >>>>>
    >>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
    >>      >>>>>
    >>      >>>>> Moving this work from Java threads onto the concurrent GC 
    >> threads
    >>      >>>>> means we will potentially prolong the 
    >> RelocationSetSelection and
    >>      >>>>> Relocation phases. That might be a trade-off worth doing. In
    >>      >>>>> return, we get:
    >>      >>>>>
    >>      >>>>> * Faster array allocations, as there's now less work done 
    >> in the
    >>      >>>>> allocation path.
    >>      >>>>> * This benefits all arrays, not just those allocated in 
    >> large pages.
    >>      >>>>> * No need to consider/tune a "chunk size".
    >>      >>>>> * I also tend think we'll end up with slightly less complex 
    >> code,
    >>      >>>>> that is a bit easier to reason about. Can be debated of 
    >> course.
    >>      >>>>>
    >>      >>>>> This approach might also "survive" longer, because the YC 
    >> scheme
    >>      >>>>> we've been loosely thinking about currently requires newly
    >>      >>>>> allocated pages to be cleared anyway. It's of course too 
    >> early to
    >>      >>>>> tell if that requirement will stand in the end, but it's 
    >> possible
    >>      >>>>> anyway.
    >>      >>>>>
    >>      >>>>> I'll need to do some more testing and benchmarking to make 
    >> sure
    >>      >>>>> there's no regression or bugs here. The commented out debug 
    >> code
    >>      >>>>> also needs to be addressed of course.
    >>      >>>>>
    >>      >>>>> Comments? Other ideas?
    >>      >>>>>
    >>      >>>>> cheers,
    >>      >>>>> Per
    >>      >>>>>
    >>      >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
    >>      >>>>>>
    >>      >>>>>> Somehow I lost the RFR off the front and started a new 
    >> thread.
    >>      >>>>>> Now that we're both off vacation I'd like to revisit 
    >> this.  Can
    >>      >>>>>> you take a look?
    >>      >>>>>>
    >>      >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of 
    >> Sciampacone,
    >>      >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
    >>      >>>>>> sci at amazon.com> wrote:
    >>      >>>>>>
    >>      >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
    >>      >>>>>>       This shifts away from abusing the constructor 
    >> do_zero magic
    >>      >>>>>> in exchange for virtualizing mem_clear() and specializing 
    >> for the
    >>      >>>>>> Z version.  It does create a change in mem_clear in that it
    >>      >>>>>> returns an updated version of mem.  It does create change 
    >> outside
    >>      >>>>>> of the Z code however it does feel cleaner.
    >>      >>>>>>       I didn't make a change to PinAllocating - looking at 
    >> it, the
    >>      >>>>>> safety of keeping it constructor / destructor based still 
    >> seemed
    >>      >>>>>> appropriate to me.  If the objection is to using the sequence
    >>      >>>>>> numbers to pin (and instead using handles to update) - 
    >> this to me
    >>      >>>>>> seems less error prone.  I had originally discussed 
    >> handles with
    >>      >>>>>> Stefan but the proposal came down to this which looks much 
    >> cleaner.
    >>      >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
    >>      >>>>>> Sciampacone, Ryan" 
    >> <hotspot-gc-dev-bounces at openjdk.java.net on
    >>      >>>>>> behalf of sci at amazon.com> wrote:
    >>      >>>>>>           1) Yes this was a conscious decision.  There was
    >>      >>>>>> discussion on determining the optimal point for breakup 
    >> but given
    >>      >>>>>> the existing sizes this seemed sufficient.  This doesn't 
    >> preclude
    >>      >>>>>> the ability to go down that path if its deemed absolutely
    >>      >>>>>> necessary.  The path for more complex decisions is now 
    >> available.
    >>      >>>>>>           2) Agree
    >>      >>>>>>           3) I'm not clear here.  Do you mean effectively 
    >> going
    >>      >>>>>> direct to ZHeap and bypassing the single function 
    >> PinAllocating?
    >>      >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
    >>      >>>>>>           4) Agree
    >>      >>>>>>           5) I initially had the exact same reaction but I 
    >> played
    >>      >>>>>> around with a few other versions (including breaking up
    >>      >>>>>> initialization points between header and body to get the 
    >> desired
    >>      >>>>>> results) and this ended up looking correct.  I'll try 
    >> mixing in
    >>      >>>>>> the mem clearer function again (fresh start) to see if it 
    >> looks
    >>      >>>>>> any better.
    >>      >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" 
    >> <per.liden at oracle.com>
    >>      >>>>>> wrote:
    >>      >>>>>>               Hi Ryan,
    >>      >>>>>>               A few general comments:
    >>      >>>>>>               1) It looks like this still only work for 
    >> large pages?
    >>      >>>>>>               2) The log_info stuff should be removed.
    >>      >>>>>>               3) I'm not a huge fan of single-use 
    >> utilities like
    >>      >>>>>> PinAllocating, at
    >>      >>>>>>               least not when, IMO, the alternative is more
    >>      >>>>>> straight forward and less code.
    >>      >>>>>>               4) Please make locals const when possible.
    >>      >>>>>>               5) Duplicating _do_zero looks odd. Injecting 
    >> a "mem
    >>      >>>>>> clearer", similar to
    >>      >>>>>>               what Stefans original patch did, seems worth 
    >> exploring.
    >>      >>>>>>               cheers,
    >>      >>>>>>               /Per
    >>      >>>>>>               (Btw, I'm on vacation so I might not be
    >>      >>>>>> super-responsive to emails)
    >>      >>>>>>               On 2019-07-08 12:42, Erik Österlund wrote:
    >>      >>>>>>               > Hi Ryan,
    >>      >>>>>>               >
    >>      >>>>>>               > This looks good in general. Just some 
    >> stylistic
    >>      >>>>>> things...
    >>      >>>>>>               >
    >>      >>>>>>               > 1) In the ZGC project we like the letter 
    >> 'Z' so
    >>      >>>>>> much that we put it in
    >>      >>>>>>               > front of everything we possibly can, 
    >> including all
    >>      >>>>>> class names.
    >>      >>>>>>               > 2) We also explicitly state things are 
    >> private
    >>      >>>>>> even though it's
    >>      >>>>>>               > bleedingly obvious.
    >>      >>>>>>               >
    >>      >>>>>>               > So:
    >>      >>>>>>               >
    >>      >>>>>>               > 39 class PinAllocating {
    >>      >>>>>>               > 40 HeapWord* _mem;
    >>      >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40
    >>      >>>>>> private: 41 HeapWord* _mem;
    >>      >>>>>>               >    42
    >>      >>>>>>               >   41 public: I can be your sponsor and 
    >> push this
    >>      >>>>>> change for you. I don't
    >>      >>>>>>               > think there is a need for another webrev 
    >> for my
    >>      >>>>>> small stylistic remarks,
    >>      >>>>>>               > so I can just fix that before pushing this 
    >> for
    >>      >>>>>> you. On that note, I'll
    >>      >>>>>>               > add me and StefanK to the contributed-by 
    >> section
    >>      >>>>>> as we all worked out
    >>      >>>>>>               > the right solution to this problem
    >>      >>>>>> collaboratively. I have run through
    >>      >>>>>>               > mach5 tier1-5, and found no issues with this
    >>      >>>>>> patch. Thanks, /Erik
    >>      >>>>>>               >
    >>      >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
    >>      >>>>>>               >> 
    >> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
    >>      >>>>>>               >> 
    >> https://bugs.openjdk.java.net/browse/JDK-8227226
    >>      >>>>>>               >>
    >>      >>>>>>               >> This patch introduces safe point checks into
    >>      >>>>>> array clearing during
    >>      >>>>>>               >> allocation for ZGC.  The patch isolates the
    >>      >>>>>> changes to ZGC as (in
    >>      >>>>>>               >> particular with the more modern 
    >> collectors) the
    >>      >>>>>> approach to
    >>      >>>>>>               >> incrementalizing or respecting safe point 
    >> checks
    >>      >>>>>> is going to be
    >>      >>>>>>               >> different.
    >>      >>>>>>               >>
    >>      >>>>>>               >> The approach is to keep the region 
    >> holding the
    >>      >>>>>> array in the allocating
    >>      >>>>>>               >> state (pin logic) while updating the 
    >> color to the
    >>      >>>>>> array after checks.
    >>      >>>>>>               >>
    >>      >>>>>>               >> Can I get a review?  Thanks.
    >>      >>>>>>               >>
    >>      >>>>>>               >> Ryan
    >>      >>>>>>               >
    >>      >>>>>>
    >>      >>
    >>