8227226: Segmented array clearing for ZGC

Per Liden per.liden at oracle.com
Wed Aug 7 10:23:13 UTC 2019


Hi again,

On 8/7/19 11:59 AM, Per Liden wrote:
> Hi Ryan,
> 
> On 8/7/19 3:05 AM, Sciampacone, Ryan wrote:
>> Although least intrusive, it goes back to some of the earlier 
>> complaints about using false in the constructor for do_zero.  It also 
>> makes a fair number of 
> 
> My earlier comment about this was not about passing false to the 
> constructor, but the duplication of the _do_zero member, which I thought 
> looked a bit odd. In this patch, this was avoided by separation these 
> paths already in ZCollectedHeap::array_allocate().
> 
>> assumptions (and goes against the hierarchies intent) on 
>> initialization logic to hide in finish().  That said, I agree that is 
>> fairly clean - and definitely addresses the missed cases of the 
>> earlier webrev.
>>
> 
> We've had the same discussions here and concluded that we might want to 
> restructure parts of MemAllocator to better accommodate this use case, 
> but that overriding finish() seems ok for now. A patch to restructure 
> MemAllocator could come later if we think it's needed.
> 
>> 2 things,
>>
>> 1. Isn't the substitute_oop_array_klass() check too narrow?  It will 
>> only detect types Object[], and not any other type of reference array 
>> (such as String[]) ?  I believe there's a bug here (correct me if I'm 
>> wrong).
> 
> On the JVM level, Object[], String[] and int[][] all have the same 
> Klass, so we should catch them all with this single check.

Sorry, I'm of course wrong here. Changed the check to call 
klass->is_objArray_klass() instead. Thanks!

Updated webrev.4 in-place.

cheers,
Per

> 
>> 2. I'd want to see an assert() on the sizeof(long) == sizeof(void *) 
>> dependency.  I realize what code base this is in but it would be 
>> properly defensive.
> 
> Sounds good.
> 
>>
>> What does the reporting look like in this case?  Is the long[] type 
>> reported accepted?  I'm wondering if this depletes some of the 
>> simplicity.
> 
> By overriding finish(), the sampling/reporting remains correct and 
> unaffected, as it will never see the intermediate long[].
> 
> Updated webrev:
> 
> http://cr.openjdk.java.net/~pliden/8227226/webrev.4
> 
> cheers,
> Per
> 
>>
>> On 8/2/19, 6:13 AM, "hotspot-gc-dev on behalf of Per Liden" 
>> <hotspot-gc-dev-bounces at openjdk.java.net on behalf of 
>> per.liden at oracle.com> wrote:
>>
>>      Did some micro-benchmarking (on a Xeon E5-2630) with various segment
>>      sizes between 4K and 512K, and 64K seems to offer a good 
>> trade-off. For
>>      a 1G array, the allocation time increases by ~1%, but in exchange 
>> the
>>      worst case TTSP drops from ~280ms to ~0.6ms.
>>      Updated webrev using 64K:
>>      http://cr.openjdk.java.net/~pliden/8227226/webrev.3
>>      cheers,
>>      Per
>>      On 8/2/19 11:11 AM, Per Liden wrote:
>>      > Hi Erik,
>>      >
>>      > On 8/1/19 5:56 PM, Erik Osterlund wrote:
>>      >> Hi Per,
>>      >>
>>      >> I like that this approach is unintrusive, does its thing at 
>> the right
>>      >> abstraction layer, and also handles medium sized arrays.
>>      >
>>      > It even handles small arrays (i.e. arrays in small zpages) ;)
>>      >
>>      >> Looks good.
>>      >
>>      > Thanks! I'll test various segment sizes and see how that affects
>>      > performance and TTSP.
>>      >
>>      > cheers,
>>      > Per
>>      >
>>      >>
>>      >> Thanks,
>>      >> /Erik
>>      >>
>>      >>> On 1 Aug 2019, at 16:14, Per Liden <per.liden at oracle.com> wrote:
>>      >>>
>>      >>> Here's an updated webrev that should be complete, i.e. fixes the
>>      >>> issues related to allocation sampling/reporting that I 
>> mentioned.
>>      >>> This patch makes MemAllocator::finish() virtual, so that we 
>> can do
>>      >>> our thing and install the correct klass pointer before the 
>> Allocation
>>      >>> destructor executes. This seems to be the least intrusive why of
>>      >>> doing this.
>>      >>>
>>      >>> http://cr.openjdk.java.net/~pliden/8227226/webrev.2
>>      >>>
>>      >>> This passed function testing, but proper benchmarking remains 
>> to be
>>      >>> done.
>>      >>>
>>      >>> cheers,
>>      >>> Per
>>      >>>
>>      >>>> On 7/31/19 7:19 PM, Per Liden wrote:
>>      >>>> Hi,
>>      >>>> I found some time to benchmark the "GC clears 
>> pages"-approach, and
>>      >>>> it's fairly clear that it's not paying off. So ditching that 
>> idea.
>>      >>>> However, I'm still looking for something that would not just do
>>      >>>> segmented clearing of arrays in large zpages. Letting oop 
>> arrays
>>      >>>> temporarily be typed arrays while it's being cleared could 
>> be an
>>      >>>> option. I did a prototype for that, which looks like this:
>>      >>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.1
>>      >>>> There's at least one issue here, the code doing allocation 
>> sampling
>>      >>>> will see that we allocated long arrays instead of oop 
>> arrays, so the
>>      >>>> reporting there will be skewed. That can be addressed if we 
>> go down
>>      >>>> this path. The code is otherwise fairly simple and 
>> contained. Feel
>>      >>>> free to spot any issues.
>>      >>>> cheers,
>>      >>>> Per
>>      >>>>> On 7/26/19 2:27 PM, Per Liden wrote:
>>      >>>>> Hi Ryan & Erik,
>>      >>>>>
>>      >>>>> I had a look at this and started exploring a slightly 
>> different
>>      >>>>> approach. Instead doing segmented clearing in the 
>> allocation path,
>>      >>>>> we can have the concurrent GC thread clear pages when they are
>>      >>>>> reclaimed and not do any clearing in the allocation path at 
>> all.
>>      >>>>>
>>      >>>>> That would look like this:
>>      >>>>>
>>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-base
>>      >>>>>
>>      >>>>> (I've had to temporarily comment out three lines of 
>> assert/debug
>>      >>>>> code to make this work)
>>      >>>>>
>>      >>>>> The relocation set selection phase will now be tasked with 
>> some
>>      >>>>> potentially expensive clearing work, so we'll want to make 
>> that
>>      >>>>> part parallel also.
>>      >>>>>
>>      >>>>> http://cr.openjdk.java.net/~pliden/8227226/webrev.0-parallel
>>      >>>>>
>>      >>>>> Moving this work from Java threads onto the concurrent GC 
>> threads
>>      >>>>> means we will potentially prolong the 
>> RelocationSetSelection and
>>      >>>>> Relocation phases. That might be a trade-off worth doing. In
>>      >>>>> return, we get:
>>      >>>>>
>>      >>>>> * Faster array allocations, as there's now less work done 
>> in the
>>      >>>>> allocation path.
>>      >>>>> * This benefits all arrays, not just those allocated in 
>> large pages.
>>      >>>>> * No need to consider/tune a "chunk size".
>>      >>>>> * I also tend think we'll end up with slightly less complex 
>> code,
>>      >>>>> that is a bit easier to reason about. Can be debated of 
>> course.
>>      >>>>>
>>      >>>>> This approach might also "survive" longer, because the YC 
>> scheme
>>      >>>>> we've been loosely thinking about currently requires newly
>>      >>>>> allocated pages to be cleared anyway. It's of course too 
>> early to
>>      >>>>> tell if that requirement will stand in the end, but it's 
>> possible
>>      >>>>> anyway.
>>      >>>>>
>>      >>>>> I'll need to do some more testing and benchmarking to make 
>> sure
>>      >>>>> there's no regression or bugs here. The commented out debug 
>> code
>>      >>>>> also needs to be addressed of course.
>>      >>>>>
>>      >>>>> Comments? Other ideas?
>>      >>>>>
>>      >>>>> cheers,
>>      >>>>> Per
>>      >>>>>
>>      >>>>>> On 7/24/19 4:37 PM, Sciampacone, Ryan wrote:
>>      >>>>>>
>>      >>>>>> Somehow I lost the RFR off the front and started a new 
>> thread.
>>      >>>>>> Now that we're both off vacation I'd like to revisit 
>> this.  Can
>>      >>>>>> you take a look?
>>      >>>>>>
>>      >>>>>> On 7/8/19, 10:40 AM, "hotspot-gc-dev on behalf of 
>> Sciampacone,
>>      >>>>>> Ryan" <hotspot-gc-dev-bounces at openjdk.java.net on behalf of
>>      >>>>>> sci at amazon.com> wrote:
>>      >>>>>>
>>      >>>>>>       http://cr.openjdk.java.net/~phh/8227226/webrev.01/
>>      >>>>>>       This shifts away from abusing the constructor 
>> do_zero magic
>>      >>>>>> in exchange for virtualizing mem_clear() and specializing 
>> for the
>>      >>>>>> Z version.  It does create a change in mem_clear in that it
>>      >>>>>> returns an updated version of mem.  It does create change 
>> outside
>>      >>>>>> of the Z code however it does feel cleaner.
>>      >>>>>>       I didn't make a change to PinAllocating - looking at 
>> it, the
>>      >>>>>> safety of keeping it constructor / destructor based still 
>> seemed
>>      >>>>>> appropriate to me.  If the objection is to using the sequence
>>      >>>>>> numbers to pin (and instead using handles to update) - 
>> this to me
>>      >>>>>> seems less error prone.  I had originally discussed 
>> handles with
>>      >>>>>> Stefan but the proposal came down to this which looks much 
>> cleaner.
>>      >>>>>>       On 7/8/19, 6:36 AM, "hotspot-gc-dev on behalf of
>>      >>>>>> Sciampacone, Ryan" 
>> <hotspot-gc-dev-bounces at openjdk.java.net on
>>      >>>>>> behalf of sci at amazon.com> wrote:
>>      >>>>>>           1) Yes this was a conscious decision.  There was
>>      >>>>>> discussion on determining the optimal point for breakup 
>> but given
>>      >>>>>> the existing sizes this seemed sufficient.  This doesn't 
>> preclude
>>      >>>>>> the ability to go down that path if its deemed absolutely
>>      >>>>>> necessary.  The path for more complex decisions is now 
>> available.
>>      >>>>>>           2) Agree
>>      >>>>>>           3) I'm not clear here.  Do you mean effectively 
>> going
>>      >>>>>> direct to ZHeap and bypassing the single function 
>> PinAllocating?
>>      >>>>>> Agree. Otherwise I'll ask you to be a bit clearer.
>>      >>>>>>           4) Agree
>>      >>>>>>           5) I initially had the exact same reaction but I 
>> played
>>      >>>>>> around with a few other versions (including breaking up
>>      >>>>>> initialization points between header and body to get the 
>> desired
>>      >>>>>> results) and this ended up looking correct.  I'll try 
>> mixing in
>>      >>>>>> the mem clearer function again (fresh start) to see if it 
>> looks
>>      >>>>>> any better.
>>      >>>>>>           On 7/8/19, 5:49 AM, "Per Liden" 
>> <per.liden at oracle.com>
>>      >>>>>> wrote:
>>      >>>>>>               Hi Ryan,
>>      >>>>>>               A few general comments:
>>      >>>>>>               1) It looks like this still only work for 
>> large pages?
>>      >>>>>>               2) The log_info stuff should be removed.
>>      >>>>>>               3) I'm not a huge fan of single-use 
>> utilities like
>>      >>>>>> PinAllocating, at
>>      >>>>>>               least not when, IMO, the alternative is more
>>      >>>>>> straight forward and less code.
>>      >>>>>>               4) Please make locals const when possible.
>>      >>>>>>               5) Duplicating _do_zero looks odd. Injecting 
>> a "mem
>>      >>>>>> clearer", similar to
>>      >>>>>>               what Stefans original patch did, seems worth 
>> exploring.
>>      >>>>>>               cheers,
>>      >>>>>>               /Per
>>      >>>>>>               (Btw, I'm on vacation so I might not be
>>      >>>>>> super-responsive to emails)
>>      >>>>>>               On 2019-07-08 12:42, Erik Österlund wrote:
>>      >>>>>>               > Hi Ryan,
>>      >>>>>>               >
>>      >>>>>>               > This looks good in general. Just some 
>> stylistic
>>      >>>>>> things...
>>      >>>>>>               >
>>      >>>>>>               > 1) In the ZGC project we like the letter 
>> 'Z' so
>>      >>>>>> much that we put it in
>>      >>>>>>               > front of everything we possibly can, 
>> including all
>>      >>>>>> class names.
>>      >>>>>>               > 2) We also explicitly state things are 
>> private
>>      >>>>>> even though it's
>>      >>>>>>               > bleedingly obvious.
>>      >>>>>>               >
>>      >>>>>>               > So:
>>      >>>>>>               >
>>      >>>>>>               > 39 class PinAllocating {
>>      >>>>>>               > 40 HeapWord* _mem;
>>      >>>>>>               > 41 public: -> 39 class ZPinAllocating { 40
>>      >>>>>> private: 41 HeapWord* _mem;
>>      >>>>>>               >    42
>>      >>>>>>               >   41 public: I can be your sponsor and 
>> push this
>>      >>>>>> change for you. I don't
>>      >>>>>>               > think there is a need for another webrev 
>> for my
>>      >>>>>> small stylistic remarks,
>>      >>>>>>               > so I can just fix that before pushing this 
>> for
>>      >>>>>> you. On that note, I'll
>>      >>>>>>               > add me and StefanK to the contributed-by 
>> section
>>      >>>>>> as we all worked out
>>      >>>>>>               > the right solution to this problem
>>      >>>>>> collaboratively. I have run through
>>      >>>>>>               > mach5 tier1-5, and found no issues with this
>>      >>>>>> patch. Thanks, /Erik
>>      >>>>>>               >
>>      >>>>>>               > On 2019-07-05 17:18, Sciampacone, Ryan wrote:
>>      >>>>>>               >> 
>> http://cr.openjdk.java.net/~phh/8227226/webrev.00/
>>      >>>>>>               >> 
>> https://bugs.openjdk.java.net/browse/JDK-8227226
>>      >>>>>>               >>
>>      >>>>>>               >> This patch introduces safe point checks into
>>      >>>>>> array clearing during
>>      >>>>>>               >> allocation for ZGC.  The patch isolates the
>>      >>>>>> changes to ZGC as (in
>>      >>>>>>               >> particular with the more modern 
>> collectors) the
>>      >>>>>> approach to
>>      >>>>>>               >> incrementalizing or respecting safe point 
>> checks
>>      >>>>>> is going to be
>>      >>>>>>               >> different.
>>      >>>>>>               >>
>>      >>>>>>               >> The approach is to keep the region 
>> holding the
>>      >>>>>> array in the allocating
>>      >>>>>>               >> state (pin logic) while updating the 
>> color to the
>>      >>>>>> array after checks.
>>      >>>>>>               >>
>>      >>>>>>               >> Can I get a review?  Thanks.
>>      >>>>>>               >>
>>      >>>>>>               >> Ryan
>>      >>>>>>               >
>>      >>>>>>
>>      >>
>>


More information about the hotspot-gc-dev mailing list