RFR: 8165443: Free Collection Set serial phase takes very long on large heaps
Thomas Schatzl
thomas.schatzl at oracle.com
Thu Dec 5 13:00:02 UTC 2019
Hi Stefan,
On 03.12.19 19:21, Stefan Johansson wrote:> Hi Thomas,
>
> Thanks for taking a look.
>
> On 2019-11-29 11:55, Thomas Schatzl wrote:
>> Hi,
>> On 26.11.19 16:01, Stefan Johansson wrote:
>>> Hi,
>>>
>>> Please review this fix to improve freeing the collection set when
>>> having a lot of regions.
>>>
>>> Issue: https://bugs.openjdk.java.net/browse/JDK-8165443
>>> Webrev: http://cr.openjdk.java.net/~sjohanss/8165443/00/
>>>[...]
>> - HeapRegionManager::is_available() is (mostly) meant as internal
>> function, but due to assert'ing I was forced to make it public
>> (probably should have a non-public version and a public one that is
>> only available with assertions).
>> The use in G1RebuildFreeListTask::work() kind of violates this idea
>> (and sorry for not mentioning it anywhere in the code).
>
> This was just a very small optimization instead of using at_or_null(),
> so reverted back to use that one.
Okay :)
>
>> Maybe move the entire freelist rebuild task into HeapRegionManager
>> where it imho fits much better? HeapRegionManager is and should be
>> the "owner" of the free list. The call to
>> HeapRegion::unlink_from_list() can probably be made earlier, or see
>> below.
>> - another change I do not really like is the difference between
>> "abandoning" the free list (and then later clearing the HeapRegion
>> links in parallel) and "removing" the free list.
>> While it makes sense from a performance POV, I would be happier if we
>> could get away without introducing this tiny semantic difference.
>> An option would be that there were a FreeRegionList::add_to_tail that
>> just overwrites the links. This would remove the need for the new
>> "abandon" and other support methods in a few places and hide the
>> ugliness there.
>> What do you think?
>
> Discussed this a bit with Thomas offline and we both agree that there
> can be further cleanups improvements here, but I think we agreed on
> just moving stuff into the HeapRegionManager but keep the added
> concepts. We both agree that having both remove_all and abandon for
> the freelist is not good in the long run, but getting rid of
> remove_all or fix remove_all to perform well is out of scope for this
> enhancement.
>
Fine with me.
>> - while you mentioned that you did not look into balancing the work
>> for the rebuild free list action, but please limit the workers in the
>> G1RebuildFreeListTask by at least the number of regions in the heap.
>> (Move the chunk sizing outside of the G1RebuildFreeListTask for
>> that.)
>
> Added a clamp to make sure we never use more workers than we have
> regions, even though in most cases this should never happen.
>
>> - instead of workers()->run_task() please use
>> G1CollectedHeap::run_task(). That also removes some timing code for
>> you around these places :)
>
> Can’t use that one since I need to time the destructor of the task as
> well.
>
Okay.
>
> Here are the updated webrevs:
> Full: http://cr.openjdk.java.net/~sjohanss/8165443/01/
> Inc: http://cr.openjdk.java.net/~sjohanss/8165443/00-01/
>
Some more nits (sorry for not mentioning these in the first review - I
was a bit distracted by the placement of the rebuild-free-list code):
- HeapRegionManager::abandon_free_list should be private, and the
implementation could easily be moved to the cpp file.
Also HRM::append_to_free_list.
I would probably just manually inline these two one-liner methods though.
- G1Collectionset::iterate_part_from can be private
- I think there will be some merge errors with latest changes (i.e. the
update to the WorkerDataArray at least)
- heapRegionManager.hpp: please move G1RebuildFreeListTask into the cpp
file. It is only ever used there. This allows undo of the changes to the
includes in the hpp file too.
- in G1FreeCollectionSetClosure::handle_evacuated_region, the code could
remove the call to clear_remset_locked() and set the skip_remset
parameter of the free_region call to false. Since this seems to have
been the only case of the skip_remset==true, the parameter can be removed.
Also the "locked" parameter seems to be always true (now?), so this one
can be removed too. The removal of these parameters propagates a bit,
but that would reduce complexity of related methods quite a bit :)
Please add an is_at_safepoint() or so call there then.
I am open to moving this to a separate follow-up RFE, but it does not
seem that big of a change, and I think it makes the code easier to
understand.
- (this is imho) in the method FreeCSetStats::update_used_before() and
FreeCSetStats::increment_regions_freed() are always called together I
recommend merging them. To me it distracts a bit from the flow of the
code with details (given that you already added methods to the
FreeCSetStats class). I would also merge the calls to
update_failure_used/update_failure_waste/update_used_after in
handle_failed_region.
Looks good otherwise.
Thanks,
Thomas
More information about the hotspot-gc-dev
mailing list