RFR: 8186571: Implementation: JEP 307: Parallel Full GC for G1
Stefan Johansson
stefan.johansson at oracle.com
Wed Oct 18 09:45:07 UTC 2017
Hi again,
A lot more internal review has been done and here are the latest webrevs:
Full: http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.04/
Incremental: http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.03-04/
Incremental (from previous mail):
http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.01-04/
Summary of changes:
* Updated calculations for heap sizing after gc.
* Removed G1FullGCWorkerData and moved data into G1FullCollector instead.
* Renamed G1MarkStack to G1FullGCMarker.
* Renamed G1CompactionPoint to G1FullGCCompactionPoint.
* Removed now unused RebuildRSOopClosure and par_write_ref from G1RemSet.
* Updated comments to be more informative.
* Better naming of functions and variables.
* Updated copyright for a lot of files.
Big thanks to Erik D, Thomas S and Sangheon K for working your way
through this big change.
Cheers,
Stefan
On 2017-09-19 17:32, Stefan Johansson wrote:
> Hi,
>
> We're moving forward with the review internally and doing some
> performance enhancements as well. Here are updated webrevs:
> Full: http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.01/
> Incremental: http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.00-01/
>
> Note that the full webrev is based on the new consolidated repo, but
> the incremental was generated with the old structure.
>
> Highlight in this update:
> * Cleaned out unused code in PreservedMarks.
> * Fixed memory leak in GenericTaskQueueSet.
> * HeapRegionClaimerBase has been removed and instead we now have two
> functions to iterate through all heap regions.
> * General cleanups and renames to ease understanding the code.
> * G1 Hot Card Cache cleanup made parallel and moved into appropriate
> phase.
> * Updated HeapRegion::apply_to_marked_objects to be a template
> function to avoid virtual call.
>
> Thanks Erik D and Thomas S for all comments so far.
>
> Cheers,
> Stefan
>
>
> On 2017-09-04 17:36, Stefan Johansson wrote:
>> Hi,
>>
>> Please review the implementation of JEP-307:
>> https://bugs.openjdk.java.net/browse/JDK-8172890
>>
>> Webrev:
>> http://cr.openjdk.java.net/~sjohanss/8186571/hotspot.00/
>>
>> Summary:
>> As communicated late last year [1], I've been working on
>> parallelizing the Full GC for G1. The implementation is now ready for
>> review.
>>
>> The approach I chose was to redo marking at the start of the Full GC
>> and not reuse the marking information from the concurrent mark cycle.
>> The main reason behind this is to maximize the chance of freeing up
>> memory. I reused the marking bitmap from the concurrent mark code
>> though, so instead of marking in the mark word a bitmap is used. The
>> mark word is still used for forwarding pointers, so marks will still
>> have to be preserved for some objects.
>>
>> The algorithm is still a four phased mark-compact but each phase is
>> handled by parallel workers. Marking and reference processing is done
>> in phase 1. In phase 2 all worker threads work through the heap
>> claiming regions which they prepare for compaction. This is done by
>> installing forwarding pointers into the mark word of the live objects
>> that will move. The regions claimed by a worker in this phase will be
>> the same regions that the worker will compact in phase 4. This
>> ensures that objects are not overwritten before compacted.
>>
>> In phase 3, all pointers to other objects are updated by looking at
>> the forwarding pointers. At this point all information needed to
>> create new remembered sets is available and this rebuilding has been
>> added to phase 3. In the old version remembered set rebuilding was
>> done separately after the compaction, but this is more efficient.
>>
>> As mentioned phase 4 is when the compaction is done. In this first
>> version, to avoid some complexity, there is no work stealing in this
>> phase. This will lead to some imbalance between the workers, but this
>> can be treated as a separate RFE in the future.
>>
>> The part of this work that has generated the most questions during
>> internal discussions are the serial parts of phase 2 and 4. They are
>> executed if no regions are to be freed up by the parallel workers. It
>> is kind of a safety mechanism to avoid throwing a premature OOM. In
>> the case of no regions being freed by the parallel code path a single
>> threaded pass over the last region of each worker is done (at most
>> number-of-workers regions are handled) to further compact these
>> regions and hopefully free up some regions.
>>
>> Testing:
>> * A lot of local sanity testing, both functional and performance.
>> * Passed tier 1-5 of internal testing on supported platforms.
>> * No regressions in performance testing.
>>
>> Cheers,
>> Stefan
>>
>> [1]
>> http://mail.openjdk.java.net/pipermail/hotspot-gc-dev/2016-November/019216.html
>
More information about the hotspot-gc-dev
mailing list