RFC: Parallel full collection for G1

Thu Nov 17 10:18:39 UTC 2016

Hi,

On 2016-11-09 16:15, Erik Helin wrote:
> Hi all,
>
> currently G1 Full GC is very slow as it is done serially. In some 
> cases this might not be optimal.
>
> Although the concept is pretty well known, there are differences in 
> how well a solution would fit with the rest of the G1 code. The most 
> direct approach for implementing a parallel full GC for G1 is to 
> create a parallel version based on the closures used for the "Serial" 
> closures. Another approach for the marking phase would be to build 
> upon the concurrent marking code, essentially running a concurrent 
> mark in the foreground. These two approaches have different advantages:
>
> - piggybacking on an already running concurrent mark is preferable when
>   G1 is about to encounter a concurrent mode failure. Typically a
>   concurrent mark is about to be done, unless the dynamic IHOP
>   predictions were completely off. If so, then most of the marking work
>   is already done. This could save significant amount of time in the
>   following compaction phase of that full GC.
>
>   From this point on, G1 Full GC might either compact in-place as
>   before, or do an evacuating collection if or as soon as there is a
>   suitable reserve of regions.
>
> - parallelizing the closures used within the "MarkSweep framework" will
>   result in a parallel full GC that can handle the worst case
>   from-scratch Full GC better.  I.e. even though this algorithm will
>   have to redo marking in a STW pause, it will get the most precise
>   liveness information and so will be able to compact the heap more
>   densely.  This approach can also handle the case when G1 is
>   completely out of regions.
I have started looking at this approach. The project is just ramping up 
and I'm still in the investigation phase. I'll get back with more 
information once I have a worked through project plan and a JEP.

Thanks,
Stefan
>
> Both approaches will most likely also tie into the idea of rebuilding 
> remembered sets concurrently. Any kind of full GC implementation need 
> to rebuild all the remembered sets, unless the non-essential 
> remembered sets can be rebuilt during concurrent phase. Since after a 
> full GC G1 will resume doing young collections, the remembered sets 
> can be rebuilt later.
>
> Even though a full collection still is a failure mode for G1, having a 
> parallel version will make the impact less dramatic if it happens.
>
> Thanks,
> Erik