RFR(S) 8245925 G1 allocates EDEN region after CDS has executed GC
Ioi Lam
ioi.lam at oracle.com
Sat May 30 05:44:14 UTC 2020
On 5/29/20 8:40 PM, Jiangli Zhou wrote:
> On Fri, May 29, 2020 at 7:30 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>> https://bugs.openjdk.java.net/browse/JDK-8245925
>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v01/
>>
>>
>> Summary:
>>
>> CDS supports archived heap objects only for G1. During -Xshare:dump,
>> CDS executes a full GC so that G1 will compact the heap regions, leaving
>> maximum contiguous free space at the top of the heap. Then, the archived
>> heap regions are allocated from the top of the heap.
>>
>> Under some circumstances, java.lang.ref.Cleaners will execute
>> after the GC has completed. The cleaners may allocate or synchronized, which
>> will cause G1 to allocate an EDEN region at the top of the heap.
> This is an interesting one. Please give more details on under what
> circumstances java.lang.ref.Cleaners causes the issue. It's unclear to
> me why it hasn't been showing up before.
Hi Jiangli,
Thanks for the review. It's very helpful.
The assert (see my comment in JDK-8245925) happened in my prototype for
JDK-8244778
http://cr.openjdk.java.net/~iklam/jdk15/8244778-archive-full-module-graph.v00.8/
I have to archive AppClassLoader and PlatformClassLoader, but need to
clear their URLClassPath field (the "ucp" field). See
clear_loader_states in metaspaceShared.cpp. Because of this, some
java.util.zip.ZipFiles referenced by the URLClassPath become garbage,
and their Cleaners are executed after full GC has finished.
I think the bug has always existed, but is just never triggered because
we have not activated the Cleaners.
>
>>
>> The fix is simple -- after CDS has entered a safepoint, if EDEN regions
>> exist,
>> exit the safepoint, run GC, and try again. Eventually all the cleaners will
>> be executed and no more allocation can happen.
>>
>> For safety, I limit the retry count to 30 (or about total 9 seconds).
>>
> I think it's better to skip the top allocated region(s) in such cases
> and avoid retrying. Dump time performance is important, as we are
> moving the cost from runtime to CDS dump time. It's desirable to keep
> the dump time cost as low as possible, so using CDS delivers better
> net gain overall.
>
> Here are some comments for your current webrev itself.
>
> 1611 static bool has_unwanted_g1_eden_regions() {
> 1612 #if INCLUDE_G1GC
> 1613 return HeapShared::is_heap_object_archiving_allowed() && UseG1GC &&
> 1614 G1CollectedHeap::heap()->eden_regions_count() > 0;
> 1615 #else
> 1616 return false;
> 1617 #endif
> 1618 }
>
> You can remove 'UseG1GC' from line 1613, as
> is_heap_object_archiving_allowed() check already covers it:
>
> static bool is_heap_object_archiving_allowed() {
> CDS_JAVA_HEAP_ONLY(return (UseG1GC && UseCompressedOops &&
> UseCompressedClassPointers);)
> NOT_CDS_JAVA_HEAP(return false;)
> }
>
> Please include heap archiving code under #if INCLUDE_CDS_JAVA_HEAP.
> It's better to extract the GC handling code in
> VM_PopulateDumpSharedSpace::doit() into a separate API in
> heapShared.*.
>
> It's time to enhance heap archiving to use a separate buffer when
> copying the objects at dump time (discussed before), as a longer term
> fix. I'll file a RFE.
Thanks for reminding me. I think that is a better way to fix this
problem. It should be fairly easy to do, as we can already relocate the
heap regions using HeapShared::patch_archived_heap_embedded_pointers().
Let me try to implement it.
BTW, the GC speed is very fast, because the heap is not used very much
during -Xshare:dump. -Xlog:gc shows:
[0.259s][info][gc ] GC(0) Pause Full (Full GC for -Xshare:dump)
4M->1M(32M) 8.220ms
So we have allocated only 4MB of objects, and only 1MB of those are
reachable.
Anyway, I think we can even avoid running the GC altogether. We can scan
for contiguous free space from the top of the heap (below the EDEN
region). If there's more contiguous free space than the current
allocated heap regions, we know for sure that we can archive all the
heap objects that we need without doing the GC. That can be done as an
RFE after copying the objects. It won't save much though (8ms out of
about 700ms of total -Xshare:dump time).
I'll withdraw this patch for now, and will try to implement the object
copying.
Thanks
- Ioi
>
> Best,
> Jiangli
>
>> Thanks
>> - Ioi
>>
>>
>> <https://bugs.openjdk.java.net/browse/JDK-8245925>
More information about the hotspot-gc-dev
mailing list