RFR(S) 8245925 G1 allocates EDEN region after CDS has executed GC

Thu Jun 4 18:02:43 UTC 2020

Hi, Ioi

   Glad you make it simple.

metaspaceShared.cpp:

1612 void VM_PopulateDumpSharedSpace::doit() {
1613 if (HeapShared::is_heap_object_archiving_allowed()) {
1614 // Avoid fragmentation while archiving heap objects.
1615 if (GCLocker::is_active()) {
1616 // This should rarely happen during -Xshare:dump, if at all, but 
just to be safe.
1617 log_debug(cds)("GCLocker::is_active() ... try again");
1618 return;
1619 }
1620
1621 log_debug(cds)("Run GC ...");
1622 Universe::heap()->collect_as_vm_thread(GCCause::_archive_time_gc);
1623 log_debug(cds)("Run GC done");
1624 } I think the check for GCLocker should be after the collection, 
the G1 full collection will check active GC: bool 
G1CollectedHeap::do_full_collection(bool explicit_gc, bool 
clear_all_soft_refs) { assert_at_safepoint_on_vm_thread(); if 
(GCLocker::check_active_before_gc()) { // Full GC was not completed. 
return false; } It should be very rare for dumping that full collection 
aborted premature, but put the checking after collection seems safer. 
For test GCDuringDump.java: 64 String extraArg = (i == 0) ? 
"-showversion" : "-javaagent:" + agentJar; 65 String extraOption = (i == 
0) ? "-showversion" : "-XX:+AllowArchivingWithJavaAgent"; 66 String 
extraOption2 = (i != 2) ? "-showversion" : "-Dtest.with.cleaner=true"; 
67 Is that correct for line 65? for i = 0, both extraArg and extraOption 
will be same, looks a bug. Others looks OK to me. Thanks Yumin

On 6/3/20 10:53 PM, Ioi Lam wrote:
>
>> On 5/29/20 9:40 PM, Yumin Qi wrote:
>>> HI, Ioi
>>>
>>>   If the allocation of EDEN happens between GC and dump, should we 
>>> put the GC action in VM_PopulateDumpSharedSpace? This way, at 
>>> safepoint there should no allocation happens. The stack trace showed 
>>> it happened with a Java Thread, which should be blocked at safepoint.
>>>
>>
>> Hi Yumin,
>>
>> I think GCs cannot be executed inside a safepoint, because some parts 
>> of GC need to execute in a safepoint, so they will be blocked until 
>> VM_PopulateDumpSharedSpace::doit has returned.
>>
>> Anyway, as I mentioned in my reply to Jiangli, there's a better way 
>> to fix this, so I will withdraw the current patch.
>>
>
> Hi Yumin,
>
> Actually, I changed my mind again, and implemented your suggestion :-)
>
> There's actually a way to invoke GC inside a safepoint (it's used by 
> "jcmd gc.heap_dump", for example). So I changed the CDS code to do the 
> same thing. It's a much simpler change and does what I want -- no 
> other thread will be able to make any heap allocation after the GC has 
> completed, so no EDEN region will be allocated:
>
> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v02/
>
> The check for "if (GCLocker::is_active())" should almost always be 
> false. I left it there just for safety:
>
> During -Xshare:dump, we execute Java code to build the module graph, 
> load classes, etc. So theoretically someone could try to parallelize 
> some of that Java code in the future. Theoretically when CDS has 
> entered the safepoint, another thread could be in the middle a JNI 
> method that has held the GCLock.
>
> Thanks
> - Ioi
>
>>
>>>
>>> On 5/29/20 7:29 PM, Ioi Lam wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8245925
>>>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v01/ 
>>>>
>>>>
>>>>
>>>> Summary:
>>>>
>>>> CDS supports archived heap objects only for G1. During -Xshare:dump,
>>>> CDS executes a full GC so that G1 will compact the heap regions, 
>>>> leaving
>>>> maximum contiguous free space at the top of the heap. Then, the 
>>>> archived
>>>> heap regions are allocated from the top of the heap.
>>>>
>>>> Under some circumstances, java.lang.ref.Cleaners will execute
>>>> after the GC has completed. The cleaners may allocate or 
>>>> synchronized, which
>>>> will cause G1 to allocate an EDEN region at the top of the heap.
>>>>
>>>> The fix is simple -- after CDS has entered a safepoint, if EDEN 
>>>> regions exist,
>>>> exit the safepoint, run GC, and try again. Eventually all the 
>>>> cleaners will
>>>> be executed and no more allocation can happen.
>>>>
>>>> For safety, I limit the retry count to 30 (or about total 9 seconds).
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>
>>>> <https://bugs.openjdk.java.net/browse/JDK-8245925>
>>>
>>
>