RFR(S) 8245925 G1 allocates EDEN region after CDS has executed GC

Thu Jun 4 18:36:54 UTC 2020

On 6/4/20 11:02 AM, Yumin Qi wrote:
> Hi, Ioi
>
>   Glad you make it simple.
>
> metaspaceShared.cpp:
>
> 1612 void VM_PopulateDumpSharedSpace::doit() {
> 1613 if (HeapShared::is_heap_object_archiving_allowed()) {
> 1614 // Avoid fragmentation while archiving heap objects.
> 1615 if (GCLocker::is_active()) {
> 1616 // This should rarely happen during -Xshare:dump, if at all, but 
> just to be safe.
> 1617 log_debug(cds)("GCLocker::is_active() ... try again");
> 1618 return;
> 1619 }
> 1620
> 1621 log_debug(cds)("Run GC ...");
> 1622 Universe::heap()->collect_as_vm_thread(GCCause::_archive_time_gc);
> 1623 log_debug(cds)("Run GC done");
> 1624 }
>
> I think the check for GCLocker should be after the collection, the G1 
> full collection will check active GC:
>
> bool G1CollectedHeap::do_full_collection(bool explicit_gc, bool 
> clear_all_soft_refs) {
>   assert_at_safepoint_on_vm_thread();
>   if (GCLocker::check_active_before_gc()) {
>     // Full GC was not completed. return false;
>   }
>
> It should be very rare for dumping that full collection aborted 
> premature, but put the checking after collection seems safer.
>
>
I copied the code from here (src/hotspot/share/services/heapDumper.cpp)

void VM_HeapDumper::doit() {
   CollectedHeap* ch = Universe::heap();

   ch->ensure_parsability(false); // must happen, even if collection does
                                  // not happen (e.g. due to GCLocker)

   if (_gc_before_heap_dump) {
     if (GCLocker::is_active()) {
       warning("GC locker is held; pre-heapdump GC was skipped");
     } else {
       ch->collect_as_vm_thread(GCCause::_heap_dump);
     }
   }

Could someone on the GC team comment whether this is the correct way to 
do it?

(I should probably add ensure_parsability() to my code as well??).

> For test GCDuringDump.java:
>
> 64 String extraArg = (i == 0) ? "-showversion" : "-javaagent:" + agentJar;
> 65 String extraOption = (i == 0) ? "-showversion" : 
> "-XX:+AllowArchivingWithJavaAgent";
> 66 String extraOption2 = (i != 2) ? "-showversion" : 
> "-Dtest.with.cleaner=true";
>
> Is that correct for line 65? for i = 0, both extraArg and extraOption 
> will be same, looks a bug.

"-showversion" is just a way to pass a "no-op" to TestCommon.testDump. 
It's a harmless option and can be repeated several times.

Thanks
- Ioi

>
> Others looks OK to me. Thanks Yumin
>
> On 6/3/20 10:53 PM, Ioi Lam wrote:
>>
>>> On 5/29/20 9:40 PM, Yumin Qi wrote:
>>>> HI, Ioi
>>>>
>>>>   If the allocation of EDEN happens between GC and dump, should we 
>>>> put the GC action in VM_PopulateDumpSharedSpace? This way, at 
>>>> safepoint there should no allocation happens. The stack trace 
>>>> showed it happened with a Java Thread, which should be blocked at 
>>>> safepoint.
>>>>
>>>
>>> Hi Yumin,
>>>
>>> I think GCs cannot be executed inside a safepoint, because some 
>>> parts of GC need to execute in a safepoint, so they will be blocked 
>>> until VM_PopulateDumpSharedSpace::doit has returned.
>>>
>>> Anyway, as I mentioned in my reply to Jiangli, there's a better way 
>>> to fix this, so I will withdraw the current patch.
>>>
>>
>> Hi Yumin,
>>
>> Actually, I changed my mind again, and implemented your suggestion :-)
>>
>> There's actually a way to invoke GC inside a safepoint (it's used by 
>> "jcmd gc.heap_dump", for example). So I changed the CDS code to do 
>> the same thing. It's a much simpler change and does what I want -- no 
>> other thread will be able to make any heap allocation after the GC 
>> has completed, so no EDEN region will be allocated:
>>
>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v02/ 
>>
>>
>> The check for "if (GCLocker::is_active())" should almost always be 
>> false. I left it there just for safety:
>>
>> During -Xshare:dump, we execute Java code to build the module graph, 
>> load classes, etc. So theoretically someone could try to parallelize 
>> some of that Java code in the future. Theoretically when CDS has 
>> entered the safepoint, another thread could be in the middle a JNI 
>> method that has held the GCLock.
>>
>> Thanks
>> - Ioi
>>
>>>
>>>>
>>>> On 5/29/20 7:29 PM, Ioi Lam wrote:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8245925
>>>>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v01/ 
>>>>>
>>>>>
>>>>>
>>>>> Summary:
>>>>>
>>>>> CDS supports archived heap objects only for G1. During -Xshare:dump,
>>>>> CDS executes a full GC so that G1 will compact the heap regions, 
>>>>> leaving
>>>>> maximum contiguous free space at the top of the heap. Then, the 
>>>>> archived
>>>>> heap regions are allocated from the top of the heap.
>>>>>
>>>>> Under some circumstances, java.lang.ref.Cleaners will execute
>>>>> after the GC has completed. The cleaners may allocate or 
>>>>> synchronized, which
>>>>> will cause G1 to allocate an EDEN region at the top of the heap.
>>>>>
>>>>> The fix is simple -- after CDS has entered a safepoint, if EDEN 
>>>>> regions exist,
>>>>> exit the safepoint, run GC, and try again. Eventually all the 
>>>>> cleaners will
>>>>> be executed and no more allocation can happen.
>>>>>
>>>>> For safety, I limit the retry count to 30 (or about total 9 seconds).
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>
>>>>> <https://bugs.openjdk.java.net/browse/JDK-8245925>
>>>>
>>>
>>
>