RFR(S) 8245925 G1 allocates EDEN region after CDS has executed GC

Ioi Lam ioi.lam at oracle.com
Fri Jun 5 04:37:18 UTC 2020


Hi Jiangli,

Updated webrev is here:

http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v03/

The only difference with the previous version is this part:

1975 MutexLocker ml(THREAD, HeapShared::is_heap_object_archiving_allowed() ?
1976                Heap_lock : NULL);     // needed for 
collect_as_vm_thread

Thanks
- Ioi


On 6/4/20 5:23 PM, Jiangli Zhou wrote:
> Ioi, do you have a new webrev?
>
> Best,
> Jiangli
>
> On Thu, Jun 4, 2020 at 4:54 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>
>>
>> On 6/4/20 12:04 PM, Jiangli Zhou wrote:
>>> On Wed, Jun 3, 2020 at 10:56 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>> On 5/29/20 9:40 PM, Yumin Qi wrote:
>>>>>> HI, Ioi
>>>>>>
>>>>>>     If the allocation of EDEN happens between GC and dump, should we
>>>>>> put the GC action in VM_PopulateDumpSharedSpace? This way, at
>>>>>> safepoint there should no allocation happens. The stack trace showed
>>>>>> it happened with a Java Thread, which should be blocked at safepoint.
>>>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> I think GCs cannot be executed inside a safepoint, because some parts
>>>>> of GC need to execute in a safepoint, so they will be blocked until
>>>>> VM_PopulateDumpSharedSpace::doit has returned.
>>>>>
>>>>> Anyway, as I mentioned in my reply to Jiangli, there's a better way to
>>>>> fix this, so I will withdraw the current patch.
>>>>>
>>>> Hi Yumin,
>>>>
>>>> Actually, I changed my mind again, and implemented your suggestion :-)
>>>>
>>>> There's actually a way to invoke GC inside a safepoint (it's used by
>>>> "jcmd gc.heap_dump", for example). So I changed the CDS code to do the
>>>> same thing. It's a much simpler change and does what I want -- no other
>>>> thread will be able to make any heap allocation after the GC has
>>>> completed, so no EDEN region will be allocated:
>>>>
>>>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v02/
>>> Going with the simple approach for the short term sounds ok.
>>>
>>> 1612 void VM_PopulateDumpSharedSpace::doit() {
>>> ...
>>> 1615     if (GCLocker::is_active()) {
>>> 1616       // This should rarely happen during -Xshare:dump, if at
>>> all, but just to be safe.
>>> 1617       log_debug(cds)("GCLocker::is_active() ... try again");
>>> 1618       return;
>>> 1619     }
>>>
>>> MetaspaceShared::preload_and_dump()
>>> ...
>>> 1945     while (true) {
>>> 1946       {
>>> 1947         MutexLocker ml(THREAD, Heap_lock);
>>> 1948         VMThread::execute(&op);
>>> 1949       }
>>> 1950       // If dumping has finished, the VM would have exited. The
>>> only reason to
>>> 1951       // come back here is to wait for the GCLocker.
>>> 1952       assert(HeapShared::is_heap_object_archiving_allowed(), "sanity");
>>> 1953       os::naked_short_sleep(1);
>>> 1954     }
>>> 1955   }
>>>
>>>
>>> Instead of doing the while/retry, calling
>>> GCLocker::stall_until_clear() in MetaspaceShared::preload_and_dump
>>> before VM_PopulateDumpSharedSpace probably is much cleaner?
>> Hi Jiangli, I tried your suggestion, but GCLocker::stall_until_clear()
>> cannot be called in the VM thread:
>>
>> #  assert(thread->is_Java_thread()) failed: just checking
>>
>>> Please also add a comment in MetaspaceShared::preload_and_dump to
>>> explain that Universe::heap()->collect_as_vm_thread expects that
>>> Heap_lock is already held by the thread, and that's the reason for the
>>> call to MutexLocker ml(THREAD, Heap_lock).
>> I changed the code to be like this:
>>
>>       MutexLocker ml(THREAD, HeapShared::is_heap_object_archiving_allowed() ?
>>                      Heap_lock : NULL);     // needed for
>> collect_as_vm_thread
>>
>>> Do you also need to call
>>> Universe::heap()->soft_ref_policy()->set_should_clear_all_soft_refs(true)
>>> before GC?
>> There's no need:
>>
>> void CollectedHeap::collect_as_vm_thread(GCCause::Cause cause) {
>>        ...
>>       case GCCause::_archive_time_gc:
>>       case GCCause::_metadata_GC_clear_soft_refs: {
>>         HandleMark hm;
>>         do_full_collection(true);         // do clear all soft refs
>>
>> Thanks
>> - Ioi
>>
>>> Best,
>>> Jiangli
>>>
>>>> The check for "if (GCLocker::is_active())" should almost always be
>>>> false. I left it there just for safety:
>>>>
>>>> During -Xshare:dump, we execute Java code to build the module graph,
>>>> load classes, etc. So theoretically someone could try to parallelize
>>>> some of that Java code in the future. Theoretically when CDS has entered
>>>> the safepoint, another thread could be in the middle a JNI method that
>>>> has held the GCLock.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>>> On 5/29/20 7:29 PM, Ioi Lam wrote:
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8245925
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk15/8245925-g1-eden-after-cds-gc.v01/
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Summary:
>>>>>>>
>>>>>>> CDS supports archived heap objects only for G1. During -Xshare:dump,
>>>>>>> CDS executes a full GC so that G1 will compact the heap regions,
>>>>>>> leaving
>>>>>>> maximum contiguous free space at the top of the heap. Then, the
>>>>>>> archived
>>>>>>> heap regions are allocated from the top of the heap.
>>>>>>>
>>>>>>> Under some circumstances, java.lang.ref.Cleaners will execute
>>>>>>> after the GC has completed. The cleaners may allocate or
>>>>>>> synchronized, which
>>>>>>> will cause G1 to allocate an EDEN region at the top of the heap.
>>>>>>>
>>>>>>> The fix is simple -- after CDS has entered a safepoint, if EDEN
>>>>>>> regions exist,
>>>>>>> exit the safepoint, run GC, and try again. Eventually all the
>>>>>>> cleaners will
>>>>>>> be executed and no more allocation can happen.
>>>>>>>
>>>>>>> For safety, I limit the retry count to 30 (or about total 9 seconds).
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>>>
>>>>>>>
>>>>>>> <https://bugs.openjdk.java.net/browse/JDK-8245925>



More information about the hotspot-runtime-dev mailing list