RFR(XS) 8213250 CDS archive creation aborts due to metaspace object allocation failure

Jiangli Zhou jiangli.zhou at oracle.com
Sat Nov 3 05:42:48 UTC 2018


Hi Ioi,


On 11/2/18 5:27 PM, Ioi Lam wrote:
> Hi Jiangli,
>
> Thanks for the review
>
>
> On 11/2/18 1:24 PM, Jiangli Zhou wrote:
>> Hi Ioi,
>>
>> - src/hotspot/share/classfile/verifier.cpp
>>
>> You moved 
>> SystemDictionaryShared::finalize_verification_constraints(_klass) to 
>> ClassVerifier::verify_class(). We should be very careful in this 
>> area. Have you tested with cases where classes fail verification?
>
> void ClassVerifier::verify_class(TRAPS) {
>   log_info(verification)("Verifying class %s with new format", 
> _klass->external_name());
>
>   Array<Method*>* methods = _klass->methods();
>   int num_methods = methods->length();
>
>   for (int index = 0; index < num_methods; index++) {
>     // Check for recursive re-verification before each method.
>     if (was_recursively_verified())  return;
>
>     Method* m = methods->at(index);
>     if (m->is_native() || m->is_abstract() || m->is_overpass()) {
>       // If m is native or abstract, skip it.  It is checked in class 
> file
>       // parser that methods do not override a final method. Overpass 
> methods
>       // are trusted since the VM generates them.
>       continue;
>     }
>     verify_method(methodHandle(THREAD, m), CHECK_VERIFY(this));
>   }
>
>   if (DumpSharedSpaces) {
> SystemDictionaryShared::finalize_verification_constraints(_klass);
>   }
>
> If verification failed, the CHECK_VERIFY macro will force the function 
> to return. So we will call finalize_verification_constraints only when 
> verification has succeeded.

Ok, thanks for checking.

An alternative is to allocate from the RO space directly after 
relocation. That way you can avoid adding another CDS specific in the 
verifier code. There is also a FIXME for using the symbol offset in 
SharedDictionaryEntry::finalize_verification_constraints, which you 
could handle together.
>
>
>> I haven't ran your code yet, we should avoid copying the constraints 
>> for classes that fails verification at dump time.
>
> If a class has failed verification, the class will not be archived. So 
> whether we have called finalize_verification_constraints on it really 
> doesn't matter.

Since it doesn't do the copying when verification fails, it is okay in 
this case. With the work for dynamic archiving, we want to avoid 
unnecessary work at dump time, since performance/footprint will become 
higher priority with archiving at runtime.
>
>> I also need to think through any potential impact on dynamic 
>> archiving (possibly none).
>>
>> - src/hotspot/share/memory/metaspace.cpp
>>
>> -    if (is_init_completed() && !(DumpSharedSpaces && 
>> THREAD->is_VM_thread())) {
>> +    if (is_init_completed()) {
>>
>> Please also add an assert to make sure we don't allocate metaspace 
>> objects from the VM_thread at dump time now.
>>
>
> I think it's better to add a blanket assert(!THREAD->is_VM_thread()), 
> because no one should be allocating metaspace objects inside VM thread.
>
> Adding this has a wider impact, so I will do it in a separate RFE 
> (https://bugs.openjdk.java.net/browse/JDK-8213331), in case we need to 
> back it out.

I'd suggest including the assert for dump time with the fix so we are 
sure it is safe. Then you can follow up with the broader assert that you 
are planning for JDK-8213331.

Thanks,
Jiangli
>
> Thanks
> - Ioi
>> Thanks,
>>
>> Jiangli
>>
>>
>> On 11/1/18 9:47 PM, Ioi Lam wrote:
>>> Hi,
>>>
>>> I've revised the webrev to get to the the root of the problem -- we 
>>> shouldn't allocate metaspace objects from within the VM Thread!
>>>
>>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v02/ 
>>>
>>>
>>> The original bug that caused the "!(DumpSharedSpaces && 
>>> THREAD->is_VM_thread())" check to be added in Metaspace::allocate() was
>>>
>>>     https://bugs.openjdk.java.net/browse/JDK-8196626
>>>     UseAppCDS.java crashes with "VM thread using lock Heap_lock (not 
>>> allowed to block on)"
>>>
>>> The fix in JDK-8196626 is just a band-aid solution -- avoiding the 
>>> GC. The proper fix is not to do any allocation inside the VM thread. 
>>> So instead of calling finalize_verification_constraints, which 
>>> allocates MetaspaceObjs, very late in the CDS dumping process, now 
>>> we call be as soon as a class has been verified.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>
>>>
>>>
>>>
>>> On 11/1/18 11:07 AM, Ioi Lam wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8213250
>>>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v01/ 
>>>>
>>>>
>>>> When Metaspace::allocate() fails, usually we first try to GC to see if
>>>> some MetaspaceObjs can be freed. This is an optimization to avoid 
>>>> prematurely
>>>> expanding the metaspace.
>>>>
>>>> However, GC cannot run in the last phase of CDS which runs inside 
>>>> the VM thread.
>>>> The old code just aborts the CDS dump. The fix is to 
>>>> unconditionally expand the
>>>> metaspace without doing a GC.
>>>>
>>>> (During GC dump time, no classes are unloaded, so even if we force 
>>>> a GC, it would
>>>> not free up any metaspace anyway).
>>>>
>>>> Testing:
>>>>
>>>> Before the fix, the test case (jruby) would fail to dump once a 
>>>> couple of times. Now
>>>> it has dumped hundreds of times without failure.
>>>>
>>>> I am also running hs-tier 1/2
>>>>
>>>> Thanks
>>>> - Ioi
>>>
>>
>



More information about the hotspot-dev mailing list