RFR(XS) 8213250 CDS archive creation aborts due to metaspace object allocation failure

Ioi Lam ioi.lam at oracle.com
Sat Nov 3 00:27:40 UTC 2018


Hi Jiangli,

Thanks for the review


On 11/2/18 1:24 PM, Jiangli Zhou wrote:
> Hi Ioi,
>
> - src/hotspot/share/classfile/verifier.cpp
>
> You moved 
> SystemDictionaryShared::finalize_verification_constraints(_klass) to 
> ClassVerifier::verify_class(). We should be very careful in this area. 
> Have you tested with cases where classes fail verification?

void ClassVerifier::verify_class(TRAPS) {
   log_info(verification)("Verifying class %s with new format", 
_klass->external_name());

   Array<Method*>* methods = _klass->methods();
   int num_methods = methods->length();

   for (int index = 0; index < num_methods; index++) {
     // Check for recursive re-verification before each method.
     if (was_recursively_verified())  return;

     Method* m = methods->at(index);
     if (m->is_native() || m->is_abstract() || m->is_overpass()) {
       // If m is native or abstract, skip it.  It is checked in class file
       // parser that methods do not override a final method. Overpass 
methods
       // are trusted since the VM generates them.
       continue;
     }
     verify_method(methodHandle(THREAD, m), CHECK_VERIFY(this));
   }

   if (DumpSharedSpaces) {
SystemDictionaryShared::finalize_verification_constraints(_klass);
   }

If verification failed, the CHECK_VERIFY macro will force the function 
to return. So we will call finalize_verification_constraints only when 
verification has succeeded.


> I haven't ran your code yet, we should avoid copying the constraints 
> for classes that fails verification at dump time.

If a class has failed verification, the class will not be archived. So 
whether we have called finalize_verification_constraints on it really 
doesn't matter.

> I also need to think through any potential impact on dynamic archiving 
> (possibly none).
>
> - src/hotspot/share/memory/metaspace.cpp
>
> -    if (is_init_completed() && !(DumpSharedSpaces && 
> THREAD->is_VM_thread())) {
> +    if (is_init_completed()) {
>
> Please also add an assert to make sure we don't allocate metaspace 
> objects from the VM_thread at dump time now.
>

I think it's better to add a blanket assert(!THREAD->is_VM_thread()), 
because no one should be allocating metaspace objects inside VM thread.

Adding this has a wider impact, so I will do it in a separate RFE 
(https://bugs.openjdk.java.net/browse/JDK-8213331), in case we need to 
back it out.

Thanks
- Ioi
> Thanks,
>
> Jiangli
>
>
> On 11/1/18 9:47 PM, Ioi Lam wrote:
>> Hi,
>>
>> I've revised the webrev to get to the the root of the problem -- we 
>> shouldn't allocate metaspace objects from within the VM Thread!
>>
>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v02/ 
>>
>>
>> The original bug that caused the "!(DumpSharedSpaces && 
>> THREAD->is_VM_thread())" check to be added in Metaspace::allocate() was
>>
>>     https://bugs.openjdk.java.net/browse/JDK-8196626
>>     UseAppCDS.java crashes with "VM thread using lock Heap_lock (not 
>> allowed to block on)"
>>
>> The fix in JDK-8196626 is just a band-aid solution -- avoiding the 
>> GC. The proper fix is not to do any allocation inside the VM thread. 
>> So instead of calling finalize_verification_constraints, which 
>> allocates MetaspaceObjs, very late in the CDS dumping process, now we 
>> call be as soon as a class has been verified.
>>
>> Thanks
>> - Ioi
>>
>>
>>
>>
>>
>> On 11/1/18 11:07 AM, Ioi Lam wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8213250
>>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v01/ 
>>>
>>>
>>> When Metaspace::allocate() fails, usually we first try to GC to see if
>>> some MetaspaceObjs can be freed. This is an optimization to avoid 
>>> prematurely
>>> expanding the metaspace.
>>>
>>> However, GC cannot run in the last phase of CDS which runs inside 
>>> the VM thread.
>>> The old code just aborts the CDS dump. The fix is to unconditionally 
>>> expand the
>>> metaspace without doing a GC.
>>>
>>> (During GC dump time, no classes are unloaded, so even if we force a 
>>> GC, it would
>>> not free up any metaspace anyway).
>>>
>>> Testing:
>>>
>>> Before the fix, the test case (jruby) would fail to dump once a 
>>> couple of times. Now
>>> it has dumped hundreds of times without failure.
>>>
>>> I am also running hs-tier 1/2
>>>
>>> Thanks
>>> - Ioi
>>
>



More information about the hotspot-dev mailing list