RFR(XS) 8213250 CDS archive creation aborts due to metaspace object allocation failure
Ioi Lam
ioi.lam at oracle.com
Sat Nov 3 00:27:40 UTC 2018
Hi Jiangli,
Thanks for the review
On 11/2/18 1:24 PM, Jiangli Zhou wrote:
> Hi Ioi,
>
> - src/hotspot/share/classfile/verifier.cpp
>
> You moved
> SystemDictionaryShared::finalize_verification_constraints(_klass) to
> ClassVerifier::verify_class(). We should be very careful in this area.
> Have you tested with cases where classes fail verification?
void ClassVerifier::verify_class(TRAPS) {
log_info(verification)("Verifying class %s with new format",
_klass->external_name());
Array<Method*>* methods = _klass->methods();
int num_methods = methods->length();
for (int index = 0; index < num_methods; index++) {
// Check for recursive re-verification before each method.
if (was_recursively_verified()) return;
Method* m = methods->at(index);
if (m->is_native() || m->is_abstract() || m->is_overpass()) {
// If m is native or abstract, skip it. It is checked in class file
// parser that methods do not override a final method. Overpass
methods
// are trusted since the VM generates them.
continue;
}
verify_method(methodHandle(THREAD, m), CHECK_VERIFY(this));
}
if (DumpSharedSpaces) {
SystemDictionaryShared::finalize_verification_constraints(_klass);
}
If verification failed, the CHECK_VERIFY macro will force the function
to return. So we will call finalize_verification_constraints only when
verification has succeeded.
> I haven't ran your code yet, we should avoid copying the constraints
> for classes that fails verification at dump time.
If a class has failed verification, the class will not be archived. So
whether we have called finalize_verification_constraints on it really
doesn't matter.
> I also need to think through any potential impact on dynamic archiving
> (possibly none).
>
> - src/hotspot/share/memory/metaspace.cpp
>
> - if (is_init_completed() && !(DumpSharedSpaces &&
> THREAD->is_VM_thread())) {
> + if (is_init_completed()) {
>
> Please also add an assert to make sure we don't allocate metaspace
> objects from the VM_thread at dump time now.
>
I think it's better to add a blanket assert(!THREAD->is_VM_thread()),
because no one should be allocating metaspace objects inside VM thread.
Adding this has a wider impact, so I will do it in a separate RFE
(https://bugs.openjdk.java.net/browse/JDK-8213331), in case we need to
back it out.
Thanks
- Ioi
> Thanks,
>
> Jiangli
>
>
> On 11/1/18 9:47 PM, Ioi Lam wrote:
>> Hi,
>>
>> I've revised the webrev to get to the the root of the problem -- we
>> shouldn't allocate metaspace objects from within the VM Thread!
>>
>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v02/
>>
>>
>> The original bug that caused the "!(DumpSharedSpaces &&
>> THREAD->is_VM_thread())" check to be added in Metaspace::allocate() was
>>
>> https://bugs.openjdk.java.net/browse/JDK-8196626
>> UseAppCDS.java crashes with "VM thread using lock Heap_lock (not
>> allowed to block on)"
>>
>> The fix in JDK-8196626 is just a band-aid solution -- avoiding the
>> GC. The proper fix is not to do any allocation inside the VM thread.
>> So instead of calling finalize_verification_constraints, which
>> allocates MetaspaceObjs, very late in the CDS dumping process, now we
>> call be as soon as a class has been verified.
>>
>> Thanks
>> - Ioi
>>
>>
>>
>>
>>
>> On 11/1/18 11:07 AM, Ioi Lam wrote:
>>> https://bugs.openjdk.java.net/browse/JDK-8213250
>>> http://cr.openjdk.java.net/~iklam/jdk12/8213250-cds-dump-run-out-of-metaspace.v01/
>>>
>>>
>>> When Metaspace::allocate() fails, usually we first try to GC to see if
>>> some MetaspaceObjs can be freed. This is an optimization to avoid
>>> prematurely
>>> expanding the metaspace.
>>>
>>> However, GC cannot run in the last phase of CDS which runs inside
>>> the VM thread.
>>> The old code just aborts the CDS dump. The fix is to unconditionally
>>> expand the
>>> metaspace without doing a GC.
>>>
>>> (During GC dump time, no classes are unloaded, so even if we force a
>>> GC, it would
>>> not free up any metaspace anyway).
>>>
>>> Testing:
>>>
>>> Before the fix, the test case (jruby) would fail to dump once a
>>> couple of times. Now
>>> it has dumped hundreds of times without failure.
>>>
>>> I am also running hs-tier 1/2
>>>
>>> Thanks
>>> - Ioi
>>
>
More information about the hotspot-dev
mailing list