RFR(S): 8173743: Failures during class definition can lead to memory leaks in metaspace

Wed Feb 8 19:42:47 UTC 2017

Hi Karen,

thanks a lot for reviewing my change. Please find my comments inline:

On Tue, Feb 7, 2017 at 7:06 PM, Karen Kinnear <karen.kinnear at oracle.com> wrote:
> Volker,
>
> Thank you so much for this fix.
> The current code looks good.
>

You're welcome :)

> Thank you for the mix-and-match flag testing for UnsyncloadClass and AllowParallelDefineClass !
>
> A couple of notes:
> 1) \ I think there is also a need for a check
> inside load_instance_klass for the null loader case, i.e. after the call to find_or_define_instance_class.
>

This does indeed look like another place which needs fixing. I've
tried to come up with a test case, but didn't succeed.
I've than looked at the source code and observed the following:

- the place you mention inside load_instance_class() is only relevant
for the null (i.e. bootstrap) loader

- SystemDictionary::resolve_instance_class_or_null(), which is the
caller of load_instance_class() has the following comments and code:

    // add placeholder entry to record loading instance class
    // Five cases:
    ...
    // case4. Bootstrap classloader - don't own objectLocker
    //    This classloader supports parallelism at the classloader level,
    //    but only allows a single load of a class/classloader pair.
    //    No performance benefit and no deadlock issues.
    ...
              // case 4: bootstrap classloader: prevent futile classloading,
              // wait on first requestor
              if (class_loader.is_null()) {
                SystemDictionary_lock->wait();

So it looks to me like  find_or_define_instance_class() can not return
a different instanceKlass for the bootstrap class loader.

What do you think? Did I missed something? If yes, do you have any
idea how this error could be triggered with a test case?

Do you want me do add a check for  "defined_k == k" anyway (for any case)?

I've uploaded a new webrev which contains the changes (only for the
test) for all the suggestions I've received until now:

http://cr.openjdk.java.net/~simonis/webrevs/2017/8173743.v2/

Thanks,
Volker

> 2) Just a note - you could also get this problem if you ran into OOME e.g. adding a class to the ClassLoader vector
> in define_class. You have this covered by covering all callers of define_instance_class.
>
> 3) one clarification below
>
> thanks,
> Karen
>
>> On Feb 6, 2017, at 1:40 PM, Volker Simonis <volker.simonis at gmail.com> wrote:
>>
>> Hi,
>>
>> can somebody please review the following change:
>>
>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8173743.v1/
>> https://bugs.openjdk.java.net/browse/JDK-8173743
>>
>> It fixes some problems during class definitions where instance klasses
>> can leak into the metaspace in cases where the class definition fails
>> after the class was successfully loaded from the bytecode stream.
>>
>> There are actually two cases to consider:
>>
>> 1. If we load the same class several times into the same class loader,
>> we will get LinkageErrors after the first successful load. But
>> nevertheless we will first construct a new instanceKlass for each new
>> load attempt and this instanceKlass will never be deleted.
> Just to be sure we are saying the same thing.
> If we try to define a class several times in the same class loader, …
>
> And I’m sure you already know - we do that so that we correctly throw
> any class file format errors on the new byte stream passed in.
>>
>> 2. If we have a parallel capable class loader, set
>> -XX:-UnsyncloadClass and/or -XX:+AllowParallelDefineClass and load a
>> class from several threads at the same time in parallel, it can happen
>> that we create several instance klasses for the same class. At the end
>> only one of them will be added to the system dictionary, but all the
>> other ones will never be deleted. Notice that if we run this scenario
>> without setting either of -XX:-UnsyncloadClass or
>> -XX:+AllowParallelDefineClass, this scenario will degrade into the
>> case above and we will get LinkageErrors for all but the first
>> successful load.
>>
>> The change comes with a regression test which checks for the two cases
>> just describe and also for the failing class redefinition case, which
>> currently doesn't produce a memory leak.
>>
>> I've already committed this to the hs-demo-submit/hotspot/ forest and
>> it went through without a problem. So in theory it should have passed
>> the internal JPRT tests although I'm not sure if the test set of the
>> "demo-submit" forest and the real hotspot repo are exactly the same
>> (CC'ed Tim and Brian).
>>
>> Thank you and best regards,
>> Volker
>