RFR(S): 8173743: Failures during class definition can lead to memory leaks in metaspace

David Holmes david.holmes at oracle.com
Thu Feb 9 03:00:17 UTC 2017


Hi Volker,

On 9/02/2017 5:41 AM, Volker Simonis wrote:
> Hi David,
>
> thanks a lot for you're review. I think I've addressed all your
> points. Please see my answers inline for more details.

Yes all points addressed - thanks. A couple of further nits (no need to 
see updated webrev):

  235         catch (Exception e) {
  236             System.out.println(e);
  237             System.out.println(e.getCause());
  238             e.getCause().printStackTrace();
  239             throw new RuntimeException("Test failed because we 
can't read the class statistics!");
  240         }

Not sure why you print the stacktrace for the cause instead of the 
primary exception that you caught? e.printStackTrace() will also print 
its cause's stacktrace. Further, I think you should set 'e' as the cause 
of the new RuntimeException that you are throwing. If you do that I 
don't think you really need to print anything here as it will all be 
printed when the new exception causes the test to fail.

  368                     throw new RuntimeException("Class redifinition 
isn't allowed to change method names!");

typo: redifinition

Thanks,
David

> On Tue, Feb 7, 2017 at 6:06 AM, David Holmes <david.holmes at oracle.com> wrote:
>> Hi Volker,
>>
>> On 7/02/2017 4:40 AM, Volker Simonis wrote:
>>>
>>> Hi,
>>>
>>> can somebody please review the following change:
>>>
>>> http://cr.openjdk.java.net/~simonis/webrevs/2017/8173743.v1/
>>> https://bugs.openjdk.java.net/browse/JDK-8173743
>>
>>
>> Functional change looks good - took me a while to work through the
>> permutations. :)
>>
>> Have a few comments on the test, mostly nits.
>>
>> I believe an Oracle copyright line is also required.
>>
>
> This is a common misunderstanding :)
> It is not, and there are plenty examples where we already have only
> non-Oracle copyrights:
>
> $ grep -r -l "SAP" test/ | xargs grep -L "Oracle and" | wc -l
> 24
>
> The general rule of thumb is to have both, Oracle and SAP copyrights
> if a file was derived from another file in the source base which
> already has the Oracle copyright. That's why most of the files in our
> ports have both copyrights. On the other hand, if we contribute a new
> file which we've written from scratch, a single copyright notice is
> fine.
>
>> linkageErrors is not incremented in a thread-safe way, and should also be
>> volatile. I think you are just using this as a rough count, but if so
>> document that and at least declare it volatile so it is reported with a
>> current value if needed.
>>
>
> You're right. I just used it for debugging purpose, but once it's
> there, it should be correct. I made 'linkageErrors' volatile and
> 'incrementLinkageErrors()' synchronized.
>
>> 256             // executing and one another version
>>
>> Either: "one other version" or "another version" (occurs elsewhere)
>>
>
> Fixed.
>
>>  283                 catch (java.lang.SecurityException jlse) {
>>  284                     // Expected
>>  285                 }
>>
>> If the SecurityException is always expected then should not absence of it be
>> reported as an error?
>>
>
> Fixed.
>
>> 288             // We expect to have two instances of DefineClass here: the
>> initial version in which we are
>> ...
>> 291             System.out.println("Should have 1 DefineClass instances and
>> we have: " + count);
>>
>> Comment and code do not match: do we expect one instance or two?
>>
>>  292             System.gc();
>>  293             System.out.println("System.gc()");
>>  294             count = getClassStats("DefineClass");
>>  295             // At least after System.gc() the failed loading attempts
>> should leave no instances around!
>>
>
> Good catch. This was the typical copy-and paste error :)
> Fixed.
>
>> Are we guaranteed, for all GC's, that a single call to System.gc(), will
>> clear the unwanted instances? (This applies to all test cases.)
>>
>
> I think the call isn't actually needed because I've just realized that
> the diagnostic command which we use to get the class statistics (i.e.
> ClassStatsDCmd) already triggers a full GC on every invocation (by
> instantiating a VM_GC_HeapInspection instance with request_full_gc =
> true). But I think it's OK to leave the call there to 'document' this
> in the Java sources from where this is not directly evident.
>
>>  353             // kept alive by this main methidf
>>
>> Typo: methidf
>>
>
> Fixed.
>
>> 367                 catch (UnsupportedOperationException uoe) {
>> 368                     // Expected
>> 369                 }
>>
>> Ditto re not getting the exception - should this report an error?
>>
>
> Fixed.
>
>> Thanks,
>> David
>
> I've uploaded a new webrev which contains the changes (only for the
> test) for all suggestions I've received until now:
>
> http://cr.openjdk.java.net/~simonis/webrevs/2017/8173743.v2/
>
> Thanks,
> Volker
>> ------
>>
>>
>>>
>>> It fixes some problems during class definitions where instance klasses
>>> can leak into the metaspace in cases where the class definition fails
>>> after the class was successfully loaded from the bytecode stream.
>>>
>>> There are actually two cases to consider:
>>>
>>> 1. If we load the same class several times into the same class loader,
>>> we will get LinkageErrors after the first successful load. But
>>> nevertheless we will first construct a new instanceKlass for each new
>>> load attempt and this instanceKlass will never be deleted.
>>>
>>> 2. If we have a parallel capable class loader, set
>>> -XX:-UnsyncloadClass and/or -XX:+AllowParallelDefineClass and load a
>>> class from several threads at the same time in parallel, it can happen
>>> that we create several instance klasses for the same class. At the end
>>> only one of them will be added to the system dictionary, but all the
>>> other ones will never be deleted. Notice that if we run this scenario
>>> without setting either of -XX:-UnsyncloadClass or
>>> -XX:+AllowParallelDefineClass, this scenario will degrade into the
>>> case above and we will get LinkageErrors for all but the first
>>> successful load.
>>>
>>> The change comes with a regression test which checks for the two cases
>>> just describe and also for the failing class redefinition case, which
>>> currently doesn't produce a memory leak.
>>>
>>> I've already committed this to the hs-demo-submit/hotspot/ forest and
>>> it went through without a problem. So in theory it should have passed
>>> the internal JPRT tests although I'm not sure if the test set of the
>>> "demo-submit" forest and the real hotspot repo are exactly the same
>>> (CC'ed Tim and Brian).
>>>
>>> Thank you and best regards,
>>> Volker
>>>
>>


More information about the hotspot-runtime-dev mailing list