RFR 8243572: Multiple tests fail with assert(cld->klasses() != 0LL) failed: unexpected NULL for cld->klasses()

David Holmes david.holmes at oracle.com
Wed Apr 29 23:38:34 UTC 2020


Hi Harold,

On 29/04/2020 11:40 pm, Harold Seigel wrote:
> Hi David,
> 
> Please see inline comments.

Thanks for the explanations.

David

> On 4/29/2020 12:01 AM, David Holmes wrote:
>> Hi Harold,
>>
>> Not a review ...
>>
>> On 29/04/2020 3:27 am, Harold Seigel wrote:
>>> Hi,
>>>
>>> Please review this fix for JDK-8243572 and JDK-8243336.  Both 
>>> failures were caused by calling function ClassLoaderData::klasses() 
>>> and expecting a non-null return value. However, when the CLD had no 
>>> classes then NULL was returned causing the assertion failure in one 
>>> case and SIGSEGV in the other.
>>>
>>> Function klasses() was called in these places to determine if the 
>>> ClassLoaderData was for a hidden class or an unsafe anonymous class. 
>>> This was done during CLD statistics collection and for JFR events 
>>> involving CLD's.
>>>
>>> Since the JDK has replaced uses of unsafe anonymous classes with 
>>> hidden classes, there should be very few unsafe anonymous classes. 
>>> So, it was decided (with mgronlun and mchung) that the VM and JFR 
>>> need no longer distinguish between hidden and unsafe anonymous 
>>> classes when gathering CLD statistics and when CLD's are displayed in 
>>> JFR events.  Instead, unsafe anonymous classes will be counted as 
>>> hidden classes for CLD statistics, and JFR will show CLD's for both 
>>> hidden and unsafe anonymous classes as hidden.
>>
>> Okay, but putting aside the decision to no longer distinguish between 
>> old VM unsafe anonymous classes and new hidden classes, what was the 
>> actual source of the failure here? The CLD has no classes, but what 
>> code was expecting to find classes and why? Was it just an oversight 
>> with the new hidden classes code?
> 
> It's a bug in the new hidden classes code.  The problem is in two 
> different places, in  ClassLoaderStatsClosure::do_cld() and in 
> MetaspaceTracer::send_allocation_failure_event().  Both functions call 
> ClassLoaderData::klasses() and expect the result to be non-null.  The 
> former case gets a SIGSEGV because it dereferences the result.  The 
> latter case asserts if the result is null.  These calls to 
> ClassLoaderData::klasses() were added as part of the hidden classes 
> implementation.
> 
> The SIGSEGV occurs when the VM is creating a hidden or unsafe anonymous 
> class.  It creates a special ClassLoaderData and adds it to the 
> ClassLoaderDataGraph before loading the class.  If statistics collection 
> occurs between the time it creates the CLD and loads the class, then the 
> CLD will not have any classes and the new call to 
> ClassLoaderData::klasses() will return NULL, eventually causing the 
> SIGSEGV.
> 
> The MetaspaceTracer::send_allocation_failure_event() assert happens when 
> classFileParser has created the special ClassLoaderData for the hidden 
> class but then runs out of metaspace when trying to load the hidden 
> class.  If JFR is enabled then the JVM tries to create an allocation 
> failure event by calling send_allocation_failure_event(), causing the 
> assertion failure.
> 
> I hope this is helpful.
> 
> Thanks, Harold
> 
>>
>> Thanks,
>> David
>>
>>> Open Webrev: 
>>> http://cr.openjdk.java.net/~hseigel/bug_8243572/webrev/index.html
>>>
>>> JBS Bugs: https://bugs.openjdk.java.net/browse/JDK-8243572 and 
>>> https://bugs.openjdk.java.net/browse/JDK-8243336
>>>
>>> The fix was regression tested by running Mach5 tiers 1 and 2 tests 
>>> and builds on Linux-x64, Solaris, Windows, and Mac OS X, by running 
>>> Mach5 tiers 3-5 tests on Linux-x64, and running tier 7 tests multiple 
>>> times on Windows and also on Mac OS X.  Tier 7 testing on Linux-X64 
>>> is in progress.
>>>
>>> Thanks, Harold
>>>


More information about the hotspot-runtime-dev mailing list