RFR(S): 8251860: ClassLoaderData::loaded_classes_do fails with "assert(ZAddress::is_marked(addr)) failed: Should be marked"

calvin.cheung at oracle.com calvin.cheung at oracle.com
Mon Aug 31 23:34:27 UTC 2020


Hi Coleen,

Thanks for taking another look.

Calvin

On 8/31/20 3:22 PM, Coleen Phillimore wrote:
>
>
> On 8/31/20 11:57 AM, calvin.cheung at oracle.com wrote:
>> Hi Ioi,
>>
>> Thanks for your review!
>>
>> On 8/28/20 3:54 PM, Ioi Lam wrote:
>>> Hi Calvin,
>>>
>>>  300 void ClassLoaderData::inc_keep_alive() {
>>>  301   if (has_class_mirror_holder()) {
>>>  302     if (!Arguments::is_dumping_archive()) {
>>>  303       assert(_keep_alive > 0, "Invalid keep alive increment 
>>> count");
>>>  304     }
>>>  305     _keep_alive++;
>>>  306   }
>>>  307 }
>>>
>>> Why is 302 needed? I think in CollectCLDClosure, you wouldn't 
>>> collect any CLD whose _keep_alive count is zero.
>>
>> The (!cld->is_unloading()) doesn't mean that the _keep_alive count is 
>> non-zero.
>>
>> We currently have a few tests which would hit the assert if line 302 
>> is not there. One of them is the appcds/dynamicArchive/LinkClassTest. 
>> While the "Parent x = new Child()" is being verified during linking, 
>> the VerificationType::resolve_and_check_assignability will be called. 
>> This will eventually call the SystemDictionary::register_loader but 
>> it will not add a class_loader to the ClassLoaderDataGraph. Prior to 
>> this patch, the _keep_alive will be incremented at 2 places:
>>
>> 1) java_lang_Class::set_mirror_module_field
>>
>>    In this case, the _keep_alive will be incremented during vm startup.
>>
>> 2) ClassLoaderData::ClassLoaderData
>>
>>    _keep_alive((has_class_mirror_holder || h_class_loader.is_null()) 
>> ? 1 : 0),
>>
>> Neither of the above code path will be taken for the assignability 
>> check.
>>
>>>
>>> I think MetaspaceShared::link_and_cleanup_shared_classes() can be 
>>> simplified so you don't need to scan the CLDs in two passes.
>>>
>>> 1393  while (true) {
>>>         bool has_linked = false;
>>>         for (int i = 0; i < _loaded_cld->length(); i++) {
>>>           ClassLoaderData* cld = _loaded_cld->at(i);
>>>           for (Klass* klass = cld->klasses(); klass != NULL; klass = 
>>> klass->next_link()) {
>>>             if (klass->is_instance_klass()) {
>>>               InstanceKlass* ik = InstanceKlass::cast(ik);
>>>               if (inking_required(ik)) {
>>>                 has_linked = link_class_for_cds(ik, THREAD);
>>>               }
>>>             }
>>>           }
>>>         }
>>>
>>>         if (!has_linked) {
>>>           break;
>>>         }
>>>         // Class linking includes verification which may load more 
>>> classes.
>>>         // Keep scanning until we have linked no more classes.
>>>       }
>>>
>> I think the has_linked assignment should be "has_linked |= 
>> link_class_for_cds(ik, THREAD)" ?
>>
>> The above single loop is cleaner but I think it is less efficient. 
>> For the simple case if there's no linking required after the first 
>> loop. It will need to do another full loop to check if all classes 
>> don't need to be linked in order to exit the loop. With the 2 passes 
>> approach, it only needs to inspect the first element of the _klasses 
>> list of each CLD to determine if we can exit the loop. I'm good with 
>> the single loop since CDS dump isn't too performance critical and it 
>> makes the code easier to read.
>>
>>>
>>> Also link_class_for_cds should return the value of 
>>> MetaspaceShared::try_link_class(), so it has the same behavior as 
>>> before.
>>
>> Fixed.
>>
>> Here's an updated webrev:
>>
>>     http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.02/
>
> This looks good!
> Coleen
>>
>> thanks,
>> Calvin
>>>
>>> The rest of the changes looks good.
>>>
>>> Thanks
>>> - Ioi
>>>
>>> On 8/28/20 10:42 AM, calvin.cheung at oracle.com wrote:
>>>> Hi Coleen,
>>>>
>>>> Thanks for your review.
>>>>
>>>> I've re-implemented the fix with your suggestions.
>>>>
>>>> Here's an updated webrev:
>>>>
>>>> http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.01/
>>>>
>>>> On 8/26/20 3:24 PM, Coleen Phillimore wrote:
>>>>>
>>>>> Hi Calvin,
>>>>>
>>>>> I think you should change this to call 
>>>>> ClassLoaderDataGraph::loaded_cld_do(&collect_cld) and collect 
>>>>> ClassLoaderData in the GrowableArray, since you then walk through 
>>>>> them again to link the classes in the class loader data.
>>>> With the ClassLoaderDataGraph::loaded_cld_do approach, some klasses 
>>>> in the _klasses list may already in the link state. So I have to 
>>>> account for that and not link those klasses again. It complicates 
>>>> the fix a little bit.
>>>>>
>>>>> Can linking one klass cause a new class to be added to a _klasses 
>>>>> list of a CLD that you've already walked?  It seems like you need 
>>>>> the do() loop in the current code wrapped around walking the clds, 
>>>>> then link the classes until no progress. You can fast exit when 
>>>>> you find a klass linked, I believe, since the classes are added to 
>>>>> the front of the _klasses list.
>>>> Yes, it is possible. The updated patch will check if the first 
>>>> klass in the _klass list of each CLD hasn't been linked, then it 
>>>> will loop back again to go through all the CLD's _klasses lists.
>>>>>
>>>>> Also with this change, you can remove locked_loaded_classes_do() 
>>>>> because it was added for CDS and apparently the comment is wrong.
>>>>>
>>>>> // This case can block but cannot do unloading (called from CDS)
>>>>> void 
>>>>> ClassLoaderDataGraph::unlocked_loaded_classes_do(KlassClosure* 
>>>>> klass_closure) {
>>>>>   for (ClassLoaderData* cld = _head; cld != NULL; cld = 
>>>>> cld->next()) {
>>>>>     cld->loaded_classes_do(klass_closure);
>>>>>   }
>>>>> }
>>>>
>>>> I've removed the above.
>>>>
>>>> thanks,
>>>>
>>>> Calvin
>>>>
>>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>> On 8/26/20 5:35 PM, calvin.cheung at oracle.com wrote:
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8251860
>>>>>>
>>>>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.00/
>>>>>>
>>>>>> Please refer to the bug report for an analysis of the crash.
>>>>>>
>>>>>> The proposed fix is to acquire the ClassLoaderDataGraph_lock 
>>>>>> before calling 
>>>>>> ClassLoaderDataGraph::loaded_classes_do(&link_closure). The 
>>>>>> link_closure will not do the actual linking but just to store the 
>>>>>> InstanceKlass'es in an array and increment the keep_alive counter 
>>>>>> of the class_loader_data of each InstanceKlass. After the call to 
>>>>>> ClassLoaderDataGraph::loaded_classes_do, each InstanceKlass in 
>>>>>> the array will be linked and the keep_alive counter of the 
>>>>>> class_loader_data will be decremented.
>>>>>>
>>>>>> Testing:
>>>>>>     - tested about 40 times on the MacOS host where the crash was 
>>>>>> seen
>>>>>>        (before the fix, the crash was seen about once in 20 runs)
>>>>>>     - Tier1 - 4.
>>>>>> thanks,
>>>>>> Calvin
>>>>>
>>>
>


More information about the hotspot-runtime-dev mailing list