RFR(S): 8251860: ClassLoaderData::loaded_classes_do fails with "assert(ZAddress::is_marked(addr)) failed: Should be marked"
Coleen Phillimore
coleen.phillimore at oracle.com
Mon Aug 31 22:22:42 UTC 2020
On 8/31/20 11:57 AM, calvin.cheung at oracle.com wrote:
> Hi Ioi,
>
> Thanks for your review!
>
> On 8/28/20 3:54 PM, Ioi Lam wrote:
>> Hi Calvin,
>>
>> 300 void ClassLoaderData::inc_keep_alive() {
>> 301 if (has_class_mirror_holder()) {
>> 302 if (!Arguments::is_dumping_archive()) {
>> 303 assert(_keep_alive > 0, "Invalid keep alive increment
>> count");
>> 304 }
>> 305 _keep_alive++;
>> 306 }
>> 307 }
>>
>> Why is 302 needed? I think in CollectCLDClosure, you wouldn't collect
>> any CLD whose _keep_alive count is zero.
>
> The (!cld->is_unloading()) doesn't mean that the _keep_alive count is
> non-zero.
>
> We currently have a few tests which would hit the assert if line 302
> is not there. One of them is the appcds/dynamicArchive/LinkClassTest.
> While the "Parent x = new Child()" is being verified during linking,
> the VerificationType::resolve_and_check_assignability will be called.
> This will eventually call the SystemDictionary::register_loader but it
> will not add a class_loader to the ClassLoaderDataGraph. Prior to this
> patch, the _keep_alive will be incremented at 2 places:
>
> 1) java_lang_Class::set_mirror_module_field
>
> In this case, the _keep_alive will be incremented during vm startup.
>
> 2) ClassLoaderData::ClassLoaderData
>
> _keep_alive((has_class_mirror_holder || h_class_loader.is_null()) ?
> 1 : 0),
>
> Neither of the above code path will be taken for the assignability check.
>
>>
>> I think MetaspaceShared::link_and_cleanup_shared_classes() can be
>> simplified so you don't need to scan the CLDs in two passes.
>>
>> 1393 while (true) {
>> bool has_linked = false;
>> for (int i = 0; i < _loaded_cld->length(); i++) {
>> ClassLoaderData* cld = _loaded_cld->at(i);
>> for (Klass* klass = cld->klasses(); klass != NULL; klass =
>> klass->next_link()) {
>> if (klass->is_instance_klass()) {
>> InstanceKlass* ik = InstanceKlass::cast(ik);
>> if (inking_required(ik)) {
>> has_linked = link_class_for_cds(ik, THREAD);
>> }
>> }
>> }
>> }
>>
>> if (!has_linked) {
>> break;
>> }
>> // Class linking includes verification which may load more
>> classes.
>> // Keep scanning until we have linked no more classes.
>> }
>>
> I think the has_linked assignment should be "has_linked |=
> link_class_for_cds(ik, THREAD)" ?
>
> The above single loop is cleaner but I think it is less efficient. For
> the simple case if there's no linking required after the first loop.
> It will need to do another full loop to check if all classes don't
> need to be linked in order to exit the loop. With the 2 passes
> approach, it only needs to inspect the first element of the _klasses
> list of each CLD to determine if we can exit the loop. I'm good with
> the single loop since CDS dump isn't too performance critical and it
> makes the code easier to read.
>
>>
>> Also link_class_for_cds should return the value of
>> MetaspaceShared::try_link_class(), so it has the same behavior as
>> before.
>
> Fixed.
>
> Here's an updated webrev:
>
> http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.02/
This looks good!
Coleen
>
> thanks,
> Calvin
>>
>> The rest of the changes looks good.
>>
>> Thanks
>> - Ioi
>>
>> On 8/28/20 10:42 AM, calvin.cheung at oracle.com wrote:
>>> Hi Coleen,
>>>
>>> Thanks for your review.
>>>
>>> I've re-implemented the fix with your suggestions.
>>>
>>> Here's an updated webrev:
>>>
>>> http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.01/
>>>
>>> On 8/26/20 3:24 PM, Coleen Phillimore wrote:
>>>>
>>>> Hi Calvin,
>>>>
>>>> I think you should change this to call
>>>> ClassLoaderDataGraph::loaded_cld_do(&collect_cld) and collect
>>>> ClassLoaderData in the GrowableArray, since you then walk through
>>>> them again to link the classes in the class loader data.
>>> With the ClassLoaderDataGraph::loaded_cld_do approach, some klasses
>>> in the _klasses list may already in the link state. So I have to
>>> account for that and not link those klasses again. It complicates
>>> the fix a little bit.
>>>>
>>>> Can linking one klass cause a new class to be added to a _klasses
>>>> list of a CLD that you've already walked? It seems like you need
>>>> the do() loop in the current code wrapped around walking the clds,
>>>> then link the classes until no progress. You can fast exit when you
>>>> find a klass linked, I believe, since the classes are added to the
>>>> front of the _klasses list.
>>> Yes, it is possible. The updated patch will check if the first klass
>>> in the _klass list of each CLD hasn't been linked, then it will loop
>>> back again to go through all the CLD's _klasses lists.
>>>>
>>>> Also with this change, you can remove locked_loaded_classes_do()
>>>> because it was added for CDS and apparently the comment is wrong.
>>>>
>>>> // This case can block but cannot do unloading (called from CDS)
>>>> void ClassLoaderDataGraph::unlocked_loaded_classes_do(KlassClosure*
>>>> klass_closure) {
>>>> for (ClassLoaderData* cld = _head; cld != NULL; cld = cld->next()) {
>>>> cld->loaded_classes_do(klass_closure);
>>>> }
>>>> }
>>>
>>> I've removed the above.
>>>
>>> thanks,
>>>
>>> Calvin
>>>
>>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 8/26/20 5:35 PM, calvin.cheung at oracle.com wrote:
>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8251860
>>>>>
>>>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.00/
>>>>>
>>>>> Please refer to the bug report for an analysis of the crash.
>>>>>
>>>>> The proposed fix is to acquire the ClassLoaderDataGraph_lock
>>>>> before calling
>>>>> ClassLoaderDataGraph::loaded_classes_do(&link_closure). The
>>>>> link_closure will not do the actual linking but just to store the
>>>>> InstanceKlass'es in an array and increment the keep_alive counter
>>>>> of the class_loader_data of each InstanceKlass. After the call to
>>>>> ClassLoaderDataGraph::loaded_classes_do, each InstanceKlass in the
>>>>> array will be linked and the keep_alive counter of the
>>>>> class_loader_data will be decremented.
>>>>>
>>>>> Testing:
>>>>> - tested about 40 times on the MacOS host where the crash was
>>>>> seen
>>>>> (before the fix, the crash was seen about once in 20 runs)
>>>>> - Tier1 - 4.
>>>>> thanks,
>>>>> Calvin
>>>>
>>
More information about the hotspot-runtime-dev
mailing list