RFR(S): 8251860: ClassLoaderData::loaded_classes_do fails with "assert(ZAddress::is_marked(addr)) failed: Should be marked"
Ioi Lam
ioi.lam at oracle.com
Fri Aug 28 22:54:14 UTC 2020
Hi Calvin,
300 void ClassLoaderData::inc_keep_alive() {
301 if (has_class_mirror_holder()) {
302 if (!Arguments::is_dumping_archive()) {
303 assert(_keep_alive > 0, "Invalid keep alive increment count");
304 }
305 _keep_alive++;
306 }
307 }
Why is 302 needed? I think in CollectCLDClosure, you wouldn't collect
any CLD whose _keep_alive count is zero.
I think MetaspaceShared::link_and_cleanup_shared_classes() can be
simplified so you don't need to scan the CLDs in two passes.
1393 while (true) {
bool has_linked = false;
for (int i = 0; i < _loaded_cld->length(); i++) {
ClassLoaderData* cld = _loaded_cld->at(i);
for (Klass* klass = cld->klasses(); klass != NULL; klass =
klass->next_link()) {
if (klass->is_instance_klass()) {
InstanceKlass* ik = InstanceKlass::cast(ik);
if (inking_required(ik)) {
has_linked = link_class_for_cds(ik, THREAD);
}
}
}
}
if (!has_linked) {
break;
}
// Class linking includes verification which may load more classes.
// Keep scanning until we have linked no more classes.
}
Also link_class_for_cds should return the value of
MetaspaceShared::try_link_class(), so it has the same behavior as before.
The rest of the changes looks good.
Thanks
- Ioi
On 8/28/20 10:42 AM, calvin.cheung at oracle.com wrote:
> Hi Coleen,
>
> Thanks for your review.
>
> I've re-implemented the fix with your suggestions.
>
> Here's an updated webrev:
>
> http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.01/
>
> On 8/26/20 3:24 PM, Coleen Phillimore wrote:
>>
>> Hi Calvin,
>>
>> I think you should change this to call
>> ClassLoaderDataGraph::loaded_cld_do(&collect_cld) and collect
>> ClassLoaderData in the GrowableArray, since you then walk through
>> them again to link the classes in the class loader data.
> With the ClassLoaderDataGraph::loaded_cld_do approach, some klasses in
> the _klasses list may already in the link state. So I have to account
> for that and not link those klasses again. It complicates the fix a
> little bit.
>>
>> Can linking one klass cause a new class to be added to a _klasses
>> list of a CLD that you've already walked? It seems like you need the
>> do() loop in the current code wrapped around walking the clds, then
>> link the classes until no progress. You can fast exit when you find
>> a klass linked, I believe, since the classes are added to the front
>> of the _klasses list.
> Yes, it is possible. The updated patch will check if the first klass
> in the _klass list of each CLD hasn't been linked, then it will loop
> back again to go through all the CLD's _klasses lists.
>>
>> Also with this change, you can remove locked_loaded_classes_do()
>> because it was added for CDS and apparently the comment is wrong.
>>
>> // This case can block but cannot do unloading (called from CDS)
>> void ClassLoaderDataGraph::unlocked_loaded_classes_do(KlassClosure*
>> klass_closure) {
>> for (ClassLoaderData* cld = _head; cld != NULL; cld = cld->next()) {
>> cld->loaded_classes_do(klass_closure);
>> }
>> }
>
> I've removed the above.
>
> thanks,
>
> Calvin
>
>>
>> Thanks,
>> Coleen
>>
>> On 8/26/20 5:35 PM, calvin.cheung at oracle.com wrote:
>>> bug: https://bugs.openjdk.java.net/browse/JDK-8251860
>>>
>>> webrev: http://cr.openjdk.java.net/~ccheung/jdk16/8251860/webrev.00/
>>>
>>> Please refer to the bug report for an analysis of the crash.
>>>
>>> The proposed fix is to acquire the ClassLoaderDataGraph_lock before
>>> calling ClassLoaderDataGraph::loaded_classes_do(&link_closure). The
>>> link_closure will not do the actual linking but just to store the
>>> InstanceKlass'es in an array and increment the keep_alive counter of
>>> the class_loader_data of each InstanceKlass. After the call to
>>> ClassLoaderDataGraph::loaded_classes_do, each InstanceKlass in the
>>> array will be linked and the keep_alive counter of the
>>> class_loader_data will be decremented.
>>>
>>> Testing:
>>> - tested about 40 times on the MacOS host where the crash was seen
>>> (before the fix, the crash was seen about once in 20 runs)
>>> - Tier1 - 4.
>>> thanks,
>>> Calvin
>>
More information about the hotspot-runtime-dev
mailing list