RFR(S) 8231125 Improve testing of parallel loading of shared classes by the boot class loader
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Thu Sep 19 22:54:11 UTC 2019
On 9/19/19 5:42 PM, Ioi Lam wrote:
>
>
> On 9/19/19 8:14 AM, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 9/18/19 11:30 PM, Ioi Lam wrote:
>>>
>>>
>>> On 9/18/19 6:47 PM, coleen.phillimore at oracle.com wrote:
>>>>
>>>> I have to say that the 01 assert matched better what I thought it
>>>> should assert, ie. that boot class loaders cannot get to that code
>>>> in parallel.
>>>>
>>>> The mirror has a bunch of other logic with sharing that just
>>>> complicates this basic premise. Also, by asserting in the code
>>>> that takes out the lock, it doesn't make it clear that two threads
>>>> with the bootloader cannot execute that code in parallel.
>>>>
>>>
>>> But the problem with version 01 is two threads could both come to
>>> the assert at the same time, and both see that the class loader is
>>> not set, and then both would proceed to restore the class.
>>
>> I think the assert should show that two threads cannot call
>> load_shared_class() at the same time, because the second thread is
>> waiting on the SystemDictionary_lock because it finds the class in
>> the Placeholder table being loaded by a different thread.
>>
>
> Hi Coleen,
>
> I moved the assert to the top of load_shared_class(). I can't assert
> (ik->class_loader_data() == 0) for non-boot classes, so I added a new
> method ik->is_unshareable_info_restored().
>
> http://cr.openjdk.java.net/~iklam/jdk14/8231125-shared-boot-class-parallel-test.v04/
>
>
> I will file separate bugs to: (1) eliminate the if (ik != NULL) at the
> top of load_shared_class(). No one should call this function with a
> NULL klass. (2) remove the lock before calling
> restore_unshareable_info. It shouldn't be necessary.
(1) yes, definitely. This looked odd.
(2) if this function is subtly single threaded, I agree that lock
shouldn't be needed.
This last change looks fine.
Thanks,
Coleen
>
> Thanks
> - Ioi
>
>> Coleen
>>>
>>> Anyway, maybe we should not put the assert in load_shared_class.
>>> Instead, move the assert to Klass::restore_unsharable_info():
>>>
>>> http://cr.openjdk.java.net/~iklam/jdk14/8231125-shared-boot-class-parallel-test.v03/
>>>
>>>
>>> I also fixed the spacing in the "for" loops in the other test file
>>> as requested by David.
>>>
>>> Thanks
>>> - Ioi
>>>
>>>> I like 01 better (except the test fixes are good).
>>>> thanks,
>>>> Coleen
>>>>
>>>> On 9/18/19 2:23 AM, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 9/17/19 6:27 PM, David Holmes wrote:
>>>>>> Hi Ioi,
>>>>>>
>>>>>> On 18/09/2019 7:24 am, coleen.phillimore at oracle.com wrote:
>>>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8231125-shared-boot-class-parallel-test.v01/src/hotspot/share/classfile/systemDictionary.cpp.udiff.html
>>>>>>>
>>>>>>>
>>>>>>> Can you add a line something like.
>>>>>>>
>>>>>>> + if (class_loader() == NULL) {
>>>>>>> + // See case 3 in SystemDictionary::resolve_instance_class_or_null
>>>>>>> + // multiple threads wait on the SystemDictionary_lock if
>>>>>>> the boot loader class is in the placeholder table, so only
>>>>>>> + // one thread loads the class.
>>>>>>> + assert(ik->class_loader_data() == NULL, "boot loader classes
>>>>>>> are not loaded in parallel");
>>>>>>> + }
>>>>>>> +
>>>>>>
>>>>>> Even with Coleen's suggested comment I still don't see what this
>>>>>> code has to do with loading in parallel? AFAICS all you are
>>>>>> asserting is that if the class is to be loaded by the boot-loader
>>>>>> then the class should have been dumped by the boot-loader - no?
>>>>>>
>>>>> Hi David,
>>>>>
>>>>> When we were analyzing a recent random crash that might be caused
>>>>> by race condition during class loading, we looked at this call to
>>>>> restore_unshareable_info() in SystemDictionary::load_shared_class()
>>>>>
>>>>> ClassLoaderData* loader_data =
>>>>> ClassLoaderData::class_loader_data(class_loader());
>>>>> {
>>>>> HandleMark hm(THREAD);
>>>>> Handle lockObject = compute_loader_lock_object(class_loader,
>>>>> THREAD);
>>>>> check_loader_lock_contention(lockObject, THREAD);
>>>>> ObjectLocker ol(lockObject, THREAD, true);
>>>>> // prohibited package check assumes all classes loaded from
>>>>> archive call
>>>>> // restore_unshareable_info which calls ik->set_package()
>>>>> ik->restore_unshareable_info(loader_data, protection_domain,
>>>>> CHECK_NULL);
>>>>> }
>>>>>
>>>>> Because all the crash seemed to be with shared classes loaded by
>>>>> the boot loader, we suspect that two threads competing to load the
>>>>> same class may arrive here simultaneously, and we will end up
>>>>> calling ik->restore_unshareable_info() twice, with harmful side
>>>>> effects.
>>>>>
>>>>> Upon further investigation, we found that this is not the case.
>>>>> But it will be good to have an assert to check for this -- that we
>>>>> won't call ik->restore_unshareable_info() more than once.
>>>>>
>>>>> I realized that I put the assert at the wrong place, and asserted
>>>>> on the wrong field (non-boot classes will enter
>>>>> load_shared_class() with ik->class_loader_data() != NULL). So I
>>>>> changed the assert to be on the java_mirror. That way, the assert
>>>>> works for shared classes for all loaders.
>>>>>
>>>>> assert(!ik->is_java_mirror_restored(), "don't load the same
>>>>> shared class twice");
>>>>> ik->restore_unshareable_info(loader_data, protection_domain,
>>>>> CHECK_NULL);
>>>>> assert(ik->is_java_mirror_restored() ||
>>>>> !SystemDictionary::Class_klass_loaded(), "must be");
>>>>>
>>>>> Here's an updated webrev, which also contains fixes you suggested
>>>>> below.
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>> The test seems okay - the proof is in the running :)
>>>>>>
>>>>>> One style nit:
>>>>>>
>>>>>> ! for (int i=0; i<APP_LOOPS; i++) {
>>>>>>
>>>>>> spaces around binary operators please. This seems to impact all
>>>>>> for loops.
>>>>>>
>>>>>> Also:
>>>>>>
>>>>>> 175 { // Avoid logging in this block so the threads
>>>>>> can proceeds without
>>>>>>
>>>>>> s/proceeds/proceed/
>>>>>>
>>>>>> Thanks,
>>>>>> David
>>>>>> -----
>>>>>>
>>>>>>> I wonder if the ObjectLocker is really needed for non-bootclass
>>>>>>> loaders below this code, since it appears to not be needed for
>>>>>>> boot class loaders.
>>>>>>>
>>>>>>> The test changes look good. I think you're supposed to have
>>>>>>> @bug tags in them.
>>>>>>>
>>>>>>> Thanks for going through this code looking for races!
>>>>>>> Coleen
>>>>>>>
>>>>>>>
>>>>>>> On 9/17/19 1:10 PM, Ioi Lam wrote:
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8231125
>>>>>>>> http://cr.openjdk.java.net/~iklam/jdk14/8231125-shared-boot-class-parallel-test.v01/
>>>>>>>>
>>>>>>>>
>>>>>>>>
>>>>>>>> [1] Modified ParallelLoadTest.java to also test the loading of
>>>>>>>> shared classes
>>>>>>>> into the boot loader.
>>>>>>>> [2] Modified ParallelLoad.java to minimize the initial time lag
>>>>>>>> when the 4 parallel
>>>>>>>> threads compete to load the same class.
>>>>>>>> [3] Added an assert in the JVM to verify that for a shared
>>>>>>>> classes loaded
>>>>>>>> the boot class loader, SystemDictionary::load_shared_class
>>>>>>>> is not called
>>>>>>>> multiple times.
>>>>>>>>
>>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list