RFR: JDK-8236604: Optimize SystemDictionary::resolve_well_known_classes for CDS

Yumin Qi yumin.qi at oracle.com
Wed Feb 26 16:55:20 UTC 2020


Hi, Ioi and all

    Updated webrev :http://cr.openjdk.java.net/~minqi/8236604/webrev-01/

   Add back the assert for Compile_lock in add_to_hierarchy and its 
calling paths. Changed them to conditional guard with 
Universe::is_fully_initialized(), the testing result showed almost same 
as first version.

   Test passed hs-tier1-4.

   Thanks

   Yumin

On 2/25/20 9:05 AM, Yumin Qi wrote:
> Hi, Ioi
>
> On 2/24/20 11:04 PM, Ioi Lam wrote:
>> Hi Yumin,
>>
>> This looks good. A few small nits:
>>
>> For the assets related to Compile_lock, I think you can change them to:
>>
>>   + if (Universe::fully_initialized()) {
>>       assert_locked_or_safepoint(Compile_lock);
>>   + }
>>
>> That way you don't have to remove the asserts, and you can call
>> SystemDictionary::add_to_hierarchy without making a copy of the code.
>>
>> Also in add_to_hierarchy:
>>
>>  + if (Universe::fully_initialized()) {
>>      CodeCache::flush_dependents_on(k);
>>  + }
>>
> This can save changes to the asserts but will add instructions to 
> runtime though it is not much. Let me check the effect.
>>
>> systemnDictionary.hpp:
>>
>> It seems unnecessary to make add_to_hierarchy() a public method.
>>
>> systemnDictionary.cpp:
>>
>> add_package_service() -- this should be a member of SystemDictionary.
>> It's better to not move the code up, so we can minimize the
>> amount of code changes.
>>
>> Also, the name add_package_service() is kind of unclear. I can't think
>> of a better name. Since we are just splitting out the bottom half of
>> load_shared_class, how about
>>
>>      ik->restore_unshareable_info(loader_data, protection_domain, 
>> CHECK_NULL);
>>    }
>> +  load_shared_class0(ik, loader_data, CHECK_NULL);
>> +  return ik;
>> + }
>> +
>> +
>> + // This common code used by both load_shared_class() and 
>> quick_resolve().
>> + void SystemDictionary::load_shared_class0(InstanceKlass* ik,
>> +          ClassLoaderData* loader_data, TRAPS) {
>>   ik->print_class_load_logging(loader_data, NULL, NULL);
>>
> I don't know how to name it either, maybe load_shared_class_misc for 
> doing the rest of the logging, adding, printing work?
>> ===============
>>
>> For SystemDictionary::quick_resolve(), I think we should add an 
>> assert for k->is_shared().
> OK, will add.
>>
>> ===============
>>
>> Is the check for MAX_QUICK_RESOLVE_ID necessary?
>>
>>   if (UseSharedSpaces && !JvmtiExport::should_post_class_prepare()
>>                       && id > MAX_QUICK_RESOLVE_ID) {
>>
> This still in need. If the check removed, we will assert on shared 
> archive objects are not fixed. That is, the resolving of Object_klass 
> should not use quick_resolve. Fixing shared archive objects needs 
> Object fully loaded, but if we quick_resolve Object class, the 
> restoring un-shareable info will try to fix the mirror, point to 
> shared archive object heap. There, assert on shared archive heap not 
> fixed --- the fixation is after resolving Object_klass. This part 
> confused me so I use the macro here for a guard, the Object_klass has 
> to follow the old path for resolution. Or it can be changed to 
> !is_wk_klass_loaded(Object_klass)
>> ===============
>>
>> +  if ((*klassp) == NULL || (*klassp)->class_loader_data() == NULL) {
>>
>> Maybe it's better to change to
>>
>>    if (is_wk_klass_loaded(*klassp)) {
>>
> Do you mean
>
> if (!is_wk_klass_loaded(*klassp)) ?
>
>> ===============
>>  void SystemDictionary::resolve_well_known_classes(TRAPS) {
>> -  assert(WK_KLASS(Object_klass) == NULL, "well-known classes should 
>> only be initialized once");
>>
>> Maybe it's better to leave this assert here, but change to
>>
>>     assert(!Object_klass_loaded(), ...);
>>
> Yes, after the patch, the Object class is not NULL and not loaded at 
> this point, but for non shared vesion, have to check if 
> Object_klass_loaded() will encounter NULL pointer first.
>> ===============
>>
>> While you are changing 
>> SystemDictionary::resolve_well_known_classes(), these
>> comments are very old (we don't have perm gen anymore). Could you 
>> delete them?
>>
>> 2038   // These calls iterate over the objects currently in the perm gen
>> 2039   // so calling them at this point is matters (not before when 
>> there
>> 2040   // are fewer objects and not later after there are more objects
>> 2041   // in the perm gen.
>>
>>
> Sure
>
>
> Thanks
>
> Yumin
>
>
>> On 2/24/20 9:02 PM, Yumin Qi wrote:
>>> Hi,
>>>
>>>    Please review fix:
>>>
>>>    Bug: https://bugs.openjdk.java.net/browse/JDK-8236604
>>>
>>>    Webrev:http://cr.openjdk.java.net/~minqi/8236604/webrev-00
>>>
>>>     Description: Optimize well known classes initialization for CDS 
>>> during jvm startup.
>>>
>>>     When run with CDS, initialize well known classes (95 classes) 
>>> will call resolve_or_fail thus go though the resolve functions, 
>>> locks etc. The initialization of well-known classes happens in fact 
>>> in very early stage, those lock can be avoided. The fix is 
>>> serializing SystemDictionay::_well_known_klasses into CDS  and in 
>>> runtime restore them by avoiding call resolve_or_fail. Since 
>>> Compile_lock has to be held for SystemDictionary::add_to_hierachy, 
>>> the fix avoids calling it by copying the code from it in 
>>> SytemDictionary::quick_resolve, but this way I have to remove two 
>>> asserts which will assert on Compile_lock. The reminding usage of 
>>> the lock was added as comments on the function declarations of 
>>> Klass::append_to_sibling_list and InstanceKlass::add_implementor. 
>>> The two functions calling paths have been checked to make sure they 
>>> are not used in other places.
>>>
>>>      Performance measured by take time before and after 
>>> SystemDictionary::resolve_well_known_classes(manually modified 
>>> orignal/new versions), since this is the most direct measurement and 
>>> accurate(excluded in review code):
>>>
>>>     + jlong s0 = os::javaTimeNanos();
>>>      resolve_well_known_classes(CHECK);
>>>
>>>      + jlong s1 = os::javaTimeNanos();
>>>
>>>      // print out s1 - s0
>>>
>>>      Run -version 2000 times for original/new versions respectively 
>>> and took the averages. The saving is about 18% (2.9ms vs 2.4ms), it 
>>> is not a big saving but still helps to improve the startup time.
>>>
>>>     Testing: local jtreg test on linux-x86.
>>>
>>>     pending mach5 hs-tier1-4
>>>
>>>
>>> Thanks
>>>
>>> Yumin
>>>
>>


More information about the hotspot-runtime-dev mailing list