RFR: JDK-8236604: Optimize SystemDictionary::resolve_well_known_classes for CDS

Yumin Qi yumin.qi at oracle.com
Tue Feb 25 15:59:16 UTC 2020


HI, Claes

   Thanks for your confirmation. Wait for your full-review!

Yumin

On 2/24/20 9:41 PM, Claes Redestad wrote:
> Hi,
>
> before diving into a full review, I took your patch for a spin and
> can confirm a speed-up around .5ms, along with about a 1% reduction of
> #instructions and #branches on Hello World[1]. Sweet!
>
> /Claes
>
> [1]
> Baseline:
>          53.028269      task-clock (msec)         #    1.396 CPUs 
> utilized            ( +-  0.34% )
>                244      context-switches          #    0.005 M/sec 
>               ( +-  0.60% )
>                 32      cpu-migrations            #    0.607 K/sec 
>               ( +-  0.39% )
>              3,634      page-faults               #    0.069 M/sec 
>               ( +-  0.03% )
>        138,590,680      cycles                    #    2.614 GHz 
>               ( +-  0.25% )
>        112,946,802      instructions              #    0.81  insns per 
> cycle          ( +-  0.06% )
>         22,428,330      branches                  #  422.950 M/sec 
>               ( +-  0.06% )
>            771,755      branch-misses             #    3.44% of all 
> branches          ( +-  0.13% )
>
>        0.037981200 seconds time elapsed          ( +-  0.39% )
>
> Patched:
>          52.537826      task-clock (msec)         #    1.409 CPUs 
> utilized            ( +-  0.26% )
>                244      context-switches          #    0.005 M/sec 
>               ( +-  0.65% )
>                 32      cpu-migrations            #    0.607 K/sec 
>               ( +-  0.43% )
>              3,637      page-faults               #    0.069 M/sec 
>               ( +-  0.04% )
>        137,129,495      cycles                    #    2.610 GHz 
>               ( +-  0.23% )
>        111,931,764      instructions              #    0.82  insns per 
> cycle          ( +-  0.07% )
>         22,215,346      branches                  #  422.845 M/sec 
>               ( +-  0.07% )
>            766,573      branch-misses             #    3.45% of all 
> branches          ( +-  0.13% )
>
>        0.037283845 seconds time elapsed          ( +-  0.30% )
>
>
> On 2020-02-25 06:02, Yumin Qi wrote:
>> Hi,
>>
>>     Please review fix:
>>
>>     Bug: https://bugs.openjdk.java.net/browse/JDK-8236604
>>
>>     Webrev:http://cr.openjdk.java.net/~minqi/8236604/webrev-00
>>
>>      Description: Optimize well known classes initialization for CDS 
>> during jvm startup.
>>
>>      When run with CDS, initialize well known classes (95 classes) 
>> will call resolve_or_fail thus go though the resolve functions, locks 
>> etc. The initialization of well-known classes happens in fact in very 
>> early stage, those lock can be avoided. The fix is serializing 
>> SystemDictionay::_well_known_klasses into CDS  and in runtime restore 
>> them by avoiding call resolve_or_fail. Since Compile_lock has to be 
>> held for SystemDictionary::add_to_hierachy, the fix avoids calling it 
>> by copying the code from it in SytemDictionary::quick_resolve, but 
>> this way I have to remove two asserts which will assert on 
>> Compile_lock. The reminding usage of the lock was added as comments 
>> on the function declarations of Klass::append_to_sibling_list and 
>> InstanceKlass::add_implementor. The two functions calling paths have 
>> been checked to make sure they are not used in other places.
>>
>>       Performance measured by take time before and after 
>> SystemDictionary::resolve_well_known_classes(manually modified 
>> orignal/new versions), since this is the most direct measurement and 
>> accurate(excluded in review code):
>>
>>      + jlong s0 = os::javaTimeNanos();
>>       resolve_well_known_classes(CHECK);
>>
>>       + jlong s1 = os::javaTimeNanos();
>>
>>       // print out s1 - s0
>>
>>       Run -version 2000 times for original/new versions respectively 
>> and took the averages. The saving is about 18% (2.9ms vs 2.4ms), it 
>> is not a big saving but still helps to improve the startup time.
>>
>>      Testing: local jtreg test on linux-x86.
>>
>>      pending mach5 hs-tier1-4
>>
>>
>> Thanks
>>
>> Yumin
>>


More information about the hotspot-runtime-dev mailing list