RFR JDK-8232222: Set state to 'linked' when an archived class is restored at runtime

Ioi Lam ioi.lam at oracle.com
Mon Jun 1 22:56:34 UTC 2020



On 5/31/20 11:14 PM, David Holmes wrote:
> Hi Jiangli,
>
> On 29/05/2020 9:02 am, Jiangli Zhou wrote:
>> (Looping in serviceability-dev at openjdk.java.net ...)
>>
>> Hi David and Ioi,
>>
>> On Wed, May 27, 2020 at 11:15 PM David Holmes 
>> <david.holmes at oracle.com> wrote:
>>>
>>> Hi Jiangli,
>>>
>>> On 28/05/2020 11:35 am, Ioi Lam wrote:
>>>>
>>>>
>>>> On 5/27/20 6:17 PM, Jiangli Zhou wrote:
>>>>> On Wed, May 27, 2020 at 1:56 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>> On 5/26/20 6:21 PM, Jiangli Zhou wrote:
>>>>>>
>>>>>>> Focusing on the link state for archived classes in this thread, I
>>>>>>> updated the webrev to only set archived boot classes to 'linked' 
>>>>>>> state
>>>>>>> at restore time. More investigations can be done for archived 
>>>>>>> classes
>>>>>>> for other builtin loaders.
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8232222
>>>>>>> http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/
>>>>>>>
>>>>>>> Please let me know if there is any additional concerns to the 
>>>>>>> change.
>>>>>>>
>>>>>>> Best regards,
>>>>>>> Jiangli
>>>>>>>
>>>>>> Hi Jiangli,
>>>>>>
>>>>>> I think the change is fine. I am wondering if this
>>>>>>
>>>>>> 2530   if (!BytecodeVerificationLocal &&
>>>>>> 2531 loader_data->is_the_null_class_loader_data()) {
>>>>>> 2532     _init_state = linked;
>>>>>> 2533   }
>>>>>>
>>>>>>
>>>>>> can be changed to
>>>>>>
>>>>>>           if (!BytecodeVerificationLocal &&
>>>>>> loader_data->is_the_null_class_loader_data() &&
>>>>>>               !JvmtiExport::should_post_class_prepare())
>>>>>>
>>>>>> That way, there's no need to change systemDictionary.cpp.
>>>>>>
>>>>>>
>>>>> I was going to take the suggestion, but realized that it would add
>>>>> unnecessary complications for archived boot classes with class
>>>>> pre-initialization support. Some agents may set
>>>>> JvmtiExport::should_post_class_prepare(). It's worthwhile to support
>>>>> class pre-init uniformly for archived boot classes with
>>>>> JvmtiExport::should_post_class_prepare() enabled or disabled.
>>>>
>>>> This would introduce behavioral changes when JVMTI is enabled:
>>>>
>>>> + The order of JvmtiExport::post_class_prepare is different than 
>>>> before
>>>> + JvmtiExport::post_class_prepare may be called for a class that 
>>>> was not
>>>> called before (if the class is never linked during run time)
>>>> + JvmtiExport::post_class_prepare was called inside the init_lock, now
>>>> it's called outside of the init_lock
>>>
>>> I have to say I share Ioi's concerns here. This change will impact JVM
>>> TI agents in a way we can't be sure of. From a specification 
>>> perspective
>>> I think we are fine as linking can be lazy or eager, so there's no
>>> implied order either. But this would be a behavioural change that will
>>> be observable by agents. (I'm less concerned about the init_lock
>>> situation as it seems potentially buggy to me to call out to an agent
>>> with the init_lock held in the first place! I find it hard to 
>>> imagine an
>>> agent only working correctly if the init_lock is held.)
>>>
>>
>> Totally agree that we need to be very careful here (that's also part
>> of the reason why I separated this into an individual RFE for the
>> dedicated discussion). David, thanks for the analysis from the spec
>> perspective! Agreed with the init_lock comment also. In the future, I
>> think we can even get rid of the needs for init_lock completely for
>> some of the pre-initialized classes.
>>
>> This change has gone through extensive testing since the later part of
>> last year and has been in use (with the default CDS) with agents that
>> do post_class_prepare. Hopefully that would ease some of the concerns.
>
> That is good to know, but that is just one sample of a set of agents.
>
>>> This would need a CSR request and involvement of the serviceabilty 
>>> folk,
>>> to work through any potential issues.
>>>
>>
>> I've looped in serviceability-dev at openjdk.java.net for this
>> discussion. Chris or Serguei could you please take a look of the
>> change, http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/,
>> specifically the JvmtiExport::post_class_prepare change in
>> systemDictionary.cpp.
>>
>> Filing a CSR request sounds good to me. The CSR looks after source,
>> binary, and behavioral compatibility. From a behavior point of view,
>> the change most likely does not cause any visible effects to a JVMTI
>> agent (based on what's observed in testing and usages). What should be
>> included in the CSR?
>
> The CSR request should explain the behavioural change that will be 
> observable by agents, and all of the potential compatibility issues 
> that might arise from that - pointing out of course that as the spec 
> (JVMS 5.4**) allows for eager or lazy linking, agents shouldn't be 
> relying on the exact timing or order of events.
>
> ** I note this section has some additional constraints regarding 
> dynamically computed constants that might also come into play with 
> this pre-linking for CDS classes.
>
I think the CSR should also include the benefit of doing this. It's not 
a lot of code change, but now we have to maintain two different code 
paths for post_class_prepare to be called.

JVMTI agents will typically introduce quite a bit of overhead in 
start-up, so a reduction in the range of 0.2~0.4ms seems a drop to the 
bucket. I'd rather keep the VM simple unless we have a strong reason to 
make it more complicated.

Thanks
- Ioi

> Cheers,
> David
> -----
>
>>> Ioi's suggestion avoids this problem, but, as you note, at the expense
>>> of disabling this optimisation if an agent is attached and wants class
>>> prepare events.
>>>
>>
>> Right, if we handle that case conditionally, we would alway need to
>> store the cached static field values separately since the dump time
>> cannot foresee if the runtime can set boot classes in 'linked' state
>> (and 'fully_initialized' state with the planned changes) at restore
>> time. As a result, we need to handle all pre-initialized static fields
>> like what we are doing today, which is storing them in the archived
>> class_info_records then installing them to the related fields at
>> runtime. That causes both unwanted memory and CPU overhead at runtime.
>>
>> I also updated the webrev.02 in place with typo fixes. Thanks!
>>
>> Best regards,
>> Jiangli
>>
>>> Thanks,
>>> David
>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>>>
>>>>>> BTW, I was wondering where the performance came from, so I wrote an
>>>>>> investigative patch:
>>>>>>
>>>>>> diff -r 0702191777c9 src/hotspot/share/oops/instanceKlass.cpp
>>>>>> --- a/src/hotspot/share/oops/instanceKlass.cpp    Thu May 21 
>>>>>> 15:56:27
>>>>>> 2020 -0700
>>>>>> +++ b/src/hotspot/share/oops/instanceKlass.cpp    Wed May 27 
>>>>>> 10:48:57
>>>>>> 2020 -0700
>>>>>> @@ -866,6 +866,13 @@
>>>>>>         return true;
>>>>>>       }
>>>>>>
>>>>>> +  if (UseSharedSpaces && !BytecodeVerificationLocal &&
>>>>>> is_shared_boot_class()) {
>>>>>> +    Handle h_init_lock(THREAD, init_lock());
>>>>>> +    ObjectLocker ol(h_init_lock, THREAD, h_init_lock() != NULL);
>>>>>> +    set_init_state(linked);
>>>>>> +    return true;
>>>>>> +  }
>>>>>> +
>>>>>>       // trace only the link time for this klass that includes
>>>>>>       // the verification time
>>>>>>       PerfClassTraceTime 
>>>>>> vmtimer(ClassLoader::perf_class_link_time(),
>>>>>>
>>>>>>
>>>>>> Benchmarking results (smaller numbers are better):
>>>>>>
>>>>>> (baseline vs your patch)
>>>>>>
>>>>>>              baseline jiangli                           baseline
>>>>>> jiangli
>>>>>>       1:     58514375    57755638 (-758737) -----     40.266
>>>>>> 40.135 (
>>>>>> -0.131)      -
>>>>>>       2:     58506426    57754623 (-751803) -----     40.367
>>>>>> 39.417 (
>>>>>> -0.950)      -----
>>>>>>       3:     58498554    57759735 (-738819) -----     40.513
>>>>>> 39.970 (
>>>>>> -0.543)      ---
>>>>>>       4:     58491265    57751296 (-739969) -----     40.439
>>>>>> 40.268 (
>>>>>> -0.171)      -
>>>>>>       5:     58500588    57750975 (-749613) -----     40.569
>>>>>> 40.080 (
>>>>>> -0.489)      --
>>>>>>       6:     58497015    57744418 (-752597) -----     41.097
>>>>>> 40.147 (
>>>>>> -0.950)      -----
>>>>>>       7:     58494335    57749909 (-744426) -----     39.983 40.214
>>>>>> (  0.231)     +
>>>>>>       8:     58500401    57750305 (-750096) -----     40.235 40.417
>>>>>> (  0.182)     +
>>>>>>       9:     58490728    57767463 (-723265) -----     40.354
>>>>>> 39.928 (
>>>>>> -0.426)      --
>>>>>>      10:     58497858    57746557 (-751301) -----     40.756
>>>>>> 39.706 (
>>>>>> -1.050)      -----
>>>>>> ============================================================
>>>>>>              58499154    57753091 (-746062) -----     40.457
>>>>>> 40.027 (
>>>>>> -0.430)      --
>>>>>> instr delta =      -746062    -1.2753%
>>>>>> time  delta =       -0.430 ms -1.0619%
>>>>>>
>>>>>>
>>>>>> (baseline vs my patch)
>>>>>>
>>>>>>              baseline    ioi baseline  ioi
>>>>>>       1:     58503574    57821124 (-682450)      ----- 40.554    
>>>>>> 39.783 (
>>>>>> -0.771)      -----
>>>>>>       2:     58499325    57819459 (-679866) -----     40.092 40.325
>>>>>> (  0.233)    ++
>>>>>>       3:     58492362    57811978 (-680384) -----     40.546
>>>>>> 39.826 (
>>>>>> -0.720)      -----
>>>>>>       4:     58488655    57828878 (-659777) -----     40.270 40.550
>>>>>> (  0.280)    ++
>>>>>>       5:     58501567    57830179 (-671388) -----     40.382
>>>>>> 40.145 (
>>>>>> -0.237)      --
>>>>>>       6:     58496552    57808774 (-687778) -----     40.702
>>>>>> 40.527 (
>>>>>> -0.175)      -
>>>>>>       7:     58482701    57808925 (-673776) -----     40.268
>>>>>> 39.849 (
>>>>>> -0.419)      ---
>>>>>>       8:     58493831    57807810 (-686021) -----     40.396
>>>>>> 39.940 (
>>>>>> -0.456)      ---
>>>>>>       9:     58489388    57811354 (-678034) -----     40.575
>>>>>> 40.078 (
>>>>>> -0.497)      ---
>>>>>>      10:     58482512    57795489 (-687023) -----     40.084 40.247
>>>>>> (  0.163)     +
>>>>>> ============================================================
>>>>>>              58493046    57814396 (-678650) -----     40.386
>>>>>> 40.126 (
>>>>>> -0.260)      --
>>>>>> instr delta =      -678650    -1.1602%
>>>>>> time  delta =       -0.260 ms -0.6445%
>>>>>>
>>>>>>
>>>>>> (your patch vs my patch)
>>>>>>
>>>>>>              jiangli ioi                              jiangli ioi
>>>>>>       1:     57716711    57782622 ( 65911) ++++          41.042 
>>>>>> 40.302 (
>>>>>> -0.740)      -----
>>>>>>       2:     57709666    57780196 ( 70530) ++++          40.334 
>>>>>> 40.965 (
>>>>>> 0.631)  ++++
>>>>>>       3:     57716074    57803315 ( 87241) +++++          40.239 
>>>>>> 39.823 (
>>>>>> -0.416)      ---
>>>>>>       4:     57725152    57782719 ( 57567) +++          40.430 
>>>>>> 39.805 (
>>>>>> -0.625)      ----
>>>>>>       5:     57719799    57787187 ( 67388) ++++          40.138 
>>>>>> 40.003 (
>>>>>> -0.135)      -
>>>>>>       6:     57721922    57769193 ( 47271) +++          40.324 
>>>>>> 40.207 (
>>>>>> -0.117)      -
>>>>>>       7:     57716438    57785212 ( 68774) ++++          39.978 
>>>>>> 40.149 (
>>>>>> 0.171)     +
>>>>>>       8:     57713834    57778797 ( 64963) ++++          40.359 
>>>>>> 40.210 (
>>>>>> -0.149)      -
>>>>>>       9:     57711272    57786376 ( 75104) ++++          40.575 
>>>>>> 40.724 (
>>>>>> 0.149)     +
>>>>>>      10:     57711660    57780548 ( 68888) ++++          40.291 
>>>>>> 40.091 (
>>>>>> -0.200)      -
>>>>>> ============================================================
>>>>>>              57716252    57783615 ( 67363) ++++          40.370 
>>>>>> 40.226 (
>>>>>> -0.144)      -
>>>>>> instr delta =        67363     0.1167%
>>>>>> time  delta =       -0.144 ms -0.3560%
>>>>>>
>>>>>>
>>>>>> These numbers show that the majority of the time spent (678650
>>>>>> instructions) inside InstanceKlass::link_class_impl is spent from 
>>>>>> the
>>>>>> PerfClassTraceTime. Walking of the class hierarchy and taking the
>>>>>> h_init_lock only takes about 67363 instructions).
>>>>>>
>>>>>> Due to this finding, I filed two more RFEs:
>>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246019
>>>>>> PerfClassTraceTime slows down VM start-up
>>>>>>
>>>>> It's related to JDK-8246020, and I've commented on the bug (see
>>>>> JDK-8246020 comments). UsePerfData for perf data collection is common
>>>>> in cloud usages. It's better to keep UsePerfData enabled by default.
>>>>>
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246015
>>>>>> Method::link_method is called twice for CDS methods
>>>>>
>>>>> That was addressed as part of the initial change for JDK-8232222:
>>>>> http://cr.openjdk.java.net/~jiangli/8232222/weberv.02/src/hotspot/share/oops/instanceKlass.cpp.frames.html 
>>>>>
>>>>>
>>>>>
>>>>> It's cleaner to handle it separately, so I removed it from the latest
>>>>> version. I've assigned JDK-8246015 to myself and will address it
>>>>> separately. Thanks for recording the separate bug.
>>>>>
>>>>> Thanks!
>>>>> Jiangli
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> - Ioi
>>>>



More information about the hotspot-runtime-dev mailing list