RFR JDK-8232222: Set state to 'linked' when an archived class is restored at runtime

David Holmes david.holmes at oracle.com
Mon Jun 1 23:22:05 UTC 2020


Hi Ioi,

On 2/06/2020 8:56 am, Ioi Lam wrote:
> 
> 
> On 5/31/20 11:14 PM, David Holmes wrote:
>> Hi Jiangli,
>>
>> On 29/05/2020 9:02 am, Jiangli Zhou wrote:
>>> (Looping in serviceability-dev at openjdk.java.net ...)
>>>
>>> Hi David and Ioi,
>>>
>>> On Wed, May 27, 2020 at 11:15 PM David Holmes 
>>> <david.holmes at oracle.com> wrote:
>>>>
>>>> Hi Jiangli,
>>>>
>>>> On 28/05/2020 11:35 am, Ioi Lam wrote:
>>>>>
>>>>>
>>>>> On 5/27/20 6:17 PM, Jiangli Zhou wrote:
>>>>>> On Wed, May 27, 2020 at 1:56 PM Ioi Lam <ioi.lam at oracle.com> wrote:
>>>>>>> On 5/26/20 6:21 PM, Jiangli Zhou wrote:
>>>>>>>
>>>>>>>> Focusing on the link state for archived classes in this thread, I
>>>>>>>> updated the webrev to only set archived boot classes to 'linked' 
>>>>>>>> state
>>>>>>>> at restore time. More investigations can be done for archived 
>>>>>>>> classes
>>>>>>>> for other builtin loaders.
>>>>>>>>
>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8232222
>>>>>>>> http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/
>>>>>>>>
>>>>>>>> Please let me know if there is any additional concerns to the 
>>>>>>>> change.
>>>>>>>>
>>>>>>>> Best regards,
>>>>>>>> Jiangli
>>>>>>>>
>>>>>>> Hi Jiangli,
>>>>>>>
>>>>>>> I think the change is fine. I am wondering if this
>>>>>>>
>>>>>>> 2530   if (!BytecodeVerificationLocal &&
>>>>>>> 2531 loader_data->is_the_null_class_loader_data()) {
>>>>>>> 2532     _init_state = linked;
>>>>>>> 2533   }
>>>>>>>
>>>>>>>
>>>>>>> can be changed to
>>>>>>>
>>>>>>>           if (!BytecodeVerificationLocal &&
>>>>>>> loader_data->is_the_null_class_loader_data() &&
>>>>>>>               !JvmtiExport::should_post_class_prepare())
>>>>>>>
>>>>>>> That way, there's no need to change systemDictionary.cpp.
>>>>>>>
>>>>>>>
>>>>>> I was going to take the suggestion, but realized that it would add
>>>>>> unnecessary complications for archived boot classes with class
>>>>>> pre-initialization support. Some agents may set
>>>>>> JvmtiExport::should_post_class_prepare(). It's worthwhile to support
>>>>>> class pre-init uniformly for archived boot classes with
>>>>>> JvmtiExport::should_post_class_prepare() enabled or disabled.
>>>>>
>>>>> This would introduce behavioral changes when JVMTI is enabled:
>>>>>
>>>>> + The order of JvmtiExport::post_class_prepare is different than 
>>>>> before
>>>>> + JvmtiExport::post_class_prepare may be called for a class that 
>>>>> was not
>>>>> called before (if the class is never linked during run time)
>>>>> + JvmtiExport::post_class_prepare was called inside the init_lock, now
>>>>> it's called outside of the init_lock
>>>>
>>>> I have to say I share Ioi's concerns here. This change will impact JVM
>>>> TI agents in a way we can't be sure of. From a specification 
>>>> perspective
>>>> I think we are fine as linking can be lazy or eager, so there's no
>>>> implied order either. But this would be a behavioural change that will
>>>> be observable by agents. (I'm less concerned about the init_lock
>>>> situation as it seems potentially buggy to me to call out to an agent
>>>> with the init_lock held in the first place! I find it hard to 
>>>> imagine an
>>>> agent only working correctly if the init_lock is held.)
>>>>
>>>
>>> Totally agree that we need to be very careful here (that's also part
>>> of the reason why I separated this into an individual RFE for the
>>> dedicated discussion). David, thanks for the analysis from the spec
>>> perspective! Agreed with the init_lock comment also. In the future, I
>>> think we can even get rid of the needs for init_lock completely for
>>> some of the pre-initialized classes.
>>>
>>> This change has gone through extensive testing since the later part of
>>> last year and has been in use (with the default CDS) with agents that
>>> do post_class_prepare. Hopefully that would ease some of the concerns.
>>
>> That is good to know, but that is just one sample of a set of agents.
>>
>>>> This would need a CSR request and involvement of the serviceabilty 
>>>> folk,
>>>> to work through any potential issues.
>>>>
>>>
>>> I've looped in serviceability-dev at openjdk.java.net for this
>>> discussion. Chris or Serguei could you please take a look of the
>>> change, http://cr.openjdk.java.net/~jiangli/8232222/webrev.02/,
>>> specifically the JvmtiExport::post_class_prepare change in
>>> systemDictionary.cpp.
>>>
>>> Filing a CSR request sounds good to me. The CSR looks after source,
>>> binary, and behavioral compatibility. From a behavior point of view,
>>> the change most likely does not cause any visible effects to a JVMTI
>>> agent (based on what's observed in testing and usages). What should be
>>> included in the CSR?
>>
>> The CSR request should explain the behavioural change that will be 
>> observable by agents, and all of the potential compatibility issues 
>> that might arise from that - pointing out of course that as the spec 
>> (JVMS 5.4**) allows for eager or lazy linking, agents shouldn't be 
>> relying on the exact timing or order of events.
>>
>> ** I note this section has some additional constraints regarding 
>> dynamically computed constants that might also come into play with 
>> this pre-linking for CDS classes.
>>
> I think the CSR should also include the benefit of doing this. It's not 
> a lot of code change, but now we have to maintain two different code 
> paths for post_class_prepare to be called.

The CSR is concerned only with the compatibility aspects of a change. 
The cost:benefit ratio is an engineering decision that should be 
discussed here in the RFR.

David
-----

> JVMTI agents will typically introduce quite a bit of overhead in 
> start-up, so a reduction in the range of 0.2~0.4ms seems a drop to the 
> bucket. I'd rather keep the VM simple unless we have a strong reason to 
> make it more complicated.
> 
> Thanks
> - Ioi
> 
>> Cheers,
>> David
>> -----
>>
>>>> Ioi's suggestion avoids this problem, but, as you note, at the expense
>>>> of disabling this optimisation if an agent is attached and wants class
>>>> prepare events.
>>>>
>>>
>>> Right, if we handle that case conditionally, we would alway need to
>>> store the cached static field values separately since the dump time
>>> cannot foresee if the runtime can set boot classes in 'linked' state
>>> (and 'fully_initialized' state with the planned changes) at restore
>>> time. As a result, we need to handle all pre-initialized static fields
>>> like what we are doing today, which is storing them in the archived
>>> class_info_records then installing them to the related fields at
>>> runtime. That causes both unwanted memory and CPU overhead at runtime.
>>>
>>> I also updated the webrev.02 in place with typo fixes. Thanks!
>>>
>>> Best regards,
>>> Jiangli
>>>
>>>> Thanks,
>>>> David
>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>>>
>>>>>>> BTW, I was wondering where the performance came from, so I wrote an
>>>>>>> investigative patch:
>>>>>>>
>>>>>>> diff -r 0702191777c9 src/hotspot/share/oops/instanceKlass.cpp
>>>>>>> --- a/src/hotspot/share/oops/instanceKlass.cpp    Thu May 21 
>>>>>>> 15:56:27
>>>>>>> 2020 -0700
>>>>>>> +++ b/src/hotspot/share/oops/instanceKlass.cpp    Wed May 27 
>>>>>>> 10:48:57
>>>>>>> 2020 -0700
>>>>>>> @@ -866,6 +866,13 @@
>>>>>>>         return true;
>>>>>>>       }
>>>>>>>
>>>>>>> +  if (UseSharedSpaces && !BytecodeVerificationLocal &&
>>>>>>> is_shared_boot_class()) {
>>>>>>> +    Handle h_init_lock(THREAD, init_lock());
>>>>>>> +    ObjectLocker ol(h_init_lock, THREAD, h_init_lock() != NULL);
>>>>>>> +    set_init_state(linked);
>>>>>>> +    return true;
>>>>>>> +  }
>>>>>>> +
>>>>>>>       // trace only the link time for this klass that includes
>>>>>>>       // the verification time
>>>>>>>       PerfClassTraceTime 
>>>>>>> vmtimer(ClassLoader::perf_class_link_time(),
>>>>>>>
>>>>>>>
>>>>>>> Benchmarking results (smaller numbers are better):
>>>>>>>
>>>>>>> (baseline vs your patch)
>>>>>>>
>>>>>>>              baseline jiangli                           baseline
>>>>>>> jiangli
>>>>>>>       1:     58514375    57755638 (-758737) -----     40.266
>>>>>>> 40.135 (
>>>>>>> -0.131)      -
>>>>>>>       2:     58506426    57754623 (-751803) -----     40.367
>>>>>>> 39.417 (
>>>>>>> -0.950)      -----
>>>>>>>       3:     58498554    57759735 (-738819) -----     40.513
>>>>>>> 39.970 (
>>>>>>> -0.543)      ---
>>>>>>>       4:     58491265    57751296 (-739969) -----     40.439
>>>>>>> 40.268 (
>>>>>>> -0.171)      -
>>>>>>>       5:     58500588    57750975 (-749613) -----     40.569
>>>>>>> 40.080 (
>>>>>>> -0.489)      --
>>>>>>>       6:     58497015    57744418 (-752597) -----     41.097
>>>>>>> 40.147 (
>>>>>>> -0.950)      -----
>>>>>>>       7:     58494335    57749909 (-744426) -----     39.983 40.214
>>>>>>> (  0.231)     +
>>>>>>>       8:     58500401    57750305 (-750096) -----     40.235 40.417
>>>>>>> (  0.182)     +
>>>>>>>       9:     58490728    57767463 (-723265) -----     40.354
>>>>>>> 39.928 (
>>>>>>> -0.426)      --
>>>>>>>      10:     58497858    57746557 (-751301) -----     40.756
>>>>>>> 39.706 (
>>>>>>> -1.050)      -----
>>>>>>> ============================================================
>>>>>>>              58499154    57753091 (-746062) -----     40.457
>>>>>>> 40.027 (
>>>>>>> -0.430)      --
>>>>>>> instr delta =      -746062    -1.2753%
>>>>>>> time  delta =       -0.430 ms -1.0619%
>>>>>>>
>>>>>>>
>>>>>>> (baseline vs my patch)
>>>>>>>
>>>>>>>              baseline    ioi baseline  ioi
>>>>>>>       1:     58503574    57821124 (-682450)      ----- 40.554 
>>>>>>> 39.783 (
>>>>>>> -0.771)      -----
>>>>>>>       2:     58499325    57819459 (-679866) -----     40.092 40.325
>>>>>>> (  0.233)    ++
>>>>>>>       3:     58492362    57811978 (-680384) -----     40.546
>>>>>>> 39.826 (
>>>>>>> -0.720)      -----
>>>>>>>       4:     58488655    57828878 (-659777) -----     40.270 40.550
>>>>>>> (  0.280)    ++
>>>>>>>       5:     58501567    57830179 (-671388) -----     40.382
>>>>>>> 40.145 (
>>>>>>> -0.237)      --
>>>>>>>       6:     58496552    57808774 (-687778) -----     40.702
>>>>>>> 40.527 (
>>>>>>> -0.175)      -
>>>>>>>       7:     58482701    57808925 (-673776) -----     40.268
>>>>>>> 39.849 (
>>>>>>> -0.419)      ---
>>>>>>>       8:     58493831    57807810 (-686021) -----     40.396
>>>>>>> 39.940 (
>>>>>>> -0.456)      ---
>>>>>>>       9:     58489388    57811354 (-678034) -----     40.575
>>>>>>> 40.078 (
>>>>>>> -0.497)      ---
>>>>>>>      10:     58482512    57795489 (-687023) -----     40.084 40.247
>>>>>>> (  0.163)     +
>>>>>>> ============================================================
>>>>>>>              58493046    57814396 (-678650) -----     40.386
>>>>>>> 40.126 (
>>>>>>> -0.260)      --
>>>>>>> instr delta =      -678650    -1.1602%
>>>>>>> time  delta =       -0.260 ms -0.6445%
>>>>>>>
>>>>>>>
>>>>>>> (your patch vs my patch)
>>>>>>>
>>>>>>>              jiangli ioi                              jiangli ioi
>>>>>>>       1:     57716711    57782622 ( 65911) ++++          41.042 
>>>>>>> 40.302 (
>>>>>>> -0.740)      -----
>>>>>>>       2:     57709666    57780196 ( 70530) ++++          40.334 
>>>>>>> 40.965 (
>>>>>>> 0.631)  ++++
>>>>>>>       3:     57716074    57803315 ( 87241) +++++          40.239 
>>>>>>> 39.823 (
>>>>>>> -0.416)      ---
>>>>>>>       4:     57725152    57782719 ( 57567) +++          40.430 
>>>>>>> 39.805 (
>>>>>>> -0.625)      ----
>>>>>>>       5:     57719799    57787187 ( 67388) ++++          40.138 
>>>>>>> 40.003 (
>>>>>>> -0.135)      -
>>>>>>>       6:     57721922    57769193 ( 47271) +++          40.324 
>>>>>>> 40.207 (
>>>>>>> -0.117)      -
>>>>>>>       7:     57716438    57785212 ( 68774) ++++          39.978 
>>>>>>> 40.149 (
>>>>>>> 0.171)     +
>>>>>>>       8:     57713834    57778797 ( 64963) ++++          40.359 
>>>>>>> 40.210 (
>>>>>>> -0.149)      -
>>>>>>>       9:     57711272    57786376 ( 75104) ++++          40.575 
>>>>>>> 40.724 (
>>>>>>> 0.149)     +
>>>>>>>      10:     57711660    57780548 ( 68888) ++++          40.291 
>>>>>>> 40.091 (
>>>>>>> -0.200)      -
>>>>>>> ============================================================
>>>>>>>              57716252    57783615 ( 67363) ++++          40.370 
>>>>>>> 40.226 (
>>>>>>> -0.144)      -
>>>>>>> instr delta =        67363     0.1167%
>>>>>>> time  delta =       -0.144 ms -0.3560%
>>>>>>>
>>>>>>>
>>>>>>> These numbers show that the majority of the time spent (678650
>>>>>>> instructions) inside InstanceKlass::link_class_impl is spent from 
>>>>>>> the
>>>>>>> PerfClassTraceTime. Walking of the class hierarchy and taking the
>>>>>>> h_init_lock only takes about 67363 instructions).
>>>>>>>
>>>>>>> Due to this finding, I filed two more RFEs:
>>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246019
>>>>>>> PerfClassTraceTime slows down VM start-up
>>>>>>>
>>>>>> It's related to JDK-8246020, and I've commented on the bug (see
>>>>>> JDK-8246020 comments). UsePerfData for perf data collection is common
>>>>>> in cloud usages. It's better to keep UsePerfData enabled by default.
>>>>>>
>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8246015
>>>>>>> Method::link_method is called twice for CDS methods
>>>>>>
>>>>>> That was addressed as part of the initial change for JDK-8232222:
>>>>>> http://cr.openjdk.java.net/~jiangli/8232222/weberv.02/src/hotspot/share/oops/instanceKlass.cpp.frames.html 
>>>>>>
>>>>>>
>>>>>>
>>>>>> It's cleaner to handle it separately, so I removed it from the latest
>>>>>> version. I've assigned JDK-8246015 to myself and will address it
>>>>>> separately. Thanks for recording the separate bug.
>>>>>>
>>>>>> Thanks!
>>>>>> Jiangli
>>>>>>
>>>>>>>
>>>>>>> Thanks
>>>>>>> - Ioi
>>>>>
> 


More information about the hotspot-runtime-dev mailing list