RFR(S) 8246019 PerfClassTraceTime slows down VM start-up
David Holmes
david.holmes at oracle.com
Wed Jun 17 09:42:10 UTC 2020
On 17/06/2020 1:54 pm, Yumin Qi wrote:
> product(bool, UsePerfData,
> true, \
> "Flag to disable jvmstat instrumentation for performance
> testing "\
> "and problem isolation
> purposes") \
>
>
> The flag default value set to true --- should we change that? If the
> flag set to false at default, performance can benefit from that. If
> users want to collect performance data, should explicitly turn it on.
See comments in JDK-8246020.
David
>
> Thanks
>
> Yumin
>
>
> On 6/16/20 8:19 PM, Ioi Lam wrote:
>>
>>
>> On 6/16/20 6:20 PM, David Holmes wrote:
>>> Hi Ioi,
>>>
>>> On 17/06/2020 6:14 am, Ioi Lam wrote:
>>>> https://bugs.openjdk.java.net/browse/JDK-8246019
>>>> http://cr.openjdk.java.net/~iklam/jdk16/8246019-avoid-PerfClassTraceTime.v01/
>>>>
>>>>
>>>> PerfClassTraceTime is (a rarely used feature) for measuring the time
>>>> spent during class linking and initialization.
>>>
>>> "A special command jcmd <process id/main class> PerfCounter.print
>>> prints all performance counters in the process."
>>>
>>> How do you know this is a "rarely used feature"?
>> Hi David,
>>
>> Sure, the counter will be dumped, but by "rarely used" -- I mean no
>> one will find this particular counter useful, and no one will be
>> actively looking at it.
>>
>> I changed two parts of the code -- class init and class linking.
>>
>> For class initialization, the counter may be useful for people who
>> want to know how much time is spent in their <clinit> functions, and
>> my patch doesn't change that. It only avoids using the counter when a
>> class has no <clinit>, i.e., we know that the counter counts nothing
>> (except for a logging statement).
>>
>> =====
>>
>> For class linking, no user code is executed, so it only measures VM
>> code. If it's useful for anyone, that would be VM engineers like me
>> who are trying to optimize the speed of class loading. However, due to
>> the overhead of the counter vs what it's trying to measure, the
>> results are pretty meaningless.
>>
>> Note that I've not disabled the counter altogether. Instead, I disable
>> it only when linking a CDS shared class, and we know that very little
>> is happening for this class (e.g., no verification).
>>
>> I think the class linking timer might have been useful 15 years ago
>> when it was introduced, or it might be useful today when CDS is
>> disabled. But with CDS enabled, we are paying a constant price that
>> seems to benefit no one.
>>
>> I think we should short-circuit it when it seems appropriate. If this
>> indeed causes problems for our users, it's easy to re-enable it.
>> That's better than just keeping this forever just because we're afraid
>> to touch anything.
>>
>>>
>>> I find it hard to evaluate whether this short-circuiting of the time
>>> tracing is reasonable or not. Obviously any monitoring mechanism
>>> should impose minimal overhead compared to what is being measured,
>>> and these timers fall short in that regard. But if these stats become
>>> meaningless then they may as well be removed.
>>>
>>> I think the serviceability folk (cc'd) need to evaluate this in the
>>> context of the M&M tools.
>>>
>>>> However, it's quite expensive and it needs to start and stop a bunch
>>>> of timers. With CDS, it's quite often for the overhead of the timer
>>>> itself to be much more than the time it's trying to measure, giving
>>>> unreliable measurement.
>>>>
>>>> In this patch, when it's clear that the init and linking will be
>>>> very quick, I disable the timer and count only the number of
>>>> invocations. This shows a small improvement in start-up
>>>
>>> I'm curious if you tried to forcing EagerInitialization to be true to
>>> see how that improves the baseline. I've always noticed eager_init in
>>> the code, but hadn't realized it is disabled by default.
>>>
>>
>> I think it cannot be done by default, as it will violate the JLS. A
>> class can be initialized only when it's touched by bytecodes.
>>
>> It can also backfire as we may load many classes without initializing
>> them. E.g., during bytecode verification, we load many classes and
>> just check that one is a supertype of another.
>>
>> Thanks
>> - Ioi
>>
>>> Cheers,
>>> David
>>> -----
>>>
>>>> Results of " perf stat -r 100 bin/java -Xshare:on
>>>> -XX:SharedArchiveFile=jdk2.jsa -Xint -version "
>>>>
>>>> 59623970 59341935 (-282035) ----- 41.774 41.591 ( -0.183) -
>>>> 59623495 59331646 (-291849) ----- 41.696 41.165 ( -0.531) --
>>>> 59627148 59329526 (-297622) ----- 41.249 41.094 ( -0.155) -
>>>> 59612439 59340760 (-271679) ---- 41.773 40.657 ( -1.116) -----
>>>> 59626438 59335681 (-290757) ----- 41.683 40.901 ( -0.782) ----
>>>> 59618436 59338953 (-279483) ----- 41.861 41.249 ( -0.612) ---
>>>> 59608782 59340173 (-268609) ---- 41.198 41.508 ( 0.310) +
>>>> 59614612 59325177 (-289435) ----- 41.397 41.738 ( 0.341) ++
>>>> 59615905 59344006 (-271899) ---- 41.921 40.969 ( -0.952) ----
>>>> 59635867 59333147 (-302720) ----- 41.491 40.836 ( -0.655) ---
>>>> ================================================
>>>> 59620708 59336100 (-284608) ----- 41.604 41.169 ( -0.434) --
>>>> instruction delta = -284608 -0.4774%
>>>> time delta = -0.434 ms -1.0435%
>>>>
>>>> The number of PerfClassTraceTime's used is reduced from 564 to 116
>>>> (so we have an overhead of about 715 instructions per use, yikes!).
>>
More information about the serviceability-dev
mailing list