Why does call_site_target keep changing for a Nashorn method?
Vladimir Ivanov
vladimir.x.ivanov at oracle.com
Wed Jan 30 02:19:29 UTC 2019
Thanks for experimenting with a fix!
> What's about this change? It can pass hotspot-tier1 tests.
> From my observation, my application sticks with C1 because it's MDO::_invocation_counter can't increment.
Can you elaborate a bit here? Is it because invocation counter decays
faster than C2 gets a change to compile it?
> diff -r d02f1f4ff3a6 src/hotspot/share/ci/ciEnv.cpp
> --- a/src/hotspot/share/ci/ciEnv.cpp Thu Jan 24 14:22:50 2019 -0800
> +++ b/src/hotspot/share/ci/ciEnv.cpp Thu Jan 24 23:35:52 2019 -0800
> @@ -939,6 +939,11 @@
> if (result != Dependencies::end_marker) {
> if (result == Dependencies::call_site_target_value) {
> _inc_decompile_count_on_failure = false;
> +
> + MethodData* mdo = target->get_Method()->method_data();
> + if (mdo != NULL) {
> + mdo->invocation_counter()->decay();
> + }
> record_failure("call site target change");
> } else if (Dependencies::is_klass_type(result)) {
> record_failure("concurrent class loading");
>
I believe MethodHandles::flush_dependent_nmethods() should do something
similar.
> I am not sure if I should update ciMethodData::_invocation_counter as well. There's no mutator function for it.
> It looks that nobody consumes ciMethodDate::_invocation_counter.
It should be updated by ciMethodData::load_data().
> I believe I can shake off the unstable dynamic functions in inliner. Do you think it can solve JDK-8147550?
> In Dependencies::check_call_site_target_value, we have the exact callsite oop.
> We can add a new node type called 'DynamicCallData" in MethodData.hpp. Each dynamic call node attaches a profiling node. It contains information like "call site target changes".
> Inliner uses the profiling data to determine inline or not.
I'm not too optimistic about solving JDK-8147550 in such vein. As
comment from John says:
"There is a serious design tension here, though: Some users apparently
are willing to endure an infinite series of recompilations as part of
the cost of doing business; JDK-7177745 addresses this need by turning
off the fail-safe against (accidental, buggy) infinite recompilation for
unstable CSs. Other users might find that having a percentage of machine
time devoted to recompilation is a problem."
Tweaking inlining will definitely make some users unhappy.
The only viable option I see is to reduce the overhead of excessive
recompilation by delaying recompilations and thus increasing warmup period.
For example, in tiered mode:
* tier 2-3 can conservatively avoid inlining through unstable CallSites;
* tier 4 always inline through CallSites, but use exponential (?)
backoff to guide recompilations (it's worth to consider lower & upper
limits for periods between recompilations);
With such design, execution always stays in tier2-3 (C1) in the worst
case, but there's always a chance left for the code to jump into tier4
once execution stabilizes.
> Btw, this patch in 2011 seems to abort inlining based on the profiling data of callsite. my idea is same.
> http://cr.openjdk.java.net/~twisti/7087838/src/share/vm/opto/callGenerator.cpp.udiff.html
Unfortunately, it doesn't fit some important use cases as stated in
JDK-7087838:
"The consensus among language runtime implementors is that they want
control over switch points (and thus call sites) and so it's their
responsibility to handle extensive invalidation of such."
Best regards,
Vladimir Ivanov
> On 1/18/19, 5:06 PM, "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com> wrote:
>
>
> > Thank you for the response. After reading your email and associated RFEs, now I got the background story.
> > I understand the design decision in hotspot.
> >
> > In my case, compiler thread crowds out the app thread because we run application in docker with 1 CPU.
> > Is it good idea that we decay the invocation counts of the methods if they fail due to 'call_site_target value change?'
>
> Yes, sounds reasonable. I believe compilation bailed out due to
> invalidated call_site_target dependency should be treated as if it were
> a deoptimization with Action_reinterpret, but resetting invocation
> counts may be too much. So, decaying counters instead sounds reasonable.
>
> Also, it's hard to tell what method to act on: problematic CallSite may
> be located somewhere deep in inline tree, but only root method is known.
>
> Best regards,
> Vladimir Ivanov
>
> > On 1/17/19, 2:36 PM, "Vladimir Ivanov" <vladimir.x.ivanov at oracle.com> wrote:
> >
> > C1/C2 optimistically inline through CallSite instances even if those are
> > mutable (MutableCallSite/VolatileCallSite). It requires a nmethod
> > dependency and once CallSite target changes, all dependent nmethods
> > should be invalidated. If such change happens during compilation,
> > nmethod installation fails.
> >
> > That's exactly what you observe: the dependency is recorded during
> > inlining, but failed verification during installation.
> >
> > Regarding the observed behavior, it is well-known [1] [2] and was a
> > deliberate choice. As JDK-7087838 [1] states:
> >
> > "The consensus among language runtime implementors is that they want
> > control over switch points (and thus call sites) and so it's their
> > responsibility to handle extensive invalidation of such."
> >
> > So, such pathological behavior is treated as a bug in user code (Nashorn
> > in this particular case).
> >
> > There's an RFE filed [3] to consider alternative options for unstable
> > calls.
> >
> > Best regards,
> > Vladimir Ivanov
> >
> > [1] https://bugs.openjdk.java.net/browse/JDK-7087838
> > [2] https://bugs.openjdk.java.net/browse/JDK-7177745
> > [3] https://bugs.openjdk.java.net/browse/JDK-8147550
> >
> > On 16/01/2019 14:04, Liu, Xin wrote:
> > > In one of our applications, C1/C2 keeps compiling a Javascript method
> > > generated by Nashorn but the code fails a dependency check right before
> > > installing in the code cache. This is with JDK tip.
> > >
> > > It can’t pass ‘Dependencies::check_call_site_target_value’.
> > >
> > > [C2 Parsing]
> > >
> > > <bc code='182' bci='1'/>
> > >
> > > <dependency type='unique_concrete_method' ctxk='1131' x='1141'/>
> > >
> > > <call method='1141' count='878838' prof_factor='0.024122' inline='1'/>
> > >
> > > <inline_success reason='accessor'/>
> > >
> > > <parse method='1141' uses='21249.000000' stamp='1112.538'>
> > >
> > > <bc code='180' bci='1'/>
> > >
> > > <unknown id='1556'/>
> > >
> > > <unknown id='1866'/>
> > >
> > > <dependency type='call_site_target_value' x0='1556' x='1866'/>
> > >
> > > <parse_done nodes='41636' live='12226' memory='12130984' stamp='1112.538'/>
> > >
> > > </parse>
> > >
> > > [Validating compilation dependencies]
> > >
> > > <dependency type='call_site_target_value' x0='1132' x='1143'/>
> > >
> > > <dependency type='call_site_target_value' x0='1334' x='1337'/>
> > >
> > > <dependency type='call_site_target_value' x0='1424' x='1425'/>
> > >
> > > <dependency type='call_site_target_value' x0='1437' x='1438'/>
> > >
> > > <dependency type='call_site_target_value' x0='1454' x='1455'/>
> > >
> > > <dependency type='call_site_target_value' x0='1465' x='1466'/>
> > >
> > > <dependency type='call_site_target_value' x0='1482' x='1483'/>
> > >
> > > <dependency type='call_site_target_value' x0='1498' x='1499'/>
> > >
> > > <dependency type='call_site_target_value' x0='1509' x='1510'/>
> > >
> > > <dependency type='call_site_target_value' x0='1526' x='1576'/>
> > >
> > > <dependency type='call_site_target_value' x0='1528' x='1667'/>
> > >
> > > <dependency type='call_site_target_value' x0='1536' x='1692'/>
> > >
> > > <dependency type='call_site_target_value' x0='1537' x='1707'/>
> > >
> > > <dependency type='call_site_target_value' x0='1538' x='1730'/>
> > >
> > > <dependency type='call_site_target_value' x0='1539' x='1746'/>
> > >
> > > <dependency type='call_site_target_value' x0='1540' x='1787'/>
> > >
> > > <dependency type='call_site_target_value' x0='1550' x='1804'/>
> > >
> > > <dependency type='call_site_target_value' x0='1553' x='1820'/>
> > >
> > > <dependency type='call_site_target_value' x0='1554' x='1836'/>
> > >
> > > <dependency type='call_site_target_value' x0='1555' x='1849'/>
> > >
> > > <dependency type='call_site_target_value' x0='1556' x='1866'/>
> > >
> > > <dependency_failed type='call_site_target_value' x0='1556' x='1866'
> > > witness='jdk/nashorn/internal/runtime/linker/LinkerCallSite'
> > > stamp='1113.578'/>
> > >
> > > It’s related to the GWT methodHandle. The 2 mismatched methodhandles
> > > are very similar except for argL3, which is an int[2].
> > >
> > > Even though arg0-2 are not identical objects, their contents are same.
> > >
> > > (gdb)call java_lang_invoke_CallSite::target(call_site)->print()
> > >
> > > java.lang.invoke.BoundMethodHandle$Species_LLLL
> > >
> > > {0x00000000f586ca98}-
> > > klass:'java/lang/invoke/BoundMethodHandle$Species_LLLL'
> > >
> > > - ---- fields(total size 6 words):
> > >
> > > -'customizationCount''B'@12 0
> > >
> > > - private final'type''Ljava/lang/invoke/MethodType;'@16
> > > a'java/lang/invoke/MethodType'{0x00000000e21e2878}=(Ljava/lang/Object;Ljdk/nashorn/internal/runtime/Undefined;Ljava/lang/Object;)Ljava/lang/Object;(e21e2878)
> > >
> > > - final'form''Ljava/lang/invoke/LambdaForm;'@20
> > > a'java/lang/invoke/LambdaForm'{0x00000000e1e4a670}=>a'java/lang/invoke/MemberName'{0x00000000e1e4a938}={method}{0x00007fffa512cb68}'guard''(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;'in'java/lang/invoke/LambdaForm$MH'(e1e4a670)
> > >
> > > -'asTypeCache''Ljava/lang/invoke/MethodHandle;'@24 NULL(0)
> > >
> > > - final'argL0''Ljava/lang/Object;'@28
> > > a'java/lang/invoke/BoundMethodHandle$Species_LL'{0x00000000f586c9e8}(f586c9e8)
> > >
> > > - final'argL1''Ljava/lang/Object;'@32
> > > a'java/lang/invoke/MethodHandleImpl$CountingWrapper'{0x00000000f586ca28}(f586ca28)
> > >
> > > - final'argL2''Ljava/lang/Object;'@36
> > > a'java/lang/invoke/MethodHandleImpl$CountingWrapper'{0x00000000f586ca60}(f586ca60)
> > >
> > > - final'argL3''Ljava/lang/Object;'@40 [I{0x00000000f586ca10}(f586ca10)
> > >
> > > (gdb)call method_handle->print()
> > >
> > > java.lang.invoke.BoundMethodHandle$Species_LLLL
> > >
> > > {0x00000000f6b18500}-
> > > klass:'java/lang/invoke/BoundMethodHandle$Species_LLLL'
> > >
> > > - ---- fields(total size 6 words):
> > >
> > > -'customizationCount''B'@12 0
> > >
> > > - private final'type''Ljava/lang/invoke/MethodType;'@16
> > > a'java/lang/invoke/MethodType'{0x00000000e21e2878}=(Ljava/lang/Object;Ljdk/nashorn/internal/runtime/Undefined;Ljava/lang/Object;)Ljava/lang/Object;(e21e2878)
> > >
> > > - final'form''Ljava/lang/invoke/LambdaForm;'@20
> > > a'java/lang/invoke/LambdaForm'{0x00000000e1e4a670}=>a'java/lang/invoke/MemberName'{0x00000000e1e4a938}={method}{0x00007fffa512cb68}'guard''(Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;'in'java/lang/invoke/LambdaForm$MH'(e1e4a670)
> > >
> > > -'asTypeCache''Ljava/lang/invoke/MethodHandle;'@24 NULL(0)
> > >
> > > - final'argL0''Ljava/lang/Object;'@28
> > > a'java/lang/invoke/BoundMethodHandle$Species_LL'{0x00000000f6b18450}(f6b18450)
> > >
> > > - final'argL1''Ljava/lang/Object;'@32
> > > a'java/lang/invoke/MethodHandleImpl$CountingWrapper'{0x00000000f6b18490}(f6b18490)
> > >
> > > - final'argL2''Ljava/lang/Object;'@36
> > > a'java/lang/invoke/MethodHandleImpl$CountingWrapper'{0x00000000f6b184c8}(f6b184c8)
> > >
> > > - final'argL3''Ljava/lang/Object;'@40 [I{0x00000000f6b18478}(f6b18478)
> > >
> > > My guess is argL3 is counters in Java.lang.invoke.MethodHandleImpl.
> > >
> > > // Intrinsified by C2. Counters are used during parsing to calculate
> > > branch frequencies.
> > > @LambdaForm.Hidden
> > > @jdk.internal.HotSpotIntrinsicCandidate
> > > static
> > > boolean profileBoolean(boolean result, int[] counters) {
> > > // Profile is int[2] where [0] and [1] correspond to false and true
> > > occurrences respectively.
> > > int idx = result ? 1 : 0;
> > > try {
> > > counters[idx] = Math./addExact/(counters[idx], 1);
> > > } catch (ArithmeticException e) {
> > > // Avoid continuous overflow by halving the problematic count.
> > > counters[idx] = counters[idx] / 2;
> > > }
> > > return result;
> > > }
> > >
> > > I am still struggling to understand the source code in
> > > java.lang.invoke.*. Could anybody enlighten me why the target of the
> > > callsite changes every time here? it is relative to this profiling thing?
> > >
> > > In validation log, it has validated the dep “dependency
> > > type='call_site_target_value' x0='1556' x='1866'” above. Why it can’t
> > > pass it after then? My guess is one MH object has been changed by
> > > another Java thread.
> > >
> > > One interesting fact that compiler thread can’t pass 22^th dep. My
> > > tuition is it goes over an unknown threshold.
> > >
> > > The 2nd question is about ciEnv:: validate_compile_task_dependencies.
> > > Why does failure of call_site_target_value_changed not count as a deopt?
> > >
> > > The flag _inc_decompile_count_on_failure =false stops MDO to mark this
> > > method ‘not_compileable’. C2 doesn’t set the flag, so C2 ends up
> > > compiling it over and over, which makes C2 a cpu hog. Here’s the code in
> > > validate_compile_task_dependencies
> > >
> > > bool counter_changed = system_dictionary_modification_counter_changed();
> > >
> > > Dependencies::DepType result =
> > > dependencies()->validate_dependencies(_task, counter_changed);
> > >
> > > if (result != Dependencies::end_marker) {
> > >
> > > if (result == Dependencies::call_site_target_value) {
> > >
> > > _inc_decompile_count_on_failure = false;
> > >
> > > record_failure("call site target change");
> > >
> > > Maybe the right thing to do is to count this as a deopt and change the
> > > deopt limit computation to take into account the size of the method in
> > > nodes, just as done for abandoning compilation if the graph is too big.
> > >
> > > Thanks,
> > >
> > > --lx
> > >
> >
> >
>
>
More information about the hotspot-compiler-dev
mailing list