RFR(L): JDK-8032463 VirtualDispatch test timeout with DeoptimizeALot
Igor Veresov
igor.veresov at oracle.com
Fri May 9 02:37:02 UTC 2014
Thanks, Chris!
Alright, thanks for the hint: http://cr.openjdk.java.net/~iveresov/8032463/webrev.03/
igor
On May 8, 2014, at 7:25 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> src/share/vm/c1/c1_Runtime1.cpp
> + nmethod* nm = (nmethod*)caller_frame.cb();
> + assert(nm != NULL && nm->is_nmethod(), "Sanity check");
> There is CodeBlob::as_nmethod_or_null().
>
> The C1 code looks good.
>
> On May 8, 2014, at 3:28 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>
>> Looks good.
>>
>> Thanks,
>> Vladimir
>>
>> On 5/8/14 12:30 PM, Igor Veresov wrote:
>>> Alright, here is the update:
>>> http://cr.openjdk.java.net/~iveresov/8032463/webrev.02/
>>>
>>> The change is that I added a trap counter in the MDO that is used to
>>> scale up the number of stack walks that are necessary for the tenured
>>> method to revert back to the standard flushing policy. This required
>>> some changes in c1_Runtime.cpp and deoptimization.cpp to handle these
>>> deopts and allocate MDOs (in C1). The counter increment is guarded by
>>> the make_not_entrant() call that guarantees that the increment happens
>>> only once per method state change.
>>> Also fixed the comments in sweeper.cpp and indent in globals.hpp that I
>>> forgot to do the last time.
>>>
>>> Thanks!
>>> igor
>>>
>>>
>>> On May 7, 2014, at 2:27 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
>>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>
>>>> On 5/7/14 12:37 AM, Igor Veresov wrote:
>>>>> Vladimir,
>>>>>
>>>>> Thanks for the review!
>>>>> Please find the reply inline.
>>>>>
>>>>> igor
>>>>>
>>>>> On May 6, 2014, at 5:05 PM, Vladimir Kozlov
>>>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>>>
>>>>>> I think we need to let SAP know about this change (counter code in
>>>>>> C2) since it may affect their tiered compilation. It use the same
>>>>>> Deoptimization::Reason_age.
>>>>>>
>>>>> Oh, I didn’t know that. I’ll introduce another constant.
>>>>>
>>>>>> Can you separate C1 trap_request newa code in a separate RFE and
>>>>>> push it first?
>>>>>>
>>>>>
>>>>> Do you think this is necessary? It so happens that the DeoptimizeStub
>>>>> was not used before this by any code. So the aging scheme is the only
>>>>> user of it so far.
>>>>
>>>> I am fine to keep it in this fix but I can't review it. Ask Christian
>>>> to look.
>>>>
>>>>>
>>>>>> Can you add the field to MDO and not to MethodCounters? We doing
>>>>>> this only for methods which were compiled already and have MDO.
>>>>>> MethodCounters small size is important.
>>>>>
>>>>> It was in MDO originally, unfortunately I have to make it work for
>>>>> pure C1 and we usually don’t allocate MDOs in that configuration.
>>>>> Although, we need to only allocate MDOs for methods with aging, a set
>>>>> of which may be would be smallish.. But MDOs are kind of big.. Would
>>>>> that be better?
>>>>
>>>> I forgot about pure C1.
>>>> Chris Plummer pointed me, for RTM changes, MethodCounters are
>>>> allocated for all executed methods which may not be compiled. About
>>>> 30% methods are executed.
>>>>
>>>>>
>>>>>> Why the change in methodData.hpp and nmethod.cpp? '||' should be
>>>>>> '&&' in nmethod::inc_decompile_count(), I think.
>>>>>
>>>>> You’re right, for pure C1 that won’t work anyways, because there are
>>>>> no MDOs. I’ll remove these for now. I probably need another decompile
>>>>> counter to put a cap on the number of recompiles, see the idea below.
>>>>>
>>>>>> In globals.hpp flags description indention is off.
>>>>>>
>>>>>
>>>>> Fixed.
>>>>
>>>> You moved it to opposite direction :) . Indention of all comments are
>>>> the same in the file. They are not aligned to a flag's declaration line.
>>>>
>>>>>
>>>>>> sweeper.cpp:
>>>>>>
>>>>>> // The stack-scanning low-cost detection may not see the method was
>>>>>> used (which can happen for
>>>>>>
>>>>>> // previous MinPassesBeforeFlush sweeps. Reset the counter. Stay in
>>>>>> the existing
>>>>
>>>> You did not change above comments as I suggested.
>>>>
>>>>>>
>>>>>> Why do you need to repeat? For small leaf nmethods (with no
>>>>>> safepoint, for example) it could trigger continues recompilation:
>>>>>>
>>>>>> + if (time_since_reset > MinPassesBeforeFlush * 2) {
>>>>>> + // It's been long enough, we still haven't seen it
>>>>>> on stack.
>>>>>> + // Try to flush it, but enable counters the next time.
>>>>>> + mc->reset_nmethod_age();
>>>>>>
>>>>>
>>>>> That’s a good question. Before this change we would’ve recompiled
>>>>> warm methods continuously (given some code cache pressure), so the
>>>>> number of such cases is probably small. The options here with the
>>>>> code aging are:
>>>>> - don’t ever again flush a method that we’ve seen deopt once;
>>>>> - allow the flusher machinery to get rid of it if it becomes idle again.
>>>>>
>>>>> Currently I decided to stick with the latter but also give the method
>>>>> twice the standard time initially to be noticed by the stack-walker.
>>>>> You are right that there is a problem for small methods. Technically
>>>>> there is always one safepoint in the epilog though, right? I’m good
>>>>> with either option. Vladimir, what are your thoughts on this? Albert,
>>>>> if you skimming through this, what do you think?
>>>>>
>>>>> Here is an idea... We could count the number of decompiles for this
>>>>> particular reason and use it to scale the number of stack-walks we
>>>>> have to do before we flush the method again. For example we can wait
>>>>> MinPassesBeforeFlush * number_of_aging_deopts() stack-walks.
>>>>> Essentially this makes it unflushable in the limit - a combination of
>>>>> the two options above.
>>>>> I’ll need to allocate an MDO or add another counter to methodCounters
>>>>> for that though, or pack a small value info method_age itself, at the
>>>>> expense of a slightly bigger code. And there is an MT problem, as
>>>>> usual, with counting the number of deopts.
>>>>
>>>> As we discussed the check will be scaled by number of recompilation.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>>>
>>>>> Here is the updated (with the changes addressed so far) webrev:
>>>>> http://cr.openjdk.java.net/~iveresov/8032463/webrev.01/
>>>>>
>>>>> Thanks,
>>>>> igor
>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 5/5/14 7:44 PM, Igor Veresov wrote:
>>>>>>> The current implementation of the code cache flusher tries to
>>>>>>> determine the working set of methods by doing periodic stack walks
>>>>>>> during safepoints. Methods that are seen on the stack are
>>>>>>> considered to be used by the program and are excluded from the set
>>>>>>> of flush candidates. Such sampling is a great zero-overhead
>>>>>>> approach to collect the working set of super-hot methods without
>>>>>>> instrumentation. However it doesn’t work that good for flat
>>>>>>> profiles, which the test in the bug report happens to expose
>>>>>>> (thousands of methods calls sequentially one after another in the
>>>>>>> loop). The solution I ended up implementing uses conservative
>>>>>>> on-demand instrumentation to obtain information about the method
>>>>>>> usage. The algorithms is as follows:
>>>>>>>
>>>>>>> 1. Initially a method is compiled as usual (no instrumentation, no
>>>>>>> overhead).
>>>>>>> 2. When the sampling (stack-walking) scheme fails to detect
>>>>>>> activity we flush the method (again, as usual). However before the
>>>>>>> flush we set the age counter (an int added to MethodCounters) to a
>>>>>>> value that instructs the compilers to instrument the code.
>>>>>>> 3. If we ever request to compile this method again the aging code
>>>>>>> is inserted, which decrements the counter.
>>>>>>> 4. The value of the counter is then used by the sweeper to
>>>>>>> determine is the method is in fact used.
>>>>>>> 5. If the counter reaches zero before the sweeper is able examine
>>>>>>> and reset the counter we deopt and recompile the method without the
>>>>>>> counters, basically switching it back to use the sampling scheme again.
>>>>>>>
>>>>>>> Webrev: http://cr.openjdk.java.net/~iveresov/8032463/webrev.00/
>>>>>>>
>>>>>>> Testing:
>>>>>>> - jprt,
>>>>>>> - aurora-perf No performance effects with the default code cache,
>>>>>>> since we have enough headroom and the flusher is not exercised that
>>>>>>> much. No statistically significant effects after warmup (for spec
>>>>>>> benchmarks) in stress mode (-XX:+StressCodeAging) either, that is
>>>>>>> if all methods are compiled initially with counters deopt and
>>>>>>> switch to sampling later after 100000 invocations.
>>>>>>>
>>>>>>> igor
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20140508/6e6562b1/attachment-0001.html>
More information about the ppc-aix-port-dev
mailing list