RFR(L): JDK-8032463 VirtualDispatch test timeout with DeoptimizeALot

Fri May 9 02:43:05 UTC 2014

Good.

On May 8, 2014, at 4:37 PM, Igor Veresov <igor.veresov at oracle.com> wrote:

> Thanks, Chris!
> 
> Alright, thanks for the hint: http://cr.openjdk.java.net/~iveresov/8032463/webrev.03/
> 
> igor
> 
> On May 8, 2014, at 7:25 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:
> 
>> src/share/vm/c1/c1_Runtime1.cpp
>> +   nmethod* nm = (nmethod*)caller_frame.cb();
>> +   assert(nm != NULL && nm->is_nmethod(), "Sanity check");
>> There is CodeBlob::as_nmethod_or_null().
>> 
>> The C1 code looks good.
>> 
>> On May 8, 2014, at 3:28 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>> 
>>> Looks good.
>>> 
>>> Thanks,
>>> Vladimir
>>> 
>>> On 5/8/14 12:30 PM, Igor Veresov wrote:
>>>> Alright, here is the update:
>>>> http://cr.openjdk.java.net/~iveresov/8032463/webrev.02/
>>>> 
>>>> The change is that I added a trap counter in the MDO that is used to
>>>> scale up the number of stack walks that are necessary for the tenured
>>>> method to revert back to the standard flushing policy. This required
>>>> some changes in c1_Runtime.cpp and deoptimization.cpp to handle these
>>>> deopts and allocate MDOs (in C1). The counter increment is guarded by
>>>> the make_not_entrant() call that guarantees that the increment happens
>>>> only once per method state change.
>>>> Also fixed the comments in sweeper.cpp and indent in globals.hpp that I
>>>> forgot to do the last time.
>>>> 
>>>> Thanks!
>>>> igor
>>>> 
>>>> 
>>>> On May 7, 2014, at 2:27 PM, Vladimir Kozlov <vladimir.kozlov at oracle.com
>>>> <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>> 
>>>>> On 5/7/14 12:37 AM, Igor Veresov wrote:
>>>>>> Vladimir,
>>>>>> 
>>>>>> Thanks for the review!
>>>>>> Please find the reply inline.
>>>>>> 
>>>>>> igor
>>>>>> 
>>>>>> On May 6, 2014, at 5:05 PM, Vladimir Kozlov
>>>>>> <vladimir.kozlov at oracle.com <mailto:vladimir.kozlov at oracle.com>> wrote:
>>>>>> 
>>>>>>> I think we need to let SAP know about this change (counter code in
>>>>>>> C2) since it may affect their tiered compilation. It use the same
>>>>>>> Deoptimization::Reason_age.
>>>>>>> 
>>>>>> Oh, I didn’t know that. I’ll introduce another constant.
>>>>>> 
>>>>>>> Can you separate C1 trap_request newa code in a separate RFE and
>>>>>>> push it first?
>>>>>>> 
>>>>>> 
>>>>>> Do you think this is necessary? It so happens that the DeoptimizeStub
>>>>>> was not used before this by any code. So the aging scheme is the only
>>>>>> user of it so far.
>>>>> 
>>>>> I am fine to keep it in this fix but I can't review it. Ask Christian
>>>>> to look.
>>>>> 
>>>>>> 
>>>>>>> Can you add the field to MDO and not to MethodCounters? We doing
>>>>>>> this only for methods which were compiled already and have MDO.
>>>>>>> MethodCounters small size is important.
>>>>>> 
>>>>>> It was in MDO originally, unfortunately I have to make it work for
>>>>>> pure C1 and we usually don’t allocate MDOs in that configuration.
>>>>>> Although, we need to only allocate MDOs for methods with aging, a set
>>>>>> of which may be would be smallish..  But MDOs are kind of big.. Would
>>>>>> that be better?
>>>>> 
>>>>> I forgot about pure C1.
>>>>> Chris Plummer pointed me, for RTM changes, MethodCounters are
>>>>> allocated for all executed methods which may not be compiled. About
>>>>> 30% methods are executed.
>>>>> 
>>>>>> 
>>>>>>> Why the change in methodData.hpp and nmethod.cpp? '||' should be
>>>>>>> '&&' in nmethod::inc_decompile_count(), I think.
>>>>>> 
>>>>>> You’re right, for pure C1 that won’t work anyways, because there are
>>>>>> no MDOs. I’ll remove these for now. I probably need another decompile
>>>>>> counter to put a cap on the number of recompiles, see the idea below.
>>>>>> 
>>>>>>> In globals.hpp flags description indention is off.
>>>>>>> 
>>>>>> 
>>>>>> Fixed.
>>>>> 
>>>>> You moved it to opposite direction :) . Indention of all comments are
>>>>> the same in the file. They are not aligned to a flag's declaration line.
>>>>> 
>>>>>> 
>>>>>>> sweeper.cpp:
>>>>>>> 
>>>>>>> // The stack-scanning low-cost detection may not see the method was
>>>>>>> used (which can happen for
>>>>>>> 
>>>>>>> // previous MinPassesBeforeFlush sweeps. Reset the counter. Stay in
>>>>>>> the existing
>>>>> 
>>>>> You did not change above comments as I suggested.
>>>>> 
>>>>>>> 
>>>>>>> Why do you need to repeat? For small leaf nmethods (with no
>>>>>>> safepoint, for example) it could trigger continues recompilation:
>>>>>>> 
>>>>>>> +             if (time_since_reset > MinPassesBeforeFlush * 2) {
>>>>>>> +               // It's been long enough, we still haven't seen it
>>>>>>> on stack.
>>>>>>> +               // Try to flush it, but enable counters the next time.
>>>>>>> +               mc->reset_nmethod_age();
>>>>>>> 
>>>>>> 
>>>>>> That’s a good question. Before this change we would’ve recompiled
>>>>>> warm methods continuously (given some code cache pressure), so the
>>>>>> number of such cases is probably small. The options here with the
>>>>>> code aging are:
>>>>>> - don’t ever again flush a method that we’ve seen deopt once;
>>>>>> - allow the flusher machinery to get rid of it if it becomes idle again.
>>>>>> 
>>>>>> Currently I decided to stick with the latter but also give the method
>>>>>> twice the standard time initially to be noticed by the stack-walker.
>>>>>> You are right that there is a problem for small methods. Technically
>>>>>> there is always one safepoint in the epilog though, right? I’m good
>>>>>> with either option. Vladimir, what are your thoughts on this? Albert,
>>>>>> if you skimming through this, what do you think?
>>>>>> 
>>>>>> Here is an idea... We could count the number of decompiles for this
>>>>>> particular reason and use it to scale the number of stack-walks we
>>>>>> have to do before we flush the method again. For example we can wait
>>>>>> MinPassesBeforeFlush * number_of_aging_deopts() stack-walks.
>>>>>> Essentially this makes it unflushable in the limit - a combination of
>>>>>> the two options above.
>>>>>> I’ll need to allocate an MDO or add another counter to methodCounters
>>>>>> for that though, or pack a small value info method_age itself, at the
>>>>>> expense of a slightly bigger code. And there is an MT problem, as
>>>>>> usual, with counting the number of deopts.
>>>>> 
>>>>> As we discussed the check will be scaled by number of recompilation.
>>>>> 
>>>>> Thanks,
>>>>> Vladimir
>>>>> 
>>>>>> 
>>>>>> Here is the updated (with the changes addressed so far) webrev:
>>>>>> http://cr.openjdk.java.net/~iveresov/8032463/webrev.01/
>>>>>> 
>>>>>> Thanks,
>>>>>> igor
>>>>>> 
>>>>>>> Thanks,
>>>>>>> Vladimir
>>>>>>> 
>>>>>>> On 5/5/14 7:44 PM, Igor Veresov wrote:
>>>>>>>> The current implementation of the code cache flusher tries to
>>>>>>>> determine the working set of methods by doing periodic stack walks
>>>>>>>> during safepoints. Methods that are seen on the stack are
>>>>>>>> considered to be used by the program and are excluded from the set
>>>>>>>> of flush candidates. Such sampling is a great zero-overhead
>>>>>>>> approach to collect the working set of super-hot methods without
>>>>>>>> instrumentation. However it doesn’t work that good for flat
>>>>>>>> profiles, which the test in the bug report happens to expose
>>>>>>>> (thousands of methods calls sequentially one after another in the
>>>>>>>> loop). The solution I ended up implementing uses conservative
>>>>>>>> on-demand instrumentation to obtain information about the method
>>>>>>>> usage. The algorithms is as follows:
>>>>>>>> 
>>>>>>>> 1. Initially a method is compiled as usual (no instrumentation, no
>>>>>>>> overhead).
>>>>>>>> 2. When the sampling (stack-walking) scheme fails to detect
>>>>>>>> activity we flush the method (again, as usual). However before the
>>>>>>>> flush we set the age counter (an int added to MethodCounters) to a
>>>>>>>> value that instructs the compilers to instrument the code.
>>>>>>>> 3. If we ever request to compile this method again the aging code
>>>>>>>> is inserted, which decrements the counter.
>>>>>>>> 4. The value of the counter is then used by the sweeper to
>>>>>>>> determine is the method is in fact used.
>>>>>>>> 5. If the counter reaches zero before the sweeper is able examine
>>>>>>>> and reset the counter we deopt and recompile the method without the
>>>>>>>> counters, basically switching it back to use the sampling scheme again.
>>>>>>>> 
>>>>>>>> Webrev: http://cr.openjdk.java.net/~iveresov/8032463/webrev.00/
>>>>>>>> 
>>>>>>>> Testing:
>>>>>>>> - jprt,
>>>>>>>> - aurora-perf No performance effects with the default code cache,
>>>>>>>> since we have enough headroom and the flusher is not exercised that
>>>>>>>> much. No statistically significant effects after warmup (for spec
>>>>>>>> benchmarks) in stress mode (-XX:+StressCodeAging) either, that is
>>>>>>>> if all methods are compiled initially with counters deopt and
>>>>>>>> switch to sampling later after 100000 invocations.
>>>>>>>> 
>>>>>>>> igor
>> 
> 

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/ppc-aix-port-dev/attachments/20140508/6179c43b/attachment.html>