RFR(M): 8031752: Failed speculative optimization should be reattempted when root of compilation is different

Fri Jan 17 03:49:26 PST 2014

Thanks for taking another look.

Comments inlined...

On Jan 16, 2014, at 8:53 PM, Christian Thalinger <christian.thalinger at oracle.com> wrote:

> 
> On Jan 16, 2014, at 8:32 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
> 
>> Thanks for reviewing this, Chris.
>> 
>> Here is a new webrev:
>> 
>> http://cr.openjdk.java.net/~roland/8031752/webrev.01/
> 
> ! bool MethodData::speculative_trap_bytecode(Bytecodes::Code code) {
> 
> It would be good if this method would have an “is” in the name.  Something like is_speculative_trap_bytecode or is_bytecode_speculative_trap.

Ok.

> 
>> 
>> And comments below:
>> 
>>> All these print_data_on changes are a pain:
>>> 
>>> !   virtual void print_data_on(outputStream* st, const char* extra) const {
>>> 
>>> I see you actually pass something in only in one place.  How does the output look like?  Can’t you print the “extra” output in ProfileData::print_data_on directly after the other output?
>> 
>> I want the speculative traps to be listed together with the byte code they affect. Diagnosing a missed optimization is much easier that way. Here is an example output:
>> 
>> 6 ldc_w TestTrapSpeculate$B
>> 9 if_acmpne 19
>> 64  bci: 9    BranchData          trap(class_check recompiled) trap/ TestTrapSpeculate::m2(class_check recompiled) trap/ TestTrapSpeculate::m1(class_check recompiled) 
>>                                   not taken(10)
>> 12 fast_aload_0
>> 
>> But the speculative traps are not stored with the ProfileData for each byte code. So I need to pass something to each ProfileData::print_data_on() that can be used to print the speculative traps. I decided to pass a formatted string with the trap stuff. Which is then passed to ProfileData::print_shared() where the per byte code line is printed.
> 
> Ok.  Oh, one other thing I forgot to mention:  should we have a default value for extra?
> 
> !   virtual void print_data_on(outputStream* st, const char* extra = NULL) const {

Ok.

> 
>> 
>>> src/share/vm/oops/methodData.cpp:
>>> 
>>> +   case Bytecodes::_invokestatic:
>>> +     return UseTypeSpeculation;
>>> +   }
>>> +   return false;
>>> 
>>> Could you move the return false into a default case of the switch?  At some point I would like to enable the compiler warnings about switches not having all cases (-Wswitch).
>> 
>> Ok.
> 
> +   default:
> +     return false;
> +   }
> +   return false;
> + }
> 
> Isn’t the compiler warning about unreached code here?

I would expect the compiler to complain because of the lack of final return if I remove it. Pretty hard to figure out which is good given the number of platforms we support and compilers that change the way they behave from one version to another.

> 
>> 
>>> 
>>>  int empty_bc_count = 0;  // number of bytecodes lacking data
>>> +   bool has_spec_bc = false;
>>> 
>>> That name is hard to understand for people new to the code.  At least the first ugly name has a comment.
>> 
>> Would needs_speculative_traps be better?
> 
> Yes, thanks.
> 
>> 
>>> 
>>> + DataLayout* MethodData::next_extra(DataLayout* dp) {
>>> +   int nb_cells = 0;
>>> +   if (dp->tag() == DataLayout::bit_data_tag || dp->tag() == DataLayout::no_tag) {
>>> +     nb_cells = BitData::static_cell_count();
>>> +   } else if (dp->tag() == DataLayout::speculative_trap_data_tag) {
>>> +     nb_cells = SpeculativeTrapData::static_cell_count();
>>> +   } else {
>>> +     fatal(err_msg("unexpected tag %d", dp->tag()));
>>> +   }
>>> 
>>> It would be easier to read with a switch instead of if-else.  Maybe also in MethodData::print_data_on.  The assert:
>>> 
>>>      assert(dp->tag() == DataLayout::arg_info_data_tag, "must be BitData or ArgInfo”);
>>> 
>>> is now out-of-date and the code could have a fatal instead.  Same in MethodData::clean_method_data.
>> 
>> Ok. So I tried to use switches when iterating over the extra data everywhere. The problem is that break, breaks out of the switch but not out of the loop anymore.
> 
> Can’t you use continue in that case?  Oh, I just saw you did.
> 
>> The new MethodData::next_extra() doesn’t know how to move past a arg_info_data_tag because it’s a static method. It’s a static method because it’s used by ciMethodData as well. So I need to break out of the loop before calling the last MethodData::next_extra(). Anyway, I refactored the code so that the loops are in their own methods and then I can return from them to jump out of the switch and loop.
> 
> Thanks.  Looks much better now.
> 
>> 
>>> 
>>> src/share/vm/opto/compile.cpp:
>>> 
>>> !   if (md->has_trap_at(bci, Deoptimization::reason_is_speculate(reason) ? this->method() : NULL, reason) != 0) {
>>> 
>>> Can you factor this to a local variable as you do later:
>>> 
>>> +   ciMethod* m = Deoptimization::reason_is_speculate(reason) ? this->method() : NULL;
>>>  if ((per_bc_reason == Deoptimization::Reason_none
>>> !        || md->has_trap_at(bci, m, reason) != 0)
>>> 
>>> ?
>>> 
>> 
>> Ok.
>> 
>>> src/share/vm/runtime/globals.hpp:
>>> 
>>> +   product(intx, PerMethodSpecTrapLimit,  5000,                              \
>>> +           "Limit on speculative traps (of one kind) in a method (includes inlines)") \
>>> 
>>> +   product(intx, SpecTrapLimitExtraEntries,  3,                              \
>>> +           "Extra method data trap entries for speculation")                 \
>>> 
>>> Do we need these new flags to be product flags?
>> 
>> At this point, it’s unclear whether those values are going to be good enough so I think we need to be able to adjust the parameters in benchmark runs.
> 
> I have mentioned this in a previous review but we need something like diagnostic flags but for non-diagnostic ones.  Vladimir and I just talked to Mikael about this and we should use experimental for such flags; the flag is used for experiments (as you say for benchmark runs or similar things) and can either go away or be made a develop flag in the future.

Ok. What do you suggest I do with the existing:

  product(intx, PerMethodTrapLimit,  100,                                   \
          "Limit on traps (of one kind) in a method (includes inlines)")    \
                                                                            \

  product(intx, PerBytecodeTrapLimit,  4,                                   \
          "Limit on traps (of one kind) at a particular BCI")               \
                                                                            \

?

Roland.

> 
>> 
>>> src/share/vm/ci/ciMethodData.cpp:
>>> 
>>> ! void ciParametersTypeData::print_data_on(outputStream* st, const char* extra) const {
>>> !   st->print_cr("Parametertypes");
>>> !   st->print_cr("ciParameterTypesData”);
>>> 
>>> Typo replaced with a typo :-)
>> 
>> Indeed.
>> 
>> Roland.
>> 
>>> 
>>> On Jan 15, 2014, at 3:19 AM, Roland Westrelin <roland.westrelin at oracle.com> wrote:
>>> 
>>>> http://cr.openjdk.java.net/~roland/8031752/webrev.00/
>>>> 
>>>> When C2 optimizes the code based on type profiling, it emits a guard. If the guard fails, the code "traps" and the location of the trap (method, bci) and cause (class check) are recorded. When C2 generates code for that method again, it doesn't perform any optimization of that type (class check) at that location (bci). 
>>>> 
>>>> If, say, the trap occurs at (m, bci) when m is inlined in m1. When m2 that inlines m is compiled, no class check optimization is performed at (m, bci) because of the trap triggered in m1. With type speculation, type information is flowing from many profiling points in the application code. So in the example above, the profile data used at (m, bci) may come from m1 when the compiled method is m1 and it may be come from m2 when the compiled method is m2. And so the trap at (m,bci) in compiled method m1 doesn't say much about the validity of a speculative optimization at (m,bci) when the compiled method is m2. 
>>>> 
>>>> When a trap occurs from a speculative optimization, the trap should record: (method, bci, compiled method) so that in the example above: erroneous optimization at (m, bci) is not reattempted if m1 is re-compiled but is attempted when m2 is compiled.
>>>> 
>>>> With nashorn, peak performance varies a lot from one run to another for some benchmarks. This is one of the causes. Depending on the order of compilations, what’s compiled or not, what’s inlined or not, some optimizations may or may not be performed.
>>>> 
>>>> This change adds a new type of traps (SpeculativeTrapData). They record the nmethod’s method where the trap occurs. They are allocated in the extra data space of the MDO. If no more space is available in the extra data space, the VM fallbacks to standard traps.
>>>> 
>>>> I also added a PerMethodSpecTrapLimit similar to PerMethodTrapLimit but for speculative traps. Its value is much higher. That appears to be required as well to have reproducible peak performance results from one run to another.
>>>> 
>>>> I have a couple more changes that help reduce performance variation with nashorn and that I will send for review later.
>>>> 
>>>> Roland.