RFR(L): 8024070: C2 needs some form of type speculation

Tue Oct 22 17:40:53 PDT 2013

Thanks for the Doxygen comments!  These are the best ones we have so far!

src/share/vm/opto/phaseX.cpp:

+ /**
+  *Remove the speculative part of all types that we know of
+  */

Add a space after "*" and get rid of the comment in the header file.

Otherwise this looks good.

On Oct 21, 2013, at 1:10 PM, Roland Westrelin <roland.westrelin at oracle.com> wrote:

> Here is a new webrev that should address all comments:
> 
> http://cr.openjdk.java.net/~roland/8024070/webrev.02/
> 
> Roland.
> 
> 
> On Oct 19, 2013, at 1:04 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
> 
>> Roland,
>> 
>> I looked on whole changes.
>> 
>> Please, switch off UseTypeSpeculation by default as we discussed.
>> 
>> I don't see where you check that profiled type is better than current type to avoid unneeded speculative type generation. For example, type you get from arguments profiling for call in current method could be better then a type from parameters profiling which is used ehrn method is inlined. You will have 2 casts in such case. I think one could be avoided (from parameters profiling) if it is not better then previous (speculative or current) type.
>> 
>> ciMethod.cpp can you use simple 'i' instead of 'nb' as we do in other places? And change comments accordingly, "for i'th argument":
>> 
>> + // Return true if profiling provides a type for the argument nb to the
>> 
>> I think *_profiled_type() should return type* instead of passing it through *& parameter. You can return NULL for cases when you return 'false' now.
>> 
>> It would be nice if you can factor out asserts in separate methods since they are the same to avoid duplication and also print bytecode when assert fails.
>> 
>> compile.cpp next comment is wrong since you keep using CastPP nodes:
>> 
>> +   // Remove the speculative part of types and clean up the graph from
>> +   // the extra CheckCastPP nodes whose only purpose is to carry
>> +   // them. Do that early so that optimizations are not disrupted by
>> +   // the extra CheckCastPP nodes.
>> 
>> 'tmp' is not good name for a 'list':
>> +     Unique_Node_List tmp;
>> 
>> Split next line:
>> 
>> +      if (n->is_Type() && n->as_Type()->type()->isa_oopptr() != NULL && n->as_Type()->type()->is_oopptr()->speculative() != NULL) {
>> 
>> 
>> doCall.cpp you can store profile.morphism() in local var and set it to 1 if you get spec_rcvr_type. It will simplify the code in call_generator().
>> 
>> graphKit.cpp please, put all new record_*() methods together preferably in the same order as declarations in graphKit.hpp.
>> 
>> library_call.cpp split next line:
>> 
>> +     ciKlass* src_k = top_src->klass(), *dest_k = top_dest->klass();
>> 
>> 
>> parse2.cpp I think you need to create new LoadKlass/DecodeNKlass if new obj is different (has cast). replace_in_map() will not replace inputs to address (AddP) of LoadKlass.
>> 
>> Thanks,
>> Vladimir
>> 
>> On 10/18/13 10:24 AM, Vladimir Kozlov wrote:
>>> Roland,
>>> 
>>>> tap->const_oop() would be non NULL for a Constant only and xk is set
>>> correctly if klass() == tap->klass()
>>> 
>>> Lets leave NULL for now because in general meet(con, bot) = bot. It may
>>> change code generation more then other changes and we don't want it now.
>>> 
>>>> 
>>>> I've been trying to write simple tests to reproduce those failures
>>> without much success. Any suggestion what simple java code I can write
>>> to have something of ptr type AnyNull?
>>> 
>>> AnyNull is ptr_dual for NotNull. Meets with it is checked in
>>> Type::meet(). So it should work for tests only with NotNull.
>>> 
>>> Vladimir
>>> 
>>> On 10/18/13 9:19 AM, Roland Westrelin wrote:
>>>> Hi Vladimir,
>>>> 
>>>> On Oct 18, 2013, at 5:21 AM, Vladimir Kozlov
>>>> <vladimir.kozlov at oracle.com> wrote:
>>>> 
>>>>> On 10/17/13 10:55 AM, Roland Westrelin wrote:
>>>>>> Here is a new webrev for this:
>>>>>> 
>>>>>> http://cr.openjdk.java.net/~roland/8024070/webrev.01/
>>>>>> 
>>>>>> I applied your suggestions below.
>>>>>> I kept the "speculative" name. "Profiling" in the code indicates
>>>>>> profiling data that we use right away so reusing profiling for this,
>>>>>> I find confusing.
>>>>> 
>>>>> Okay.
>>>>> 
>>>>>> I kept the CastPPNodes. I tried the CheckCastPPNodes but I had some
>>>>>> problems that need investigation. I did extensive testing with the
>>>>>> CastPPNodes. Can we keep them for now and reconsider that later?
>>>>> 
>>>>> Yes, it is fine.
>>>>> 
>>>>>> 
>>>>>> type.cpp has several generic type system fixes:
>>>>>> 
>>>>>> 3184       if (klass()->equals(ciEnv::current()->Object_klass()) &&
>>>>>> !klass_is_exact()) {
>>>>>> 3185         return TypeAryPtr::make(ptr, tp->ary(), tp->klass(),
>>>>>> tp->klass_is_exact(), offset, instance_id, speculative);
>>>>>> 3186       } else {
>>>>> 
>>>>> Looks correct but, please, make sure it works for multi-dimensional
>>>>> object arrays. And run full CTW tests and all jtregs. I am concern
>>>>> that sometimes exact Object klass may be not really exact.
>>>> 
>>>> Ok.
>>>> 
>>>>> 
>>>>>> 
>>>>>> mimics what we do for instance klass with subclass/superclass when
>>>>>> both subclass and superclass are exact
>>>>>> 
>>>>>> 3315     if( this_klass->is_interface() &&
>>>>>> !(tinst_klass->is_interface() ||
>>>>>> 3316                                         tinst_klass ==
>>>>>> ciEnv::current()->Object_klass())) {
>>>>>> 
>>>>>> is required because otherwise if tinst_klass is of an object class,
>>>>>> we swap tinst_* and this_* but don't enter the next if() and we use
>>>>>> this->_ptr and tinst->_ptr which were not swapped.
>>>>> 
>>>>> Seems reasonable.
>>>>> 
>>>>>> 
>>>>>> I also had to make the cases:
>>>>>> 
>>>>>> 3818     case AnyNull:
>>>>>> 3819     case TopPTR:
>>>>>> 
>>>>>> and
>>>>>> 
>>>>>> 3847     case NotNull:
>>>>>> 3848     case BotPTR:
>>>>>> 
>>>>>> symmetrical which they are not currently.
>>>>> 
>>>>> Yes, the check "else if (above_centerline(tap->_ptr))" was in wrong
>>>>> place (it was always false).
>>>>> With your changes the code is still not symmetrical. I think in last
>>>>> case (NotNull or BotPTR) when klass() == tap->klass() we need to pass
>>>>> const_oop() to constructor. This is what we do when tap is Const and
>>>>> 'this' is below.
>>>> 
>>>> Wouldn't this be sufficient as a fix?
>>>> 
>>>> diff --git a/src/share/vm/opto/type.cpp b/src/share/vm/opto/type.cpp
>>>> --- a/src/share/vm/opto/type.cpp
>>>> +++ b/src/share/vm/opto/type.cpp
>>>> @@ -3851,7 +3851,7 @@
>>>>             xk = tap->_klass_is_exact;
>>>>       else  xk = (tap->_klass_is_exact & this->_klass_is_exact) &&
>>>>               (klass() == tap->klass()); // Only precise for
>>>> identical arrays
>>>> -      return TypeAryPtr::make(ptr, NULL, tary, lazy_klass, xk, off,
>>>> instance_id, speculative);
>>>> +      return TypeAryPtr::make(ptr, tap->const_oop(), tary,
>>>> lazy_klass, xk, off, instance_id, speculative);
>>>>     default: ShouldNotReachHere();
>>>>     }
>>>>   }
>>>> 
>>>> tap->const_oop() would be non NULL for a Constant only and xk is set
>>>> correctly if klass() == tap->klass()
>>>> 
>>>> I've been trying to write simple tests to reproduce those failures
>>>> without much success. Any suggestion what simple java code I can write
>>>> to have something of ptr type AnyNull?
>>>> 
>>>> Thanks,
>>>> Roland.
>>>> 
>>>>> 
>>>>> I need to look on whole changes later. I will send additional comments.
>>>>> 
>>>>> Thanks,
>>>>> Vladimir
>>>>> 
>>>>>> 
>>>>>> Roland.
>>>>>> 
>>>>>>> General notes:
>>>>>>> 
>>>>>>> I think 'speculative' is not right word. We get these types from
>>>>>>> profiling when we see only one type. And we use it only after
>>>>>>> generated checks. 'profiled' will be more accurate and clear where
>>>>>>> it comes from. Or other more appropriate name.
>>>>>>> 
>>>>>>> You need to add jtreg tests to verify these changes for different
>>>>>>> cases (especially merge points with different profile types).
>>>>>>> 
>>>>>>> Please, explain why we need CastPP. In merge points it will leak to
>>>>>>> Phi nodes and further anyway.
>>>>>>> 
>>>>>>> Small notes:
>>>>>>> 
>>>>>>> c2_globals.hpp:
>>>>>>> 
>>>>>>> TypeSpeculation --> UseTypeSpeculation
>>>>>>> 
>>>>>>> drop_speculative* --> remove_speculative*
>>>>>>> 
>>>>>>> Why change all TypeInstPtr::make() if you have default value set
>>>>>>> for last parameters.
>>>>>>> 
>>>>>>> In a lot of places you use:
>>>>>>> t->isa_oopptr() != NULL && t->is_oopptr()->speculative_type()
>>>>>>> 
>>>>>>> add Type::has_speculative_type() and Type::speculative_type() to
>>>>>>> use in such cases.
>>>>>>> 
>>>>>>> library_call.cpp: typo:
>>>>>>> 
>>>>>>> ciKlass* src_k, *dest_k;
>>>>>>>               ^
>>>>>>> regards,
>>>>>>> Vladimir
>>>>>>> 
>>>>>>> On 9/6/13 3:11 PM, Vladimir Kozlov wrote:
>>>>>>>> Roland,
>>>>>>>> 
>>>>>>>> Thank you for doing this experiment. But Richards is faster ;)
>>>>>>>> Regression could be also caused by different inlining
>>>>>>>> (InlineSmallCode)
>>>>>>>> because more uncommon traps are generated.
>>>>>>>> Anyway, you convince me. Let me go through your changes again. I
>>>>>>>> am also
>>>>>>>> looking on your extra type profiling changes.
>>>>>>>> 
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>> 
>>>>>>>> On 9/6/13 11:16 AM, Roland Westrelin wrote:
>>>>>>>>>>> Can you try an other experiment (may be you did it already)?
>>>>>>>>>>> Always
>>>>>>>>>>> (under flag) generate klass check (type_check_receiver()) when
>>>>>>>>>>> unique type from profiling is more precise than type from static
>>>>>>>>>>> analysis. Yes, we may get performance regression in cases where
>>>>>>>>>>> more
>>>>>>>>>>> precise type is not needed. We need to test it and see how bad
>>>>>>>>>>> it is.
>>>>>>>>>> 
>>>>>>>>>> This change in itself doesn't bring any performance improvement.
>>>>>>>>>> It's
>>>>>>>>>> here so that if we add more profiling points (8023657) we know what
>>>>>>>>>> to do with the data from profiling. Type speculation + extra
>>>>>>>>>> profiling show a perf improvement on nashorn (10-20% on some
>>>>>>>>>> benchmarks with the nashorn.jar from jdk b105). To me the
>>>>>>>>>> experiment
>>>>>>>>>> that makes sense it to enable profiling from 8023657, do the type
>>>>>>>>>> checks at the profiling points systematically and see what
>>>>>>>>>> impact we
>>>>>>>>>> see on perf. I did that with type checks on entry to the root
>>>>>>>>>> method
>>>>>>>>>> of the compilation for incoming arguments and at returns from calls
>>>>>>>>>> that were not inlined and I had a clear regression on specjvm98. I
>>>>>>>>>> don't remember the exact numbers. I can run the experiment again.
>>>>>>>>> 
>>>>>>>>> This is the kind of results I get with nashorn:
>>>>>>>>> deltablue:
>>>>>>>>> current: 2345
>>>>>>>>> with type speculation and extra profiling: 2662
>>>>>>>>> with extra profiling and systematic type checks: 2285
>>>>>>>>> 
>>>>>>>>> RayTrace:
>>>>>>>>> current: 1978
>>>>>>>>> with type speculation and extra profiling: 2113
>>>>>>>>> with extra profiling and systematic type checks: 1795
>>>>>>>>> 
>>>>>>>>> Richards:
>>>>>>>>> current: 2222
>>>>>>>>> with type speculation and extra profiling: 2489
>>>>>>>>> with extra profiling and systematic type checks: 2563
>>>>>>>>> 
>>>>>>>>> "extra profiling and systematic type checks" is:
>>>>>>>>> - type checks for incoming arguments on method entry for the root
>>>>>>>>> method of the compilation
>>>>>>>>> - type check on return from non inlined calls
>>>>>>>>> 
>>>>>>>>> so a limited subset of type checks but that gives us type for value
>>>>>>>>> flowing in the compiled method. Eventhought the number of checks is
>>>>>>>>> limited, it is sufficient to cause a regression on standard
>>>>>>>>> benchmarks:
>>>>>>>>> 
>>>>>>>>> ============================================================================
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> t
>>>>>>>>>  Benchmark         Samples        Mean     Stdev
>>>>>>>>>  specjvm98              10      663.84      0.01
>>>>>>>>>    javac                10      385.70      0.05
>>>>>>>>>    db                   10      444.51      0.01
>>>>>>>>>    jess                 10      699.05      0.00
>>>>>>>>>    jack                 10      603.98      0.01
>>>>>>>>>    compress             10      534.99      0.01
>>>>>>>>>    mtrt                 10     1694.45      0.01
>>>>>>>>>    mpegaudio            10      866.77      0.00
>>>>>>>>> ============================================================================
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> t2
>>>>>>>>>  Benchmark         Samples        Mean     Stdev   %Diff    P
>>>>>>>>> Significant
>>>>>>>>>  specjvm98              10      654.39      0.01   -1.42
>>>>>>>>> 0.014            *
>>>>>>>>>    javac                10      356.46      0.07   -7.58
>>>>>>>>> 0.013            *
>>>>>>>>>    db                   10      444.87      0.01    0.08
>>>>>>>>> 0.875            *
>>>>>>>>>    jess                 10      681.66      0.01   -2.49
>>>>>>>>> 0.000          Yes
>>>>>>>>>    jack                 10      601.90      0.01   -0.34
>>>>>>>>> 0.425            *
>>>>>>>>>    compress             10      538.06      0.02    0.57
>>>>>>>>> 0.329            *
>>>>>>>>>    mtrt                 10     1722.77      0.03    1.67
>>>>>>>>> 0.181            *
>>>>>>>>>    mpegaudio            10      854.38      0.01   -1.43
>>>>>>>>> 0.001          Yes
>>>>>>>>> ============================================================================
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> 
>>>>>>>>> Roland.
>>>>>>>>> 
>>>>>> 
>>>> 
>