RFR(L): 8024070: C2 needs some form of type speculation

Thu Oct 17 20:21:12 PDT 2013

On 10/17/13 10:55 AM, Roland Westrelin wrote:
> Here is a new webrev for this:
>
> http://cr.openjdk.java.net/~roland/8024070/webrev.01/
>
> I applied your suggestions below.
> I kept the "speculative" name. "Profiling" in the code indicates profiling data that we use right away so reusing profiling for this, I find confusing.

Okay.

> I kept the CastPPNodes. I tried the CheckCastPPNodes but I had some problems that need investigation. I did extensive testing with the CastPPNodes. Can we keep them for now and reconsider that later?

Yes, it is fine.

>
> type.cpp has several generic type system fixes:
>
> 3184       if (klass()->equals(ciEnv::current()->Object_klass()) && !klass_is_exact()) {
> 3185         return TypeAryPtr::make(ptr, tp->ary(), tp->klass(), tp->klass_is_exact(), offset, instance_id, speculative);
> 3186       } else {

Looks correct but, please, make sure it works for multi-dimensional 
object arrays. And run full CTW tests and all jtregs. I am concern that 
sometimes exact Object klass may be not really exact.

>
> mimics what we do for instance klass with subclass/superclass when both subclass and superclass are exact
>
> 3315     if( this_klass->is_interface() && !(tinst_klass->is_interface() ||
> 3316                                         tinst_klass == ciEnv::current()->Object_klass())) {
>
> is required because otherwise if tinst_klass is of an object class, we swap tinst_* and this_* but don't enter the next if() and we use this->_ptr and tinst->_ptr which were not swapped.

Seems reasonable.

>
> I also had to make the cases:
>
> 3818     case AnyNull:
> 3819     case TopPTR:
>
> and
>
> 3847     case NotNull:
> 3848     case BotPTR:
>
> symmetrical which they are not currently.

Yes, the check "else if (above_centerline(tap->_ptr))" was in wrong 
place (it was always false).
With your changes the code is still not symmetrical. I think in last 
case (NotNull or BotPTR) when klass() == tap->klass() we need to pass 
const_oop() to constructor. This is what we do when tap is Const and 
'this' is below.

I need to look on whole changes later. I will send additional comments.

Thanks,
Vladimir

>
> Roland.
>
>> General notes:
>>
>> I think 'speculative' is not right word. We get these types from profiling when we see only one type. And we use it only after generated checks. 'profiled' will be more accurate and clear where it comes from. Or other more appropriate name.
>>
>> You need to add jtreg tests to verify these changes for different cases (especially merge points with different profile types).
>>
>> Please, explain why we need CastPP. In merge points it will leak to Phi nodes and further anyway.
>>
>> Small notes:
>>
>> c2_globals.hpp:
>>
>> TypeSpeculation --> UseTypeSpeculation
>>
>> drop_speculative* --> remove_speculative*
>>
>> Why change all TypeInstPtr::make() if you have default value set for last parameters.
>>
>> In a lot of places you use:
>> t->isa_oopptr() != NULL && t->is_oopptr()->speculative_type()
>>
>> add Type::has_speculative_type() and Type::speculative_type() to use in such cases.
>>
>> library_call.cpp: typo:
>>
>> ciKlass* src_k, *dest_k;
>>                 ^
>> regards,
>> Vladimir
>>
>> On 9/6/13 3:11 PM, Vladimir Kozlov wrote:
>>> Roland,
>>>
>>> Thank you for doing this experiment. But Richards is faster ;)
>>> Regression could be also caused by different inlining (InlineSmallCode)
>>> because more uncommon traps are generated.
>>> Anyway, you convince me. Let me go through your changes again. I am also
>>> looking on your extra type profiling changes.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 9/6/13 11:16 AM, Roland Westrelin wrote:
>>>>>> Can you try an other experiment (may be you did it already)? Always
>>>>>> (under flag) generate klass check (type_check_receiver()) when
>>>>>> unique type from profiling is more precise than type from static
>>>>>> analysis. Yes, we may get performance regression in cases where more
>>>>>> precise type is not needed. We need to test it and see how bad it is.
>>>>>
>>>>> This change in itself doesn't bring any performance improvement. It's
>>>>> here so that if we add more profiling points (8023657) we know what
>>>>> to do with the data from profiling. Type speculation + extra
>>>>> profiling show a perf improvement on nashorn (10-20% on some
>>>>> benchmarks with the nashorn.jar from jdk b105). To me the experiment
>>>>> that makes sense it to enable profiling from 8023657, do the type
>>>>> checks at the profiling points systematically and see what impact we
>>>>> see on perf. I did that with type checks on entry to the root method
>>>>> of the compilation for incoming arguments and at returns from calls
>>>>> that were not inlined and I had a clear regression on specjvm98. I
>>>>> don't remember the exact numbers. I can run the experiment again.
>>>>
>>>> This is the kind of results I get with nashorn:
>>>> deltablue:
>>>> current: 2345
>>>> with type speculation and extra profiling: 2662
>>>> with extra profiling and systematic type checks: 2285
>>>>
>>>> RayTrace:
>>>> current: 1978
>>>> with type speculation and extra profiling: 2113
>>>> with extra profiling and systematic type checks: 1795
>>>>
>>>> Richards:
>>>> current: 2222
>>>> with type speculation and extra profiling: 2489
>>>> with extra profiling and systematic type checks: 2563
>>>>
>>>> "extra profiling and systematic type checks" is:
>>>> - type checks for incoming arguments on method entry for the root
>>>> method of the compilation
>>>> - type check on return from non inlined calls
>>>>
>>>> so a limited subset of type checks but that gives us type for value
>>>> flowing in the compiled method. Eventhought the number of checks is
>>>> limited, it is sufficient to cause a regression on standard benchmarks:
>>>>
>>>> ============================================================================
>>>>
>>>> t
>>>>    Benchmark         Samples        Mean     Stdev
>>>>    specjvm98              10      663.84      0.01
>>>>      javac                10      385.70      0.05
>>>>      db                   10      444.51      0.01
>>>>      jess                 10      699.05      0.00
>>>>      jack                 10      603.98      0.01
>>>>      compress             10      534.99      0.01
>>>>      mtrt                 10     1694.45      0.01
>>>>      mpegaudio            10      866.77      0.00
>>>> ============================================================================
>>>>
>>>> t2
>>>>    Benchmark         Samples        Mean     Stdev   %Diff    P
>>>> Significant
>>>>    specjvm98              10      654.39      0.01   -1.42
>>>> 0.014            *
>>>>      javac                10      356.46      0.07   -7.58
>>>> 0.013            *
>>>>      db                   10      444.87      0.01    0.08
>>>> 0.875            *
>>>>      jess                 10      681.66      0.01   -2.49
>>>> 0.000          Yes
>>>>      jack                 10      601.90      0.01   -0.34
>>>> 0.425            *
>>>>      compress             10      538.06      0.02    0.57
>>>> 0.329            *
>>>>      mtrt                 10     1722.77      0.03    1.67
>>>> 0.181            *
>>>>      mpegaudio            10      854.38      0.01   -1.43
>>>> 0.001          Yes
>>>> ============================================================================
>>>>
>>>>
>>>> Roland.
>>>>
>