RFR(L): 8023657: New type profiling points: arguments to call

Mon Oct 7 12:46:23 PDT 2013

On 10/5/13 2:58 AM, Roland Westrelin wrote:
>
> On Oct 5, 2013, at 6:30 AM, Vladimir Kozlov <vladimir.kozlov at oracle.com> wrote:
>
>>
>>
>> On 10/4/13 5:59 PM, Vladimir Kozlov wrote:
>>> Roland,
>>>
>>> I need more time to go through methoddate.?pp changes.
>>> Here some preliminary comments.
>>>
>>> I think you need to use orptr(mdo_adr, obj) when storing klass first time to preserve sticky bit null_seen which could
>>> be set by other thread during small window. But it is still not atomic.
>>
>> We can't use orptr for klass record because other thread may store an other klass. As result we will get garbage value. So need to use mov. How expensive will be cmpxchg? I am just worried that we can't trust this type info due to concurrency.
>
> Why is it such concern in the case? Other profiling may also miss events due to concurrency (counter not updated, lost receiver at a call/type check). In the new code, we set the null bit only once, the recorded class only once, the unknown bit only once. It's true we can miss one of those events, but most likely the event will occur again and it can't be lost more than twice, I would say (because each of the events is recorded only once). So only a very rare event can be lost for good. We start profiling after several invocations of the method, so in practice we already risk loosing rare events. And in any case, we risk making a wrong assumption at compile time that we can recover from but not generate erroneous code.
> To me, concurrency when counters are incremented must be much more of an issue. We risk repeatedly loosing events and end up with wrong relative frequencies.

I agree with you about klass recording because we will generate klass 
check so the code will be correct even if recording is wrong (we go slow 
path). The same about unknown bit.
I looked on never_see_null code in GraphKit::null_check_oop() and we 
generate uncommon trap otherwise merged result could be NULL. So it 
seems we will hit uncommon trap if we lost null_seen bit.

Looks like we fine.

Thanks,
Vladimir

>
> Roland.
>
>>
>> Vladimir
>>
>>> It would be even better if you set these sticking flags in a separate field (may be in stack_slot_entry which does not
>>> change, may be split in 2 field). Or may be use atomic cmpxchg.
>>> Then we would not worry about concurrent updates from different threads.
>>>
>>> c1_LIRAssembler_x86.cpp: could you clean up ifdefs?:
>>>
>>> +     if (do_update) {
>>> + #ifndef ASSERT
>>> +       __ jmpb(next);
>>> + #else
>>> +       __ jmp(next);
>>> + #endif
>>> +     }
>>> +   } else {
>>> + #ifdef ASSERT
>>> +     __ testptr(tmp, tmp);
>>> +     __ jccb(Assembler::notZero, update);
>>> +     __ stop("unexpect null obj");
>>> + #endif
>>> +   }
>>>
>>> for example:
>>>
>>> +     if (do_update) {
>>> + #ifndef ASSERT
>>> +       __ jmpb(next);
>>> +     }
>>> + #else
>>> +       __ jmp(next);
>>> +     }
>>> +   } else {
>>> +     __ testptr(tmp, tmp);
>>> +     __ jccb(Assembler::notZero, update);
>>> +     __ stop("unexpect null obj");
>>> + #endif
>>> +   }
>>>
>>> Indention in c1_LIRGenerator.cpp in sig_next_klass():
>>>
>>> +   if (first_sig_k != NULL) {
>>> +      sig_k = first_sig_k;
>>> +     first_sig_k = NULL;
>>>
>>>
>>> java.cpp can you also print mdo size per method before (or after) invocation counters? I assume you want to print more
>>> statistic with new flag later. If it is only size then we may not need a new flag (we have too many of them already).
>>>
>>> TypeProfile is very general name and can be confusing. Current code already does type profiling: UseTypeProfile.
>>> TypeProfileLeveL?
>>>
>>> Also add new flags into /* compiler */ section of globals.hpp where other TypeProfile flags are defined.
>>>
>>> Thanks,
>>> Vladimir
>>>
>>> On 10/2/13 10:02 AM, Roland Westrelin wrote:
>>>> Here is a new webrev for the new profiling points, only profiling of arguments at calls this time. This one includes
>>>> tiered support on 32bit and 64bit x86.
>>>>
>>>> http://cr.openjdk.java.net/~roland/8023657/webrev.01/
>>>>
>>>> The output of -XX:+PrintMethodData is:
>>>>
>>>> 6 invokevirtual 2 <TestProfiling.m2(LTestProfiling$C;Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;>
>>>>    0   bci: 6    VirtualCallData     count(0) entries(1)
>>>>                                      'TestProfiling'(4746 1.00)
>>>>                  argument types      0: stack(1) 'TestProfiling$C'
>>>>                                      1: stack(2) unknown (null seen)
>>>>
>>>> Profiling can either be off, on only for jsr292 related calls (invokedynamic, invokehandle or all invokes in a lambda
>>>> form) or on for all methods and invokes. -XX:TypeProfile={0,1,2} is used to pick a level of profiling.
>>>> Once all profiling is in (arguments, parameters, return value), in order to limit the number of options my plan is to
>>>> use TypeProfile=xyz, with x,y,z in {0,1,2} to select the level of profiling for each. TypeProfile=111, Typeprofile=222
>>>> etc.
>>>> Profiling is by default on only for jsr292 related calls in order to not impact startup/footprint of standard java apps.
>>>>
>>>> Roland.
>>>>
>>>>
>