RFC: 8023657: Extra type profiling points could help C2 code generation

Mon Sep 9 10:21:25 PDT 2013

On 9/9/13 2:21 AM, Roland Westrelin wrote:
> Hi Vladimir,
>
> Thanks for taking another look at this.
>
>> Do you have new webrev for this?
>
> Not yet but I'll work on it.
>
>> Can you do initial changes with just one new profiling - passed arguments at call sites? It would be much easier to review. And I need more time to recall how MDO works.
>
> Ok.
>
> This is currently only implemented on x86 (32bit) in the interpreter and we need interpreter and c2 code for x86 (32 bit & 64bit), sparc. What is the minimum requirement before this can be pushed? I think we need interpreter+c1 on x86 (32 bit and 64 bit) to make this useful.

Agree, "interpreter+c1 on x86 (32 bit and 64 bit)" is good starting point. Other platforms we can do later.

>
> Also, this may have an impact on startup and it won't be known until we can run it in tiered so again we would need c1 code.

It depends how much time/work you need to implement it in C1. May be push Interpreter first (32- and 64-bits), then push 
speculative types which will use this info and let it be tested (we have Nightly without TieredComp) while you work on 
C1. Because you may need to adjust something based on testing.

>
>> Was the reason to put MethodArgs at the end of MDO?
>
> If MethodArgs is first, it needs to be associated with a bci (a u2) but it doesn't have one. Other data entries that don't have a specific bci are at then end of the MDO so it looked like the way to go.

Thank you for explanation.

Thanks,
Vladimir

>
> I'll try to follow you other suggestions below.
>
> Roland.
>
>>
>> On 8/26/13 9:52 AM, Roland Westrelin wrote:
>>> Vladimir,
>>>
>>> Thanks for taking a look at this.
>>>
>>>> Can you show an example of new mdo data (from print_codes())?
>>>
>>> incoming arguments:
>>>
>>> spec/benchmarks/_213_javac/ConstantPool.put(Ljava/lang/Object;)V
>>>    interpreter_invocation_count:  3245628
>>>    invocation_counter:            3245628
>>>    backedge_counter:                    0
>>>
>>>    MethodArgsTypeData args(2)
>>>                                      arg off 0 'spec/benchmarks/_213_javac/ConstantPool'
>>>                                      arg off 1 unknown
>>
>> As I suggested it could be better to name them 'parm' (as we do in C2 ideal graph) to separate from 'arg' for invoke bytecodes:
>>
>>      parameters:   ArgumentsTypeData   entries(2)
>>                                        0: stack(0) 'spec/benchmarks/_213_javac/ConstantPool'
>>                                        1: stack(1) unknown
>>
>>>
>>> arguments to a call and return value:
>>>
>>> 31 invokevirtual 153 <java/util/Hashtable.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;>
>>>    100 bci: 31   VirtualCallArgsTypeData count(0) entries(1)
>>>                                      'java/util/Hashtable'(70 1.00)
>>>                                      args(2)
>>>                                      arg off 1 'spec/benchmarks/_213_javac/Identifier'
>>>                                      arg off 2 'spec/benchmarks/_213_javac/ClassDeclaration'
>>>                                      ret null
>>
>> This is confusing since you mixing 2 data types (escpessily if you get 2 receiver's types). Can you preserve original VirtualCallData info (with new added ratio) since you kept separate profile_call() call:
>>
>>   31 invokevirtual 153 <java/util/Hashtable.put(Ljava/lang/Object;Ljava/lang/Object;)Ljava/lang/Object;>
>>      100 bci: 31   VirtualCallData     count(70) entries(1)
>>                                        0: stack(0) 'java/util/Hashtable'(70 1.00)
>>                    ArgumentsTypeData   entries(2)
>>                                        1: stack(1) 'spec/benchmarks/_213_javac/Identifier'
>>                                        2: stack(2) 'spec/benchmarks/_213_javac/ClassDeclaration'
>>                    ReturnedValueData   null
>>
>>
>> And for invokestatic first line still be CounterData followed by ArgumentsData:
>>
>>      76  bci: 37   CounterData         count(2)
>>                    ArgumentsTypeData   entries(1)
>>                                        0: stack(0) 'spec/benchmarks/_213_javac/Identifier'
>>
>>>
>>>> You need to describe where new info is recorded so we don't need to search for it.
>>>
>>> I should have given more details, you're right.
>>> My change records types which can either be none (no profile), unknown (conflicting profile data), null (always null reference) or the actual klass.
>>> There's no counter associated with the type and it keeps track of a single type.
>>> It records types for all incoming reference arguments of a method, up to TypeProfileNbCallArgs outgoing arguments at calls and the return type at calls if that makes sense.
>>>
>>> So if the signature is void m(Object a), it only records type of a. If it is m(Object a, Object b), it records type of a and b. If it is m(int i, Object a, long l, Object b, Object c) it records type for a and b.
>>>
>>
>> All above should be in the comments. Each new methodData class needs comment description.
>>
>>>>  From what I see MethodArgs are recorded at the end of MDO. CallArgs and CallRet are recorded together with Call's receiver klass data (receiver is simple first argument). Right?
>>>
>>> Yes.
>>
>> Was the reason to put MethodArgs at the end of MDO?
>>
>>>
>>>> Why do you need 2 cells per arg when you record only one type without counter? What is arg_off_rel_off for?
>>>
>>> It's an offset on the interpreter stack so that the interpreter can locate each argument that needs profiling.
>>
>> Okay, add comments what values mean.
>>
>>>
>>>> 737   enum {
>>>> 738     arg_off_rel_off,
>>>> 739     arg_type_rel_off,
>>>> 740     per_arg_cell_count
>>>> 741   };
>>
>> You don't need arg_ prefix since it is inside ArgsTypeData class:
>>
>>    enum {
>>      stack_slot_offset,
>>      type_offset,
>>      per_arg_cell_count
>>    };
>>
>>>>
>>>> Why not reuse ReceiverTypeData for all new data (you used it only for VirtualCallArgs)?
>>>
>>> Keeping track of 2 types, and 2 counters when all we need is a single class looks like a waste of space especially if we want to collect a lot more type data. It also leads to some runtime overhead to keep track of the counters and the other type that we never use.
>>
>> Okay.
>>
>>>
>>>> What next fields are for? I don't see constructor is called.
>>>>
>>>> TypeDataAtCall {
>>>>    const int _base_off;
>>>>    ProfileData* _pd;
>>>
>>> I tried to share as much code as possible among the various type profiling I wanted to do and I used composition. For instance, VirtualCallArgsTypeData has a RetArgsTypeData _args_type_data member which itself has members ArgsTypeData _args and RetTypeData _ret which themselves are subclasses of TypeDataAtCall. The code for the layout of the data is all shared in ArgsTypeData, RetTypeData, TypeDataAtCall and they need a pointer back to the ProfileData of VirtualCallArgsTypeData and the offset where their own piece of memory starts to do updates to VirtualCallArgsTypeData itself.
>>
>> It is too complicated for me, sorry.  I don't see why you need separate VirtualCallArgsTypeData and CallArgsTypeData if you keep VirtualCallData.
>>
>>>
>>>> "Parameters refers to the list of variables in a method declaration. Arguments are the actual values that are passed in when the method is invoked."
>>>>
>>>> May be:
>>>>
>>>> TypeProfileCallArgs --> TypeProfileArgs
>>>> TypeProfileCallRet  --> TypeProfileRet
>>>> TypeProfileMethodArgs --> TypeProfileParams
>>
>> Actually I think it should be one flag (we already have ton of them).
>>
>> Thanks,
>> Vladimir
>>
>>>>
>>>> Definitely:
>>>>
>>>> TypeProfileNbCallArgs --> TypeProfileArgsLimit
>>>
>>> Thanks for the suggestions. I'll keep them in mind.
>>>
>>> Roland.
>>>
>