RFR/RFC calling convention change for value types in compiled code
Tobias Hartmann
tobias.hartmann at oracle.com
Fri Nov 25 15:30:14 UTC 2016
Hi Roland,
very nice work!
Apart from the points that Zoltan already mentioned, I noticed the following:
- sharedRuntime_x86_64.cpp:
- both branches of the if in line 635 contain the same statement. The comment in the first branch refers to the removed st_off/next_off.
- line 656: shouldn't this be "movq" and can then be merged with the first branch?
- callGenerator.cpp:
- I would move the !ValueTypePassFieldsAsArgs condition in line 157 to the outer if
- There are lots of "domain() -> domain_sig()" changes. Maybe it would make sense to not rename domain() but only add domain_cc().
- Please add messages to asserts like "assert(type->is_valuetype(), "")" or remove the ""
- I saw that you also fixed the invokedirect problem you've mentioned. Was this problem triggered by one of the tests you added?
- The test changes look good, great that you were able to handle both flag settings with the same tests.
[The following thoughts are out of the scope of this but should be considered for future work]
Below code allocates the value type 'v' three times although only one allocation is required:
1: MyValue v = createInline();
2: pass_to_interpreter(v);
3: pass_to_interpreter(v);
4: v = compiled_return(v); // this compiled callee method does "vreturn v;"
Allocation in line 2 is required because we call interpreted code. In line 3, we allocate again because allocation in line 2 happened in the c2i adapter and the oop is not passed back to the caller. In line 4, we have to load the field values from memory, pass them to the compiled callee and re-allocate the value type on vreturn.
My idea is to solve this problem by having read/write oop arguments. Like this, we could pass an additional oop argument for each VT parameter that can be NULL and set the oop in the c2i adapter or the compiled callee after allocation happened the first time.
In line 2, we would then pass NULL as oop argument and the adapter would set the oop after allocating the VT. In line 3, we would re-use that oop and the c2i adapter would not allocate again. In line 4, we would not allocate in the callee because we receive a non-null oop.
This would also solve the problem of how to avoid allocations when returning to compiled code. For example:
MyValue get() {
return createInline();
}
would always allocate the VT even if we return to compiled code. We could now pass an additional oop to an uninitialized VT as argument to get() to hold the to be returned VT. The buffered VT is then initialized in the callee if the oop argument is non-NULL instead of allocating and returning a new VT (this would of course require interpreter support). The advantage is that we can defer allocation to the caller (recursively), potentially increasing optimization opportunities and avoiding allocations in long call/return chains.
We should also consider passing VTs "by value" only if the field values are available as constants or have to be loaded in the callee anyway. Otherwise, we may load field values from memory just to pass them to the callee when it may be more beneficial to just pass an oop and avoid field loads (especially for large value types).
Thanks,
Tobias
On 24.11.2016 15:16, Zoltán Majó wrote:
> Hi Roland,
>
>
> On 11/24/2016 11:25 AM, Roland Westrelin wrote:
>> http://cr.openjdk.java.net/~roland/valhalla/callingconvention/webrev.00/
>>
>> This patch implements a new compiled code calling convention for methods
>> that take value type arguments: instead of passing a reference to the
>> value type as an argument, the method is passed each field of the value
>> type in registers/stack slots.
>>
>> The interpreter continues to use references at calls so when going to
>> compiled from the interpreter (i2c adapter), value type fields are
>> loaded from the value type object and the value type object is
>> discarded. When going from compiled code to the interpreter (c2i
>> adapter), value type objects for every arguments need to be allocated
>> and initialized with the field values passed by the compiler.
>
> the change looks good to me in general. Please find some minor comments below:
>
> = src/cpu/x86/vm/sharedRuntime_x86_64.cpp
>
> Don't we have one T_VOID too much here?
>
> 561 // flattened so, for instance: T_VALUETYPE T_INT T_VALUETYPE
> 562 // T_INT T_LONG T_VOID T_VOID T_VOID is a value type with a
>
>
> 2 minor typos.
>
> 563 // int field an a value type field that itself has 2 fields, a
> 564 // int and a long
>
> =>
>
> 563 // int field and a value type field that itself has 2 fields, an
> 564 // int and a long
>
>
> Maybe it's better to have a more precise condition here:
>
> 692 if (i != sig.length()) {
>
> =>
>
> 692 if (i < sig.length()) {
>
>
> Can you maybe find more suggestive names for j and k (and maybe i) in gen_c2i_adapter_helper()?
>
> 766 for (int i = 0, ignored = 0, j = 0, k = 0; i < sig.length(); i++) {
>
> Maybe next_vt_arg_comp for j, next_arg_int for k and next_arg_comp for k? These are long variable names, I agree, but it would be helpful if the code indicated which variable represents the interpreter's and which the compiler's point of view.
>
> The same could be applied to gen_i2c_adapter()
>
> 1029 for (int i = 0, ignored = 0, k = 0; i < sig.length(); i++) {
>
>
> Can we maybe change the method name to reflect that the number of arguments computed is from the point of view of the interpreter?
>
> 552 static int compute_total_args_passed_int(const GrowableArray<SigEntry>& sig) {
>
> Maybe it would be also good to change the argument name to sig_extended or sig_cc so that it's more obvious that this is the signature used by the calling convention (and not the declared signature).
>
> So in SharedRuntime::gen_i2c_adapter, you could change
>
> 1018 int total_args_passed = compute_total_args_passed(sig);
>
> to
>
> 1018 int total_args_passed_int = compute_total_args_passed_int(sig_extended);
>
>
> = src/share/vm/opto/type.hpp
>
> 1536 const TypeTuple* const _domain_sig; // Domain of inputs
> 1537 const TypeTuple* const _domain_cc; // Domain of inputs
>
> Could you please add a comment to explain what the difference between _domain_sig and _domain_cc is? I see that 'cc' refers to calling convention, but that was not obvious on the first sight. Or maybe you can use _domain_sig_extended instead. (Whatever you may pick, it would be great if it stayed consistent with the terms you use at other places, e.g., in the adapter generation).
>
> The comment in src/share/vm/opto/type.cpp describes very well what is happening, maybe you can reuse part of that text.
>
>
> 1558 static const TypeFunc *make(const TypeTuple* domain, const TypeTuple* domain_cc, const TypeTuple* range);
>
> You could add a version of the make method that takes only one domain argument for the case domain_sig and domain_sig_extended are the same. (There are more than 30 places where the two are equal and only one place where they are not.)
>
>
> 652 static const TypeTuple *make_domain(ciInstanceKlass* recv, ciSignature *sig, bool is_cc = false);
>
> Could you please give a more suggestive name to the 'is_cc'? You don't actually check that we have a calling convention but rather if the special calling convention for VT fields is enabled. Maybe vt_fields_as_args_enabled could work.
>
>
> = src/share/vm/opto/type.cpp
>
> Could you please move the extra_value_fields() and collect_value_fields() functions to ciValueKlass? Both functions operate on a ciValueKlass instance and maybe we want to use them later in some other context. Maybe add a short comment summarizing what those functions do.
>
>
> = src/share/vm/opto/callGenerator.cpp
>
> 375 // FIXME: late inlining of methods that take value type arguments is
> 376 // broken: arguments at the call are set up so fields of value type
> 377 // arguments are passed but code here expects a single argument per
> 378 // value type (a ValueTypeNode) instead.
>
> How difficult do you think it will be to fix late inlining later on? It's fine to do the fix later -- I'm just wondering if there is anything here that would cause a large difficulty for us later on (most likely not, but better to ask).
>
>
> = src/share/vm/runtime/globals.hpp
>
> 4183 experimental(bool, ValueTypePassFieldsAsArgs, true, \
> 4184 "Pass each field as an argument at calls") \
>
> Can you please correct the indentation in the second line?
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> Roland.
>
More information about the valhalla-dev
mailing list