[9] RFR (X): 8163880: Constant pool caching of fields inhibited/delayed unnecessarily
Coleen Phillimore
coleen.phillimore at oracle.com
Thu Sep 1 00:28:16 UTC 2016
On 8/30/16 7:24 AM, Zoltán Majó wrote:
> Hi Coleen,
>
>
> thank you for the review! Please see details below.
>
> On 08/29/2016 10:31 PM, Coleen Phillimore wrote:
>>
>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.02/src/share/vm/interpreter/interpreterRuntime.cpp.udiff.html
>>
>>
>> I thought we always want to suppress resolving a static putfield, and
>> only want to give the error if >= classfile version 53.
>
> Yes, we throw an IllegalAccessError() only for class file version >=
> 53. For class file version < 53 we do not to throw an error.
> Delaying CP cache resolution is therefore not necessary for class file
> version < 53. That is why we have the classfile version check in place.
I see where my confusion was now.
thanks,
Coleen
>
> If we remove the classfile version check, we'll delay CP caching for
> class file version < 53 as well. That won't hurt correctness and is
> will not hurt performance either (I did some experiments to verify that).
>
> Unfortunately, I pushed the fixed before your mail arrived. But we
> can open a new bug to handle the issue you have raised. Please let me
> know if you'd like that and I can take care of the bug.
>
>>
>> *+bool has_initialized_final_update =
>> info.field_holder()->major_version() >= 53 &&*
>> *+ info.has_initialized_final_update();*
>>
>>
>> So, not check the classfile version here.
>>
>> If I understand correctly, this only inhibits the quickening for when
>> functions other than <clinit> are updating the final field, right?
>
> If inhibits quickening if methods other than <clinit> *or <init>* are
> updating the final field.
>
>> I am surprised this hurt performance because I would have thought
>> initializing final fields were an infrequent operation.
>
> Here are some more details to answer that.
>
> The CP cache does caching for get and put bytecodes individually.
> Normally, the caching for put (in 'put_code') should be set up by put
> instructions, the caching for get (in 'get_code') should be set up by
> get instruction. The optimization is to let get instructions set up
> caching for put bytecodes in some select cases.
>
> 1) The table below shows the way that worked before the regression
> (i.e., before JDK-8160527). The table shows if put_code is set (S) or
> not set (NS) after InterpreterRuntime::resolve_get_put(). The table
> applies to both static and instance fields. Furthermore, the table
> assumes that class initialization has already taken place (i.e.,
> uninitialized_static is false).
>
> | Instruction triggering resolution
> |===================================
> Field type | get | put
> ===================================================
> final | NS | S
> non-final | S | S
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.38
>
> 2) Unfortunately, JDK-8160527 changed the table to:
>
> | Instruction triggering resolution
> |===================================
> Field type | get | put
> ===================================================
> final | NS |*NS*
> non-final | *NS* | S
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.30
>
> The cause of the regression reported in JDK-8163880 is that
>
> (1) For final fields, setting up the CP cache for put_code is
> inhibited (Row 1). In most class files we have to deal with, only
> initializers (<clinit>, <init>) update final fields. Inhibiting CP
> caching for static final fields will most likely not impact
> performance too much, as <clinit> methods are executed infrequently.
> Inhibiting CP for instance fields has most likely have a larger
> performance impact, as <init> methods are executed more frequently.
>
> (2) For non-final fields, setting up the CP cache for put_code is not
> set if the triggering instruction is a get (i.e., setting up the CP
> cache is delayed until a put instruction attempts to set up the CP
> cache), which can cause a number of extra calls into the runtime.
>
> The regression is due to both (1) and (2). You pointed out (and I
> agree) that (1) has a small effect if the field is static. I think
> the effect of (1) is larger if the field is non-static. Also, there
> might be a large effect due to (2). But we'd need to do some
> experiments to confirm that.
>
> Currently, the system works is as follows.
>
> | Instruction triggering resolution
> |===================================
> Field type | get | put
> ===================================================
> final | NS | S (for fields in class filed version <
> 53) / NS (for field in class file version >= 53 that have
> initialized final updates)
> non-final | S | S
>
> We can change the final/put case to "S for all fields except ones with
> initialized final updates)", as you suggested. That is not necessary
> for correctness, but won't hurt performance either and could make code
> more readable. Please let me know if you'd like a bug to be filed for
> that -- I'd be happy to take care of it.
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> thanks,
>> Coleen
>>
>>
>> On 8/29/16 1:28 AM, Zoltán Majó wrote:
>>> Thank you, Vladimir, for the review!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>
>>> On 08/26/2016 07:58 PM, Vladimir Kozlov wrote:
>>>> Looks good.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/26/16 4:38 AM, Zoltán Majó wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>>
>>>>> thank you for looking at this issue once more.
>>>>>
>>>>> On 08/25/2016 09:05 PM, Vladimir Kozlov wrote:
>>>>>> Okay I think I understand what you are trying to say but it was
>>>>>> not my
>>>>>> point.
>>>>>
>>>>> Oh, I see. Then I misunderstood you.
>>>>>
>>>>>> I was questioning why put_code is set when bytecode could be get
>>>>>> instruction. Reading comments and code around I think I
>>>>>> understand now.
>>>>>> We set both get_code and put_code marking that the field is
>>>>>> resolved.
>>>>>> But in some cases we delay marking to force field's reresolution.
>>>>>
>>>>> Yes, that is also my understanding.
>>>>>
>>>>>>
>>>>>> In next code you can use is_static:
>>>>>>
>>>>>> bool uninitialized_static = ((bytecode ==
>>>>>> Bytecodes::_getstatic ||
>>>>>> bytecode == Bytecodes::_putstatic) &&
>>>>>> !klass->is_initialized());
>>>>>
>>>>> OK, I've updated the code.
>>>>>
>>>>>>
>>>>>> Code could be rearranged to be more compact (similar to original
>>>>>> code):
>>>>>>
>>>>>> Bytecodes::Code put_code = (Bytecodes::Code)0;
>>>>>> Bytecodes::Code get_code = (Bytecodes::Code)0;
>>>>>> if (!uninitialized_static) {
>>>>>> get_code = ((is_static) ? Bytecodes::_getstatic :
>>>>>> Bytecodes::_getfield);
>>>>>> if ((is_put && !has_initialized_final_update) ||
>>>>>> !info.access_flags().is_final()) {
>>>>>> put_code = ((is_static) ? Bytecodes::_putstatic :
>>>>>> Bytecodes::_putfield);
>>>>>> }
>>>>>> }
>>>>>
>>>>> I agree and have updated the code accordingly.
>>>>>
>>>>> Here is the updated webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.02/
>>>>>
>>>>> JPRT testing passes (testset hotspot).
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 8/23/16 5:31 AM, Zoltán Majó wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>>
>>>>>>> thank you for the feedback! Please see details below.
>>>>>>>
>>>>>>> On 08/22/2016 08:20 PM, Vladimir Kozlov wrote:
>>>>>>>> Typo (double 'not'):
>>>>>>>> + // and the required IllegalAccessError would not not be
>>>>>>>> thrown.
>>>>>>>
>>>>>>> OK, I've corrected that.
>>>>>>>
>>>>>>>>
>>>>>>>> I think has_initialized_final_update should be part of assert
>>>>>>>> to see
>>>>>>>> it in hs_err:
>>>>>>>>
>>>>>>>> + if (has_initialized_final_update) {
>>>>>>>> + assert(info.access_flags().is_final(), "Fields with
>>>>>>>> initialized
>>>>>>>> final updates must be final");
>>>>>>>> + }
>>>>>>>
>>>>>>> Yes, I've updated that as well.
>>>>>>>
>>>>>>>>
>>>>>>>> Why is_put check is not done in outer check? It is missing for
>>>>>>>> is_final():
>>>>>>>>
>>>>>>>> + if (!uninitialized_static) {
>>>>>>>> + if ((is_put && !has_initialized_final_update) ||
>>>>>>>> !info.access_flags().is_final()) {
>>>>>>>
>>>>>>> I think it's an optimization to do it this way: For
>>>>>>> non-final-fields, we
>>>>>>> want to set up CP caching faster (i.e., we want to reduce the
>>>>>>> number of
>>>>>>> times we call into the runtime to
>>>>>>> LinkResolver::resolve_field_access()).We had that optimization
>>>>>>> in place
>>>>>>> before JDK-8160527, but not afterwards, which caused a performance
>>>>>>> regression. Please see some details below.
>>>>>>>
>>>>>>> The CP cache does caching for get and put bytecodes individually.
>>>>>>> Normally, the caching for put (in 'put_code') should be set up
>>>>>>> by put
>>>>>>> instructions, the caching for get (in 'get_code') should be set
>>>>>>> up by
>>>>>>> get instruction. The optimization is to let get instructions
>>>>>>> set up
>>>>>>> caching for put bytecodes in some select cases. Here is the way
>>>>>>> that
>>>>>>> (not) workedbefore/after JDK-8160527.
>>>>>>>
>>>>>>>
>>>>>>> 1) The table below describes the way resolution worked before
>>>>>>> JDK-8160527. The table shows if put_code was set (S) or not set
>>>>>>> (NS)
>>>>>>> after InterpreterRuntime::resolve_get_put().
>>>>>>>
>>>>>>> | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type | get | put
>>>>>>> ===================================================
>>>>>>> final | NS |S
>>>>>>> non-final | S | S
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.38
>>>>>>>
>>>>>>>
>>>>>>> 2) Unfortunately, JDK-8160527 changed the table to:
>>>>>>>
>>>>>>> | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type | get | put
>>>>>>> ===================================================
>>>>>>> final | NS |*NS*
>>>>>>> non-final | *NS* | S
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.30
>>>>>>>
>>>>>>> So the VM does not set up caching in *2additional cases*, which
>>>>>>> causes a
>>>>>>> performance regression.
>>>>>>>
>>>>>>>
>>>>>>> 3) The current webrev proposes the following:
>>>>>>>
>>>>>>> | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type | get | put
>>>>>>> ===================================================
>>>>>>> final | NS |*S (unless
>>>>>>> has_initialized_final_update
>>>>>>> is true for the field)*
>>>>>>> non-final | S | S
>>>>>>>
>>>>>>> That table looks the same as what we had before JDK-8160527,
>>>>>>> except for
>>>>>>> fields with initialized final updates. For those fields we need
>>>>>>> to defer
>>>>>>> setting up CP caching so that the appropriate exception can be
>>>>>>> thrown at
>>>>>>> the right place (by LinkResolver::resolve_field_access()).
>>>>>>>
>>>>>>>
>>>>>>> 4) If we would do what you suggested (I hope I understood your
>>>>>>> suggestion correctly)thenwe'd move the is_put condition to the
>>>>>>> outer
>>>>>>> check:
>>>>>>>
>>>>>>> + if (!uninitialized_static && is_put) {
>>>>>>> + if (!has_initialized_final_update ||
>>>>>>> !info.access_flags().is_final()) {
>>>>>>>
>>>>>>> Then the table would look:
>>>>>>>
>>>>>>> | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type | get | put
>>>>>>> ===================================================
>>>>>>> final | NS |S(unless
>>>>>>> has_initialized_final_update is
>>>>>>> true for the field)
>>>>>>> non-final | *NS* | S
>>>>>>>
>>>>>>> So we would have *still one case* where CP caching is not set up
>>>>>>> and
>>>>>>> that would most likely result in some performance regression
>>>>>>> (relative
>>>>>>> to pre-JDK-8160527).
>>>>>>>
>>>>>>>
>>>>>>> I hope I have not overlooked anythingabove.
>>>>>>>
>>>>>>> Here is the updated webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.01/
>>>>>>>
>>>>>>> JPRT testing is running.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 8/22/16 4:20 AM, Zoltán Majó wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> please review the fix for JDK-8163880.
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8163880
>>>>>>>>>
>>>>>>>>> Problem & solution: The performance regression reported by the
>>>>>>>>> bug is
>>>>>>>>> caused by unnecessary delay (or even inhibition) of constant pool
>>>>>>>>> caching for fields targeted by put instructions. For a more
>>>>>>>>> detailed description of the problem and possible solutions,
>>>>>>>>> please
>>>>>>>>> see the following comment in the bug report [1].
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.00/
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>> - performance evaluation with the affected benchmark shows
>>>>>>>>> that fix
>>>>>>>>> above solves the performance regression;
>>>>>>>>> - all hotspot test with -Xmixed and -Xcomp (running).
>>>>>>>>>
>>>>>>>>> Thank you!
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Zoltan
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8163880?focusedCommentId=13990030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13990030
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>
More information about the hotspot-dev
mailing list