[9] RFR (X): 8163880: Constant pool caching of fields inhibited/delayed unnecessarily

Coleen Phillimore coleen.phillimore at oracle.com
Thu Sep 1 00:28:16 UTC 2016



On 8/30/16 7:24 AM, Zoltán Majó wrote:
> Hi Coleen,
>
>
> thank you for the review!  Please see details below.
>
> On 08/29/2016 10:31 PM, Coleen Phillimore wrote:
>>
>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.02/src/share/vm/interpreter/interpreterRuntime.cpp.udiff.html 
>>
>>
>> I thought we always want to suppress resolving a static putfield, and 
>> only want to give the error if >= classfile version 53.
>
> Yes, we throw an IllegalAccessError() only for class file version >= 
> 53.  For class file version < 53 we do not to throw an error.  
> Delaying CP cache resolution is therefore not necessary for class file 
> version < 53.  That is why we have the classfile version check in place.

I see where my confusion was now.
thanks,
Coleen
>
> If we remove the classfile version check, we'll delay CP caching for 
> class file version < 53 as well.  That won't hurt correctness and is 
> will not hurt performance either (I did some experiments to verify that).
>
> Unfortunately, I pushed the fixed before your mail arrived.  But we 
> can open a new bug to handle the issue you have raised.  Please let me 
> know if you'd like that and I can take care of the bug.
>
>>
>> *+bool has_initialized_final_update = 
>> info.field_holder()->major_version() >= 53 &&*
>> *+ info.has_initialized_final_update();*
>>
>>
>> So, not check the classfile version here.
>>
>> If I understand correctly, this only inhibits the quickening for when 
>> functions other than <clinit> are updating the final field, right?
>
> If inhibits quickening if methods other than <clinit> *or <init>* are 
> updating the final field.
>
>> I am surprised this hurt performance because I would have thought 
>> initializing final fields were an infrequent operation.
>
> Here are some more details to answer that.
>
> The CP cache does caching for get and put bytecodes individually. 
> Normally, the caching for put (in 'put_code') should be set up by put 
> instructions, the caching for get (in 'get_code') should be set up by 
> get instruction.  The optimization is to let get instructions set up 
> caching for put bytecodes in some select cases.
>
> 1) The table below shows the way that worked before the regression 
> (i.e., before JDK-8160527). The table shows if put_code is set (S) or 
> not set (NS) after InterpreterRuntime::resolve_get_put().  The table 
> applies to both static and instance fields.  Furthermore, the table 
> assumes that class initialization has already taken place (i.e., 
> uninitialized_static is false).
>
>                | Instruction triggering resolution
>                |===================================
> Field type     | get          | put
> ===================================================
> final          | NS           | S
> non-final      | S            | S
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.38
>
> 2) Unfortunately, JDK-8160527 changed the table to:
>
>                | Instruction triggering resolution
>                |===================================
> Field type     | get          | put
> ===================================================
> final          | NS           |*NS*
> non-final      | *NS*         | S
>
> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.30
>
> The cause of the regression reported in JDK-8163880 is that
>
> (1) For final fields, setting up the CP cache for put_code is 
> inhibited (Row 1). In most class files we have to deal with, only 
> initializers (<clinit>, <init>) update final fields. Inhibiting CP 
> caching for static final fields will most likely not impact 
> performance too much, as <clinit> methods are executed infrequently.  
> Inhibiting CP for instance fields has most likely have a larger 
> performance impact, as <init> methods are executed more frequently.
>
> (2) For non-final fields, setting up the CP cache for put_code is not 
> set if the triggering instruction is a get (i.e., setting up the CP 
> cache is delayed until a put instruction attempts to set up the CP 
> cache), which can cause a number of extra calls into the runtime.
>
> The regression is due to both (1) and (2).  You pointed out (and I 
> agree) that (1) has a small effect if the field is static.  I think 
> the effect of (1) is larger if the field is non-static. Also, there 
> might be a large effect due to (2).  But we'd need to do some 
> experiments to confirm that.
>
> Currently, the system works is as follows.
>
>                | Instruction triggering resolution
>                |===================================
> Field type     | get          | put
> ===================================================
> final          | NS           | S (for fields in class filed version < 
> 53) / NS (for field in class file version >= 53 that have
> initialized final updates)
> non-final      | S            | S
>
> We can change the final/put case to "S for all fields except ones with 
> initialized final updates)", as you suggested.  That is not necessary 
> for correctness, but won't hurt performance either and could make code 
> more readable.  Please let me know if you'd like a bug to be filed for 
> that -- I'd be happy to take care of it.
>
> Thank you!
>
> Best regards,
>
>
> Zoltan
>
>>
>> thanks,
>> Coleen
>>
>>
>> On 8/29/16 1:28 AM, Zoltán Majó wrote:
>>> Thank you, Vladimir, for the review!
>>>
>>> Best regards,
>>>
>>>
>>> Zoltan
>>>
>>>
>>> On 08/26/2016 07:58 PM, Vladimir Kozlov wrote:
>>>> Looks good.
>>>>
>>>> Thanks,
>>>> Vladimir
>>>>
>>>> On 8/26/16 4:38 AM, Zoltán Majó wrote:
>>>>> Hi Vladimir,
>>>>>
>>>>>
>>>>> thank you for looking at this issue once more.
>>>>>
>>>>> On 08/25/2016 09:05 PM, Vladimir Kozlov wrote:
>>>>>> Okay I think I understand what you are trying to say but it was 
>>>>>> not my
>>>>>> point.
>>>>>
>>>>> Oh, I see.  Then I misunderstood you.
>>>>>
>>>>>> I was questioning why put_code is set when bytecode could be get
>>>>>> instruction. Reading comments and code around I think I 
>>>>>> understand now.
>>>>>> We set both get_code and put_code marking that the field is 
>>>>>> resolved.
>>>>>> But in some cases we delay marking to force field's reresolution.
>>>>>
>>>>> Yes, that is also my understanding.
>>>>>
>>>>>>
>>>>>> In next code you can use is_static:
>>>>>>
>>>>>>     bool uninitialized_static = ((bytecode == 
>>>>>> Bytecodes::_getstatic ||
>>>>>> bytecode == Bytecodes::_putstatic) &&
>>>>>> !klass->is_initialized());
>>>>>
>>>>> OK, I've updated the code.
>>>>>
>>>>>>
>>>>>> Code could be rearranged to be more compact (similar to original 
>>>>>> code):
>>>>>>
>>>>>>     Bytecodes::Code put_code = (Bytecodes::Code)0;
>>>>>>     Bytecodes::Code get_code = (Bytecodes::Code)0;
>>>>>>     if (!uninitialized_static) {
>>>>>>       get_code = ((is_static) ? Bytecodes::_getstatic :
>>>>>> Bytecodes::_getfield);
>>>>>>       if ((is_put && !has_initialized_final_update) ||
>>>>>> !info.access_flags().is_final()) {
>>>>>>         put_code = ((is_static) ? Bytecodes::_putstatic :
>>>>>> Bytecodes::_putfield);
>>>>>>       }
>>>>>>     }
>>>>>
>>>>> I agree and have updated the code accordingly.
>>>>>
>>>>> Here is the updated webrev:
>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.02/
>>>>>
>>>>> JPRT testing passes (testset hotspot).
>>>>>
>>>>> Thank you!
>>>>>
>>>>> Best regards,
>>>>>
>>>>>
>>>>> Zoltan
>>>>>
>>>>>>
>>>>>> Thanks,
>>>>>> Vladimir
>>>>>>
>>>>>> On 8/23/16 5:31 AM, Zoltán Majó wrote:
>>>>>>> Hi Vladimir,
>>>>>>>
>>>>>>>
>>>>>>> thank you for the feedback! Please see details below.
>>>>>>>
>>>>>>> On 08/22/2016 08:20 PM, Vladimir Kozlov wrote:
>>>>>>>> Typo (double 'not'):
>>>>>>>> +   // and the required IllegalAccessError would not not be 
>>>>>>>> thrown.
>>>>>>>
>>>>>>> OK, I've corrected that.
>>>>>>>
>>>>>>>>
>>>>>>>> I think has_initialized_final_update should be part of assert 
>>>>>>>> to see
>>>>>>>> it in hs_err:
>>>>>>>>
>>>>>>>> +   if (has_initialized_final_update) {
>>>>>>>> +     assert(info.access_flags().is_final(), "Fields with 
>>>>>>>> initialized
>>>>>>>> final updates must be final");
>>>>>>>> +   }
>>>>>>>
>>>>>>> Yes, I've updated that as well.
>>>>>>>
>>>>>>>>
>>>>>>>> Why is_put check is not done in outer check? It is missing for
>>>>>>>> is_final():
>>>>>>>>
>>>>>>>> +   if (!uninitialized_static) {
>>>>>>>> +     if ((is_put && !has_initialized_final_update) ||
>>>>>>>> !info.access_flags().is_final()) {
>>>>>>>
>>>>>>> I think it's an optimization to do it this way: For 
>>>>>>> non-final-fields, we
>>>>>>> want to set up CP caching faster (i.e., we want to reduce the 
>>>>>>> number of
>>>>>>> times we call into the runtime to
>>>>>>> LinkResolver::resolve_field_access()).We had that optimization 
>>>>>>> in place
>>>>>>> before JDK-8160527, but not afterwards, which caused a performance
>>>>>>> regression.  Please see some details below.
>>>>>>>
>>>>>>> The CP cache does caching for get and put bytecodes individually.
>>>>>>> Normally, the caching for put (in 'put_code') should be set up 
>>>>>>> by put
>>>>>>> instructions, the caching for get (in 'get_code') should be set 
>>>>>>> up by
>>>>>>> get instruction.  The optimization is to let get instructions 
>>>>>>> set up
>>>>>>> caching for put bytecodes in some select cases.  Here is the way 
>>>>>>> that
>>>>>>> (not) workedbefore/after JDK-8160527.
>>>>>>>
>>>>>>>
>>>>>>> 1) The table below describes the way resolution worked before
>>>>>>> JDK-8160527.  The table shows if put_code was set (S) or not set 
>>>>>>> (NS)
>>>>>>> after InterpreterRuntime::resolve_get_put().
>>>>>>>
>>>>>>>                | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type     | get          | put
>>>>>>> ===================================================
>>>>>>> final          | NS           |S
>>>>>>> non-final      | S            | S
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.38
>>>>>>>
>>>>>>>
>>>>>>> 2) Unfortunately, JDK-8160527 changed the table to:
>>>>>>>
>>>>>>>                | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type     | get          | put
>>>>>>> ===================================================
>>>>>>> final          | NS           |*NS*
>>>>>>> non-final      | *NS*         | S
>>>>>>>
>>>>>>> http://hg.openjdk.java.net/jdk9/jdk9/hotspot/rev/29ed49c42486#l1.30
>>>>>>>
>>>>>>> So the VM does not set up caching in *2additional cases*, which 
>>>>>>> causes a
>>>>>>> performance regression.
>>>>>>>
>>>>>>>
>>>>>>> 3) The current webrev proposes the following:
>>>>>>>
>>>>>>>                | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type     | get          | put
>>>>>>> ===================================================
>>>>>>> final          | NS           |*S (unless 
>>>>>>> has_initialized_final_update
>>>>>>> is true for the field)*
>>>>>>> non-final      | S            | S
>>>>>>>
>>>>>>> That table looks the same as what we had before JDK-8160527, 
>>>>>>> except for
>>>>>>> fields with initialized final updates. For those fields we need 
>>>>>>> to defer
>>>>>>> setting up CP caching so that the appropriate exception can be 
>>>>>>> thrown at
>>>>>>> the right place (by LinkResolver::resolve_field_access()).
>>>>>>>
>>>>>>>
>>>>>>> 4) If we would do what you suggested (I hope I understood your
>>>>>>> suggestion correctly)thenwe'd move the is_put condition to the 
>>>>>>> outer
>>>>>>> check:
>>>>>>>
>>>>>>> +   if (!uninitialized_static && is_put) {
>>>>>>> +     if (!has_initialized_final_update ||
>>>>>>> !info.access_flags().is_final()) {
>>>>>>>
>>>>>>> Then the table would look:
>>>>>>>
>>>>>>>                | Instruction triggering resolution
>>>>>>> |===================================
>>>>>>> Field type     | get          | put
>>>>>>> ===================================================
>>>>>>> final          | NS           |S(unless 
>>>>>>> has_initialized_final_update is
>>>>>>> true for the field)
>>>>>>> non-final      | *NS*         | S
>>>>>>>
>>>>>>> So we would have *still one case* where CP caching is not set up 
>>>>>>> and
>>>>>>> that would most likely result in some performance regression 
>>>>>>> (relative
>>>>>>> to pre-JDK-8160527).
>>>>>>>
>>>>>>>
>>>>>>> I hope I have not overlooked anythingabove.
>>>>>>>
>>>>>>> Here is the updated webrev:
>>>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.01/
>>>>>>>
>>>>>>> JPRT testing is running.
>>>>>>>
>>>>>>> Thank you!
>>>>>>>
>>>>>>> Best regards,
>>>>>>>
>>>>>>>
>>>>>>> Zoltan
>>>>>>>
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks,
>>>>>>>> Vladimir
>>>>>>>>
>>>>>>>> On 8/22/16 4:20 AM, Zoltán Majó wrote:
>>>>>>>>> Hi,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> please review the fix for JDK-8163880.
>>>>>>>>>
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8163880
>>>>>>>>>
>>>>>>>>> Problem & solution: The performance regression reported by the 
>>>>>>>>> bug is
>>>>>>>>> caused by unnecessary delay (or even inhibition) of constant pool
>>>>>>>>> caching for fields targeted by put instructions. For a more
>>>>>>>>> detailed description of the problem and possible solutions, 
>>>>>>>>> please
>>>>>>>>> see the following comment in the bug report [1].
>>>>>>>>>
>>>>>>>>> Webrev:
>>>>>>>>> http://cr.openjdk.java.net/~zmajo/8163880/webrev.00/
>>>>>>>>>
>>>>>>>>> Testing:
>>>>>>>>> - performance evaluation with the affected benchmark shows 
>>>>>>>>> that fix
>>>>>>>>> above solves the performance regression;
>>>>>>>>> - all hotspot test with -Xmixed and -Xcomp (running).
>>>>>>>>>
>>>>>>>>> Thank you!
>>>>>>>>>
>>>>>>>>> Best regards,
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Zoltan
>>>>>>>>>
>>>>>>>>> [1]
>>>>>>>>> https://bugs.openjdk.java.net/browse/JDK-8163880?focusedCommentId=13990030&page=com.atlassian.jira.plugin.system.issuetabpanels:comment-tabpanel#comment-13990030 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>
>>>>>
>>>
>>
>



More information about the hotspot-dev mailing list