RFR: 8074345: Enable RewriteBytecodes when VM runs with CDS

Yumin Qi yumin.qi at oracle.com
Wed Mar 25 21:48:27 UTC 2015


Thanks for your review!

Yumin

On 3/25/2015 2:45 PM, Coleen Phillimore wrote:
>
> Yumin,
>
> This looks great.  You addressed my earlier pre-review comments.
>
> There are two other architectures now, that I don't believe we build 
> and test.   I think it's only cpu code that would be changed so the 
> fixes for these architectures won't need a sponsor.
>
> Zero doesn't support CDS so I think the minimal changes you've put in 
> bytecodeInterpreter.cpp are good.
>
> Thanks!
> Coleen
>
>
> On 3/25/15, 5:24 PM, Yumin Qi wrote:
>> Hi,  Coleen
>>
>>   New version based on Ioi's suggestion is located at:
>> http://cr.openjdk.java.net/~minqi/8074345/webrev03/
>>
>>   Test: JPRT. Manual test on -Xshare:[dump | on ]
>>
>> Thanks
>> Yumin
>>
>> On 3/25/2015 9:58 AM, Coleen Phillimore wrote:
>>>
>>> Yes, this was on my to-do list, sorry I haven't gotten to it yet.
>>> Coleen
>>>
>>> On 3/25/15, 12:38 PM, Ioi Lam wrote:
>>>> Hi Yumin,
>>>>
>>>> The changes look good. Just a few nits:
>>>>
>>>> *src/share/vm/interpreter/bytecodes.hpp:**
>>>> *
>>>>  293     // Rewritten at CDS dump time to | Original bytecode
>>>>  294     // _invoke_virtual rewritten on sparc, will be disabled if 
>>>> UseSharedSpaces turned on.
>>>>  295     // ------------------------------+------------------
>>>>  296     _nofast_getfield      , //  <- _getfield
>>>>  297     _nofast_putfield      , //  <- _putfield
>>>>  298     _nofast_aload_0       , //  <- _aload_0
>>>>  299     _nofast_iload         , //  <- _iload
>>>>
>>>> I think it should be reformatted to line up the columns:
>>>>
>>>>  293     // Rewritten at CDS dump time to | Original bytecode
>>>> 295     // ------------------------------+------------------
>>>>  296     _nofast_getfield      ,         //  <- _getfield
>>>>  297     _nofast_putfield      ,         //  <- _putfield
>>>>  298     _nofast_aload_0       ,         //  <- _aload_0
>>>>  299     _nofast_iload         ,         //  <- _iload
>>>> 230     // NOTE: _invoke_virtual is rewritten only on sparc. This 
>>>> will be disabled if
>>>>          // UseSharedSpaces turned on.
>>>>
>>>> *src/share/vm/interpreter/rewriter.cpp:*
>>>>
>>>> There are many places that modify the Method object. Instead of 
>>>> putting asserts at all the places where an actual modification 
>>>> happens, I think it's better to use only one assert at the Rewriter 
>>>> entry point, and remove the other assets that you added:
>>>>
>>>>  516 void Rewriter::rewrite(instanceKlassHandle klass, TRAPS) {
>>>> +      if (!DumpSharedSpaces) {
>>>> + assert(!MetaspaceShared::is_in_shared_space(klass()), "archive 
>>>> methods must not be rewritten at run time");
>>>> +      }
>>>> 517   ResourceMark rm(THREAD);
>>>>  518   Rewriter     rw(klass, klass->constants(), klass->methods(), 
>>>> CHECK);
>>>>  519   // (That's all, folks.)
>>>>  520 }
>>>>
>>>> Also, I am not sure if the PPC directories in the repo have been 
>>>> 'locked' or not, but I guess you will find out when you do the push.
>>>>
>>>> I am not a Reviewer, so probably Coleen needs to look at this as well.
>>>>
>>>> Thanks
>>>> - Ioi
>>>>
>>>> On 3/20/15, 1:53 PM, Yumin Qi wrote:
>>>>> Hi, Coleen and all
>>>>>
>>>>>   New version with suggested changes can be reviewed at:
>>>>> http://cr.openjdk.java.net/~minqi/8074345/webrev02/
>>>>>
>>>>>   Removed _fast_invokeinvirtual from last version, disable 
>>>>> rewriting _invokevirtual if UseSharedSpaces turned on. Only on 
>>>>> sparc _invokevirtual got rewritten.  Other platforms as 
>>>>> unimplemented.
>>>>>
>>>>>   Thanks
>>>>>   Yumin
>>>>>
>>>>> On 3/11/2015 1:23 PM, Yumin Qi wrote:
>>>>>> Thanks, I will have another webrev after build/test/perf test.
>>>>>>
>>>>>> Yumin
>>>>>>
>>>>>> On 3/11/2015 1:11 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Hi Yumin,  One comment embedded.
>>>>>>>
>>>>>>> On 3/11/15, 2:04 PM, Yumin Qi wrote:
>>>>>>>> Hi, Coleen
>>>>>>>>
>>>>>>>>   Thanks for the review. See embedded.
>>>>>>>>
>>>>>>>> On 3/10/2015 2:54 PM, Coleen Phillimore wrote:
>>>>>>>>>
>>>>>>>>> Yumin,
>>>>>>>>>
>>>>>>>>> The new bytecode approach came out pretty cleanly, or as 
>>>>>>>>> cleanly as this could be.
>>>>>>>>>
>>>>>>>>> The file templateTable_x86_32 and 64 have just been merged, so 
>>>>>>>>> you'll have to make your change in the new version.
>>>>>>>>>
>>>>>>>>> I also have some comments:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/rewriter.cpp.udiff.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Typo "rewirting"
>>>>>>>>>
>>>>>>>> Sure.
>>>>>>>>> In these files, can you break up the long lines into three lines?
>>>>>>>>>
>>>>>>>>> -  if (!is_static) { 
>>>>>>>>> patch_bytecode(Bytecodes::_fast_fgetfield, Rbc, Rscratch); }
>>>>>>>>> +  if (!is_static && rc == MAY_REWRITE) { 
>>>>>>>>> patch_bytecode(Bytecodes::_fast_fgetfield, Rbc, Rscratch); }
>>>>>>>>>
>>>>>>>> Sure.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/bytecodes.hpp.udiff.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> How many bytecodes do we have now?  We're limited to 255 (or 
>>>>>>>>> 256) and there are other new bytecodes being added.
>>>>>>>>>
>>>>>>>> now total is 234 (after the fix). See below answer.
>>>>>>>>> What was the performance benefit to this?   I think if we 
>>>>>>>>> wanted to be conservative, we'd turn off RewriteFrequentPairs 
>>>>>>>>> and only do nofast_getfield and nofast_putfield. I think they 
>>>>>>>>> were the only bytecodes that actually affected performance.
>>>>>>>>>
>>>>>>>>> In this file above, can you remove the last block of comments 
>>>>>>>>> about fast_linearswitch and fast_ldc? I think this confuses 
>>>>>>>>> rewriting in the interpreter and rewriting in the rewriter, or 
>>>>>>>>> rather makes the confusion worse.  I don't think this comment 
>>>>>>>>> is helpful.
>>>>>>>>>
>>>>>>>>> I'd prefer to see the first comment smaller also, like:
>>>>>>>>>
>>>>>>>>> +    // These bytecodes are rewritten at CDS dump time, so 
>>>>>>>>> that we can prevent them from being
>>>>>>>>> +    // rewritten at run time. This way, the ConstMethods can 
>>>>>>>>> be placed in the CDS ReadOnly
>>>>>>>>> +    // section, and RewriteByteCodes/RewriteFrequentPairs can 
>>>>>>>>> rewrite non-CDS bytecodes
>>>>>>>>> +    // at run time.
>>>>>>>>> +    _nofast_getfield      ,
>>>>>>>>> +    _nofast_putfield      ,
>>>>>>>>> +    _nofast_aload_0       ,
>>>>>>>>> +    _nofast_iload         ,
>>>>>>>>> +    _nofast_invokevirtual ,
>>>>>>>>>
>>>>>>>>> It's sort of obvious which bytecode they rewrite. I don't know 
>>>>>>>>> how much performance fast_invokevfinal is worth.  I thought I 
>>>>>>>>> deleted it.  Can we not rewrite this so we don't waste another 
>>>>>>>>> bytecode on it?  Maybe add a RewriteVFinal option and consider 
>>>>>>>>> removing it for the future? x86 doesn't use it and I can't see 
>>>>>>>>> how this would save any significant performance to be worth 
>>>>>>>>> having!
>>>>>>>>>
>>>>>>>> _invokevirtual got rewritten on sparc and ppc. Now ppc is 
>>>>>>>> removed, no need to take care for it. For sparc, it does patch 
>>>>>>>> code. I am thinking of a way if we need to add _nofast_code as 
>>>>>>>> you indicated, we only have 255 codes to use.
>>>>>>>
>>>>>>> PPC isn't removed from the open repository.
>>>>>>>>
>>>>>>>>    bool not_rewrite = UseSharedSpaces && RewriteBytecodes && 
>>>>>>>> RewriteFrequentPair;
>>>>>>>
>>>>>>> I think the conditional would be
>>>>>>>
>>>>>>>    bool not_rewrite = UseSharedSpaces || !RewriteBytecodes;
>>>>>>>>
>>>>>>>>    Can this boolean decide if we not rewrite the bytecode to 
>>>>>>>> fast? If so, I can remove all the _nofast_code and do not patch 
>>>>>>>> code when it is on.
>>>>>>>
>>>>>>> Yes, this would be nice to not add the bytecode.
>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/templateTable.hpp.udiff.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> +  enum RewriteControl { MAY_REWRITE, MAY_NOT_REWRITE };  // 
>>>>>>>>> control for fast code under CDS
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I don't know what our coding standard is but in the 
>>>>>>>>> templateTable_<cpu>.cpp files these strings look like macros. 
>>>>>>>>> I'd rather see them as MayRewrite or MayNotRewrite.
>>>>>>>>>
>>>>>>>> Agree.
>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/cpu/sparc/vm/templateTable_sparc.cpp.udiff.html 
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> I think there's a java_code() function that returns the 
>>>>>>>>> original bytecode that you could use instead of the case 
>>>>>>>>> statement in resolve_cache_and_index().   The indentation is 
>>>>>>>>> odd in the webrev. This probably applies to the other cpu 
>>>>>>>>> directories.
>>>>>>>>>
>>>>>>>>> One last question below:
>>>>>>>>>
>>>>>>>>> On 3/5/15, 4:21 PM, Yumin Qi wrote:
>>>>>>>>>> Please review:
>>>>>>>>>>
>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8074345
>>>>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8074345/
>>>>>>>>>>
>>>>>>>>>> Summary: Currently CDS when is disabled, RewriteBytecodes and 
>>>>>>>>>> RewriteFrequentPairs are disabled due to ConstantMethod in 
>>>>>>>>>> CDS are mapped read only. So memory fault will be triggered 
>>>>>>>>>> when RewriteBytecodes turned on. This also disable all method 
>>>>>>>>>> rewritten, leads interpreter run slower. Observed about 2% 
>>>>>>>>>> regression with C2 on some benchmarks, since interpreter 
>>>>>>>>>> speed is important to C2. By enable RewriteBytecodes and 
>>>>>>>>>> RewriteFrequentPairs under CDS enabled, adding _nofast_xxxx 
>>>>>>>>>> for corresponding fast codes at dump time to avoid byte code 
>>>>>>>>>> rewritten at run time, we prevent byte code rewritten and 
>>>>>>>>>> modify the read only shared portion in CDS. Meanwhile other 
>>>>>>>>>> byte codes with fast codes still get speed up.
>>>>>>>>>>
>>>>>>>>>> Tests: JPRT, jtreg, refworkload (20+ benchmarks) on all 
>>>>>>>>>> supported platforms.  Interpreter only tests showed about 3% 
>>>>>>>>>> improvement.
>>>>>>>>>
>>>>>>>>> What performance did you measure?  Is it -Xint -Xshare:on with 
>>>>>>>>> and without your patch?  It was only 3% better?
>>>>>>>>>
>>>>>>>>> What was the difference in performance with -Xint 
>>>>>>>>> -XX:-RewriteBytecodes vs. -Xint 
>>>>>>>>> -XX:+RewriteBytecodes/FrequentPairs? I thought this was around 
>>>>>>>>> 15%.
>>>>>>>>>
>>>>>>>> I will send you a separate email of the links which run with 
>>>>>>>> CDS/NoCDS/CDS+Int
>>>>>>>
>>>>>>> Thanks for the links.  From your experiments, I think your 
>>>>>>> performance improvement with your patch and CDS with -Xmixed is 
>>>>>>> 4%. That's good enough for a couple of bytecodes.
>>>>>>>
>>>>>>> Coleen
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>> Thanks,
>>>>>>>>> Coleen
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Thanks
>>>>>>>>>> Yumin
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list