RFR: 8074345: Enable RewriteBytecodes when VM runs with CDS

Ioi Lam ioi.lam at oracle.com
Wed Mar 25 16:38:48 UTC 2015


Hi Yumin,

The changes look good. Just a few nits:

*src/share/vm/interpreter/bytecodes.hpp:**
*
  293     // Rewritten at CDS dump time to | Original bytecode
  294     // _invoke_virtual rewritten on sparc, will be disabled if 
UseSharedSpaces turned on.
  295     // ------------------------------+------------------
  296     _nofast_getfield      , //  <- _getfield
  297     _nofast_putfield      , //  <- _putfield
  298     _nofast_aload_0       , //  <- _aload_0
  299     _nofast_iload         , //  <- _iload

I think it should be reformatted to line up the columns:

  293     // Rewritten at CDS dump time to | Original bytecode
295     // ------------------------------+------------------
  296     _nofast_getfield      ,         //  <- _getfield
  297     _nofast_putfield      ,         //  <- _putfield
  298     _nofast_aload_0       ,         //  <- _aload_0
  299     _nofast_iload         ,         //  <- _iload
230     // NOTE: _invoke_virtual is rewritten only on sparc. This will 
be disabled if
          // UseSharedSpaces turned on.

*src/share/vm/interpreter/rewriter.cpp:*

There are many places that modify the Method object. Instead of putting 
asserts at all the places where an actual modification happens, I think 
it's better to use only one assert at the Rewriter entry point, and 
remove the other assets that you added:

  516 void Rewriter::rewrite(instanceKlassHandle klass, TRAPS) {
+      if (!DumpSharedSpaces) {
+ assert(!MetaspaceShared::is_in_shared_space(klass()), "archive methods 
must not be rewritten at run time");
+      }
517   ResourceMark rm(THREAD);
  518   Rewriter     rw(klass, klass->constants(), klass->methods(), CHECK);
  519   // (That's all, folks.)
  520 }

Also, I am not sure if the PPC directories in the repo have been 
'locked' or not, but I guess you will find out when you do the push.

I am not a Reviewer, so probably Coleen needs to look at this as well.

Thanks
- Ioi

On 3/20/15, 1:53 PM, Yumin Qi wrote:
> Hi, Coleen and all
>
>   New version with suggested changes can be reviewed at:
>   http://cr.openjdk.java.net/~minqi/8074345/webrev02/
>
>   Removed _fast_invokeinvirtual from last version, disable rewriting 
> _invokevirtual if UseSharedSpaces turned on. Only on sparc 
> _invokevirtual got rewritten.  Other platforms as unimplemented.
>
>   Thanks
>   Yumin
>
> On 3/11/2015 1:23 PM, Yumin Qi wrote:
>> Thanks, I will have another webrev after build/test/perf test.
>>
>> Yumin
>>
>> On 3/11/2015 1:11 PM, Coleen Phillimore wrote:
>>>
>>> Hi Yumin,  One comment embedded.
>>>
>>> On 3/11/15, 2:04 PM, Yumin Qi wrote:
>>>> Hi, Coleen
>>>>
>>>>   Thanks for the review. See embedded.
>>>>
>>>> On 3/10/2015 2:54 PM, Coleen Phillimore wrote:
>>>>>
>>>>> Yumin,
>>>>>
>>>>> The new bytecode approach came out pretty cleanly, or as cleanly 
>>>>> as this could be.
>>>>>
>>>>> The file templateTable_x86_32 and 64 have just been merged, so 
>>>>> you'll have to make your change in the new version.
>>>>>
>>>>> I also have some comments:
>>>>>
>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/rewriter.cpp.udiff.html 
>>>>>
>>>>>
>>>>> Typo "rewirting"
>>>>>
>>>> Sure.
>>>>> In these files, can you break up the long lines into three lines?
>>>>>
>>>>> -  if (!is_static) { patch_bytecode(Bytecodes::_fast_fgetfield, 
>>>>> Rbc, Rscratch); }
>>>>> +  if (!is_static && rc == MAY_REWRITE) { 
>>>>> patch_bytecode(Bytecodes::_fast_fgetfield, Rbc, Rscratch); }
>>>>>
>>>> Sure.
>>>>>
>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/bytecodes.hpp.udiff.html 
>>>>>
>>>>>
>>>>> How many bytecodes do we have now?  We're limited to 255 (or 256) 
>>>>> and there are other new bytecodes being added.
>>>>>
>>>> now total is 234 (after the fix). See below answer.
>>>>> What was the performance benefit to this?   I think if we wanted 
>>>>> to be conservative, we'd turn off RewriteFrequentPairs and only do 
>>>>> nofast_getfield and nofast_putfield.  I think they were the only 
>>>>> bytecodes that actually affected performance.
>>>>>
>>>>> In this file above, can you remove the last block of comments 
>>>>> about fast_linearswitch and fast_ldc?  I think this confuses 
>>>>> rewriting in the interpreter and rewriting in the rewriter, or 
>>>>> rather makes the confusion worse.  I don't think this comment is 
>>>>> helpful.
>>>>>
>>>>> I'd prefer to see the first comment smaller also, like:
>>>>>
>>>>> +    // These bytecodes are rewritten at CDS dump time, so that we 
>>>>> can prevent them from being
>>>>> +    // rewritten at run time. This way, the ConstMethods can be 
>>>>> placed in the CDS ReadOnly
>>>>> +    // section, and RewriteByteCodes/RewriteFrequentPairs can 
>>>>> rewrite non-CDS bytecodes
>>>>> +    // at run time.
>>>>> +    _nofast_getfield      ,
>>>>> +    _nofast_putfield      ,
>>>>> +    _nofast_aload_0       ,
>>>>> +    _nofast_iload         ,
>>>>> +    _nofast_invokevirtual ,
>>>>>
>>>>> It's sort of obvious which bytecode they rewrite.  I don't know 
>>>>> how much performance fast_invokevfinal is worth.  I thought I 
>>>>> deleted it.  Can we not rewrite this so we don't waste another 
>>>>> bytecode on it?  Maybe add a RewriteVFinal option and consider 
>>>>> removing it for the future?  x86 doesn't use it and I can't see 
>>>>> how this would save any significant performance to be worth having!
>>>>>
>>>> _invokevirtual got rewritten on sparc and ppc. Now ppc is removed, 
>>>> no need to take care for it. For sparc, it does patch code. I am 
>>>> thinking of a way if we need to add _nofast_code as you indicated, 
>>>> we only have 255 codes to use.
>>>
>>> PPC isn't removed from the open repository.
>>>>
>>>>    bool not_rewrite = UseSharedSpaces && RewriteBytecodes && 
>>>> RewriteFrequentPair;
>>>
>>> I think the conditional would be
>>>
>>>    bool not_rewrite = UseSharedSpaces || !RewriteBytecodes;
>>>>
>>>>    Can this boolean decide if we not rewrite the bytecode to fast? 
>>>> If so, I can remove all the _nofast_code and do not patch code when 
>>>> it is on.
>>>
>>> Yes, this would be nice to not add the bytecode.
>>>
>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/templateTable.hpp.udiff.html 
>>>>>
>>>>>
>>>>> +  enum RewriteControl { MAY_REWRITE, MAY_NOT_REWRITE }; // 
>>>>> control for fast code under CDS
>>>>>
>>>>>
>>>>> I don't know what our coding standard is but in the 
>>>>> templateTable_<cpu>.cpp files these strings look like macros. I'd 
>>>>> rather see them as MayRewrite or MayNotRewrite.
>>>>>
>>>> Agree.
>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/cpu/sparc/vm/templateTable_sparc.cpp.udiff.html 
>>>>>
>>>>>
>>>>> I think there's a java_code() function that returns the original 
>>>>> bytecode that you could use instead of the case statement in 
>>>>> resolve_cache_and_index().   The indentation is odd in the webrev. 
>>>>> This probably applies to the other cpu directories.
>>>>>
>>>>> One last question below:
>>>>>
>>>>> On 3/5/15, 4:21 PM, Yumin Qi wrote:
>>>>>> Please review:
>>>>>>
>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8074345
>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8074345/
>>>>>>
>>>>>> Summary: Currently CDS when is disabled, RewriteBytecodes and 
>>>>>> RewriteFrequentPairs are disabled due to ConstantMethod in CDS 
>>>>>> are mapped read only. So memory fault will be triggered when 
>>>>>> RewriteBytecodes turned on. This also disable all method 
>>>>>> rewritten, leads interpreter run slower. Observed about 2% 
>>>>>> regression with C2 on some benchmarks, since interpreter speed is 
>>>>>> important to C2. By enable RewriteBytecodes and 
>>>>>> RewriteFrequentPairs under CDS enabled, adding _nofast_xxxx for 
>>>>>> corresponding fast codes at dump time to avoid byte code 
>>>>>> rewritten at run time, we prevent byte code rewritten and modify 
>>>>>> the read only shared portion in CDS. Meanwhile other byte codes 
>>>>>> with fast codes still get speed up.
>>>>>>
>>>>>> Tests: JPRT, jtreg, refworkload (20+ benchmarks) on all supported 
>>>>>> platforms.  Interpreter only tests showed about 3% improvement.
>>>>>
>>>>> What performance did you measure?  Is it -Xint -Xshare:on with and 
>>>>> without your patch?  It was only 3% better?
>>>>>
>>>>> What was the difference in performance with -Xint 
>>>>> -XX:-RewriteBytecodes vs. -Xint 
>>>>> -XX:+RewriteBytecodes/FrequentPairs? I thought this was around 15%.
>>>>>
>>>> I will send you a separate email of the links which run with 
>>>> CDS/NoCDS/CDS+Int
>>>
>>> Thanks for the links.  From your experiments, I think your 
>>> performance improvement with your patch and CDS with -Xmixed is 4%. 
>>> That's good enough for a couple of bytecodes.
>>>
>>> Coleen
>>>
>>>>
>>>> Thanks
>>>> Yumin
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>
>>>>>
>>>>
>>>
>>
>



More information about the hotspot-runtime-dev mailing list