RFR: 8074345: Enable RewriteBytecodes when VM runs with CDS
Yumin Qi
yumin.qi at oracle.com
Wed Mar 25 21:24:51 UTC 2015
Hi, Coleen
New version based on Ioi's suggestion is located at:
http://cr.openjdk.java.net/~minqi/8074345/webrev03/
Test: JPRT. Manual test on -Xshare:[dump | on ]
Thanks
Yumin
On 3/25/2015 9:58 AM, Coleen Phillimore wrote:
>
> Yes, this was on my to-do list, sorry I haven't gotten to it yet.
> Coleen
>
> On 3/25/15, 12:38 PM, Ioi Lam wrote:
>> Hi Yumin,
>>
>> The changes look good. Just a few nits:
>>
>> *src/share/vm/interpreter/bytecodes.hpp:**
>> *
>> 293 // Rewritten at CDS dump time to | Original bytecode
>> 294 // _invoke_virtual rewritten on sparc, will be disabled if
>> UseSharedSpaces turned on.
>> 295 // ------------------------------+------------------
>> 296 _nofast_getfield , // <- _getfield
>> 297 _nofast_putfield , // <- _putfield
>> 298 _nofast_aload_0 , // <- _aload_0
>> 299 _nofast_iload , // <- _iload
>>
>> I think it should be reformatted to line up the columns:
>>
>> 293 // Rewritten at CDS dump time to | Original bytecode
>> 295 // ------------------------------+------------------
>> 296 _nofast_getfield , // <- _getfield
>> 297 _nofast_putfield , // <- _putfield
>> 298 _nofast_aload_0 , // <- _aload_0
>> 299 _nofast_iload , // <- _iload
>> 230 // NOTE: _invoke_virtual is rewritten only on sparc. This
>> will be disabled if
>> // UseSharedSpaces turned on.
>>
>> *src/share/vm/interpreter/rewriter.cpp:*
>>
>> There are many places that modify the Method object. Instead of
>> putting asserts at all the places where an actual modification
>> happens, I think it's better to use only one assert at the Rewriter
>> entry point, and remove the other assets that you added:
>>
>> 516 void Rewriter::rewrite(instanceKlassHandle klass, TRAPS) {
>> + if (!DumpSharedSpaces) {
>> + assert(!MetaspaceShared::is_in_shared_space(klass()), "archive
>> methods must not be rewritten at run time");
>> + }
>> 517 ResourceMark rm(THREAD);
>> 518 Rewriter rw(klass, klass->constants(), klass->methods(),
>> CHECK);
>> 519 // (That's all, folks.)
>> 520 }
>>
>> Also, I am not sure if the PPC directories in the repo have been
>> 'locked' or not, but I guess you will find out when you do the push.
>>
>> I am not a Reviewer, so probably Coleen needs to look at this as well.
>>
>> Thanks
>> - Ioi
>>
>> On 3/20/15, 1:53 PM, Yumin Qi wrote:
>>> Hi, Coleen and all
>>>
>>> New version with suggested changes can be reviewed at:
>>> http://cr.openjdk.java.net/~minqi/8074345/webrev02/
>>>
>>> Removed _fast_invokeinvirtual from last version, disable rewriting
>>> _invokevirtual if UseSharedSpaces turned on. Only on sparc
>>> _invokevirtual got rewritten. Other platforms as unimplemented.
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 3/11/2015 1:23 PM, Yumin Qi wrote:
>>>> Thanks, I will have another webrev after build/test/perf test.
>>>>
>>>> Yumin
>>>>
>>>> On 3/11/2015 1:11 PM, Coleen Phillimore wrote:
>>>>>
>>>>> Hi Yumin, One comment embedded.
>>>>>
>>>>> On 3/11/15, 2:04 PM, Yumin Qi wrote:
>>>>>> Hi, Coleen
>>>>>>
>>>>>> Thanks for the review. See embedded.
>>>>>>
>>>>>> On 3/10/2015 2:54 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>> Yumin,
>>>>>>>
>>>>>>> The new bytecode approach came out pretty cleanly, or as cleanly
>>>>>>> as this could be.
>>>>>>>
>>>>>>> The file templateTable_x86_32 and 64 have just been merged, so
>>>>>>> you'll have to make your change in the new version.
>>>>>>>
>>>>>>> I also have some comments:
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/rewriter.cpp.udiff.html
>>>>>>>
>>>>>>>
>>>>>>> Typo "rewirting"
>>>>>>>
>>>>>> Sure.
>>>>>>> In these files, can you break up the long lines into three lines?
>>>>>>>
>>>>>>> - if (!is_static) { patch_bytecode(Bytecodes::_fast_fgetfield,
>>>>>>> Rbc, Rscratch); }
>>>>>>> + if (!is_static && rc == MAY_REWRITE) {
>>>>>>> patch_bytecode(Bytecodes::_fast_fgetfield, Rbc, Rscratch); }
>>>>>>>
>>>>>> Sure.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/bytecodes.hpp.udiff.html
>>>>>>>
>>>>>>>
>>>>>>> How many bytecodes do we have now? We're limited to 255 (or
>>>>>>> 256) and there are other new bytecodes being added.
>>>>>>>
>>>>>> now total is 234 (after the fix). See below answer.
>>>>>>> What was the performance benefit to this? I think if we wanted
>>>>>>> to be conservative, we'd turn off RewriteFrequentPairs and only
>>>>>>> do nofast_getfield and nofast_putfield. I think they were the
>>>>>>> only bytecodes that actually affected performance.
>>>>>>>
>>>>>>> In this file above, can you remove the last block of comments
>>>>>>> about fast_linearswitch and fast_ldc? I think this confuses
>>>>>>> rewriting in the interpreter and rewriting in the rewriter, or
>>>>>>> rather makes the confusion worse. I don't think this comment is
>>>>>>> helpful.
>>>>>>>
>>>>>>> I'd prefer to see the first comment smaller also, like:
>>>>>>>
>>>>>>> + // These bytecodes are rewritten at CDS dump time, so that
>>>>>>> we can prevent them from being
>>>>>>> + // rewritten at run time. This way, the ConstMethods can be
>>>>>>> placed in the CDS ReadOnly
>>>>>>> + // section, and RewriteByteCodes/RewriteFrequentPairs can
>>>>>>> rewrite non-CDS bytecodes
>>>>>>> + // at run time.
>>>>>>> + _nofast_getfield ,
>>>>>>> + _nofast_putfield ,
>>>>>>> + _nofast_aload_0 ,
>>>>>>> + _nofast_iload ,
>>>>>>> + _nofast_invokevirtual ,
>>>>>>>
>>>>>>> It's sort of obvious which bytecode they rewrite. I don't know
>>>>>>> how much performance fast_invokevfinal is worth. I thought I
>>>>>>> deleted it. Can we not rewrite this so we don't waste another
>>>>>>> bytecode on it? Maybe add a RewriteVFinal option and consider
>>>>>>> removing it for the future? x86 doesn't use it and I can't see
>>>>>>> how this would save any significant performance to be worth having!
>>>>>>>
>>>>>> _invokevirtual got rewritten on sparc and ppc. Now ppc is
>>>>>> removed, no need to take care for it. For sparc, it does patch
>>>>>> code. I am thinking of a way if we need to add _nofast_code as
>>>>>> you indicated, we only have 255 codes to use.
>>>>>
>>>>> PPC isn't removed from the open repository.
>>>>>>
>>>>>> bool not_rewrite = UseSharedSpaces && RewriteBytecodes &&
>>>>>> RewriteFrequentPair;
>>>>>
>>>>> I think the conditional would be
>>>>>
>>>>> bool not_rewrite = UseSharedSpaces || !RewriteBytecodes;
>>>>>>
>>>>>> Can this boolean decide if we not rewrite the bytecode to
>>>>>> fast? If so, I can remove all the _nofast_code and do not patch
>>>>>> code when it is on.
>>>>>
>>>>> Yes, this would be nice to not add the bytecode.
>>>>>
>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/interpreter/templateTable.hpp.udiff.html
>>>>>>>
>>>>>>>
>>>>>>> + enum RewriteControl { MAY_REWRITE, MAY_NOT_REWRITE }; //
>>>>>>> control for fast code under CDS
>>>>>>>
>>>>>>>
>>>>>>> I don't know what our coding standard is but in the
>>>>>>> templateTable_<cpu>.cpp files these strings look like macros.
>>>>>>> I'd rather see them as MayRewrite or MayNotRewrite.
>>>>>>>
>>>>>> Agree.
>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/cpu/sparc/vm/templateTable_sparc.cpp.udiff.html
>>>>>>>
>>>>>>>
>>>>>>> I think there's a java_code() function that returns the original
>>>>>>> bytecode that you could use instead of the case statement in
>>>>>>> resolve_cache_and_index(). The indentation is odd in the
>>>>>>> webrev. This probably applies to the other cpu directories.
>>>>>>>
>>>>>>> One last question below:
>>>>>>>
>>>>>>> On 3/5/15, 4:21 PM, Yumin Qi wrote:
>>>>>>>> Please review:
>>>>>>>>
>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8074345
>>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8074345/
>>>>>>>>
>>>>>>>> Summary: Currently CDS when is disabled, RewriteBytecodes and
>>>>>>>> RewriteFrequentPairs are disabled due to ConstantMethod in CDS
>>>>>>>> are mapped read only. So memory fault will be triggered when
>>>>>>>> RewriteBytecodes turned on. This also disable all method
>>>>>>>> rewritten, leads interpreter run slower. Observed about 2%
>>>>>>>> regression with C2 on some benchmarks, since interpreter speed
>>>>>>>> is important to C2. By enable RewriteBytecodes and
>>>>>>>> RewriteFrequentPairs under CDS enabled, adding _nofast_xxxx for
>>>>>>>> corresponding fast codes at dump time to avoid byte code
>>>>>>>> rewritten at run time, we prevent byte code rewritten and
>>>>>>>> modify the read only shared portion in CDS. Meanwhile other
>>>>>>>> byte codes with fast codes still get speed up.
>>>>>>>>
>>>>>>>> Tests: JPRT, jtreg, refworkload (20+ benchmarks) on all
>>>>>>>> supported platforms. Interpreter only tests showed about 3%
>>>>>>>> improvement.
>>>>>>>
>>>>>>> What performance did you measure? Is it -Xint -Xshare:on with
>>>>>>> and without your patch? It was only 3% better?
>>>>>>>
>>>>>>> What was the difference in performance with -Xint
>>>>>>> -XX:-RewriteBytecodes vs. -Xint
>>>>>>> -XX:+RewriteBytecodes/FrequentPairs? I thought this was around 15%.
>>>>>>>
>>>>>> I will send you a separate email of the links which run with
>>>>>> CDS/NoCDS/CDS+Int
>>>>>
>>>>> Thanks for the links. From your experiments, I think your
>>>>> performance improvement with your patch and CDS with -Xmixed is
>>>>> 4%. That's good enough for a couple of bytecodes.
>>>>>
>>>>> Coleen
>>>>>
>>>>>>
>>>>>> Thanks
>>>>>> Yumin
>>>>>>> Thanks,
>>>>>>> Coleen
>>>>>>>
>>>>>>>>
>>>>>>>> Thanks
>>>>>>>> Yumin
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>
More information about the hotspot-runtime-dev
mailing list