RFR: 8074345: Enable RewriteBytecodes when VM runs with CDS

Thomas Stüfe thomas.stuefe at gmail.com
Fri Mar 27 08:58:45 UTC 2015


Hi,

could this break zero? I try to build zero and get the following build
error:

/shared/projects/openjdk/jdk9-hs-rt/sources/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp:
In static member function ‘static void
BytecodeInterpreter::run(interpreterState)’:
/shared/projects/openjdk/jdk9-hs-rt/sources/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp:581:33:
error: label ‘opc_nofast_getfield’ used but not defined
 /* 0xE8 */ &&opc_invokehandle,&&opc_nofast_getfield,&&opc_nofast_putfield,
&&opc_nofast_aload_0,
                                 ^
/shared/projects/openjdk/jdk9-hs-rt/sources/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp:581:55:
error: label ‘opc_nofast_putfield’ used but not defined
 /* 0xE8 */ &&opc_invokehandle,&&opc_nofast_getfield,&&opc_nofast_putfield,
&&opc_nofast_aload_0,
                                                       ^
/shared/projects/openjdk/jdk9-hs-rt/sources/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp:581:78:
error: label ‘opc_nofast_aload_0’ used but not defined
 /* 0xE8 */ &&opc_invokehandle,&&opc_nofast_getfield,&&opc_nofast_putfield,
&&opc_nofast_aload_0,

  ^
/shared/projects/openjdk/jdk9-hs-rt/sources/hotspot/src/share/vm/interpreter/bytecodeInterpreter.cpp:582:14:
error: label ‘opc_nofast_iload’ used but not defined
 /* 0xEC */ &&opc_nofast_iload,&&opc_default,        &&opc_default,
&&opc_default,
              ^
make[8]: *** [bytecodeInterpreter.o] Error 1


I build with a very unexciting configure line:

bash ../sources/configure  --with-boot-jdk=/shared/projects/sapjvm_8
--with-jvm-variants=zero

on Ubuntu 14.4.

Reverting back before this change makes the build go through (save for the
printf() format issue in frame_zero.cpp which Colleen fixed in a later
patch).

Am I building wrong?

Kind Regards, Thomas



On Wed, Mar 25, 2015 at 10:48 PM, Yumin Qi <yumin.qi at oracle.com> wrote:

> Thanks for your review!
>
> Yumin
>
>
> On 3/25/2015 2:45 PM, Coleen Phillimore wrote:
>
>>
>> Yumin,
>>
>> This looks great.  You addressed my earlier pre-review comments.
>>
>> There are two other architectures now, that I don't believe we build and
>> test.   I think it's only cpu code that would be changed so the fixes for
>> these architectures won't need a sponsor.
>>
>> Zero doesn't support CDS so I think the minimal changes you've put in
>> bytecodeInterpreter.cpp are good.
>>
>> Thanks!
>> Coleen
>>
>>
>> On 3/25/15, 5:24 PM, Yumin Qi wrote:
>>
>>> Hi,  Coleen
>>>
>>>   New version based on Ioi's suggestion is located at:
>>> http://cr.openjdk.java.net/~minqi/8074345/webrev03/
>>>
>>>   Test: JPRT. Manual test on -Xshare:[dump | on ]
>>>
>>> Thanks
>>> Yumin
>>>
>>> On 3/25/2015 9:58 AM, Coleen Phillimore wrote:
>>>
>>>>
>>>> Yes, this was on my to-do list, sorry I haven't gotten to it yet.
>>>> Coleen
>>>>
>>>> On 3/25/15, 12:38 PM, Ioi Lam wrote:
>>>>
>>>>> Hi Yumin,
>>>>>
>>>>> The changes look good. Just a few nits:
>>>>>
>>>>> *src/share/vm/interpreter/bytecodes.hpp:**
>>>>> *
>>>>>  293     // Rewritten at CDS dump time to | Original bytecode
>>>>>  294     // _invoke_virtual rewritten on sparc, will be disabled if
>>>>> UseSharedSpaces turned on.
>>>>>  295     // ------------------------------+------------------
>>>>>  296     _nofast_getfield      , //  <- _getfield
>>>>>  297     _nofast_putfield      , //  <- _putfield
>>>>>  298     _nofast_aload_0       , //  <- _aload_0
>>>>>  299     _nofast_iload         , //  <- _iload
>>>>>
>>>>> I think it should be reformatted to line up the columns:
>>>>>
>>>>>  293     // Rewritten at CDS dump time to | Original bytecode
>>>>> 295     // ------------------------------+------------------
>>>>>  296     _nofast_getfield      ,         //  <- _getfield
>>>>>  297     _nofast_putfield      ,         //  <- _putfield
>>>>>  298     _nofast_aload_0       ,         //  <- _aload_0
>>>>>  299     _nofast_iload         ,         //  <- _iload
>>>>> 230     // NOTE: _invoke_virtual is rewritten only on sparc. This will
>>>>> be disabled if
>>>>>          // UseSharedSpaces turned on.
>>>>>
>>>>> *src/share/vm/interpreter/rewriter.cpp:*
>>>>>
>>>>> There are many places that modify the Method object. Instead of
>>>>> putting asserts at all the places where an actual modification happens, I
>>>>> think it's better to use only one assert at the Rewriter entry point, and
>>>>> remove the other assets that you added:
>>>>>
>>>>>  516 void Rewriter::rewrite(instanceKlassHandle klass, TRAPS) {
>>>>> +      if (!DumpSharedSpaces) {
>>>>> + assert(!MetaspaceShared::is_in_shared_space(klass()), "archive
>>>>> methods must not be rewritten at run time");
>>>>> +      }
>>>>> 517   ResourceMark rm(THREAD);
>>>>>  518   Rewriter     rw(klass, klass->constants(), klass->methods(),
>>>>> CHECK);
>>>>>  519   // (That's all, folks.)
>>>>>  520 }
>>>>>
>>>>> Also, I am not sure if the PPC directories in the repo have been
>>>>> 'locked' or not, but I guess you will find out when you do the push.
>>>>>
>>>>> I am not a Reviewer, so probably Coleen needs to look at this as well.
>>>>>
>>>>> Thanks
>>>>> - Ioi
>>>>>
>>>>> On 3/20/15, 1:53 PM, Yumin Qi wrote:
>>>>>
>>>>>> Hi, Coleen and all
>>>>>>
>>>>>>   New version with suggested changes can be reviewed at:
>>>>>> http://cr.openjdk.java.net/~minqi/8074345/webrev02/
>>>>>>
>>>>>>   Removed _fast_invokeinvirtual from last version, disable rewriting
>>>>>> _invokevirtual if UseSharedSpaces turned on. Only on sparc _invokevirtual
>>>>>> got rewritten.  Other platforms as unimplemented.
>>>>>>
>>>>>>   Thanks
>>>>>>   Yumin
>>>>>>
>>>>>> On 3/11/2015 1:23 PM, Yumin Qi wrote:
>>>>>>
>>>>>>> Thanks, I will have another webrev after build/test/perf test.
>>>>>>>
>>>>>>> Yumin
>>>>>>>
>>>>>>> On 3/11/2015 1:11 PM, Coleen Phillimore wrote:
>>>>>>>
>>>>>>>>
>>>>>>>> Hi Yumin,  One comment embedded.
>>>>>>>>
>>>>>>>> On 3/11/15, 2:04 PM, Yumin Qi wrote:
>>>>>>>>
>>>>>>>>> Hi, Coleen
>>>>>>>>>
>>>>>>>>>   Thanks for the review. See embedded.
>>>>>>>>>
>>>>>>>>> On 3/10/2015 2:54 PM, Coleen Phillimore wrote:
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Yumin,
>>>>>>>>>>
>>>>>>>>>> The new bytecode approach came out pretty cleanly, or as cleanly
>>>>>>>>>> as this could be.
>>>>>>>>>>
>>>>>>>>>> The file templateTable_x86_32 and 64 have just been merged, so
>>>>>>>>>> you'll have to make your change in the new version.
>>>>>>>>>>
>>>>>>>>>> I also have some comments:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/
>>>>>>>>>> interpreter/rewriter.cpp.udiff.html
>>>>>>>>>>
>>>>>>>>>> Typo "rewirting"
>>>>>>>>>>
>>>>>>>>>>  Sure.
>>>>>>>>>
>>>>>>>>>> In these files, can you break up the long lines into three lines?
>>>>>>>>>>
>>>>>>>>>> -  if (!is_static) { patch_bytecode(Bytecodes::_fast_fgetfield,
>>>>>>>>>> Rbc, Rscratch); }
>>>>>>>>>> +  if (!is_static && rc == MAY_REWRITE) {
>>>>>>>>>> patch_bytecode(Bytecodes::_fast_fgetfield, Rbc, Rscratch); }
>>>>>>>>>>
>>>>>>>>>>  Sure.
>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/
>>>>>>>>>> interpreter/bytecodes.hpp.udiff.html
>>>>>>>>>>
>>>>>>>>>> How many bytecodes do we have now?  We're limited to 255 (or 256)
>>>>>>>>>> and there are other new bytecodes being added.
>>>>>>>>>>
>>>>>>>>>>  now total is 234 (after the fix). See below answer.
>>>>>>>>>
>>>>>>>>>> What was the performance benefit to this?   I think if we wanted
>>>>>>>>>> to be conservative, we'd turn off RewriteFrequentPairs and only do
>>>>>>>>>> nofast_getfield and nofast_putfield. I think they were the only bytecodes
>>>>>>>>>> that actually affected performance.
>>>>>>>>>>
>>>>>>>>>> In this file above, can you remove the last block of comments
>>>>>>>>>> about fast_linearswitch and fast_ldc? I think this confuses rewriting in
>>>>>>>>>> the interpreter and rewriting in the rewriter, or rather makes the
>>>>>>>>>> confusion worse.  I don't think this comment is helpful.
>>>>>>>>>>
>>>>>>>>>> I'd prefer to see the first comment smaller also, like:
>>>>>>>>>>
>>>>>>>>>> +    // These bytecodes are rewritten at CDS dump time, so that
>>>>>>>>>> we can prevent them from being
>>>>>>>>>> +    // rewritten at run time. This way, the ConstMethods can be
>>>>>>>>>> placed in the CDS ReadOnly
>>>>>>>>>> +    // section, and RewriteByteCodes/RewriteFrequentPairs can
>>>>>>>>>> rewrite non-CDS bytecodes
>>>>>>>>>> +    // at run time.
>>>>>>>>>> +    _nofast_getfield      ,
>>>>>>>>>> +    _nofast_putfield      ,
>>>>>>>>>> +    _nofast_aload_0       ,
>>>>>>>>>> +    _nofast_iload         ,
>>>>>>>>>> +    _nofast_invokevirtual ,
>>>>>>>>>>
>>>>>>>>>> It's sort of obvious which bytecode they rewrite. I don't know
>>>>>>>>>> how much performance fast_invokevfinal is worth.  I thought I deleted it.
>>>>>>>>>> Can we not rewrite this so we don't waste another bytecode on it?  Maybe
>>>>>>>>>> add a RewriteVFinal option and consider removing it for the future? x86
>>>>>>>>>> doesn't use it and I can't see how this would save any significant
>>>>>>>>>> performance to be worth having!
>>>>>>>>>>
>>>>>>>>>>  _invokevirtual got rewritten on sparc and ppc. Now ppc is
>>>>>>>>> removed, no need to take care for it. For sparc, it does patch code. I am
>>>>>>>>> thinking of a way if we need to add _nofast_code as you indicated, we only
>>>>>>>>> have 255 codes to use.
>>>>>>>>>
>>>>>>>>
>>>>>>>> PPC isn't removed from the open repository.
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    bool not_rewrite = UseSharedSpaces && RewriteBytecodes &&
>>>>>>>>> RewriteFrequentPair;
>>>>>>>>>
>>>>>>>>
>>>>>>>> I think the conditional would be
>>>>>>>>
>>>>>>>>    bool not_rewrite = UseSharedSpaces || !RewriteBytecodes;
>>>>>>>>
>>>>>>>>>
>>>>>>>>>    Can this boolean decide if we not rewrite the bytecode to fast?
>>>>>>>>> If so, I can remove all the _nofast_code and do not patch code when it is
>>>>>>>>> on.
>>>>>>>>>
>>>>>>>>
>>>>>>>> Yes, this would be nice to not add the bytecode.
>>>>>>>>
>>>>>>>>  http://cr.openjdk.java.net/~minqi/8074345/src/share/vm/
>>>>>>>>>> interpreter/templateTable.hpp.udiff.html
>>>>>>>>>>
>>>>>>>>>> +  enum RewriteControl { MAY_REWRITE, MAY_NOT_REWRITE };  //
>>>>>>>>>> control for fast code under CDS
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> I don't know what our coding standard is but in the
>>>>>>>>>> templateTable_<cpu>.cpp files these strings look like macros. I'd rather
>>>>>>>>>> see them as MayRewrite or MayNotRewrite.
>>>>>>>>>>
>>>>>>>>>>  Agree.
>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~minqi/8074345/src/cpu/sparc/
>>>>>>>>>> vm/templateTable_sparc.cpp.udiff.html
>>>>>>>>>>
>>>>>>>>>> I think there's a java_code() function that returns the original
>>>>>>>>>> bytecode that you could use instead of the case statement in
>>>>>>>>>> resolve_cache_and_index().   The indentation is odd in the webrev. This
>>>>>>>>>> probably applies to the other cpu directories.
>>>>>>>>>>
>>>>>>>>>> One last question below:
>>>>>>>>>>
>>>>>>>>>> On 3/5/15, 4:21 PM, Yumin Qi wrote:
>>>>>>>>>>
>>>>>>>>>>> Please review:
>>>>>>>>>>>
>>>>>>>>>>> bug: https://bugs.openjdk.java.net/browse/JDK-8074345
>>>>>>>>>>> webrev: http://cr.openjdk.java.net/~minqi/8074345/
>>>>>>>>>>>
>>>>>>>>>>> Summary: Currently CDS when is disabled, RewriteBytecodes and
>>>>>>>>>>> RewriteFrequentPairs are disabled due to ConstantMethod in CDS are mapped
>>>>>>>>>>> read only. So memory fault will be triggered when RewriteBytecodes turned
>>>>>>>>>>> on. This also disable all method rewritten, leads interpreter run slower.
>>>>>>>>>>> Observed about 2% regression with C2 on some benchmarks, since interpreter
>>>>>>>>>>> speed is important to C2. By enable RewriteBytecodes and
>>>>>>>>>>> RewriteFrequentPairs under CDS enabled, adding _nofast_xxxx for
>>>>>>>>>>> corresponding fast codes at dump time to avoid byte code rewritten at run
>>>>>>>>>>> time, we prevent byte code rewritten and modify the read only shared
>>>>>>>>>>> portion in CDS. Meanwhile other byte codes with fast codes still get speed
>>>>>>>>>>> up.
>>>>>>>>>>>
>>>>>>>>>>> Tests: JPRT, jtreg, refworkload (20+ benchmarks) on all
>>>>>>>>>>> supported platforms.  Interpreter only tests showed about 3% improvement.
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> What performance did you measure?  Is it -Xint -Xshare:on with
>>>>>>>>>> and without your patch?  It was only 3% better?
>>>>>>>>>>
>>>>>>>>>> What was the difference in performance with -Xint
>>>>>>>>>> -XX:-RewriteBytecodes vs. -Xint -XX:+RewriteBytecodes/FrequentPairs?
>>>>>>>>>> I thought this was around 15%.
>>>>>>>>>>
>>>>>>>>>>  I will send you a separate email of the links which run with
>>>>>>>>> CDS/NoCDS/CDS+Int
>>>>>>>>>
>>>>>>>>
>>>>>>>> Thanks for the links.  From your experiments, I think your
>>>>>>>> performance improvement with your patch and CDS with -Xmixed is 4%. That's
>>>>>>>> good enough for a couple of bytecodes.
>>>>>>>>
>>>>>>>> Coleen
>>>>>>>>
>>>>>>>>
>>>>>>>>> Thanks
>>>>>>>>> Yumin
>>>>>>>>>
>>>>>>>>>> Thanks,
>>>>>>>>>> Coleen
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>>> Thanks
>>>>>>>>>>> Yumin
>>>>>>>>>>>
>>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>
>>>>>>>>
>>>>>>>
>>>>>>
>>>>>
>>>>
>>>
>>
>


More information about the hotspot-runtime-dev mailing list