Review Request: Add fast bytecode support to C++ interpreter
Tom Rodriguez
tom.rodriguez at oracle.com
Wed Mar 31 11:01:44 PDT 2010
I think that makes sense. So this isn't really adding support for fast bytecodes but providing a fallback path for a system with two different interpreters, one which supports fast bytecodes and one which doesn't. I think it would be best to keep this ifdef ZERO. What about this:
diff -r c047da02984c src/share/vm/interpreter/bytecodeInterpreter.cpp
--- a/src/share/vm/interpreter/bytecodeInterpreter.cpp Wed Mar 17 16:40:25 2010 -0700
+++ b/src/share/vm/interpreter/bytecodeInterpreter.cpp Wed Mar 31 11:01:02 2010 -0700
@@ -2328,6 +2328,17 @@ run:
}
DEFAULT:
+#ifdef ZERO
+ // Some zero configurations use the C++ interpreter as a
+ // fallback interpreter and have support for platform
+ // specific fast bytecodes which aren't supported here, so
+ // redispatch to the equivalent non-fast bytecode when they
+ // are encountered.
+ if (Bytecodes::is_defined((Bytecodes::Code)opcode)) {
+ opcode = (jubyte)Bytecodes::java_code((Bytecodes::Code)opcode);
+ goto opcode_switch;
+ }
+#endif
fatal2("\t*** Unimplemented opcode: %d = %s\n",
opcode, Bytecodes::name((Bytecodes::Code)opcode));
goto finish;
tom
On Mar 31, 2010, at 3:18 AM, Edward Nevill wrote:
> Hi Tom,
>
> The rewriting is done in
> icedtea6/ports/hotspot/src/cpu/zero/vm/cppInterpreter_arm.S
>
> The following fast/rewritten bytecodes are supported
>
> #define opc_bgetfield 0xcc
> #define opc_cgetfield 0xcd
> #define opc_igetfield 0xd0
> #define opc_lgetfield 0xd1
> #define opc_sgetfield 0xd2
> #define opc_aputfield 0xd3
> #define opc_bputfield 0xd4
> #define opc_cputfield 0xd5
> #define opc_iputfield 0xd8
> #define opc_lputfield 0xd9
> #define opc_iaccess_0 0xdb
> #define opc_iaccess_1 0xdc
> #define opc_iaccess_2 0xdd
> #define opc_iaccess_3 0xde
> #define opc_invokeresolved 0xdf
> #define opc_invokespecialresolved 0xe0
> #define opc_invokestaticresolved 0xe1
> #define opc_invokevfinal 0xe2
> #define opc_iload_iload 0xe3
> #define opc_iload_iload_N 0xe4
> #define opc_dmac 0xe6
> #define opc_iload_0_iconst_N 0xe7
> #define opc_iload_1_iconst_N 0xe8
> #define opc_iload_2_iconst_N 0xe9
> #define opc_iload_3_iconst_N 0xea
> #define opc_iload_iconst_N 0xeb
> #define opc_iadd_istore_N 0xec
> #define opc_isub_istore_N 0xed
> #define opc_iand_istore_N 0xee
> #define opc_ior_istore_N 0xef
> #define opc_ixor_istore_N 0xf0
> #define opc_iadd_u4store 0xf1
> #define opc_isub_u4store 0xf2
> #define opc_iand_u4store 0xf3
> #define opc_ior_u4store 0xf4
> #define opc_ixor_u4store 0xf5
> #define opc_iload_0_iload 0xf6
> #define opc_iload_1_iload 0xf7
> #define opc_iload_2_iload 0xf8
> #define opc_iload_3_iload 0xf9
> #define opc_iload_0_iload_N 0xfa
> #define opc_iload_1_iload_N 0xfb
> #define opc_iload_2_iload_N 0xfc
> #define opc_iload_3_iload_N 0xfd
>
> Under normal execution these bytecodes will be handled by
> cppInterpreter_arm.s.
>
> However, there are a number of cases where it backs out to the C++
> interpreter.
>
> - If JvmtiExport::can_post_interpreter_events is true
>
> - If the ARM interpreter is built in conjunction with the Shark JIT and
> the Shark JIT marks a method as non entrant.
>
> I agree with you that it would be possible / desirable for the C++
> interpreter to support these bytecodes directly, however...
>
> - Performance is not a particular issue in either of the above cases so
> the fact that the rewritten bytecodes are now slower than the non
> rewritten variants does not matter.
>
> - There are a large number of bytecodes which are rewritten. This would
> involve a large change to bytecodeInterpreter.cpp which would have to be
> debugged, supported etc.
>
> - The rewritten bytecodes are architecture dependant. Well, at least,
> some of them are, and some of them are not.
>
> The file hotspot/src/share/vm/interpreter/bytecodes.cpp contains the
> definitions of a number of the fast bytecodes (see
> Bytecodes::initialize()) where a set of fast bytecodes is defined
> between _fast_aputfield and _fast_binary_switch.
>
> Bytecodes::initialize() then calls
>
> // platform specific JVM bytecodes
> pd_initialize();
>
> To allow the platform to define its own specific bytecodes. In the case
> of Zero / ARM this is done in
>
> icedtea6/ports/hotspot/src/cpu/zero/vm/bytecodes_zero.cpp
>
> where the rest of the fast bytecodes used by the ARM interpreter are
> defined.
>
> Given that the fast bytecodes can be defined in a platform dependant
> area of code like this it would seem wrong to add support for them to
> bytecodeInterpreter.cpp which is in shared code because the set of fast
> bytecodes may vary between platforms.
>
> What might be possible is for bytecodeInterpreter.cpp to #include say
> "bytecodeInterpreter_pd.cpp". This would then allow rewritten bytecodes
> to be added in a platform specific fashion.
>
> However, the fix I proposed is that in the default case, before throwing
> a fatal error, the bytecode interpreter should perform a quick check to
> see if the bytecode is defined, and in this case replace the bytecode
> with the original bytecode and re-execute it.
>
> This would seem to be a fairly safe change to bytecodeInterpreter.cpp
> since it only affects the case where a fatal error was to be thrown in
> any case.
>
> Regards,
> Ed.
>
>
>> I don't understand this. Zero sets both RewriteFrequentPairs and
>> RewriteBytecodes to true but it doesn't appear to have the logic to
>> convert normal bytecodes into their fast variants so how does it run
>> afoul of the fast bytecodes? Is there some code somewhere that I'm not
>> seeing that injects the _fast variants? Strictly speaking
>> RewriteFrequentPairs and RewriteBytecodes should be false if the C++
>> interpreter is being used but since it's all the responsibility of the
>> interpreter itself having wrong settings for those flags doesn't really
>> matter.
>>
>> If you really wanted to support RewriteFrequentPairs and RewriteBytecodes then you'd need to add the rewriting logic to inject the _fast variants and then add cases to handle them. The template interpreter has special implementations for each of the fast variants, including the pairs, so it doesn't convert them back to the normal version and redispatch. Redispatching will make the _fast variants slower than the normal variant which doesn't seem like a very good optimization.
>>
>> tom
>>
>> On Mar 30, 2010, at 3:07 AM, Gary Benson wrote:
>>
>>> Hi all,
>>>
>>> HotSpot has the capability to rewrite the bytecode stream, for example
>>> to combine common instruction pairs, but the C++ interpreter has no
>>> support for this. This webrev adds support for backing out over
>>> rewritten bytecodes to the C++ interpreter, in much the same way as I
>>> believe the template interpreter does.
>
More information about the hotspot-compiler-dev
mailing list