x86 interpreters in hotspot
Xin Tong
xerox.time.tech at gmail.com
Sat May 12 20:55:32 PDT 2012
I am hacking the interpreter dispatch based code and found something i do
not understand.
in the
void InterpreterMacroAssembler::dispatch_base(TosState state,
address* table,
bool verifyoop) {
...
...
// load the table address
lea(rscratch1, ExternalAddress((address)table));
// jmp based on the table address and bytecode ( loaded into rbx)
jmp(Address(rscratch1, rbx, Address::times_8));
}
However, when i take a profile and look at the generated interpreter code.
I do not see the lea being generated. instead, r10 is used directly. it
seems that the hotspot does optimizations on the generated interpreter
sequences. ( maybe like peephole optimizations )
Address Offset Bytes Disassembly
% br_misp_exec
0x00007f2d146fe6ab 0x0000040b 0x488d24dc LEA RSP,QWORD PTR
[RSP+RBX*8]
0x00007f2d146fe6af 0x0000040f 0x410fb65d00 MOVZX RBX,BYTE PTR [R13]
0x00007f2d146fe6b4 0x00000414 0x49ba0052371d2d7f0000 MOV RDX,7F2D1D375200H
// no lea ?
0x00007f2d146fe6be 0x0000041e 0x41ff24da * JMP DWORD PTR
[R10+RBX*8] *
I am also trying to record the current bytecode index into a memory buffer
allocated by myself. However, the following code gives me
java.lang.NullPointerException when running one of the test cases.
allocated_channel is a malloc allocated memory.
void InterpreterMacroAssembler::dispatch_base(TosState state,
address* table,
bool verifyoop) {
...
...
// the current bytecode pc is kept in r13.
lea(rscratch2, ExternalAddress((address)allocated_channel));
movptr(Address(rscratch2, 0), r13);
lea(rscratch1, ExternalAddress((address)table));
jmp(Address(rscratch1, rbx, Address::times_8));
}
Thanks
Xin
On Tue, May 8, 2012 at 8:53 AM, Coleen Phillimore <
coleen.phillimore at oracle.com> wrote:
>
> There's a PrintBytecodeHistogram which will tell you how many times each
> bytecode is called.
>
> There's PrintInterpreter which will tell you the size of the template for
> each bytecode.
>
> There's only one tos (top of stack) element, we print two tos elements
> because if the tos is a double or long, it takes two slots.
>
> I'm not sure exactly what you want to do with this information, but
> hopefully this helps.
>
> Coleen
>
>
> On 5/8/2012 2:34 AM, Krystal Mok wrote:
>
> On Tue, May 8, 2012 at 11:17 AM, Xin Tong <xerox.time.tech at gmail.com>
wrote:
>>
>> For example, for bipush interpreter code, it is like this in x86_64
>>
>> void TemplateTable::bipush() {
>> transition(vtos, itos);
>> __ load_signed_byte(rax, at_bcp(1));
>> }
>>
>>
>> I would like to know the size of the generated assembly by the
>> TemplateTable::bipush in given a bcp.
>
>
> Not sure why you would want that, but here's what you could do:
>
> // Bytecodes::Code code = (Bytecodes::Code) i;
> address ep = Interpreter::dispatch_table()[i]; // or normal_table()
> InterpreterCodelet* codelet = Interpreter::codelet_containing(ep);
> int size = codelet->size(); // or code_size() or code_size_to_size()
>
> Code varies in detail depending on what you really want.
>
>>
>>
>> Also, btw, i traced down the trace_bytecode. it calls overloaded
>> traces. 1 with 1 tos and 1 with 2 toses. does that mean java opcodes
>> can take up to 2 tos elements ?
>>
>>
> Short answer: no.
> tos and tos2 are there because on some architectures (e.g. 32-bit x86) the
> top-of-stack value may be stored in two registers (e.g. LTOS on x86_32
> stores the long value in eax:edx).
> On 64-bit architectures, tos2 tends to do nothing, since tos is 64-bit,
wide
> enough to hold any TOS value.
>
> - Kris
>
>>
>>
>> Xin
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/attachments/20120512/7aad6bfc/attachment.html
More information about the hotspot-runtime-dev
mailing list