Add instrumentation in the TemplateInterpreter

Khanh Nguyen ktruong.nguyen at gmail.com
Thu Feb 18 19:30:53 UTC 2016


The main reason is the performance difference between the
TemplateInterpreter and the BytecodeInterpreter in Zero.
I did not verify the difference but I found from this mailing list that the
difference is 10x.

And since we are talking about Zero. How much is the performance difference
between ZeroShark and the standard Hotspot, do you by any chance know?

Thanks
On Feb 18, 2016 11:14 AM, "Christian Thalinger" <
christian.thalinger at oracle.com> wrote:

> Can you share the reason?
>
> On Feb 18, 2016, at 8:01 AM, Khanh Nguyen <ktruong.nguyen at gmail.com>
> wrote:
>
> Unfortunately it has to be the Template Interpreter.
> On Feb 18, 2016 9:27 AM, "Christian Thalinger" <
> christian.thalinger at oracle.com> wrote:
>
>> Does it have to be the template interpreter or could you do your work
>> with Zero as well?
>>
>> > On Feb 9, 2016, at 11:19 PM, Khanh Nguyen <ktruong.nguyen at gmail.com>
>> wrote:
>> >
>> > Hello,
>> >
>> > I want to add instrumentation to monitor all reads and writes in the
>> > TemplateInterpreter, I think I got the correct place for it in
>> > /cpu/x86/vm/templateTable_x86_64.cpp. Can someone please tell me if I'm
>> > doing it right?
>> >
>> > For writes:
>> > static void do_oop_store(InterpreterMacroAssembler* _masm,
>> >                         Address obj,
>> >                         Register val,
>> >                         BarrierSet::Name barrier,
>> >                         bool precise) {
>> > [...]
>> > case BarrierSet::CardTableModRef:
>> >  case BarrierSet::CardTableExtension:
>> >      {
>> >        if (val == noreg) {
>> >          __ store_heap_oop_null(obj);
>> >        } else {
>> >          __ store_heap_oop(obj, val);
>> >
>> > /*mycodeA*/  __ movptr(c_rarg1, obj.base()); // save this value
>> otherwise
>> > it will be changed?
>> >
>> >          // flatten object address if needed
>> >          if (!precise || (obj.index() == noreg && obj.disp() == 0)) {
>> >            __ store_check(obj.base());
>> > /*mycodeB*/ __ call_VM(noreg, //void
>> >                       CAST_FROM_FN_PTR(address,
>> >
>> InterpreterRuntime::write_helper),
>> >                       c_rarg1,  // obj
>> >                       c_rarg1, // field address because store check is
>> > called on field address
>> >                       val);
>> >          } else {
>> >            __ leaq(rdx, obj);
>> >            __ store_check(rdx);
>> > /*mycodeC*/ __ call_VM(noreg, //void
>> >                         CAST_FROM_FN_PTR(address,
>> >
>> InterpreterRuntime::write_helper),
>> >                         c_rarg1,  // obj
>> >                         rdx, // field address, because store check is
>> > called on field address
>> >                         val);
>> >        }
>> >      }
>> >      break;
>> >
>> > For reads:
>> > case Bytecodes::_fast_agetfield:
>> >    __ load_heap_oop(rax, field);
>> >
>> > /*mycodeD*/     __ call_VM(noreg,
>> >               CAST_FROM_FN_PTR(address,
>> >                                InterpreterRuntime::read_barrier_helper),
>> >               rax);
>> >
>> > __ verify_oop(rax);
>> >    break;
>> >
>> > My questions are:
>> >
>> > 1) I thought this represents a putfield a.f=b where a.f is represented
>> by
>> > the parameter obj of type Address. b is obvious the parameter val of
>> type
>> > Register. Especially in obj there are fields: base, index and disp. But
>> as
>> > I run this, looks like obj is actually the field address. (the case
>> mycodeB)
>> > I haven't found a test case that can trigger the case mycodeC to see the
>> > behavior (i.e., rdx might get destroyed and I got random value back or
>> > c_rarg1 is the obj address and rdx is field address)
>> >
>> > 2) Before this, I tried to insert the same __ call_VM in fast_aputfield
>> > before do_oop_store but it results in JVM crash. I don't understand the
>> > reason why. What I did in the call is just print the parameters. I did
>> see
>> > the values printed (only the 1st time it goes to the method) but then
>> the
>> > VM crashed. I thought __ call_VM will preserve all registers's value and
>> > restore properly when comes back. My instrumentation has no side
>> effect, I
>> > just observe and record the values (actually just printing the values to
>> > test).
>> >
>> > 3) Is it strictly required to have the line /*mycodeA*/ I tried to, in
>> > mycodeB line, pass obj.base() twice and it got build errors for "smashed
>> > args"?
>> >
>> > I greatly appreciate your time,
>> >
>> > Best,
>> >
>> > Khanh Nguyen
>>
>>
>


More information about the hotspot-dev mailing list