Add instrumentation in the TemplateInterpreter
Khanh Nguyen
ktruong.nguyen at gmail.com
Thu Feb 18 23:43:03 UTC 2016
Yes, you are correct. This is for a research project. We try to remember
those references so that later we can do something similar to a GC to
update the references. The base assumption is that we only have a small
number of these kind of object so the cost should be acceptable.
A side reason for staying with the TemplateInterpreter is that I can't
convince my team to switch to BytecodeInterpreter in ZeroShark. The unknown
performance of ZeroShark is partially responsible for my unability to
convince them.
On Feb 18, 2016 2:50 PM, "Christian Thalinger" <
christian.thalinger at oracle.com> wrote:
>
> On Feb 18, 2016, at 9:30 AM, Khanh Nguyen <ktruong.nguyen at gmail.com>
> wrote:
>
> The main reason is the performance difference between the
> TemplateInterpreter and the BytecodeInterpreter in Zero.
> I did not verify the difference but I found from this mailing list that
> the difference is 10x.
>
>
> But the instrumentation you are adding is quite expensive and I’m assuming
> this is for research or academia?
>
> And since we are talking about Zero. How much is the performance
> difference between ZeroShark and the standard Hotspot, do you by any chance
> know?
>
> Thanks
> On Feb 18, 2016 11:14 AM, "Christian Thalinger" <
> christian.thalinger at oracle.com> wrote:
>
>> Can you share the reason?
>>
>> On Feb 18, 2016, at 8:01 AM, Khanh Nguyen <ktruong.nguyen at gmail.com>
>> wrote:
>>
>> Unfortunately it has to be the Template Interpreter.
>> On Feb 18, 2016 9:27 AM, "Christian Thalinger" <
>> christian.thalinger at oracle.com> wrote:
>>
>>> Does it have to be the template interpreter or could you do your work
>>> with Zero as well?
>>>
>>> > On Feb 9, 2016, at 11:19 PM, Khanh Nguyen <ktruong.nguyen at gmail.com>
>>> wrote:
>>> >
>>> > Hello,
>>> >
>>> > I want to add instrumentation to monitor all reads and writes in the
>>> > TemplateInterpreter, I think I got the correct place for it in
>>> > /cpu/x86/vm/templateTable_x86_64.cpp. Can someone please tell me if I'm
>>> > doing it right?
>>> >
>>> > For writes:
>>> > static void do_oop_store(InterpreterMacroAssembler* _masm,
>>> > Address obj,
>>> > Register val,
>>> > BarrierSet::Name barrier,
>>> > bool precise) {
>>> > [...]
>>> > case BarrierSet::CardTableModRef:
>>> > case BarrierSet::CardTableExtension:
>>> > {
>>> > if (val == noreg) {
>>> > __ store_heap_oop_null(obj);
>>> > } else {
>>> > __ store_heap_oop(obj, val);
>>> >
>>> > /*mycodeA*/ __ movptr(c_rarg1, obj.base()); // save this value
>>> otherwise
>>> > it will be changed?
>>> >
>>> > // flatten object address if needed
>>> > if (!precise || (obj.index() == noreg && obj.disp() == 0)) {
>>> > __ store_check(obj.base());
>>> > /*mycodeB*/ __ call_VM(noreg, //void
>>> > CAST_FROM_FN_PTR(address,
>>> >
>>> InterpreterRuntime::write_helper),
>>> > c_rarg1, // obj
>>> > c_rarg1, // field address because store check is
>>> > called on field address
>>> > val);
>>> > } else {
>>> > __ leaq(rdx, obj);
>>> > __ store_check(rdx);
>>> > /*mycodeC*/ __ call_VM(noreg, //void
>>> > CAST_FROM_FN_PTR(address,
>>> >
>>> InterpreterRuntime::write_helper),
>>> > c_rarg1, // obj
>>> > rdx, // field address, because store check is
>>> > called on field address
>>> > val);
>>> > }
>>> > }
>>> > break;
>>> >
>>> > For reads:
>>> > case Bytecodes::_fast_agetfield:
>>> > __ load_heap_oop(rax, field);
>>> >
>>> > /*mycodeD*/ __ call_VM(noreg,
>>> > CAST_FROM_FN_PTR(address,
>>> >
>>> InterpreterRuntime::read_barrier_helper),
>>> > rax);
>>> >
>>> > __ verify_oop(rax);
>>> > break;
>>> >
>>> > My questions are:
>>> >
>>> > 1) I thought this represents a putfield a.f=b where a.f is represented
>>> by
>>> > the parameter obj of type Address. b is obvious the parameter val of
>>> type
>>> > Register. Especially in obj there are fields: base, index and disp.
>>> But as
>>> > I run this, looks like obj is actually the field address. (the case
>>> mycodeB)
>>> > I haven't found a test case that can trigger the case mycodeC to see
>>> the
>>> > behavior (i.e., rdx might get destroyed and I got random value back or
>>> > c_rarg1 is the obj address and rdx is field address)
>>> >
>>> > 2) Before this, I tried to insert the same __ call_VM in fast_aputfield
>>> > before do_oop_store but it results in JVM crash. I don't understand the
>>> > reason why. What I did in the call is just print the parameters. I did
>>> see
>>> > the values printed (only the 1st time it goes to the method) but then
>>> the
>>> > VM crashed. I thought __ call_VM will preserve all registers's value
>>> and
>>> > restore properly when comes back. My instrumentation has no side
>>> effect, I
>>> > just observe and record the values (actually just printing the values
>>> to
>>> > test).
>>> >
>>> > 3) Is it strictly required to have the line /*mycodeA*/ I tried to, in
>>> > mycodeB line, pass obj.base() twice and it got build errors for
>>> "smashed
>>> > args"?
>>> >
>>> > I greatly appreciate your time,
>>> >
>>> > Best,
>>> >
>>> > Khanh Nguyen
>>
>>
>
More information about the hotspot-dev
mailing list