Add instrumentation in the TemplateInterpreter
Christian Thalinger
christian.thalinger at oracle.com
Fri Feb 19 18:18:44 UTC 2016
> On Feb 18, 2016, at 1:43 PM, Khanh Nguyen <ktruong.nguyen at gmail.com> wrote:
>
> Yes, you are correct. This is for a research project. We try to remember those references so that later we can do something similar to a GC to update the references. The base assumption is that we only have a small number of these kind of object so the cost should be acceptable.
>
> A side reason for staying with the TemplateInterpreter is that I can't convince my team to switch to BytecodeInterpreter in ZeroShark. The unknown performance of ZeroShark is partially responsible for my unability to convince them.
>
>
Are you running your experiments interpreted only or in a tiered environment with C1/C2?
> On Feb 18, 2016 2:50 PM, "Christian Thalinger" <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>
>> On Feb 18, 2016, at 9:30 AM, Khanh Nguyen <ktruong.nguyen at gmail.com <mailto:ktruong.nguyen at gmail.com>> wrote:
>>
>> The main reason is the performance difference between the TemplateInterpreter and the BytecodeInterpreter in Zero.
>> I did not verify the difference but I found from this mailing list that the difference is 10x.
>>
>>
>
> But the instrumentation you are adding is quite expensive and I’m assuming this is for research or academia?
>
>> And since we are talking about Zero. How much is the performance difference between ZeroShark and the standard Hotspot, do you by any chance know?
>>
>> Thanks
>>
>> On Feb 18, 2016 11:14 AM, "Christian Thalinger" <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>> Can you share the reason?
>>
>>> On Feb 18, 2016, at 8:01 AM, Khanh Nguyen <ktruong.nguyen at gmail.com <mailto:ktruong.nguyen at gmail.com>> wrote:
>>>
>>> Unfortunately it has to be the Template Interpreter.
>>>
>>> On Feb 18, 2016 9:27 AM, "Christian Thalinger" <christian.thalinger at oracle.com <mailto:christian.thalinger at oracle.com>> wrote:
>>> Does it have to be the template interpreter or could you do your work with Zero as well?
>>>
>>> > On Feb 9, 2016, at 11:19 PM, Khanh Nguyen <ktruong.nguyen at gmail.com <mailto:ktruong.nguyen at gmail.com>> wrote:
>>> >
>>> > Hello,
>>> >
>>> > I want to add instrumentation to monitor all reads and writes in the
>>> > TemplateInterpreter, I think I got the correct place for it in
>>> > /cpu/x86/vm/templateTable_x86_64.cpp. Can someone please tell me if I'm
>>> > doing it right?
>>> >
>>> > For writes:
>>> > static void do_oop_store(InterpreterMacroAssembler* _masm,
>>> > Address obj,
>>> > Register val,
>>> > BarrierSet::Name barrier,
>>> > bool precise) {
>>> > [...]
>>> > case BarrierSet::CardTableModRef:
>>> > case BarrierSet::CardTableExtension:
>>> > {
>>> > if (val == noreg) {
>>> > __ store_heap_oop_null(obj);
>>> > } else {
>>> > __ store_heap_oop(obj, val);
>>> >
>>> > /*mycodeA*/ __ movptr(c_rarg1, obj.base()); // save this value otherwise
>>> > it will be changed?
>>> >
>>> > // flatten object address if needed
>>> > if (!precise || (obj.index() == noreg && obj.disp() == 0)) {
>>> > __ store_check(obj.base());
>>> > /*mycodeB*/ __ call_VM(noreg, //void
>>> > CAST_FROM_FN_PTR(address,
>>> > InterpreterRuntime::write_helper),
>>> > c_rarg1, // obj
>>> > c_rarg1, // field address because store check is
>>> > called on field address
>>> > val);
>>> > } else {
>>> > __ leaq(rdx, obj);
>>> > __ store_check(rdx);
>>> > /*mycodeC*/ __ call_VM(noreg, //void
>>> > CAST_FROM_FN_PTR(address,
>>> > InterpreterRuntime::write_helper),
>>> > c_rarg1, // obj
>>> > rdx, // field address, because store check is
>>> > called on field address
>>> > val);
>>> > }
>>> > }
>>> > break;
>>> >
>>> > For reads:
>>> > case Bytecodes::_fast_agetfield:
>>> > __ load_heap_oop(rax, field);
>>> >
>>> > /*mycodeD*/ __ call_VM(noreg,
>>> > CAST_FROM_FN_PTR(address,
>>> > InterpreterRuntime::read_barrier_helper),
>>> > rax);
>>> >
>>> > __ verify_oop(rax);
>>> > break;
>>> >
>>> > My questions are:
>>> >
>>> > 1) I thought this represents a putfield a.f=b where a.f is represented by
>>> > the parameter obj of type Address. b is obvious the parameter val of type
>>> > Register. Especially in obj there are fields: base, index and disp. But as
>>> > I run this, looks like obj is actually the field address. (the case mycodeB)
>>> > I haven't found a test case that can trigger the case mycodeC to see the
>>> > behavior (i.e., rdx might get destroyed and I got random value back or
>>> > c_rarg1 is the obj address and rdx is field address)
>>> >
>>> > 2) Before this, I tried to insert the same __ call_VM in fast_aputfield
>>> > before do_oop_store but it results in JVM crash. I don't understand the
>>> > reason why. What I did in the call is just print the parameters. I did see
>>> > the values printed (only the 1st time it goes to the method) but then the
>>> > VM crashed. I thought __ call_VM will preserve all registers's value and
>>> > restore properly when comes back. My instrumentation has no side effect, I
>>> > just observe and record the values (actually just printing the values to
>>> > test).
>>> >
>>> > 3) Is it strictly required to have the line /*mycodeA*/ I tried to, in
>>> > mycodeB line, pass obj.base() twice and it got build errors for "smashed
>>> > args"?
>>> >
>>> > I greatly appreciate your time,
>>> >
>>> > Best,
>>> >
>>> > Khanh Nguyen
More information about the hotspot-dev
mailing list