Request for reviews (L): 7063628: Use cbcond on T4
Tom Rodriguez
tom.rodriguez at oracle.com
Thu Jul 14 11:31:25 PDT 2011
assembler_sparc.hpp:
If we wanted to be clever about the delay slot issue we could add an function like maybe_delayed() that would emit the instruction in a side buffer and place it in the right place for these instructions. Basically you could emit it as if it were not delayed and then move it into the delay slot if you actually have one. So instead of:
__ br_null(G3_scratch, false, __ pt, skip_fixup, false);
__ delayed()->ld_ptr(G5_method, in_bytes(methodOopDesc::interpreter_entry_offset()), G3_scratch);
you do
__ maybe_delayed()->ld_ptr(G5_method, in_bytes(methodOopDesc::interpreter_entry_offset()), G3_scratch);
__ br_null(G3_scratch, false, __ pt, skip_fixup);
It's a little tweaky but it would allow cbcond to be used in more places.
You need some comments explaining when the instruction emits the delay slot for you. I would think this should only be done for names which are clearly not real instruction names. I don't like the extra argument that controls whether the delay slot is emitted or not. It's a bit mysterious in a casual reading. Maybe the false version should be _no_delay? I'm not sure how I'd want to this work but it seems a bit subtle at a first glance.
Get rid of the emit_delayed_nop stuff in ba. It's just a confusing way to emit "br(always, false, pt, L)" so callers should just use that directly. Maybe instead of being called ba is should br_always so it's clear that it's a macro instruction. You can then leave the existing ba alone.
cbc doesn't match the docs which use cbcond to refer to them generically. It's also slightly odd that the real instructions are called cwb and cxb but we don't have any way to emit those directly but I guess the generic forms take care of it. I'm not a huge fan of the instructions that take CC as an argument. I might prefer to see cwb, cxb and cb_ptr to cover the ptr_cc case but given the limited number of uses I can live with what you've done.
Why did you remove the Condition argument from br_zero? It can work with the new encoding.
c1_LIRAssembler_sparc.cpp:
Why isn't this:
! __ ba(false, *update_done);
! __ ba(*update_done, false);
__ delayed()->nop();
simply
__ ba(*update_done);
That actually applies to a bunch of other places too.
sharedRuntime_sparc.cpp:
I think the annulling here is wrong or at least meaningless.
! __ cmp_and_brx(temp_reg, G5_inline_cache_reg, Assembler::equal, true, Assembler::pt, L);
__ brx(Assembler::equal, true, Assembler::pt, L);
__ delayed()->nop();
cmp_and_brx shouldn't even have an annul argument. Actually none of these new macro instructions should.
compile.cpp:
+ if (n->is_Branch() && n->as_Mach()->ideal_Opcode() != Op_Jump) {
+ // Fake label for branch instruction.
+ MacroAssembler masm(&buf);
+ Label fakeL;
+ masm.bind(fakeL);
+ n->as_Mach()->label_set(fakeL, 0);
+ }
What guarantees do we get about the lifetime of the storage for fakeL since it doesn't appear to be used? I would assume that at a minimum it should be declared outside of the if.
You can clean up all the "l ? (l->loc_pos() - (cbuf.insts_size() + 1)) : 0;" code in x86_*.ad after this fix too. We can even use the MacroAssembler for these instrutions now if we want to though that's probably for another day.
Should labelOper::label have an assert that it's non-null when called? That would cover all the other asserts you added though I'm not sure it must be true.
The overall change looks good.
tom
On Jul 13, 2011, at 8:44 PM, Vladimir Kozlov wrote:
> http://cr.openjdk.java.net/~kvn/7063628/webrev
>
> Fixed 7063628: Use cbcond on T4
>
> Added new MacroAssembler instructions cmp_and_br() and old branch instructions are modified to use cbcond on T4 if distance is small (2K bytes). Most of the rest changes are usage of these new branch instructions in Interpreter and C1.
>
> The prototype was done by Tom and I took some of his additional fixes. The formssel.cpp change is a bug fix where the MatchNode equality test wasn't recursing so it would mistakenly return true for complex matches. Added a fake label for branches generated in temp buffer by MachNode::emit_size(), added assert into .ad files to check label.
>
> There was problem in is_in_wdisp16_range() (and in initial implementation of use_cbc()) which calls target(L) before emitting branch. Non-bound (forward branch) labels record current pc() as a branch address and later try to patch instruction (which could be 'cmp') at that address.
>
> Removed unused code in check_klass_subtype_fast_path().
>
> Modified vm_version string: has_ prefix is removed and only v8 or v9 will be printed as well only niagara_plus or niagara.
>
> Tested on T4 with CTW, nsk, VM and java invoke regression tests.
More information about the hotspot-compiler-dev
mailing list