RFR: 8199417: Modularize interpreter GC barriers

Sun Apr 1 03:12:57 UTC 2018

> On Mar 26, 2018, at 5:31 AM, Erik Österlund <erik.osterlund at oracle.com> wrote:
> 
> Hi,
> 
> The GC barriers for the interpreter are not as modular as they could be. They currently use switch statements to check which GC barrier set is being used, and call this or that barrier based on that, in a way that assumes GCs only use write barriers.
> 
> This patch modularizes this by generating accesses in the interpreter with declarative semantics. Accesses to the heap may now use store_at and load_at functions of the BarrierSetAssembler, passing along appropriate arguments and decorators. Each concrete BarrierSetAssembler can override the access completely or sprinkle some appropriate GC barriers as necessary.
> 
> Big thanks go to Martin Doerr and Roman Kennke, who helped plugging this into S390, PPC and AArch64 respectively.
> 
> Webrev:
> http://cr.openjdk.java.net/~eosterlund/8199417/webrev.00/
> 
> Bug:
> https://bugs.openjdk.java.net/browse/JDK-8199417
> 
> Thanks,
> /Erik

I've only looked at the x86 and generic code so far. I have some
comments, but also a bunch of questions. I think I'm missing some
things somewhere. I'm sending what I've got so far anyway, and will
continue studying.

------------------------------------------------------------------------------
src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
354   // flatten object address if needed
355   // We do it regardless of precise because we need the registers

There is no "precise" in this context.

------------------------------------------------------------------------------
src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
364 #ifndef _LP64
365   InterpreterMacroAssembler *imasm = static_cast<InterpreterMacroAssembler*>(masm);
366 #endif

In the old code, the function received an InterpreterMacroAssembler*.

What is there about this context that lets us know the downcast is
safe here?  Is oop_store_at only ever called with an IMA*?  If so,
then why isn't that the parameter type.  If not, then what?

------------------------------------------------------------------------------
src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp  
753   bs->load_at(_masm, IN_HEAP | ON_WEAK_OOP_REF, T_OBJECT,
754               rax, field_address, /*tmp1*/ rbx, /*tmp_thread*/ rdx);

This needs to distinguish between WEAK and PHANTOM references, else it
will break ZGC's concurrent reference processing.  (At least so long as
we still have to deal with finalization.)

------------------------------------------------------------------------------
src/hotspot/cpu/x86/templateInterpreterGenerator_x86.cpp 
698 address TemplateInterpreterGenerator::generate_Reference_get_entry(void) {

Almost all the code here (and much of the commentary) was/is specific
to G1 (or SATB collectors).  Shouldn't it all be in the barrier_set's
load_at?  As it is now, if (say) we're using Parallel GC then I think
there's a bunch of additional code being generated and executed here
that wasn't present before.

------------------------------------------------------------------------------
src/hotspot/cpu/x86/methodHandles_x86.cpp  
170   __ load_heap_oop(method_temp, Address(recv, NONZERO(java_lang_invoke_MethodHandle::form_offset_in_bytes())));                      
=>
175   bs->load_at(_masm, IN_HEAP, T_OBJECT, method_temp, Address(recv, NONZERO(java_lang_invoke_MethodHandle::form_offset_in_bytes())), t

This doesn't look like an improvement in code readability to me.  Why
can't bs->load_at be the implementation of "__ load_heap_oop" ?

And the lines are getting really long; so long that webrev is cutting
off the right end, so I can't review the new versions of the lines.

------------------------------------------------------------------------------ 
src/hotspot/cpu/x86/methodHandles_x86.cpp 
363         __ load_heap_oop(temp2_defc, member_clazz);
=>
368         BarrierSetAssembler* bs = BarrierSet::barrier_set()->barrier_set_assembler();
369         Register rthread = LP64_ONLY(r15_thread) NOT_LP64(noreg);
370         bs->load_at(_masm, IN_HEAP, T_OBJECT, temp2_defc, member_clazz, noreg, rthread);

Similarly, this doesn't seem like an improvement to me.  And there are
more like this.

------------------------------------------------------------------------------ 
src/hotspot/cpu/x86/macroAssembler_x86.hpp

It would reduce the clutter in this change to leave the as_Address
declarations where they were, since that place is now public, rather
than moving them to a different public section.

------------------------------------------------------------------------------

Generally, it seems like it might be nice for the MacroAssembler class
to have a reference to the barrier_set, or maybe even the bs's
assembler, rather than looking them up in various places, and in
different ways.  For example, I see each of

  Universe::heap()->barrier_set()
  BarrerSet::barrier_set()

used in various different places *in new code*.  Some consistency
would be nice.

------------------------------------------------------------------------------
src/hotspot/cpu/x86/macroAssembler_x86.cpp
5249   bs->load_at(this, IN_ROOT | ON_PHANTOM_OOP_REF, T_OBJECT,
5250               value, Address(value, -JNIHandles::weak_tag_value), tmp, thread);
5251   verify_oop(value);

This puts the verify_oop after the G1 pre-barrier, where in the old
code verify_oop was before the pre-barrier. I remember the old order
being intentional, though don't off-hand recall how important that
was.  If for no other reason, it seems like the new order could rarely
(due to bad timing) have a bad oop reported somewhere far away during
SATB buffer processing, rather than at the point of the load.

With the new version's order there could be one verify after the done
label.  That's just a minor nit.

------------------------------------------------------------------------------ 
src/hotspot/cpu/x86/gc/g1/g1BarrierSetAssembler_x86.cpp
230 #ifdef _LP64
231     if (c_rarg1 != thread) {
232       __ mov(c_rarg1, thread);
233     }
234     if (c_rarg0 != pre_val) {
235       __ mov(c_rarg0, pre_val);
236     }
237 #else
238     __ push(thread);
239     __ push(pre_val);
240 #endif

Old code used:

  __ pass_arg1(thread);
  __ pass_arg0(pre_val);

Why is this being changed?  Oh, pass_argN are static helpers not
available outside macroAssembler_x86.cpp.  Maybe they should be
exposed in some way, like public helpers in the x86 version of the
MacroAssembler class, rather than copied inline here (and maybe other
places?).

------------------------------------------------------------------------------ 
src/hotspot/cpu/x86/interp_masm_x86.cpp
Deleted these lines:
519   // The resulting oop is null if the reference is not yet resolved.                                                                 
520   // It is Universe::the_null_sentinel() if the reference resolved to NULL via condy.                                                

I don't think those comment lines should be deleted.

------------------------------------------------------------------------------
src/hotspot/cpu/x86/gc/shared/modRefBarrierSetAssembler_x86.cpp
86 void ModRefBarrierSetAssembler::store_at(MacroAssembler* masm, DecoratorSet decorators, BasicType type,
87                                        Address dst, Register val, Register tmp1, Register tmp2) {
88   if (type == T_OBJECT || type == T_ARRAY) {
89     oop_store_at(masm, decorators, type, dst, val, tmp1, tmp2);
90   } else {
91     BarrierSetAssembler::store_at(masm, decorators, type, dst, val, tmp1, tmp2);
92   }
93 }

Doesn't the conditional conversion from store_at to oop_store_at
actually indicate a bug in the caller? That is, calling store_at with
a oop (T_OBJECT or T_ARRAY) value seems like a bug, and should be
asserted against, rather than translating.

And oop_store_at should assert that the value *is* an oop value.

And should there be any verification that the decorators and the type
are consistent?

But maybe I'm not understanding the protocol around oop_store_at?
There being no documentation, it seems entirely possible that I've
misunderstood something here.  Maybe I'm being confused by
similarities in naming with Access:: that don't hold?

------------------------------------------------------------------------------ 
src/hotspot/cpu/x86/gc/shared/cardTableBarrierSetAssembler_x86.cpp
88 void CardTableBarrierSetAssembler::store_check(MacroAssembler* masm, Register obj, Address dst) {
89   // Does a store check for the oop in register obj. The content of
90   // register obj is destroyed afterwards.

The comment states obj is an oop, but when called by the precise case
of oop_store_at, it is not an oop, but rather a derived pointer.  I
think it's the comment that's wrong.

------------------------------------------------------------------------------

I was surprised that oop_store_at is initially declared at the
ModRefBarrierSetAssembler level, rather than at the
BarrierSetAssembler level.

I was also surprised to not see an oop_load_at.

In Access we have _at suffixed operations paired with non-_at suffixed
operations, the difference being the _at suffixed operations have a
base object, to allow dealing with the brooks-ptr. I'm not finding any
such distinction in the xxxAssembler API. Presumably I've missed
something somewhere. How is that handled?

------------------------------------------------------------------------------