RFR: 7143664: Clean up OrderAccess implementations and usage

Fri Feb 20 22:40:48 UTC 2015

> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v4/

I haven't reviewed all of the changes, instead focusing on the
structure and use of templates.  That generally looks fine; I have a
few stylistic comments below, but nothing earth shaking.

Some of the remaining comments, even if technically correct and
acceptable, might not be appropriate for this change set, but should
instead be dealt with in follow-on changes.

------------------------------------------------------------------------------  
Throughout:

I would probably have used some simple macros to avoid near
duplication of a lot of groups of related functions that differ only
in a type parameter.

The present approach of making the groups of related functions
"obviously" similar by putting the entire definition on one line so
there are several nearly identical lines in a row makes me look at
each of the lines carefully to make sure the several parallel
differences are what's expected.  It has the additional defect of
involving very long lines, which makes them harder to review in
side-by-side views.

YMMV.

------------------------------------------------------------------------------   

The ordered access operations are defined on jbyte ... jlong.  The
problem is that there is a type in that sequence that is the same size
as two distinct C/C++ types.  And as a result, one of those C/C++
types is missing atomic support.  This can lead to problems with types
like size_t on some platforms

I also don't see atomic support for char*.

This might be something to consider for a separate change. 

------------------------------------------------------------------------------   
An observation:

If OrderAccess was a real namespace instead of a pseudo-namespace
class, the specialized versions of the specialized_xxx functions could
be ordinary function overloads rather than function template
specializations. 

OTOH, namespaces don't provide private access control.  I often find
myself wishing for such a feature.

No actual change needed here, just pointing out an interesting
variation from the approach taken.

------------------------------------------------------------------------------   
An observation:

load_ptr_acquire isn't strictly necessary, and could be replaced by
load_acquire with some additional work.  But it might be the "_ptr_"
in the name is considered useful documentation.

Given that, load_ptr_acquire can be generalized to support any type of
pointer, not just intptr_t* and void*.  This would eliminate the need
for casting result types in some places, and for some strange value
types in others.  For example MetaspaceGC::_capacity_until_gc is
declared intptr_t rather than size_t.

This might be something to consider for a separate change. 

------------------------------------------------------------------------------   
src/share/vm/runtime/orderAccess.hpp 

I don't see the point of having ScopedFence derive from
ScopedFenceGeneral.  It seems to me that ScopedFenceGeneral could
derive from AllStatic (with corresponding changes to functions) and
ScopedFence could instead derive from StackObj.

------------------------------------------------------------------------------ 
src/share/vm/runtime/orderAccess.hpp 

Why are ScopedFenceType, ScopedFenceGeneral, and ScopedFence defined
at global scope rather than nested in OrderAccess?

------------------------------------------------------------------------------ 
src/share/vm/runtime/orderAccess.hpp 

For all load_acquire variants, why (volatile T* p) rather than
(const volatile* p)?

------------------------------------------------------------------------------
src/share/vm/runtime/orderAccess.inline.hpp 
  75 #ifdef VM_HAS_GENERALIZED_ORDER_ACCESS

I'm not sure what the point of this is.  Is this just to allow some
(perhaps third-party?) platforms to continue to use an old
implementation?

Or is it for incremental development of this set of changes?

Is the plan that it will eventually go away?

------------------------------------------------------------------------------
src/share/vm/runtime/orderAccess.inline.hpp 

I think there needs to be some documentation here about which
specialized_xxx functions can/should be considered for specialization.

------------------------------------------------------------------------------
src/share/vm/runtime/orderAccess.inline.hpp 
 138 // The following methods can be specialized using simple template specialization
 139 // in the platform specific files for optimization purposes. Otherwise the
 140 // generalized variant is used.
 141 template<typename T> inline T    OrderAccess::specialized_load_acquire       (volatile T* p)       { return ordered_load<T, X_ACQUIRE>(p);    }
 142 template<typename T> inline void OrderAccess::specialized_release_store      (volatile T* p, T v)  { ordered_store<T, RELEASE_X>(p, v);       }
 143 template<typename T> inline void OrderAccess::specialized_release_store_fence(volatile T* p, T v)  { ordered_store<T, RELEASE_X_FENCE>(p, v); }

I *think* it doesn't matter in practice, but I would put these
definitions before the potential instantiations.  The relevant
declarations are obviously in scope where those instantiations occur,
and everything is declared inline, but I worry that some
insufficiently clever compiler might end up generating out of line
code in some cases because the definition was after use.

------------------------------------------------------------------------------
src/share/vm/runtime/orderAccess.inline.hpp 
 101 inline jubyte   OrderAccess::load_acquire(volatile jubyte*  p) { return (jubyte) specialized_load_acquire((volatile jbyte*)p);  }
 102 inline jushort  OrderAccess::load_acquire(volatile jushort* p) { return (jushort)specialized_load_acquire((volatile jshort*)p); }
 103 inline juint    OrderAccess::load_acquire(volatile juint*   p) { return (juint)  specialized_load_acquire((volatile jint*)p);   }
 104 inline julong   OrderAccess::load_acquire(volatile julong*  p) { return (julong) specialized_load_acquire((volatile jlong*)p);  }

I would probably define these as calling load_acquire with casts,
rather than directly calling specialized_load_acquire.  Similarly for
store_release and load_ptr_acquire variants.  This would make it
easier to fiddle with the how the specializations are implemented, and
shorten the line lengths by a dozen characters.

I would probably also have used some simple macrology to avoid near
duplication of code.

------------------------------------------------------------------------------
src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp

I see no uses of the inlasm_eieio macro. 
I see no uses of the inlasm_isync macro.