RFR: 8186166: Generalize Atomic::cmpxchg with templates

Mon Aug 14 21:11:59 UTC 2017

Kim,

It's good of you to break this up.   Some initial comments:

http://cr.openjdk.java.net/~kbarrett/8186166/hotspot.00/src/os_cpu/linux_x86/vm/atomic_linux_x86.hpp.frames.html

Should all of the CmpxchgPlatform<1,4,8> functions have an assert like 
below?  I saw them somewhere.  With template expansion, perhaps it's 
impossible otherwise, but this is good documentation of the function's 
expectations.

231 STATIC_ASSERT(4 == sizeof(T));

Trying to find what happened to:

VM_HAS_SPECIALIZED_CMPXCHG_BYTE

...

http://cr.openjdk.java.net/~kbarrett/8186166/hotspot.00/src/os_cpu/solaris_sparc/vm/solaris_sparc.il.frames.html

Very nice.  After the atomics, we can file an RFE to move the rest into asm.

http://cr.openjdk.java.net/~kbarrett/8186166/hotspot.00/src/os_cpu/solaris_x86/vm/atomic_solaris_x86.hpp.frames.html

I haven't worked out cmpxchg_using_stub yet.  But doing a call like this 
seems a lot simpler.  Why not do this for all the platforms that have 
stub functions rather than inline assembly?  I don't think the cost of 
doing a call is going to be noticeable performance wise, relative to 
doing synchronized operations.   Could avoid another template.

http://cr.openjdk.java.net/~kbarrett/8186166/hotspot.00/src/share/vm/runtime/atomic.hpp.udiff.html

For the second two cmpxchg_ptr functions, can you remove the weird 
formatting since it's not trying to line up with anything.

Can you add a comment just above conditional_store_ptr about what it's 
for? ie. initialization or rather replacing a null value.

This comment about PlatformCmpxchg is really confusing which is D and 
T.   Do you mean T instead of D in the comment below, since the function 
() takes T ?

+ // - dest is of type D*.
+ // - exchange_value and compare_value are of type D.
+ // - order is of type cmpxchg_memory_order.
+ // - platform_cmpxchg is an object of type PlatformCmpxchg<sizeof(D)>.
+ //
+ // Then
+ // platform_cmpxchg()(exchange_value, dest, compare_value, order)
+ // must be a valid expression, returning a result convertible to D.
+ //
+ // A default definition is provided, which declares a function template
+ // T operator()(T, T volatile*, T, cmpxchg_memory_order) const

Can you move the public above the comment so it's clear that the comment 
refers to this CmpxchgByteUsingInt ?  And spell out RMW.

+ // Support platforms that do not provide RMW byte-level atomic access
+ // To use, derive PlatformCmpxchg<1> from this class.
+ // Can't be private: C++03 11.4/2; fixed in C++11.
+public:
+ struct CmpxchgByteUsingInt;

I could look at IsPointerConvertible for hours and not know what it does 
without your previous explanation, and even now I can't recall the 
details.   Can you add comment hints?   Ie why is (&no)[2] declared like 
that, and why aren't there function declarations.   And that you're 
using overload resolution to try to convert From* to To*.   They don't 
have to be long comments because nobody's going to stare at this for a 
while after they get the gist of it.

nit, can you remove 'all' in this sentence.  Or "These need to be".   
I'm reading comments now.

+// specializations of the class. That all needs to be provided by the

So am I reading this right, this doesn't depend on integer promotions, 
all the integer types must be identical in sign and size?  This would be 
good.

+// Handle cmpxchg for integral and enum types.
+//
+// All the involved types must be identical.
+template<typename T>
+struct Atomic::CmpxchgImpl<
+ T, T, T,
+ typename EnableIf<IsIntegral<T>::value || 
IsRegisteredEnum<T>::value>::type>
+ VALUE_OBJ_CLASS_SPEC
+{
+ T operator()(T exchange_value, T volatile* dest, T compare_value,
+ cmpxchg_memory_order order) const {
+ // Forward to the platform handler for the size of T.
+ return PlatformCmpxchg<sizeof(T)>()(exchange_value,
+ dest,
+ compare_value,
+ order);
+ }
+};

Regarding platforms that don't have a cmpxchg for byte.   I see the 
magic now.  Can you add this comment to all the platforms, then also add 
the line that the implementation of () is shared in atomic.hpp ?

+// No direct support for cmpxchg of bytes; emulate using int.
+template<>
+struct Atomic::PlatformCmpxchg<1> : Atomic::CmpxchgByteUsingInt {};
+

Does the name of the function have to be "operator()" ?  It's very C++y 
and maybe standard practice but it's never called as () is it?  It could 
have an english verb for the name, couldn't it? Then one could do some 
sort of find-all pattern match on it.

This is all I can look at for pass 1.   It seems reasonable to me.   I 
haven't looked at IntegerTypes again yet.   I also didn't look at the 
sparc asm for correctness since I can't do that. More below:

On 8/13/17 10:41 PM, Kim Barrett wrote:
> Please review this change to Atomic::cmpxchg, making it a function
> template with a per-platform underlying implementation.
>
> This has been broken out from 8184334: "Generalizing Atomic with
> templates", as the originally proposed change there encountered
> various objections.  This change takes a somewhat different approach
> to the same problem.
>
> One goal is to generalize the API and make it more type-safe,
> eliminating (nearly) all need for casts by callers.  Another is to
> allow more direct per-platform implementations, especially where there
> are appropriate compiler intrinsics available.
>
> If all goes well and this change is accepted, it is intended that the
> remainder of Atomic, and later OrderAccess, will be changed in a
> similar manner as one or more followups.
>
> This change to cmpxchg deprecates cmpxchg_ptr.  Followup changes will
> convert existing uses of cmpxchg_ptr and eventually remove it.
> Followup changes will also update uses of cmpxchg that can be
> simplified because of the generalized API.

Yes, totally agree with the follow-up RFEs.  This is going in the right 
direction.
>
> The cmpxchg function calls private Atomic::CmpxchgImpl, to select
> among several cases, based on the argument types.  Each case performs
> some checking and possibly conversion to reach a standard form of the
> arguments (or fail and report a compile time error), which are then
> passed to private Atomic::PlatformCmpxchg.  The PlatformCmpxchg class
> is provided by each platform, and performs additional case analysis to
> generate the appropriate code for the platform.
>
> This change involves modifications to all of the existing platform
> implementations of cmpxchg.  I've updated all of the platforms, but
> only tested those supported by Oracle.  I'll need help from the
> various platform maintainers to check and test my changes.
>
> Changes to files can be categorized as follows:
>
> New metaprogramming utilities:
> integerTypes.hpp
> test_integerTypes.cpp
> isRegisteredEnum.hpp
> test_isRegisteredEnum.cpp

I haven't reviewed the test code.
>
> Change cmpxchg to a template:
> atomic.hpp
>
> Oracle-supported platforms:
> atomic_bsd_x86.hpp - 64bit only
> atomic_linux_arm.hpp
> atomic_linux_x86.hpp
> atomic_solaris_sparc.hpp
> solaris_sparc.il
> atomic_solaris_x86.hpp
> atomic_windows_x86
>
> Non-Oracle platform changes:
> atomic_aix_ppc.hpp
> atomic_bsd_x86.hpp - 32bit
> atomic_bsd_zero.hpp
> atomic_linux_aarch.hpp
> atomic_linux_ppc.hpp
> atomic_linux_s390.hpp
> atomic_linux_sparc.hpp
> atomic_linux_zero.hpp
>
> Usage changes required by API change:
> os_bsd.cpp
> os_solaris.cpp
> aotCodeHeap.cpp
> aotCodeHeap.hpp
> psParallelCompact.hpp
> workgroup.cpp
> oopsHierarchy.hpp
> os.cpp
>
> Example updates of cmpxchg_ptr to cmpxchg:
> oop.inline.hpp
> bitMap.cpp
> bitMap.inline.hpp
>
> Some specific issues:
>
> - For Solaris (both SPARC and x86), we were using helper functions
> defined in corresponding .il files.  Using the cmpxchg_using_stub
> helper didn't work; passing a "defined in .il" function as a template
> parameter didn't work, leading to linker errors.  The ideal solution
> is to use gcc-style inline assembly, which is now supported by Solaris
> Studio.  That change was made for SPARC.  However, making that change
> for x86 ran into related segfaults, so using the .il definition with
> direct conversions for now, until that problem can be resolved.
>
> - I've added the function template Atomic::condition_store_ptr, which
> is a cmpxchg for pointer types with a NULL compare_value.  That might
> not be the best name for it.

Maybe conditional_store_pointer (not condition).  otherwise seems ok.
>
> - The name IntegerTypes was retained from the original RFR for
> JDK-8184334.  However, that name no longer fits the provided
> functionality very well, and I think a new name is needed.  Something
> involving Bitwise, or ValueRepresentation perhaps?  I'm open to
> suggestions.

I don't mind the IntegerTypes name even though it's not perfect. The 
other two names suggest other implementations of completely different 
things already in the vm, so I don't liek them.  There are no other 
sorts of IntegerTypes.
>
> - Issues with casts/conversions provided by IntegerTypes.
>
> * Conversions between integers and floating point values are
> supported.  It was suggested during the discussion of the RFR for
> JDK-8184334 that this was not needed, as there are no uses of
> atomic/ordered floating point values in Hotspot.  However, I'm told
> VarHandles may include atomic floating point values.
>
> * Conversions between integers and floating point values are presently
> implemented using the memcpy technique.  While this technique is
> correct, it is known to be quite poorly optimized by some compilers.
> It may be necessary to use the "union trick" for performance reasons,
> even though it is technically undefined behavior.  Hotspot already
> does this elsewhere (and we should merge some of the different
> conversion handling).
>
> * In the original RFR for JDK-8184334, conversions between different
> integral types of the same size (along with enums) were performed
> using
>
>    reinterpret_cast<ToType&>(from_value)
>
> This was based on discussion from several months ago, referring to
> C++03 3.10/15.  However, re-reading that section in response to
> discussion of the RFR for JDK-8184334, I now think that implementation
> is not supported in many cases needed here, and indeed invokes
> undefined behavior in those cases.  In particular, where 3.10/15 says
>    "- a type that is the signed or unsigned type corresponding to the
>       dynamic type of the object,"
>
> I interpret this to mean that if the dynamic type of an object is an
> integral type of a given rank, the value may be accessed using an
> lvalue of an integral type of the same rank and the opposite sign.
> (Using the same rank and sign is already permitted, since that's the
> same type.)  I don't think it allows access using an lvalue of an
> integral type with a different rank.  So, for example, a signed int
> value may be accessed via an unsigned int lvalue.  But access via a
> signed long lvalue is not permitted, even if int and long have the
> same size.
>
> I don't think there is a good way to perform the integral and enum
> conversions we need, without invoking implementation-defined,
> unspecified, or undefined behavior.
>
> ** The memcpy technique avoids such problems, but as noted above, is
> known to be quite poorly optimized by some compilers.
>
> ** Using static_cast avoids any undefined behavior, but does have
> cases of implementation-defined or unspecified behavior.  (When
> casting an unsigned value to a signed type, the result may be
> implementation defined; 5.2.9/2 and 4.7/3.  C++11 <type_traits> plus
> the reinterpret_cast to reference seems to provide possibly multi-step
> conversion paths that avoid such qualifications on all of the steps.)
> (When casting an integral value to an enum type, the result may be
> unspecified; 5.2.9/7.  With C++11 <type_traits>, the problem doesn't
> arise so long as the starting value is appropriate, e.g. we're
> reversing an enum to integer conversion.)
>
> ** Again here, we might consider the "union trick".

So the union trick is undefined behavior and reinterpret_cast<> is also 
undefined behavior (sometimes?). If that's the case, I think we should 
pick reinterpret_cast<> then because it's intended behavior is clearer.  
The union trick is really ugly.

Won't these undefined behaviors not cause bugs because we compile with 
the -fno-alias-pattern ?

Thanks,
Coleen
>
> I haven't changed the implementation from the JDK-8184334 RFR, but I'm
> presently leaning toward the "union trick", despite previous strong
> resistance to it.  Right now it seems to me to be the best option in
> practice, despite being disallowed by the standard.  It's really too
> bad the memcpy technique is so poorly supported by some compilers.
>
> CR:
> https://bugs.openjdk.java.net/browse/JDK-8186166
>
> Webrev:
> http://cr.openjdk.java.net/~kbarrett/8186166/hotspot.00/
>
> Testing:
> Ad hoc Hotspot nightly test.
>
>