RFR: 8067790: Better support for native implementations of Atomic::cmpxchg

Tue Jan 6 09:24:31 UTC 2015

Thank you Jesper!

With Jespers help, I have some webrevs of my two proposals:

Code Minimum:
http://cr.openjdk.java.net/~jwilhelm/8067790/webrev.01/
Reuse Maximum:
http://cr.openjdk.java.net/~jwilhelm/8067790/webrev.02/

Thanks,
/Erik

On 5 jan 2015, at 21:05, Jesper Wilhelmsson <jesper.wilhelmsson at oracle.com<mailto:jesper.wilhelmsson at oracle.com>> wrote:

Hi Erik,

Attachments are stripped when sent to the list. If you send me two patches I can upload webrevs for the two versions.
/Jesper

Erik Österlund skrev den 5/1/15 18:41:
Hi David and Kim,

Thank you David for putting this into context.

If we want this to be done with the inheritance/template way (specifically the SelectBaseClass template, oh and other names are welcome, InjectOptionalBase?), then I see two options, so I implemented both so you can compare and pick a favourite. They both have exactly identical semantics: they check for the existence of an optional class to inject to the class hierarchy, and check that the inheritance isn’t fishy (the optional class for specialization must be a subclass of the required class for generalization).

I think it all comes down to reusability vs code size.

Approach 1. “Reusability Maximum”: Build SelectBaseClass in terms of reusable components resembling boost/C++11 as much as possible.
(Note: Since Kim wanted IsBaseOf to require Derived to be complete (defined) to mimic other libraries, I now force a compiler error if parameters are incomplete as per request and instead use her proposal for checking if classes are defined, thanks for that contribution)

Pros: 1) Many reusable metafunctions that can be reused in other contexts (for instance automatic closure specialization), and I believe we will have to include them sooner or later anyway. 2) Since they comply very closely with the C++11 standard, it will be easy to make a transition once all compilers support this some time in the distant past.
Cons: 380 LoC template infrastructure, including testing, for a perceivably small application (if not reused in the future).

Approach 2. “Code Minimum”: Get the same SelectBaseClass, but with minimal dependencies. Only 20 LoC for the necessary metafunction, some more for IsSame used only for testing purposes, and 156 LoC in total for the template infrastructure, making it much lighter.

Pros: Minimal amount of code for the change (less than half).
Cons: No reusable metafunctions for the future which we quite likely will need later.

I have attached the two solutions in this email so you can have a look and tell me which direction you want me to go before making a final webrev to make everyone happy.

Personally I like approach 1 better but if it is concerning to commit too much code and there are strong opinions about this, I’m happy with approach 2 as well.

Appendix with code for minimal implementation with no dependencies:

template <class Optional, class Required>
class SelectBaseClass {
  typedef char yes[1];
  typedef char no[2];

  template<typename T>
  static yes &check(Optional*, T);
  static no &check(Required*, int);

  struct OptionalCheckHost {
    operator Required*() const;
    operator Optional*();
  };

  template<bool optional_valid, typename R, typename O>
  struct OptionalCheck { typedef O type; };

  template<typename R, typename O>
  struct OptionalCheck<false, R, O> { typedef R type; };
public:
  typedef typename OptionalCheck<sizeof(check(OptionalCheckHost(), int())) == sizeof(yes), Required, Optional>::type type;
};

Thanks,
/Erik

On 05 Jan 2015, at 04:34, David Holmes <david.holmes at oracle.com<mailto:david.holmes at oracle.com>> wrote:

Sorry for the top-post but just wanted to reset the context a little here. Originally [1] I made the comment:

"It's fine to have generic shared approaches but there really needs to be a way to allow platform specific "overrides"."

For the actual original RFR I said [2]:

"Can we pause and give some more thought to a clean mechanism for
allowing a shared implementation if desired with the ability to override if desired. I really do not like to see CPU specific ifdefs being added to shared code. (And I would also not like to see all platforms being forced to reimplement this natively).

I'm not saying we will find a simple solution, but it would be nice if
we could get a few folk to think about it before proceeding with the
ifdefs :)"

Erik then proposed three alternative approaches [3] and the simple variant was chosen [4] and presented for review. However Paul Hohensee also made comments about an inheritance-based approach and Erik floated the first template-based variant [5] and there was some discussion with Paul and Kim. But we then ended up with the simple solution, leaving an inheritance-based solution for future work [6] (which is what is what is being discussed now).

This template-based meta-programming is not something I have any familiarity with. My initial thought is this seems overkill for the situation we have with the Atomic class - but as I said I'm ignorant of the technique being used here.

David
-----

[1] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-September/015292.html
[2] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-September/015303.html
[3] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-September/015362.html
[4] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-September/015401.html
[5] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-October/015586.html
[6] http://mail.openjdk.java.net/pipermail/hotspot-dev/2014-November/015889.html

On 2/01/2015 12:07 AM, Erik Österlund wrote:
Hi Kim,

On 01 Jan 2015, at 07:42, Kim Barrett <kim.barrett at oracle.com<mailto:kim.barrett at oracle.com>> wrote:

On Dec 31, 2014, at 6:12 AM, Erik Österlund <erik.osterlund at lnu.se<mailto:erik.osterlund at lnu.se>> wrote:

The rest of these comments only matter if, contrary to my suggestion
above, some additional infrastructure is thought necessary.

This is what I originally thought too and hence the current solution. But voices said that using macros for this is, I quote, "that bad".

[…]

This scales with the number of AtomicPlatform definitions, rather than
the number of specialized functions as happens with the existing code.

So your proposal is to still use ugly macros but move it to the class level instead, even though at the moment there is really only one member function that needs it. That feels a bit meh to be honest. IMO if we want to do this right, why go half the way and still rely on ugly macros?

Please lay off with the perjorative "ugly macros".  The preprocessor
is part of C++, and isn't going away.  It's a tool that can be used,
or abused, and I think this is a fine use.  To me, "ugly" is adding
800+ lines of code / tests / comments to eliminate a dozen lines of
flag macro definitions and associated #ifdef’s.

Well people are entitled to have different opinions of course. As I already mentioned, I did this because other reviewers (as well as me) found it ugly to use macros for conditionally specializing, which is the whole motivation of doing this, and was well understood from the beginning. But I’m fine with leaving it as macros if this is in general preferred and opinions have suddenly changed in this matter.

Personally I don’t see how the number of rows of code define uglyness. Reusability, testing, documentation and well definedness in contracts/behaviours all lead to more lines of code, but I don’t think that equals uglyness. So while I disagree more code is uglier because it’s more, it’s all up to opinion and it’s quite pointless to discuss; you are entitled to think that is ugly.

If that additional infrastructure were amortized across more use-cases
then it might be more appealing.  I might even have some additional
use-cases for something along that line, though not in its present
form.  (More on that below.)

Yeah I specifically had in mind the wish to make a split between commercial extensions to open source code actually. This pattern allows easy optional class inheritance injection as a general design pattern to achieve this. And for my own purposes the ability to define asymmetric dekker synchronization to elide storeload fences when the platform (e.g. TSO) supports it but otherwise revert to a generalized implementation (using storeload fences). This is why I think a clean reusable strategy is almost always better because it is widely applicable, even if there would not be obvious other examples where it is useful.

I think the name "IsBaseOfAndDerived" would be better as
"IsBaseAndDerived”.  […]

The current implementation of IsBaseOfAndDerived (and therefore
IsBaseOf) may (silently) provide incorrect answers when called with
incomplete types. […]

The actual usage in this proposed change-set even relies on this
flawed behavior of these metafunctions, since AtomicPlatform isn't
always defined. […]

So what you are saying here is that I should rename IsBaseOfAndDerived to IsBaseAndDerived to more closely resemble other external libraries we do not use. But if I do that my behaviour is "incorrect" because it is not identical to that of the external library. And therefore my change would not work if it was made "correctly”.

The proposed IsBaseOfAndDerived and IsBaseOf are being described and
placed as general utilities.  Their names are strongly reminiscent of
existing facilities in other libraries, including the (C++11) standard
library. Conforming to these external standards will avoid surprises,
especially when the proposed facilities are even described as being
the same (except for a noted corner case that is both hard to deal
with in a portable manner and not actually very interesting, so I'm
not complaining about that) in the review request email.

Additionally, I think quietly providing a potentially incorrect answer
in the case of incomplete types renders the offered code quite broken
for use as a general utility.  I think the requirement for complete
types by the well-known / standard facilities is well justified.

While I disagree it is universally incorrect (as long as it is well defined and tested) to not have the exact same requirement on completeness on the parameters as C++11/boost, who also already have different behaviour in this exact regard as pointed out by yourself, I can try to fix it if it bugs you, but it will make the infrastructure even larger which seems to be a concern.

On the contrary this is a well tested feature on all our supported compilers and I did not intend to engineer it to be identical to some external library if it only causes trouble rather than helping. This is our own class and it can do what we want it to do and be called what we want it to be called, and I think that is fine as long as it is well tested, which it is. At least this is my opinion. Anyone else agree?

Problem is, I think it's well tested code to solve the wrong problem.

I suggest that what we're really looking for is not a base/derived
relationship between a pair of classes, but that we actually want to
determine whether a platform-specific class exists.

As a result, I think there is more infrastructure in this change set
than is actually needed.  The present implementation of
SelectBaseClass uses IsBaseOf, but doesn't need a lot of the
functionality of that metafunction. There is no actual need for a
Base/Derived relationship between the (possibly not defined)
platform-specific class and the default platform-generic class, so the
use of IsBaseOf is an unnecessary restriction that only
serendipitously tests for defined or not, due to the previously
discussed defect in this particular implementation.  I also think the
name SelectBaseClass is perhaps overly general.

The main idea of SelectBaseClass is to inject an optional class into the class hierarchy for specialization of generalized behaviour. It is absolutely critical (and intended) that this class is constrained to be a derived class of the fall-back class - it is not an artifact nor an accidental behaviour. In the hierarchy A is-maybe-a B is-a C, A must be able to rely on the superclass being at least C-like so that the behaviour of the superclass can be used reliably and transparently regardless of the existence of the optional B. If this contract would not hold, I think it could lead to some really weird and unreliable code which should not be encouraged (at least not according to the Liskov substitution principle etc).

So what I need in the contract is a check if the optional class is defined and if it fits in the inheritance hierarchy: they are both important IMO.

Yes it could have been less code. From the beginning I had a single trait class that checked for both the inheritance and whether the class is defined or not but you said you wanted the behaviour to more closely match that of third party libraries. This is why there is now a lot more code to more closely (but still not exactly) match that behaviour.

I might be able to use a similar approach in some code I've been
working on as a background task.  And while writing this reply I've
thought of another place that might benefit from something along those
lines.  Thinking about the code under review in that context, I think
some changes would make these other possible use-cases easier and
clearer.

I believe what is actually needed is a metatfunction something like:

/**
* Metafunction whose nested type member is Default if Specific is
* incomplete, otherwise Specific.
*
* Specific and Default must both be non-cv qualified class types, and
* must not be the same type.
*
* If Specific is incomplete at the point where this metafunction is
* invoked, it must never be defined elsewhere in the program.
*/
template<typename Specific, typename Default>
struct SelectPlatformBase;

[Note: I'm not wedded to that name.]

Now, it turns out to be hard / impossible to write a generic
is_complete<T> metafunction, due to ODR.  Whether a type is complete
can vary between translation units, and even within a translation
unit. Having is_complete<T>::value have different values depending on
where it is instantiated is an ODR violation.  (One can make progress
by using anonymous namespaces and macro wrappers with __LINE__
concatenation, but there is still the fundamental question of whether
is_complete<T> really even makes semantic sense.)

Yes and that makes me wonder if we care enough about the completeness semantics being the same as third party libraries (who are already inconsistent) if we do not need them to be, and on the contrary benefit from them not to be.

However, we don't need a fully general is_complete utility.  We have a
much more restricted use-case, where we want to select an optionally
defined platform-specific class if it exists, and otherwise fall back
to a more generic class.  And we're not using this in arbitrary
contexts, but only to select a platform-specialized implementation if
such exists.

Here's a lightly tested implementation of SelectPlatformBase:

// --- SelectPlatformBase

template<bool b, typename T = void>
struct EnableIf { typedef T type; };

template<typename T>
struct EnableIf<false, T> { };

template<bool Cond, typename Then, typename Else>
struct If { typedef Then type; };

template<typename Then, typename Else>
struct If<false, Then, Else> { typedef Else type; };

// Metafunction whose nested value member is true if T is defined
// (complete), and false if T is incomplete.  T must be a non-cv
// qualified class type.  If T is incomplete at the point where this
// metafunction is invoked, it must never be defined elsewhere in the
// program.
template<typename T>
class IsClassDefined {
 typedef char yes[1];
 typedef char no[2];

 template<typename U>
 static typename EnableIf<sizeof(U) == sizeof(U), yes>::type & check(U*);
 static no& check(...);

public:
 static const bool value = sizeof(check((T*)0)) == sizeof(yes);
};

template<typename Specific, typename Default>
struct SelectPlatformBase {
 typedef typename If<
   IsClassDefined<Specific>::value, Specific, Default>::type type;
};

// --- end SelectPlatformBase

That's ~30 lines of code (+ tests and comments TBD) to do precisely
what we need, replacing ~100 lines (+ tests and comments) that have
various issues.  (If and EnableIf should probably be hoisted out as
separate utilities.)  We don't actually need SelectPlatformBase, and
could instead just directly use the If and IsClassDefined
metafunctions, but giving this functionality a name might be clearer.

As stated, I think it’s dangerous to allow conditional inheritance where there is no inheritance constraint on the optional class. Therefore I do not think this is better, although I agree it is smaller.

I agree that our use case here is smaller. If the amount of code is the problem and less code (and hence fewer reusable components) is preferred, then the original implementation with a simple IsBaseOfAndDerived does the trick (without checking types/cv qualifiers) and could be done in just a few lines, and this is what I originally did before you told me to expand it to more closely resemble external libraries:

template<typename Base, typename Derived>
class IsBaseOfAndDerived {
  typedef char yes[1];
  typedef char no[2];

  template<typename T>
  static yes &check(Derived*, T);
  static no &check(Base*, int);

  template<typename B, typename D>
  struct IsBaseOfHost {
    operator B*() const;
    operator D*();
  };
public:
  static const bool value = sizeof(check(IsBaseOfHost<Base, Derived>(), int())) == sizeof(yes);
};

It checks the inheritance and existence of the Derived class which is exactly the constraints I need.

If you do not want to expose it publicly and remove the reusability because it does not have identical semantics as third party libraries regarding the requirements of the Derived type to be complete, it could be an internal class of SelectBaseClass.

I’m not going to value the reusability vs LoC, I leave that decision to you.

However, while I think there are other use cases for this specific
utility, I still think the proposed change set as a whole is adding a
lot of code just to avoid a a few lines of macro usage.  That seems
like a poor tradeoff to me.

Okay.

In the end, we have to make a choice - what is more important, that the traits resemble some external library or code size. Because last time I proposed a small change and you wanted it to be closer to C++11 behaviour and I made it closer to that at the expense of more code. Now you are saying I have to make it even tighter to C++11 (or boost?), but also don’t want more lines of code which I believe is impossible given that I want to constrain both the class hierarchy to be reliable and check the existence of the optional class. This brings me to a few architectural questions which I may have (strong) opinions about but are not up to me to decide.

1. Do we want to replace the macros with template metaprogramming? 2 reviewers (and me) liked the metaprogramming approach and it was implemented because they (like me) did not want the macros. But you have a different opinion. I can’t judge who is right and wrong here.

2. Do we want to minimize the code size at the cost of reusability by making contracts of dependent components weaker and putting them as private classes in the SelectBaseClass?

3. If we do want template metaprogramming and want reusable components, the question is, is it important that the specification of completeness (which in general is almost impossible to implement correctly in a portable way as you pointed out) of the template arguments in IsBaseOf is identical as C++11/boost even though no code needs it to be and on the contrary some code benefits from it not to be? In that case which one of them do we, because they are already different anyway? There is a tradeoff between code size and complexity vs copying either one of the external libraries regarding requirements of completeness of template parameters.

Thanks,
/Erik