RFR: 8186838: Generalize Atomic::inc/dec with templates
Erik Österlund
erik.osterlund at oracle.com
Fri Sep 1 13:31:24 UTC 2017
Hi Coleen,
On 2017-09-01 14:51, coleen.phillimore at oracle.com wrote:
>
>
> On 9/1/17 4:40 AM, Erik Österlund wrote:
>> Hi Coleen,
>>
>> Thank you for taking your time to review this.
>>
>> On 2017-09-01 02:03, coleen.phillimore at oracle.com wrote:
>>>
>>> Hi, I'm trying to parse the templates to review this but maybe it's
>>> convention but decoding these with parameters that are single
>>> capital letters make reading the template very difficult. There are
>>> already a lot of non-alphanumeric characters. When the letter is
>>> T, that is expected by convention, but D or especially I makes it
>>> really hard. Can these be normalized to all use T when there is
>>> only one template parameter? It'll be clear that T* is a pointer
>>> and T is an integer without having it be P.
>>
>> I apologize the names of the template parameters are hard to
>> understand. For what it's worth, I am only consistently applying
>> Kim's conventions here. It seemed like a bad idea to violate
>> conventions already set up - that would arguably be more confusing.
>>
>> The convention from earlier work by Kim is:
>> D: Type of destination
>> I: Operand type that has to be an integral type
>> P: Operand type that is a pointer element type
>> T: Generic operand type, may be integral or pointer type
>>
>> Personally, I do not mind this convention. It is more specific and
>> annotates things we know about the type into the name of the type.
>>
>> Do you want me to:
>>
>> 1) Keep the convention, now that I have explained what the convention
>> is and why it is your friend
>
> It is not my friend. It's not helpful. I have to go through
> multiple non-alphabetic characters looking for the letter I or the
> letter P to mentally make the substitution of the template type.
Okay. I understand now that the pre-existing naming convention of types
named I and P differentiating integral types from pointer types is not
helpful to you. And if I understand you correctly, you would like to
introduce a new naming convention that you find more helpful that uses
the more general type name T instead, regardless if it refers to an
integral type or a pointer type, and save the exercise of figuring out
whether it is intentionally constrained to be a pointer type or an
integral type to the reader by going to the declaration, and there
reading some kind of comment describing such properties in text instead?
Do we have a consensus that this new convention is indeed more desirable?
>
>> 2) Break the convention for this change only making the naming
>> inconsistent
>
> Break it for this changeset and we'll fix it later for the earlier
> work from Kim. I don't remember P and I in Kim's changeset but
> realized while looking at your changeset, this was one thing that
> makes these templates slower and more difficult to read.
Okay.
> In the case of cmpxchg templates with a source, destination and
> original values, it was necessary to have more than T be the template
> type, although unsatisfying, because it turned out that the types
> couldn't be the same.
Okay.
>
>> 3) Change the convention throughout consistently, including all
>> earlier work from Kim
>>
>>>
>>> +template<typename I>
>>> +struct Atomic::IncImpl<I, typename
>>> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
>>> + void operator()(I volatile* dest) const {
>>> + typedef IntegralConstant<I, I(1)> Adjustment;
>>> + typedef PlatformInc<sizeof(I), Adjustment> PlatformOp;
>>> + PlatformOp()(dest);
>>> + }
>>> +};
>>>
>>> This one isn't as difficult, because it's short, but it would be
>>> faster to understand with T.
>>>
>>> +template<typename T>
>>> +struct Atomic::IncImpl<T, typename
>>> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
>>> + void operator()(T volatile* dest) const {
>>> + typedef IntegralConstant<T, T(1)> Adjustment;
>>> + typedef PlatformInc<sizeof(T), Adjustment> PlatformOp;
>>> + PlatformOp()(dest);
>>> + }
>>> +};
>>>
>>> +template<>
>>> +struct Atomic::IncImpl<jshort> VALUE_OBJ_CLASS_SPEC {
>>> + void operator()(jshort volatile* dest) const {
>>> + add(jshort(1), dest);
>>> + }
>>> +};
>>>
>>>
>>> Did I already ask if this could be changed to u2 rather than
>>> jshort? Or is that the follow-on RFE?
>>
>> That is a follow-on RFE.
>
> Good. I think that's the one that I assigned to myself.
Yes, you are right.
>>
>>> +// Helper for platforms wanting a constant adjustment.
>>> +template<size_t byte_size, typename Adjustment>
>>> +struct Atomic::IncUsingConstant VALUE_OBJ_CLASS_SPEC {
>>> + typedef PlatformInc<byte_size, Adjustment> Derived;
>>>
>>>
>>> I can't find the caller of this. Is it really a lot faster than
>>> having the platform independent add(1, T) / add(-1, T) to make all
>>> this code worth having? How is this called? I couldn't parse the
>>> trick. Atomic::inc() is always a "constant adjustment" so I'm
>>> confused about what the comment means and what motivates all the asm
>>> code. Do these platform implementations exist because they don't
>>> have twos complement for integer representation? really?
>>
>> This is used by some x86, PPC and s390 platforms. Personally I
>> question its usefulness for x86. I believe it might be one of those
>> things were we ran some benchmarks a decade ago and concluded that it
>> was slightly faster to have a slimmed path for Atomic::inc rather
>> than reusing Atomic::add.
>
> Yes, there are a lot of optimizations that we slog along in the code
> base because they might have either theoretically or measurably made
> some difference in something we don't have anymore.
I noticed. :)
>
>>
>> I did not initially want to bring this up as it seems like none of my
>> business, but now that the question has been asked about differences,
>> I could not help but notice the advertised "leading sync" convention
>> of Atomic::inc on PPC is not respected. That is, there is no "sync"
>> fence before the atomic increment, as required by the specified
>> semantics. There is not even a leading "lwsync". The corresponding
>> Atomic::add operation though, does have leading lwsync (unlike
>> Atomic::inc). Now this should arguably be reinforced to sync rather
>> than lwsync to respect the advertised semantics of both Atomic::add
>> and Atomic::inc on PPC. Hopefully that statement will not turn into a
>> long unrelated mailing thread...
>
> Could you file an bug with this observation?
Sure.
>>
>> Conclusively though, there is definitely a substantial difference in
>> the fencing comparing the PPC implementation of Atomic::inc to
>> Atomic::add. Whether either one of them conforms to intended
>> semantics or not is a different matter - one that I was hoping not to
>> have to deal with in this RFE as I am merely templateifying what was
>> already there, without judging the existing specializations. And it
>> is my observation that as the code looks now, we would incur a bunch
>> of more fencing compared to what the code does today on PPC.
>>
>
> Completely understand. How are these called exactly though? I
> couldn't figure it out.
They are called like this:
IncImpl::operator() calls PlatformInc::operator(), which has its class
partially specialized by the platform (e.g. atomic_linux_pcc.hpp). Its
operator() is defined by the super class helper,
IncUsingConstant::operator(), that scales the addend accordingly and
subsequently calls the PlatformInc::inc function that is defined in the
PPC-specific atomic header and performs some suitable inline assembly
for the operation.
>
>>> Also, the function name This() is really disturbing and
>>> distracting. Can it be called some verb() representing what it
>>> does? cast_to_derived()?
>>>
>>> + template<typename I>
>>> + void operator()(I volatile* dest) const {
>>> + This()->template inc<I, Adjustment>(dest);
>>> + }
>>>
>>
>> Yes, I will change the name accordingly as you suggest.
>>
>>> I didn't know you could put "template" there.
>>
>> It is required to put the template keyword before the member function
>> name when calling a template member function with explicit template
>> parameters (as opposed to implicitly inferred template parameters) on
>> a template type.
>
> I thought you could just stay inc<type,type>() in the call, but my C++
> template vocabularly is minimal.
>>
>>> What does this call?
>>
>> This calls the platform-defined intrinsic that is defined in the
>> platform files - the one that contains the inline assembly.
>
> How? I don't see how... :(
Hopefully I already explained this above.
>>
>>> Rather than I for integer case, and P for pointer case, can you add
>>> a one line comment above this like:
>>> // Helper for integer types
>>> and
>>> // Helper for pointer types
>>
>> Or perhaps we could do both? Nevertheless, I will add these comments.
>> But as per the discussion above, I would be happy if we could keep
>> the convention that Kim has already set up for the template type names.
>>
>>> Small local comments would be really helpful for many of these
>>> functions. Just to get more english words in there... Since Kim's
>>> on vacation can you help me understand this code and add comments so
>>> I remember the reasons for some of this?
>>
>> Sure - I will decorate the code with some comments to help
>> understanding. I will send an updated webrev when I get your reply
>> regarding the typename naming convention verdict.
>
> That's my opinion anyway. David might have the opposite opinion.
David? I am curious if you have the same opinion. If you both want to
replace the template names I and P with T, then I am happy to do that.
Thanks for the review.
/Erik
> Thanks,
> Coleen
>
>>
>> Thanks for the review!
>>
>> /Erik
>>
>>>
>>> Thanks!
>>> Coleen
>>>
>>>
>>> On 8/31/17 8:45 AM, Erik Österlund wrote:
>>>> Hi everyone,
>>>>
>>>> Bug ID:
>>>> https://bugs.openjdk.java.net/browse/JDK-8186838
>>>>
>>>> Webrev:
>>>> http://cr.openjdk.java.net/~eosterlund/8186838/webrev.00/
>>>>
>>>> The time has come for the next step in generalizing Atomic with
>>>> templates. Today I will focus on Atomic::inc/dec.
>>>>
>>>> I have tried to mimic the new Kim style that seems to have been
>>>> universally accepted. Like Atomic::add and Atomic::cmpxchg, the
>>>> structure looks like this:
>>>>
>>>> Layer 1) Atomic::inc/dec calls an IncImpl()/DecImpl() function
>>>> object that performs some basic type checks.
>>>> Layer 2) IncImpl/DecImpl calls PlatformInc/PlatformDec that can
>>>> define the operation arbitrarily for a given platform. The default
>>>> implementation if not specialized for a platform is to call
>>>> Atomic::add. So only platforms that want to do something different
>>>> than that as an optimization have to provide a specialization.
>>>> Layer 3) Platforms that decide to specialize
>>>> PlatformInc/PlatformDec to be more optimized may inherit from a
>>>> helper class IncUsingConstant/DecUsingConstant. This helper helps
>>>> performing the necessary computation what the increment/decrement
>>>> should be after pointer scaling using CRTP. The
>>>> PlatformInc/PlatformDec operation then only needs to define an
>>>> inc/dec member function, and will then get all the context
>>>> information necessary to generate a more optimized implementation.
>>>> Easy peasy.
>>>>
>>>> It is worth noticing that the generalized Atomic::dec operation
>>>> assumes a two's complement integer machine and potentially sends
>>>> the unary negative of a potentially unsigned type to Atomic::add. I
>>>> have the following comments about this:
>>>> 1) We already assume in other code that two's complement integers
>>>> must be present.
>>>> 2) A machine that does not have two's complement integers may still
>>>> simply provide a specialization that solves the problem in a
>>>> different way.
>>>> 3) The alternative that does not make assumptions about that would
>>>> use the good old IntegerTypes::cast_to_signed metaprogramming
>>>> stuff, and I seem to recall we thought that was a bit too involved
>>>> and complicated.
>>>> This is the reason why I have chosen to use unary minus on the
>>>> potentially unsigned type in the shared helper code that sends the
>>>> decrement as an addend to Atomic::add.
>>>>
>>>> It would also be nice if somebody with access to PPC and s390
>>>> machines could try out the relevant changes there so I do not
>>>> accidentally break those platforms. I have blind-coded the addition
>>>> of the immediate values passed in to the inline assembly in a way
>>>> that I think looks like it should work.
>>>>
>>>> Testing:
>>>> RBT hs-tier3, JPRT --testset hotspot
>>>>
>>>> Thanks,
>>>> /Erik
>>>
>>
>
More information about the hotspot-dev
mailing list