RFR: 8186838: Generalize Atomic::inc/dec with templates

Fri Sep 1 21:51:17 UTC 2017

> David? I am curious if you have the same opinion. If you both want to replace the template names I and P with T, then I am happy to do that. 

I don't mind the P, I convention, but probably would not miss it either. 
So I'm on the fence.

David
-----

On 1/09/2017 11:31 PM, Erik Österlund wrote:
> Hi Coleen,
> 
> On 2017-09-01 14:51, coleen.phillimore at oracle.com wrote:
>>
>>
>> On 9/1/17 4:40 AM, Erik Österlund wrote:
>>> Hi Coleen,
>>>
>>> Thank you for taking your time to review this.
>>>
>>> On 2017-09-01 02:03, coleen.phillimore at oracle.com wrote:
>>>>
>>>> Hi, I'm trying to parse the templates to review this but maybe it's 
>>>> convention but decoding these with parameters that are single 
>>>> capital letters make reading the template very difficult.  There are 
>>>> already a lot of non-alphanumeric characters.   When the letter is 
>>>> T, that is expected by convention, but D or especially I makes it 
>>>> really hard.   Can these be normalized to all use T when there is 
>>>> only one template parameter?  It'll be clear that T* is a pointer 
>>>> and T is an integer without having it be P.
>>>
>>> I apologize the names of the template parameters are hard to 
>>> understand. For what it's worth, I am only consistently applying 
>>> Kim's conventions here. It seemed like a bad idea to violate 
>>> conventions already set up - that would arguably be more confusing.
>>>
>>> The convention from earlier work by Kim is:
>>> D: Type of destination
>>> I: Operand type that has to be an integral type
>>> P: Operand type that is a pointer element type
>>> T: Generic operand type, may be integral or pointer type
>>>
>>> Personally, I do not mind this convention. It is more specific and 
>>> annotates things we know about the type into the name of the type.
>>>
>>> Do you want me to:
>>>
>>> 1) Keep the convention, now that I have explained what the convention 
>>> is and why it is your friend
>>
>> It is not my friend.  It's not helpful.   I have to go through 
>> multiple non-alphabetic characters looking for the letter I or the 
>> letter P to mentally make the substitution of the template type.
> 
> Okay. I understand now that the pre-existing naming convention of types 
> named I and P differentiating integral types from pointer types is not 
> helpful to you. And if I understand you correctly, you would like to 
> introduce a new naming convention that you find more helpful that uses 
> the more general type name T instead, regardless if it refers to an 
> integral type or a pointer type, and save the exercise of figuring out 
> whether it is intentionally constrained to be a pointer type or an 
> integral type to the reader by going to the declaration, and there 
> reading some kind of comment describing such properties in text instead?
> 
> Do we have a consensus that this new convention is indeed more desirable?
> 
>>
>>> 2) Break the convention for this change only making the naming 
>>> inconsistent
>>
>> Break it for this changeset and we'll fix it later for the earlier 
>> work from Kim.  I don't remember P and I in Kim's changeset but 
>> realized while looking at your changeset, this was one thing that 
>> makes these templates slower and more difficult to read.
> 
> Okay.
> 
>> In the case of cmpxchg templates with a source, destination and 
>> original values, it was necessary to have more than T be the template 
>> type, although unsatisfying, because it turned out that the types 
>> couldn't be the same.
> 
> Okay.
> 
>>
>>> 3) Change the convention throughout consistently, including all 
>>> earlier work from Kim
>>>
>>>>
>>>> +template<typename I>
>>>> +struct Atomic::IncImpl<I, typename 
>>>> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
>>>> + void operator()(I volatile* dest) const {
>>>> + typedef IntegralConstant<I, I(1)> Adjustment;
>>>> + typedef PlatformInc<sizeof(I), Adjustment> PlatformOp;
>>>> + PlatformOp()(dest);
>>>> + }
>>>> +};
>>>>
>>>> This one isn't as difficult, because it's short, but it would be 
>>>> faster to understand with T.
>>>>
>>>> +template<typename T>
>>>> +struct Atomic::IncImpl<T, typename 
>>>> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
>>>> + void operator()(T volatile* dest) const {
>>>> + typedef IntegralConstant<T, T(1)> Adjustment;
>>>> + typedef PlatformInc<sizeof(T), Adjustment> PlatformOp;
>>>> + PlatformOp()(dest);
>>>> + }
>>>> +};
>>>>
>>>> +template<>
>>>> +struct Atomic::IncImpl<jshort> VALUE_OBJ_CLASS_SPEC {
>>>> + void operator()(jshort volatile* dest) const {
>>>> + add(jshort(1), dest);
>>>> + }
>>>> +};
>>>>
>>>>
>>>> Did I already ask if this could be changed to u2 rather than 
>>>> jshort?  Or is that the follow-on RFE?
>>>
>>> That is a follow-on RFE.
>>
>> Good.  I think that's the one that I assigned to myself.
> 
> Yes, you are right.
> 
>>>
>>>> +// Helper for platforms wanting a constant adjustment.
>>>> +template<size_t byte_size, typename Adjustment>
>>>> +struct Atomic::IncUsingConstant VALUE_OBJ_CLASS_SPEC {
>>>> + typedef PlatformInc<byte_size, Adjustment> Derived;
>>>>
>>>>
>>>> I can't find the caller of this.  Is it really a lot faster than 
>>>> having the platform independent add(1, T) / add(-1, T) to make all 
>>>> this code worth having?  How is this called?  I couldn't parse the 
>>>> trick.  Atomic::inc() is always a "constant adjustment" so I'm 
>>>> confused about what the comment means and what motivates all the asm 
>>>> code.   Do these platform implementations exist because they don't 
>>>> have twos complement for integer representation?  really?
>>>
>>> This is used by some x86, PPC and s390 platforms. Personally I 
>>> question its usefulness for x86. I believe it might be one of those 
>>> things were we ran some benchmarks a decade ago and concluded that it 
>>> was slightly faster to have a slimmed path for Atomic::inc rather 
>>> than reusing Atomic::add.
>>
>> Yes, there are a lot of optimizations that we slog along in the code 
>> base because they might have either theoretically or measurably made 
>> some difference in something we don't have anymore.
> 
> I noticed. :)
> 
>>
>>>
>>> I did not initially want to bring this up as it seems like none of my 
>>> business, but now that the question has been asked about differences, 
>>> I could not help but notice the advertised "leading sync" convention 
>>> of Atomic::inc on PPC is not respected. That is, there is no "sync" 
>>> fence before the atomic increment, as required by the specified 
>>> semantics. There is not even a leading "lwsync". The corresponding 
>>> Atomic::add operation though, does have leading lwsync (unlike 
>>> Atomic::inc). Now this should arguably be reinforced to sync rather 
>>> than lwsync to respect the advertised semantics of both Atomic::add 
>>> and Atomic::inc on PPC. Hopefully that statement will not turn into a 
>>> long unrelated mailing thread...
>>
>> Could you file an bug with this observation?
> 
> Sure.
> 
>>>
>>> Conclusively though, there is definitely a substantial difference in 
>>> the fencing comparing the PPC implementation of Atomic::inc to 
>>> Atomic::add. Whether either one of them conforms to intended 
>>> semantics or not is a different matter - one that I was hoping not to 
>>> have to deal with in this RFE as I am merely templateifying what was 
>>> already there, without judging the existing specializations. And it 
>>> is my observation that as the code looks now, we would incur a bunch 
>>> of more fencing compared to what the code does today on PPC.
>>>
>>
>> Completely understand.   How are these called exactly though?  I 
>> couldn't figure it out.
> 
> They are called like this:
> IncImpl::operator() calls PlatformInc::operator(), which has its class 
> partially specialized by the platform (e.g. atomic_linux_pcc.hpp). Its 
> operator() is defined by the super class helper, 
> IncUsingConstant::operator(), that scales the addend accordingly and 
> subsequently calls the PlatformInc::inc function that is defined in the 
> PPC-specific atomic header and performs some suitable inline assembly 
> for the operation.
> 
>>
>>>> Also, the function name This() is really disturbing and 
>>>> distracting.  Can it be called some verb() representing what it 
>>>> does?  cast_to_derived()?
>>>>
>>>> + template<typename I>
>>>> + void operator()(I volatile* dest) const {
>>>> + This()->template inc<I, Adjustment>(dest);
>>>> + }
>>>>
>>>
>>> Yes, I will change the name accordingly as you suggest.
>>>
>>>> I didn't know you could put "template" there. 
>>>
>>> It is required to put the template keyword before the member function 
>>> name when calling a template member function with explicit template 
>>> parameters (as opposed to implicitly inferred template parameters) on 
>>> a template type.
>>
>> I thought you could just stay inc<type,type>() in the call, but my C++ 
>> template vocabularly is minimal.
>>>
>>>> What does this call?
>>>
>>> This calls the platform-defined intrinsic that is defined in the 
>>> platform files - the one that contains the inline assembly.
>>
>> How?  I don't see how...  :(
> 
> Hopefully I already explained this above.
> 
>>>
>>>> Rather than I for integer case, and P for pointer case, can you add 
>>>> a one line comment above this like:
>>>> // Helper for integer types
>>>> and
>>>> // Helper for pointer types
>>>
>>> Or perhaps we could do both? Nevertheless, I will add these comments. 
>>> But as per the discussion above, I would be happy if we could keep 
>>> the convention that Kim has already set up for the template type names.
>>>
>>>> Small local comments would be really helpful for many of these 
>>>> functions.   Just to get more english words in there...  Since Kim's 
>>>> on vacation can you help me understand this code and add comments so 
>>>> I remember the reasons for some of this?
>>>
>>> Sure - I will decorate the code with some comments to help 
>>> understanding. I will send an updated webrev when I get your reply 
>>> regarding the typename naming convention verdict.
>>
>> That's my opinion anyway.   David might have the opposite opinion.
> 
> David? I am curious if you have the same opinion. If you both want to 
> replace the template names I and P with T, then I am happy to do that.
> 
> Thanks for the review.
> 
> /Erik
> 
>> Thanks,
>> Coleen
>>
>>>
>>> Thanks for the review!
>>>
>>> /Erik
>>>
>>>>
>>>> Thanks!
>>>> Coleen
>>>>
>>>>
>>>> On 8/31/17 8:45 AM, Erik Österlund wrote:
>>>>> Hi everyone,
>>>>>
>>>>> Bug ID:
>>>>> https://bugs.openjdk.java.net/browse/JDK-8186838
>>>>>
>>>>> Webrev:
>>>>> http://cr.openjdk.java.net/~eosterlund/8186838/webrev.00/
>>>>>
>>>>> The time has come for the next step in generalizing Atomic with 
>>>>> templates. Today I will focus on Atomic::inc/dec.
>>>>>
>>>>> I have tried to mimic the new Kim style that seems to have been 
>>>>> universally accepted. Like Atomic::add and Atomic::cmpxchg, the 
>>>>> structure looks like this:
>>>>>
>>>>> Layer 1) Atomic::inc/dec calls an IncImpl()/DecImpl() function 
>>>>> object that performs some basic type checks.
>>>>> Layer 2) IncImpl/DecImpl calls PlatformInc/PlatformDec that can 
>>>>> define the operation arbitrarily for a given platform. The default 
>>>>> implementation if not specialized for a platform is to call 
>>>>> Atomic::add. So only platforms that want to do something different 
>>>>> than that as an optimization have to provide a specialization.
>>>>> Layer 3) Platforms that decide to specialize 
>>>>> PlatformInc/PlatformDec to be more optimized may inherit from a 
>>>>> helper class IncUsingConstant/DecUsingConstant. This helper helps 
>>>>> performing the necessary computation what the increment/decrement 
>>>>> should be after pointer scaling using CRTP. The 
>>>>> PlatformInc/PlatformDec operation then only needs to define an 
>>>>> inc/dec member function, and will then get all the context 
>>>>> information necessary to generate a more optimized implementation. 
>>>>> Easy peasy.
>>>>>
>>>>> It is worth noticing that the generalized Atomic::dec operation 
>>>>> assumes a two's complement integer machine and potentially sends 
>>>>> the unary negative of a potentially unsigned type to Atomic::add. I 
>>>>> have the following comments about this:
>>>>> 1) We already assume in other code that two's complement integers 
>>>>> must be present.
>>>>> 2) A machine that does not have two's complement integers may still 
>>>>> simply provide a specialization that solves the problem in a 
>>>>> different way.
>>>>> 3) The alternative that does not make assumptions about that would 
>>>>> use the good old IntegerTypes::cast_to_signed metaprogramming 
>>>>> stuff, and I seem to recall we thought that was a bit too involved 
>>>>> and complicated.
>>>>> This is the reason why I have chosen to use unary minus on the 
>>>>> potentially unsigned type in the shared helper code that sends the 
>>>>> decrement as an addend to Atomic::add.
>>>>>
>>>>> It would also be nice if somebody with access to PPC and s390 
>>>>> machines could try out the relevant changes there so I do not 
>>>>> accidentally break those platforms. I have blind-coded the addition 
>>>>> of the immediate values passed in to the inline assembly in a way 
>>>>> that I think looks like it should work.
>>>>>
>>>>> Testing:
>>>>> RBT hs-tier3, JPRT --testset hotspot
>>>>>
>>>>> Thanks,
>>>>> /Erik
>>>>
>>>
>>
>