RFR: 8186838: Generalize Atomic::inc/dec with templates

Fri Sep 1 08:40:06 UTC 2017

Hi Coleen,

Thank you for taking your time to review this.

On 2017-09-01 02:03, coleen.phillimore at oracle.com wrote:
>
> Hi, I'm trying to parse the templates to review this but maybe it's 
> convention but decoding these with parameters that are single capital 
> letters make reading the template very difficult.  There are already a 
> lot of non-alphanumeric characters.   When the letter is T, that is 
> expected by convention, but D or especially I makes it really hard.   
> Can these be normalized to all use T when there is only one template 
> parameter?  It'll be clear that T* is a pointer and T is an integer 
> without having it be P.

I apologize the names of the template parameters are hard to understand. 
For what it's worth, I am only consistently applying Kim's conventions 
here. It seemed like a bad idea to violate conventions already set up - 
that would arguably be more confusing.

The convention from earlier work by Kim is:
D: Type of destination
I: Operand type that has to be an integral type
P: Operand type that is a pointer element type
T: Generic operand type, may be integral or pointer type

Personally, I do not mind this convention. It is more specific and 
annotates things we know about the type into the name of the type.

Do you want me to:

1) Keep the convention, now that I have explained what the convention is 
and why it is your friend
2) Break the convention for this change only making the naming inconsistent
3) Change the convention throughout consistently, including all earlier 
work from Kim

>
> +template<typename I>
> +struct Atomic::IncImpl<I, typename 
> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
> + void operator()(I volatile* dest) const {
> + typedef IntegralConstant<I, I(1)> Adjustment;
> + typedef PlatformInc<sizeof(I), Adjustment> PlatformOp;
> + PlatformOp()(dest);
> + }
> +};
>
> This one isn't as difficult, because it's short, but it would be 
> faster to understand with T.
>
> +template<typename T>
> +struct Atomic::IncImpl<T, typename 
> EnableIf<IsIntegral<I>::value>::type> VALUE_OBJ_CLASS_SPEC {
> + void operator()(T volatile* dest) const {
> + typedef IntegralConstant<T, T(1)> Adjustment;
> + typedef PlatformInc<sizeof(T), Adjustment> PlatformOp;
> + PlatformOp()(dest);
> + }
> +};
>
> +template<>
> +struct Atomic::IncImpl<jshort> VALUE_OBJ_CLASS_SPEC {
> + void operator()(jshort volatile* dest) const {
> + add(jshort(1), dest);
> + }
> +};
>
>
> Did I already ask if this could be changed to u2 rather than jshort?  
> Or is that the follow-on RFE?

That is a follow-on RFE.

> +// Helper for platforms wanting a constant adjustment.
> +template<size_t byte_size, typename Adjustment>
> +struct Atomic::IncUsingConstant VALUE_OBJ_CLASS_SPEC {
> + typedef PlatformInc<byte_size, Adjustment> Derived;
>
>
> I can't find the caller of this.  Is it really a lot faster than 
> having the platform independent add(1, T) / add(-1, T) to make all 
> this code worth having?  How is this called?  I couldn't parse the 
> trick.  Atomic::inc() is always a "constant adjustment" so I'm 
> confused about what the comment means and what motivates all the asm 
> code.   Do these platform implementations exist because they don't 
> have twos complement for integer representation?  really?

This is used by some x86, PPC and s390 platforms. Personally I question 
its usefulness for x86. I believe it might be one of those things were 
we ran some benchmarks a decade ago and concluded that it was slightly 
faster to have a slimmed path for Atomic::inc rather than reusing 
Atomic::add.

I did not initially want to bring this up as it seems like none of my 
business, but now that the question has been asked about differences, I 
could not help but notice the advertised "leading sync" convention of 
Atomic::inc on PPC is not respected. That is, there is no "sync" fence 
before the atomic increment, as required by the specified semantics. 
There is not even a leading "lwsync". The corresponding Atomic::add 
operation though, does have leading lwsync (unlike Atomic::inc). Now 
this should arguably be reinforced to sync rather than lwsync to respect 
the advertised semantics of both Atomic::add and Atomic::inc on PPC. 
Hopefully that statement will not turn into a long unrelated mailing 
thread...

Conclusively though, there is definitely a substantial difference in the 
fencing comparing the PPC implementation of Atomic::inc to Atomic::add. 
Whether either one of them conforms to intended semantics or not is a 
different matter - one that I was hoping not to have to deal with in 
this RFE as I am merely templateifying what was already there, without 
judging the existing specializations. And it is my observation that as 
the code looks now, we would incur a bunch of more fencing compared to 
what the code does today on PPC.

> Also, the function name This() is really disturbing and distracting.  
> Can it be called some verb() representing what it does?  
> cast_to_derived()?
>
> + template<typename I>
> + void operator()(I volatile* dest) const {
> + This()->template inc<I, Adjustment>(dest);
> + }
>

Yes, I will change the name accordingly as you suggest.

> I didn't know you could put "template" there. 

It is required to put the template keyword before the member function 
name when calling a template member function with explicit template 
parameters (as opposed to implicitly inferred template parameters) on a 
template type.

> What does this call?

This calls the platform-defined intrinsic that is defined in the 
platform files - the one that contains the inline assembly.

> Rather than I for integer case, and P for pointer case, can you add a 
> one line comment above this like:
> // Helper for integer types
> and
> // Helper for pointer types

Or perhaps we could do both? Nevertheless, I will add these comments. 
But as per the discussion above, I would be happy if we could keep the 
convention that Kim has already set up for the template type names.

> Small local comments would be really helpful for many of these 
> functions.   Just to get more english words in there...  Since Kim's 
> on vacation can you help me understand this code and add comments so I 
> remember the reasons for some of this?

Sure - I will decorate the code with some comments to help 
understanding. I will send an updated webrev when I get your reply 
regarding the typename naming convention verdict.

Thanks for the review!

/Erik

>
> Thanks!
> Coleen
>
>
> On 8/31/17 8:45 AM, Erik Österlund wrote:
>> Hi everyone,
>>
>> Bug ID:
>> https://bugs.openjdk.java.net/browse/JDK-8186838
>>
>> Webrev:
>> http://cr.openjdk.java.net/~eosterlund/8186838/webrev.00/
>>
>> The time has come for the next step in generalizing Atomic with 
>> templates. Today I will focus on Atomic::inc/dec.
>>
>> I have tried to mimic the new Kim style that seems to have been 
>> universally accepted. Like Atomic::add and Atomic::cmpxchg, the 
>> structure looks like this:
>>
>> Layer 1) Atomic::inc/dec calls an IncImpl()/DecImpl() function object 
>> that performs some basic type checks.
>> Layer 2) IncImpl/DecImpl calls PlatformInc/PlatformDec that can 
>> define the operation arbitrarily for a given platform. The default 
>> implementation if not specialized for a platform is to call 
>> Atomic::add. So only platforms that want to do something different 
>> than that as an optimization have to provide a specialization.
>> Layer 3) Platforms that decide to specialize PlatformInc/PlatformDec 
>> to be more optimized may inherit from a helper class 
>> IncUsingConstant/DecUsingConstant. This helper helps performing the 
>> necessary computation what the increment/decrement should be after 
>> pointer scaling using CRTP. The PlatformInc/PlatformDec operation 
>> then only needs to define an inc/dec member function, and will then 
>> get all the context information necessary to generate a more 
>> optimized implementation. Easy peasy.
>>
>> It is worth noticing that the generalized Atomic::dec operation 
>> assumes a two's complement integer machine and potentially sends the 
>> unary negative of a potentially unsigned type to Atomic::add. I have 
>> the following comments about this:
>> 1) We already assume in other code that two's complement integers 
>> must be present.
>> 2) A machine that does not have two's complement integers may still 
>> simply provide a specialization that solves the problem in a 
>> different way.
>> 3) The alternative that does not make assumptions about that would 
>> use the good old IntegerTypes::cast_to_signed metaprogramming stuff, 
>> and I seem to recall we thought that was a bit too involved and 
>> complicated.
>> This is the reason why I have chosen to use unary minus on the 
>> potentially unsigned type in the shared helper code that sends the 
>> decrement as an addend to Atomic::add.
>>
>> It would also be nice if somebody with access to PPC and s390 
>> machines could try out the relevant changes there so I do not 
>> accidentally break those platforms. I have blind-coded the addition 
>> of the immediate values passed in to the inline assembly in a way 
>> that I think looks like it should work.
>>
>> Testing:
>> RBT hs-tier3, JPRT --testset hotspot
>>
>> Thanks,
>> /Erik
>