RFR: 8186838: Generalize Atomic::inc/dec with templates
Erik Österlund
erik.osterlund at oracle.com
Mon Sep 4 07:15:02 UTC 2017
Hi David,
On 2017-09-04 03:24, David Holmes wrote:
> Hi Erik,
>
> On 1/09/2017 8:49 PM, Erik Österlund wrote:
>> Hi David,
>>
>> The shared structure for all operations is the following:
>>
>> An Atomic::something call creates a SomethingImpl function object
>> that performs some basic type checking and then forwards the call
>> straight to a PlatformSomething function object. This
>> PlatformSomething object could decide to do anything. But to make
>> life easier, it may inherit from a shared SomethingHelper function
>> object with CRTP that calls back into the PlatformSomething function
>> object to emit inline assembly.
>
> Right, but! Lets look at some details.
>
> Atomic::add
> AddImpl
> PlatformAdd<size>
> FetchAndAdd
> AddAndFetch
> add_using_helper
>
> Atomic::cmpxchg
> CmpxchgImpl
> PlatformCmpxchg<size>
> cmpxchg_using_helper
>
> Atomic::inc
> IncImpl
> PlatformInc<size, Adjustment>
> IncUsingConstant
>
> Why is it that the simplest operation (inc/dec) has the most complex
> platform template definition? Why do we need Adjustment? You
> previously said "Adjustment represents the increment/decrement value
> as an IntegralConstant - your template friend for passing around a
> constant with both a specified type and value in templates". But add
> passes around values and doesn't need this. Further inc/dec don't need
> to pass anything around anywhere - inc adds 1, dec subtracts 1! This
> "1" does not need to appear anywhere in the API or get passed across
> layers - the only place this "1" becomes evident is in the actual
> platform asm that does the logic of "add 1" or "subtract 1".
>
> My understanding from previous discussions is that much of the
> template machinations was to deal with type management for "dest" and
> the values being passed around. But here, for inc/dec there are no
> values being passed so we don't have to make "dest" type-compatible
> with any value.
Dealing with different types being passed in is one part of the problem
- a problem that almost all operations seems to have. But Atomic::add
and inc/dec have more problems to deal with.
The Atomic::add operation has two more problems that cmpxchg does not have.
1) It needs to scale pointer arithmetic. So if you have a P* and you add
it by 2, then you really add the underlying value by 2 * sizeof(P), and
the scaled addend needs to be of the right type - the type of the
destination for integral types and ptrdiff_t for pointers. This is
similar semantics to ++pointer.
2) It connects backends with different semantics - either fetch_and_add
or add_and_fetch to a common public interface with add_and_fetch semantics.
This is the reason that Atomic::add might appear more complicated than
Atomic::cmpxchg. Because Atomic::cmpxchg only had the different type
problems to deal with - no pointer arithmetics.
The reason why Atomic::inc/dec looks more complicated than Atomic::add
is that it needs to preserve the pointer arithmetic as constants rather
than values, because the scaled addend is embedded in the inline
assembly as immediate values. Therefore it passes around an
IntegralConstant that embeds both the type and size of the addend. And
it is not just 1/-1. For integral destinations the constant used is 1/-1
of the type stored at the destination. For pointers the constant is
ptrdiff_t with a value representing the size of the element pointed to.
Having said that - I am not opposed to simply removing the
specializations of inc/dec if we are scared of the complexity of passing
this constant to the platform layer. After running a bunch of benchmarks
over the weekend, it showed no significant regressions after removal.
Now of course that might not tell the full story - it could have missed
that some critical operation in the JVM takes longer. But I would be
very surprised if that was the case.
Thanks,
/Erik
>
> Cheers,
> David
> -----
>
>> Hope this explanation helps understanding the intended structure of
>> this work.
>>
>> Thanks,
>> /Erik
>>
>> On 2017-09-01 12:34, David Holmes wrote:
>>> Hi Erik,
>>>
>>> I just wanted to add that I would expect the cmpxchg, add and inc,
>>> Atomic API's to all require similar basic structure for manipulating
>>> types/values etc, yet all three seem to have quite different
>>> structures that I find very confusing. I'm still at a loss to fathom
>>> the CRTP and the hoops we seemingly have to jump through just to add
>>> or subtract 1!!!
>>>
>>> Cheers,
>>> David
>>>
>>> On 1/09/2017 7:29 PM, Erik Österlund wrote:
>>>> Hi David,
>>>>
>>>> On 2017-09-01 02:49, David Holmes wrote:
>>>>> Hi Erik,
>>>>>
>>>>> Sorry but this one is really losing me.
>>>>>
>>>>> What is the role of Adjustment ??
>>>>
>>>> Adjustment represents the increment/decrement value as an
>>>> IntegralConstant - your template friend for passing around a
>>>> constant with both a specified type and value in templates. The
>>>> type of the increment/decrement is the type of the destination when
>>>> the destination is an integral type, otherwise if it is a pointer
>>>> type, the increment/decrement type is ptrdiff_t.
>>>>
>>>>> How are inc/dec anything but "using constant" ??
>>>>
>>>> I was also a bit torn on that name (I assume you are referring to
>>>> IncUsingConstant/DecUsingConstant). It was hard to find a name that
>>>> depicted what this platform helper does. I considered calling the
>>>> helper something with immediate in the name because it is really
>>>> used to embed the constant as immediate values in inline assembly
>>>> today. But then again that seemed too specific, as it is not
>>>> completely obvious platform specializations will use it in that
>>>> way. One might just want to specialize this to send it into some
>>>> compiler Atomic::inc intrinsic for example. Do you have any other
>>>> preferred names? Here are a few possible names for IncUsingConstant:
>>>>
>>>> IncUsingScaledConstant
>>>> IncUsingAdjustedConstant
>>>> IncUsingPlatformHelper
>>>>
>>>> Any favourites?
>>>>
>>>>> Why do we special case jshort??
>>>>
>>>> To be consistent with the special case of Atomic::add on jshort. Do
>>>> you want it removed?
>>>>
>>>>> This is indecipherable to normal people ;-)
>>>>>
>>>>> This()->template inc<I, Adjustment>(dest);
>>>>>
>>>>> For something as trivial as adding or subtracting 1 the template
>>>>> machinations here are just mind boggling!
>>>>
>>>> This uses the CRTP (Curiously Recurring Template Pattern) C++
>>>> idiom. The idea is to devirtualize a virtual call by passing in the
>>>> derived type as a template parameter to a base class, and then let
>>>> the base class static_cast to the derived class to devirtualize the
>>>> call. I hope this explanation sheds some light on what is going on.
>>>> The same CRTP idiom was used in the Atomic::add implementation in a
>>>> similar fashion.
>>>>
>>>> I will add some comments describing this in the next round after
>>>> Coleen replies.
>>>>
>>>> Thanks for looking at this.
>>>>
>>>> /Erik
>>>>
>>>>>
>>>>> Cheers,
>>>>> David
>>>>>
>>>>> On 31/08/2017 10:45 PM, Erik Österlund wrote:
>>>>>> Hi everyone,
>>>>>>
>>>>>> Bug ID:
>>>>>> https://bugs.openjdk.java.net/browse/JDK-8186838
>>>>>>
>>>>>> Webrev:
>>>>>> http://cr.openjdk.java.net/~eosterlund/8186838/webrev.00/
>>>>>>
>>>>>> The time has come for the next step in generalizing Atomic with
>>>>>> templates. Today I will focus on Atomic::inc/dec.
>>>>>>
>>>>>> I have tried to mimic the new Kim style that seems to have been
>>>>>> universally accepted. Like Atomic::add and Atomic::cmpxchg, the
>>>>>> structure looks like this:
>>>>>>
>>>>>> Layer 1) Atomic::inc/dec calls an IncImpl()/DecImpl() function
>>>>>> object that performs some basic type checks.
>>>>>> Layer 2) IncImpl/DecImpl calls PlatformInc/PlatformDec that can
>>>>>> define the operation arbitrarily for a given platform. The
>>>>>> default implementation if not specialized for a platform is to
>>>>>> call Atomic::add. So only platforms that want to do something
>>>>>> different than that as an optimization have to provide a
>>>>>> specialization.
>>>>>> Layer 3) Platforms that decide to specialize
>>>>>> PlatformInc/PlatformDec to be more optimized may inherit from a
>>>>>> helper class IncUsingConstant/DecUsingConstant. This helper helps
>>>>>> performing the necessary computation what the increment/decrement
>>>>>> should be after pointer scaling using CRTP. The
>>>>>> PlatformInc/PlatformDec operation then only needs to define an
>>>>>> inc/dec member function, and will then get all the context
>>>>>> information necessary to generate a more optimized
>>>>>> implementation. Easy peasy.
>>>>>>
>>>>>> It is worth noticing that the generalized Atomic::dec operation
>>>>>> assumes a two's complement integer machine and potentially sends
>>>>>> the unary negative of a potentially unsigned type to Atomic::add.
>>>>>> I have the following comments about this:
>>>>>> 1) We already assume in other code that two's complement integers
>>>>>> must be present.
>>>>>> 2) A machine that does not have two's complement integers may
>>>>>> still simply provide a specialization that solves the problem in
>>>>>> a different way.
>>>>>> 3) The alternative that does not make assumptions about that
>>>>>> would use the good old IntegerTypes::cast_to_signed
>>>>>> metaprogramming stuff, and I seem to recall we thought that was a
>>>>>> bit too involved and complicated.
>>>>>> This is the reason why I have chosen to use unary minus on the
>>>>>> potentially unsigned type in the shared helper code that sends
>>>>>> the decrement as an addend to Atomic::add.
>>>>>>
>>>>>> It would also be nice if somebody with access to PPC and s390
>>>>>> machines could try out the relevant changes there so I do not
>>>>>> accidentally break those platforms. I have blind-coded the
>>>>>> addition of the immediate values passed in to the inline assembly
>>>>>> in a way that I think looks like it should work.
>>>>>>
>>>>>> Testing:
>>>>>> RBT hs-tier3, JPRT --testset hotspot
>>>>>>
>>>>>> Thanks,
>>>>>> /Erik
>>>>
>>
More information about the hotspot-dev
mailing list