RFR: 8186838: Generalize Atomic::inc/dec with templates

coleen.phillimore at oracle.com coleen.phillimore at oracle.com
Fri Sep 1 14:42:48 UTC 2017



On 9/1/17 10:15 AM, Erik Österlund wrote:
> Hi Andrew,
>
> On 2017-09-01 15:41, Andrew Haley wrote:
>> On 31/08/17 13:45, Erik Österlund wrote:
>>> Hi everyone,
>>>
>>> Bug ID:
>>> https://bugs.openjdk.java.net/browse/JDK-8186838
>>>
>>> Webrev:
>>> http://cr.openjdk.java.net/~eosterlund/8186838/webrev.00/
>>>
>>> The time has come for the next step in generalizing Atomic with
>>> templates. Today I will focus on Atomic::inc/dec.
>>>
>>> I have tried to mimic the new Kim style that seems to have been
>>> universally accepted. Like Atomic::add and Atomic::cmpxchg, the
>>> structure looks like this:
>>>
>>> Layer 1) Atomic::inc/dec calls an IncImpl()/DecImpl() function object
>>> that performs some basic type checks.
>>> Layer 2) IncImpl/DecImpl calls PlatformInc/PlatformDec that can define
>>> the operation arbitrarily for a given platform. The default
>>> implementation if not specialized for a platform is to call 
>>> Atomic::add.
>>> So only platforms that want to do something different than that as an
>>> optimization have to provide a specialization.
>>> Layer 3) Platforms that decide to specialize PlatformInc/PlatformDec to
>>> be more optimized may inherit from a helper class
>>> IncUsingConstant/DecUsingConstant. This helper helps performing the
>>> necessary computation what the increment/decrement should be after
>>> pointer scaling using CRTP. The PlatformInc/PlatformDec operation then
>>> only needs to define an inc/dec member function, and will then get all
>>> the context information necessary to generate a more optimized
>>> implementation. Easy peasy.
>> I wanted to say something nice, but I honestly can't.  I am dismayed.
>
> Okay.
>
>> I hoped that inc/dec would turn out to be much simpler than the
>> cmpxchg functions: I think they should, because they don't have to
>> deal with the complexity of potentially three different types.
>> Instead we have, again, a large and complex patch.
>>
>> Even on AArch64, which should be the simplest case because Atomic::inc
>> can be defined as
>>
>> template <typename T1>
>> inc(T1 *dest) {
>>    return __sync_add_and_fetch(dest, 1);
>> }
>
> AArch64 is indeed the simplest case. It does not have a specialization 
> in my patch. It simply expresses Atomic::inc in terms of Atomic::add.
>
>> or something similar, we have
>>
>> Atomic::inc<int>
>> Atomic::IncImpl<int, void>::operator()
>> Atomic::PlatformInc<4ul, IntegralConstant<int, 1> >::operator()
>> Atomic::add<int, int>
>> Atomic::AddImpl<int, int, void>::operator()
>> Atomic::AddAndFetch<Atomic::PlatformAdd<4ul> >::operator()<int, int>
>> Atomic::PlatformAdd<4ul>::add_and_fetch<int, int>
>> __sync_add_and_fetch
>>
>> I quite understand that it isn't so easy on some systems, and they
>> need a generic form that explodes into four different calls, one for
>> each size of integer.  I completely accept that it will be more
>> complex for everything else.  But is it necessary to have so much code
>> for something so simple?  This is a 1400 line patch.  Granted, much of
>> it is simply moving stuff around, but despite the potential of
>> template code to simplify the implementation we have a more complex
>> solution than we had before.
>>
>> I ask you, is this the simplest solution that you believe is possible?
>
> It is not the simplest solution I can think of. The simplest solution 
> I can think of is to remove all specialized versions of 
> Atomic::inc/dec and just have it call Atomic::add directly. That would 
> remove the optimizations we have today, for whatever reason we have 
> them. It would lead to slightly more conservative fencing on PPC/S390, 
> and would lead to slightly less optimal machine encoding on x86 
> (without immediate values in the instructions). But it would be 
> simpler for sure. I did not put any judgement into whether our 
> existing optimizations are worthwhile or not. But if you want to 
> prioritize simplicity, removing those optimizations is one possible 
> solution. Would you prefer that?

I wonder if you could remove the linux x86 asm code for inc/dec, recode 
it to use add, and do a dev submit run against your patch? While we're 
discussing this.

thanks,
Coleen

>
> Thanks,
> /Erik



More information about the hotspot-dev mailing list