RFR: 8236778: Add Atomic::fetch_and_add

Fri Jan 10 02:05:01 UTC 2020

Hi Stefan,

On 10/01/2020 8:17 am, Stefan Karlsson wrote:
> Updated webrev:
>   https://cr.openjdk.java.net/~stefank/8236778/webrev.02.delta
>   https://cr.openjdk.java.net/~stefank/8236778/webrev.02

That all seems okay to me (not that I fully understand the details of 
the template code). I'll leave you and Kim to work out the details of 
the commentary.

Thanks,
David

> Comments below:
> 
> On 2020-01-09 01:00, Kim Barrett wrote:
>>> On Jan 8, 2020, at 10:34 AM, Stefan Karlsson 
>>> <stefan.karlsson at oracle.com> wrote:
>>>
>>> Hi all,
>>>
>>> Please review this patch to introduce Atomic::fetch_and_add.
>>>
>>> https://cr.openjdk.java.net/~stefank/8236778/webrev.01
>>> https://bugs.openjdk.java.net/browse/JDK-8236778
>>>
>>> There are a number of places where we have this pattern:
>>> int result = Atomic::add(_index, amount) - amount;
>>>
>>> I'd like to introduce Atomic::fetch_and_add so that we can write:
>>> int result = Atomic::fetch_and_add(_index, amount);
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/atomic.hpp
>> Removed:
>>   240   // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>   241   //
>>   242   // Then
>>   243   //   platform_add(dest, add_value)
>>   244   // must be a valid expression, returning a result convertible 
>> to D.
>>
>> and
>>
>>   250   // Helper base classes for defining PlatformAdd.  To use, define
>> ...
>>   275   // caller.
>>
>> These comments should have been updated to describe the new protocol
>> for PlatformAdd, rather than simply removed.  Documentation of
>> extension points like this is important.
> 
> The comments seemed to already have started deteriorating. It's seems to 
> be missing an *if* that matches the *then, and it doesn't mention 
> *order*. So, instead of trying to figure out what the comment tried to 
> say, I opted from getting rid of those problems by getting rid of that 
> part of the comment. I thought the code was pretty self-explanatory, and 
> didn't need that part. I'm not opposed to fixing the comments if you 
> have suggestions.
> 
>>    Something like
>>
>>    // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>    //
>>    // Then both
>>    //   platform_add.add_and_fetch(dest, add_value)
>>    //   platform_add.fetch_and_add(dest, add_value)
>>    // must be valid expressions returning a result convertible to D.
>>    //
>>    // add_and_fetch atomically adds add_value to the value of dest,
>>    // returning the new value.
>>    //
>>    // fetch_and_add atomically adds add_value to the value of dest,
>>    // returning the old value.
>>    //
>>    // When D is a pointer type P*, both add_and_fetch and fetch_and_add
>>    // treat it as if it were an uintptr_t; they do not perform any
>>    // scaling of add_value, as that has already been done by the caller.
> 
> I'll update the code verbatim with what you've suggested.
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/atomic.hpp
>>   679   static I scale_addend(I add_value) {
>>   680     return add_value * sizeof(P);
>>   681   }
>>   682
>>   683   static P* add_and_fetch(P* volatile* dest, I add_value, 
>> atomic_memory_order order) {
>>   684     CI addend = add_value;
>>   685     return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, 
>> scale_addend(addend), order);
>>   686   }
>>
>> This is converting add_value from I to CI then back to I, the latter
>> possibly being a narrowing conversion.  Better would be
>>
>>    static CI scale_addend(CI add_value) {
>>      return add_value * sizeof(P);
>>    }
>>
>> and then either
>>
>>    static P* add_and_fetch(P* volatile* dest, I add_value, 
>> atomic_memory_order order) {
>>      CI addend = scale_addend(add_value);
>>      return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, addend, order);
>>    }
>>
>> or don't bother with the addend variable and just pass the scaled
>> result directly to add_and_fetch / fetch_and_add.
> 
> Thanks. This was an unintentional change from the original code.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>>    54 #pragma warning(disable: 4035) // Disables warnings reporting 
>> missing return statement
>>
>> Pre-existing: This warning suppression (and the corresponding
>> restoration) appears to only be needed for the !defined(AMD64) case
>> (where __asm statements are being used), and should be moved accordingly.
> 
> Will you create a bug report for that?
> 
>>
>> ------------------------------------------------------------------------------ 
>>
>>
>>> The current implementation already has support for both "add and 
>>> fetch" and "fetch and add" but it's not exposed to the upper layers.
>>>
>>> Previously, the platform-specific code either implemented "add and 
>>> fetch" or "fetch and add", and then exposed it as an "add and fetch" 
>>> implementation by using CRTP and inheriting from either AddAndFetch 
>>> or FetchAndAdd.
>>>
>>> My first implementation of this continued in this track, but got 
>>> push-back because the code was non-intuitive and/or used 
>>> non-intuitive names. Therefore, I've removed FetchAndAdd/AddAndFetch 
>>> and opted to duplicate the trivial functionality in the platform 
>>> files instead. For example:
>>>
>>> +  template<typename D, typename I>
>>> +  D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order 
>>> order) const {
>>> +    return fetch_and_add(dest, add_value, order) + add_value;
>>> +  }
>> For the record, I would have been fine with (and actually preferred; I
>> dislike roughly a dozen copies of one or the other of the translation
>> functions) CRTP-based AddAndFetch and FetchAndAdd that provided one
>> operation in terms of the other.
>>
>> The move of the scaling of add_value in the pointer case to AddImpl is
>> an improvement (even on the pre-existing code) that might have made
>> such a CRTP approach clearer, possibly with some name changes.  Then
>> the helper CRTP base class just provides one of the functions in terms
>> of Derived's other function, and nothing else.
>>
>> Perhaps the hardest part of that approach is naming the helper base
>> classes.  FetchAndAddUsingAddAndFetch is explicit but quite a
>> mouthful.  Perhaps FetchAndAddHelper, which provides fetch_and_add in
>> terms of Derived's add_and_fetch?
>>
>> Apologies for not having followed the internal pre-review discussion
>> of this and so not commenting there.
>>
>>> There has been some thoughts that maybe we should have:
>>>
>>> void Atomic::add(...)
>>> D    Atomic::add_and_fetch(...)
>>> D    Atomic::fetch_and_add(...)
>>>
>>> Not sure if it's worth changing to this, but if others think this is 
>>> good, I can do that change.
>> I would be okay with renaming "add" to "add_and_fetch", though I don't
>> at the moment have a strong preference either way.  I'm not sure also
>> providing "add" that returns void is really all that useful.  We have
>> "inc" and "dec" that return void; my recollection is that they were at
>> one time thought to provide an opportunity for a more efficient
>> implementation on some platforms, but discussion during the
>> templatization project showed that compilers could eliminate an unused
>> return value anyway.
>>
>> For symmetry it seems like we should have fetch_and_sub, but I don't
>> see any current uses of sub that would need that.
>>
>> If add_and_fetch is added (either instead of or in addition to add),
>> then we should definitely treat the subtraction case similarly.
> 
> I'll interpret your comments above that there might be opportunities to 
> do more changes, but none of those needs to be done for this patch.
> 
>>
>>> Tested with tier123, but only compiled on platforms I have access to.
>> mach5 has the ability to cross-compile to several OpenJDK platforms
>> not otherwise supported by Oracle, though can't run tests.  That can
>> be a useful smoke test to at least avoid things like typos that don't
>> build.  For example, "-b linux-aarch64-debug”.
> 
> Right. I wasn't explicit about it, but I've compiled locally (linux) for 
> aarch64, arm32, ppc64, s390, zero, minimal.
> 
> Thanks,
> StefanK
>>
>