RFR: 8236778: Add Atomic::fetch_and_add

Fri Jan 10 08:57:26 UTC 2020

Thanks for reviewing, David.

StefanK

On 2020-01-10 03:05, David Holmes wrote:
> Hi Stefan,
>
> On 10/01/2020 8:17 am, Stefan Karlsson wrote:
>> Updated webrev:
>>   https://cr.openjdk.java.net/~stefank/8236778/webrev.02.delta
>>   https://cr.openjdk.java.net/~stefank/8236778/webrev.02
>
> That all seems okay to me (not that I fully understand the details of 
> the template code). I'll leave you and Kim to work out the details of 
> the commentary.
>
> Thanks,
> David
>
>> Comments below:
>>
>> On 2020-01-09 01:00, Kim Barrett wrote:
>>>> On Jan 8, 2020, at 10:34 AM, Stefan Karlsson 
>>>> <stefan.karlsson at oracle.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Please review this patch to introduce Atomic::fetch_and_add.
>>>>
>>>> https://cr.openjdk.java.net/~stefank/8236778/webrev.01
>>>> https://bugs.openjdk.java.net/browse/JDK-8236778
>>>>
>>>> There are a number of places where we have this pattern:
>>>> int result = Atomic::add(_index, amount) - amount;
>>>>
>>>> I'd like to introduce Atomic::fetch_and_add so that we can write:
>>>> int result = Atomic::fetch_and_add(_index, amount);
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/atomic.hpp
>>> Removed:
>>>   240   // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>>   241   //
>>>   242   // Then
>>>   243   //   platform_add(dest, add_value)
>>>   244   // must be a valid expression, returning a result 
>>> convertible to D.
>>>
>>> and
>>>
>>>   250   // Helper base classes for defining PlatformAdd.  To use, 
>>> define
>>> ...
>>>   275   // caller.
>>>
>>> These comments should have been updated to describe the new protocol
>>> for PlatformAdd, rather than simply removed.  Documentation of
>>> extension points like this is important.
>>
>> The comments seemed to already have started deteriorating. It's seems 
>> to be missing an *if* that matches the *then, and it doesn't mention 
>> *order*. So, instead of trying to figure out what the comment tried 
>> to say, I opted from getting rid of those problems by getting rid of 
>> that part of the comment. I thought the code was pretty 
>> self-explanatory, and didn't need that part. I'm not opposed to 
>> fixing the comments if you have suggestions.
>>
>>>    Something like
>>>
>>>    // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>>    //
>>>    // Then both
>>>    //   platform_add.add_and_fetch(dest, add_value)
>>>    //   platform_add.fetch_and_add(dest, add_value)
>>>    // must be valid expressions returning a result convertible to D.
>>>    //
>>>    // add_and_fetch atomically adds add_value to the value of dest,
>>>    // returning the new value.
>>>    //
>>>    // fetch_and_add atomically adds add_value to the value of dest,
>>>    // returning the old value.
>>>    //
>>>    // When D is a pointer type P*, both add_and_fetch and fetch_and_add
>>>    // treat it as if it were an uintptr_t; they do not perform any
>>>    // scaling of add_value, as that has already been done by the 
>>> caller.
>>
>> I'll update the code verbatim with what you've suggested.
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/share/runtime/atomic.hpp
>>>   679   static I scale_addend(I add_value) {
>>>   680     return add_value * sizeof(P);
>>>   681   }
>>>   682
>>>   683   static P* add_and_fetch(P* volatile* dest, I add_value, 
>>> atomic_memory_order order) {
>>>   684     CI addend = add_value;
>>>   685     return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, 
>>> scale_addend(addend), order);
>>>   686   }
>>>
>>> This is converting add_value from I to CI then back to I, the latter
>>> possibly being a narrowing conversion.  Better would be
>>>
>>>    static CI scale_addend(CI add_value) {
>>>      return add_value * sizeof(P);
>>>    }
>>>
>>> and then either
>>>
>>>    static P* add_and_fetch(P* volatile* dest, I add_value, 
>>> atomic_memory_order order) {
>>>      CI addend = scale_addend(add_value);
>>>      return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, addend, 
>>> order);
>>>    }
>>>
>>> or don't bother with the addend variable and just pass the scaled
>>> result directly to add_and_fetch / fetch_and_add.
>>
>> Thanks. This was an unintentional change from the original code.
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>>>    54 #pragma warning(disable: 4035) // Disables warnings reporting 
>>> missing return statement
>>>
>>> Pre-existing: This warning suppression (and the corresponding
>>> restoration) appears to only be needed for the !defined(AMD64) case
>>> (where __asm statements are being used), and should be moved 
>>> accordingly.
>>
>> Will you create a bug report for that?
>>
>>>
>>> ------------------------------------------------------------------------------ 
>>>
>>>
>>>> The current implementation already has support for both "add and 
>>>> fetch" and "fetch and add" but it's not exposed to the upper layers.
>>>>
>>>> Previously, the platform-specific code either implemented "add and 
>>>> fetch" or "fetch and add", and then exposed it as an "add and 
>>>> fetch" implementation by using CRTP and inheriting from either 
>>>> AddAndFetch or FetchAndAdd.
>>>>
>>>> My first implementation of this continued in this track, but got 
>>>> push-back because the code was non-intuitive and/or used 
>>>> non-intuitive names. Therefore, I've removed 
>>>> FetchAndAdd/AddAndFetch and opted to duplicate the trivial 
>>>> functionality in the platform files instead. For example:
>>>>
>>>> +  template<typename D, typename I>
>>>> +  D add_and_fetch(D volatile* dest, I add_value, 
>>>> atomic_memory_order order) const {
>>>> +    return fetch_and_add(dest, add_value, order) + add_value;
>>>> +  }
>>> For the record, I would have been fine with (and actually preferred; I
>>> dislike roughly a dozen copies of one or the other of the translation
>>> functions) CRTP-based AddAndFetch and FetchAndAdd that provided one
>>> operation in terms of the other.
>>>
>>> The move of the scaling of add_value in the pointer case to AddImpl is
>>> an improvement (even on the pre-existing code) that might have made
>>> such a CRTP approach clearer, possibly with some name changes.  Then
>>> the helper CRTP base class just provides one of the functions in terms
>>> of Derived's other function, and nothing else.
>>>
>>> Perhaps the hardest part of that approach is naming the helper base
>>> classes.  FetchAndAddUsingAddAndFetch is explicit but quite a
>>> mouthful.  Perhaps FetchAndAddHelper, which provides fetch_and_add in
>>> terms of Derived's add_and_fetch?
>>>
>>> Apologies for not having followed the internal pre-review discussion
>>> of this and so not commenting there.
>>>
>>>> There has been some thoughts that maybe we should have:
>>>>
>>>> void Atomic::add(...)
>>>> D    Atomic::add_and_fetch(...)
>>>> D    Atomic::fetch_and_add(...)
>>>>
>>>> Not sure if it's worth changing to this, but if others think this 
>>>> is good, I can do that change.
>>> I would be okay with renaming "add" to "add_and_fetch", though I don't
>>> at the moment have a strong preference either way.  I'm not sure also
>>> providing "add" that returns void is really all that useful. We have
>>> "inc" and "dec" that return void; my recollection is that they were at
>>> one time thought to provide an opportunity for a more efficient
>>> implementation on some platforms, but discussion during the
>>> templatization project showed that compilers could eliminate an unused
>>> return value anyway.
>>>
>>> For symmetry it seems like we should have fetch_and_sub, but I don't
>>> see any current uses of sub that would need that.
>>>
>>> If add_and_fetch is added (either instead of or in addition to add),
>>> then we should definitely treat the subtraction case similarly.
>>
>> I'll interpret your comments above that there might be opportunities 
>> to do more changes, but none of those needs to be done for this patch.
>>
>>>
>>>> Tested with tier123, but only compiled on platforms I have access to.
>>> mach5 has the ability to cross-compile to several OpenJDK platforms
>>> not otherwise supported by Oracle, though can't run tests. That can
>>> be a useful smoke test to at least avoid things like typos that don't
>>> build.  For example, "-b linux-aarch64-debug”.
>>
>> Right. I wasn't explicit about it, but I've compiled locally (linux) 
>> for aarch64, arm32, ppc64, s390, zero, minimal.
>>
>> Thanks,
>> StefanK
>>>
>>