RFR: 8236778: Add Atomic::fetch_and_add
Stefan Karlsson
stefan.karlsson at oracle.com
Fri Jan 10 08:57:26 UTC 2020
Thanks for reviewing, David.
StefanK
On 2020-01-10 03:05, David Holmes wrote:
> Hi Stefan,
>
> On 10/01/2020 8:17 am, Stefan Karlsson wrote:
>> Updated webrev:
>> https://cr.openjdk.java.net/~stefank/8236778/webrev.02.delta
>> https://cr.openjdk.java.net/~stefank/8236778/webrev.02
>
> That all seems okay to me (not that I fully understand the details of
> the template code). I'll leave you and Kim to work out the details of
> the commentary.
>
> Thanks,
> David
>
>> Comments below:
>>
>> On 2020-01-09 01:00, Kim Barrett wrote:
>>>> On Jan 8, 2020, at 10:34 AM, Stefan Karlsson
>>>> <stefan.karlsson at oracle.com> wrote:
>>>>
>>>> Hi all,
>>>>
>>>> Please review this patch to introduce Atomic::fetch_and_add.
>>>>
>>>> https://cr.openjdk.java.net/~stefank/8236778/webrev.01
>>>> https://bugs.openjdk.java.net/browse/JDK-8236778
>>>>
>>>> There are a number of places where we have this pattern:
>>>> int result = Atomic::add(_index, amount) - amount;
>>>>
>>>> I'd like to introduce Atomic::fetch_and_add so that we can write:
>>>> int result = Atomic::fetch_and_add(_index, amount);
>>> ------------------------------------------------------------------------------
>>>
>>> src/hotspot/share/runtime/atomic.hpp
>>> Removed:
>>> 240 // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>> 241 //
>>> 242 // Then
>>> 243 // platform_add(dest, add_value)
>>> 244 // must be a valid expression, returning a result
>>> convertible to D.
>>>
>>> and
>>>
>>> 250 // Helper base classes for defining PlatformAdd. To use,
>>> define
>>> ...
>>> 275 // caller.
>>>
>>> These comments should have been updated to describe the new protocol
>>> for PlatformAdd, rather than simply removed. Documentation of
>>> extension points like this is important.
>>
>> The comments seemed to already have started deteriorating. It's seems
>> to be missing an *if* that matches the *then, and it doesn't mention
>> *order*. So, instead of trying to figure out what the comment tried
>> to say, I opted from getting rid of those problems by getting rid of
>> that part of the comment. I thought the code was pretty
>> self-explanatory, and didn't need that part. I'm not opposed to
>> fixing the comments if you have suggestions.
>>
>>> Something like
>>>
>>> // - platform_add is an object of type PlatformAdd<sizeof(D)>.
>>> //
>>> // Then both
>>> // platform_add.add_and_fetch(dest, add_value)
>>> // platform_add.fetch_and_add(dest, add_value)
>>> // must be valid expressions returning a result convertible to D.
>>> //
>>> // add_and_fetch atomically adds add_value to the value of dest,
>>> // returning the new value.
>>> //
>>> // fetch_and_add atomically adds add_value to the value of dest,
>>> // returning the old value.
>>> //
>>> // When D is a pointer type P*, both add_and_fetch and fetch_and_add
>>> // treat it as if it were an uintptr_t; they do not perform any
>>> // scaling of add_value, as that has already been done by the
>>> caller.
>>
>> I'll update the code verbatim with what you've suggested.
>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> src/hotspot/share/runtime/atomic.hpp
>>> 679 static I scale_addend(I add_value) {
>>> 680 return add_value * sizeof(P);
>>> 681 }
>>> 682
>>> 683 static P* add_and_fetch(P* volatile* dest, I add_value,
>>> atomic_memory_order order) {
>>> 684 CI addend = add_value;
>>> 685 return PlatformAdd<sizeof(P*)>().add_and_fetch(dest,
>>> scale_addend(addend), order);
>>> 686 }
>>>
>>> This is converting add_value from I to CI then back to I, the latter
>>> possibly being a narrowing conversion. Better would be
>>>
>>> static CI scale_addend(CI add_value) {
>>> return add_value * sizeof(P);
>>> }
>>>
>>> and then either
>>>
>>> static P* add_and_fetch(P* volatile* dest, I add_value,
>>> atomic_memory_order order) {
>>> CI addend = scale_addend(add_value);
>>> return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, addend,
>>> order);
>>> }
>>>
>>> or don't bother with the addend variable and just pass the scaled
>>> result directly to add_and_fetch / fetch_and_add.
>>
>> Thanks. This was an unintentional change from the original code.
>>>
>>> ------------------------------------------------------------------------------
>>>
>>> src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
>>> 54 #pragma warning(disable: 4035) // Disables warnings reporting
>>> missing return statement
>>>
>>> Pre-existing: This warning suppression (and the corresponding
>>> restoration) appears to only be needed for the !defined(AMD64) case
>>> (where __asm statements are being used), and should be moved
>>> accordingly.
>>
>> Will you create a bug report for that?
>>
>>>
>>> ------------------------------------------------------------------------------
>>>
>>>
>>>> The current implementation already has support for both "add and
>>>> fetch" and "fetch and add" but it's not exposed to the upper layers.
>>>>
>>>> Previously, the platform-specific code either implemented "add and
>>>> fetch" or "fetch and add", and then exposed it as an "add and
>>>> fetch" implementation by using CRTP and inheriting from either
>>>> AddAndFetch or FetchAndAdd.
>>>>
>>>> My first implementation of this continued in this track, but got
>>>> push-back because the code was non-intuitive and/or used
>>>> non-intuitive names. Therefore, I've removed
>>>> FetchAndAdd/AddAndFetch and opted to duplicate the trivial
>>>> functionality in the platform files instead. For example:
>>>>
>>>> + template<typename D, typename I>
>>>> + D add_and_fetch(D volatile* dest, I add_value,
>>>> atomic_memory_order order) const {
>>>> + return fetch_and_add(dest, add_value, order) + add_value;
>>>> + }
>>> For the record, I would have been fine with (and actually preferred; I
>>> dislike roughly a dozen copies of one or the other of the translation
>>> functions) CRTP-based AddAndFetch and FetchAndAdd that provided one
>>> operation in terms of the other.
>>>
>>> The move of the scaling of add_value in the pointer case to AddImpl is
>>> an improvement (even on the pre-existing code) that might have made
>>> such a CRTP approach clearer, possibly with some name changes. Then
>>> the helper CRTP base class just provides one of the functions in terms
>>> of Derived's other function, and nothing else.
>>>
>>> Perhaps the hardest part of that approach is naming the helper base
>>> classes. FetchAndAddUsingAddAndFetch is explicit but quite a
>>> mouthful. Perhaps FetchAndAddHelper, which provides fetch_and_add in
>>> terms of Derived's add_and_fetch?
>>>
>>> Apologies for not having followed the internal pre-review discussion
>>> of this and so not commenting there.
>>>
>>>> There has been some thoughts that maybe we should have:
>>>>
>>>> void Atomic::add(...)
>>>> D Atomic::add_and_fetch(...)
>>>> D Atomic::fetch_and_add(...)
>>>>
>>>> Not sure if it's worth changing to this, but if others think this
>>>> is good, I can do that change.
>>> I would be okay with renaming "add" to "add_and_fetch", though I don't
>>> at the moment have a strong preference either way. I'm not sure also
>>> providing "add" that returns void is really all that useful. We have
>>> "inc" and "dec" that return void; my recollection is that they were at
>>> one time thought to provide an opportunity for a more efficient
>>> implementation on some platforms, but discussion during the
>>> templatization project showed that compilers could eliminate an unused
>>> return value anyway.
>>>
>>> For symmetry it seems like we should have fetch_and_sub, but I don't
>>> see any current uses of sub that would need that.
>>>
>>> If add_and_fetch is added (either instead of or in addition to add),
>>> then we should definitely treat the subtraction case similarly.
>>
>> I'll interpret your comments above that there might be opportunities
>> to do more changes, but none of those needs to be done for this patch.
>>
>>>
>>>> Tested with tier123, but only compiled on platforms I have access to.
>>> mach5 has the ability to cross-compile to several OpenJDK platforms
>>> not otherwise supported by Oracle, though can't run tests. That can
>>> be a useful smoke test to at least avoid things like typos that don't
>>> build. For example, "-b linux-aarch64-debug”.
>>
>> Right. I wasn't explicit about it, but I've compiled locally (linux)
>> for aarch64, arm32, ppc64, s390, zero, minimal.
>>
>> Thanks,
>> StefanK
>>>
>>
More information about the hotspot-dev
mailing list