RFR: 8236778: Add Atomic::fetch_and_add
Kim Barrett
kim.barrett at oracle.com
Thu Jan 9 00:00:34 UTC 2020
> On Jan 8, 2020, at 10:34 AM, Stefan Karlsson <stefan.karlsson at oracle.com> wrote:
>
> Hi all,
>
> Please review this patch to introduce Atomic::fetch_and_add.
>
> https://cr.openjdk.java.net/~stefank/8236778/webrev.01
> https://bugs.openjdk.java.net/browse/JDK-8236778
>
> There are a number of places where we have this pattern:
> int result = Atomic::add(_index, amount) - amount;
>
> I'd like to introduce Atomic::fetch_and_add so that we can write:
> int result = Atomic::fetch_and_add(_index, amount);
------------------------------------------------------------------------------
src/hotspot/share/runtime/atomic.hpp
Removed:
240 // - platform_add is an object of type PlatformAdd<sizeof(D)>.
241 //
242 // Then
243 // platform_add(dest, add_value)
244 // must be a valid expression, returning a result convertible to D.
and
250 // Helper base classes for defining PlatformAdd. To use, define
...
275 // caller.
These comments should have been updated to describe the new protocol
for PlatformAdd, rather than simply removed. Documentation of
extension points like this is important. Something like
// - platform_add is an object of type PlatformAdd<sizeof(D)>.
//
// Then both
// platform_add.add_and_fetch(dest, add_value)
// platform_add.fetch_and_add(dest, add_value)
// must be valid expressions returning a result convertible to D.
//
// add_and_fetch atomically adds add_value to the value of dest,
// returning the new value.
//
// fetch_and_add atomically adds add_value to the value of dest,
// returning the old value.
//
// When D is a pointer type P*, both add_and_fetch and fetch_and_add
// treat it as if it were an uintptr_t; they do not perform any
// scaling of add_value, as that has already been done by the caller.
------------------------------------------------------------------------------
src/hotspot/share/runtime/atomic.hpp
679 static I scale_addend(I add_value) {
680 return add_value * sizeof(P);
681 }
682
683 static P* add_and_fetch(P* volatile* dest, I add_value, atomic_memory_order order) {
684 CI addend = add_value;
685 return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, scale_addend(addend), order);
686 }
This is converting add_value from I to CI then back to I, the latter
possibly being a narrowing conversion. Better would be
static CI scale_addend(CI add_value) {
return add_value * sizeof(P);
}
and then either
static P* add_and_fetch(P* volatile* dest, I add_value, atomic_memory_order order) {
CI addend = scale_addend(add_value);
return PlatformAdd<sizeof(P*)>().add_and_fetch(dest, addend, order);
}
or don't bother with the addend variable and just pass the scaled
result directly to add_and_fetch / fetch_and_add.
------------------------------------------------------------------------------
src/hotspot/os_cpu/windows_x86/atomic_windows_x86.hpp
54 #pragma warning(disable: 4035) // Disables warnings reporting missing return statement
Pre-existing: This warning suppression (and the corresponding
restoration) appears to only be needed for the !defined(AMD64) case
(where __asm statements are being used), and should be moved accordingly.
------------------------------------------------------------------------------
> The current implementation already has support for both "add and fetch" and "fetch and add" but it's not exposed to the upper layers.
>
> Previously, the platform-specific code either implemented "add and fetch" or "fetch and add", and then exposed it as an "add and fetch" implementation by using CRTP and inheriting from either AddAndFetch or FetchAndAdd.
>
> My first implementation of this continued in this track, but got push-back because the code was non-intuitive and/or used non-intuitive names. Therefore, I've removed FetchAndAdd/AddAndFetch and opted to duplicate the trivial functionality in the platform files instead. For example:
>
> + template<typename D, typename I>
> + D add_and_fetch(D volatile* dest, I add_value, atomic_memory_order order) const {
> + return fetch_and_add(dest, add_value, order) + add_value;
> + }
For the record, I would have been fine with (and actually preferred; I
dislike roughly a dozen copies of one or the other of the translation
functions) CRTP-based AddAndFetch and FetchAndAdd that provided one
operation in terms of the other.
The move of the scaling of add_value in the pointer case to AddImpl is
an improvement (even on the pre-existing code) that might have made
such a CRTP approach clearer, possibly with some name changes. Then
the helper CRTP base class just provides one of the functions in terms
of Derived's other function, and nothing else.
Perhaps the hardest part of that approach is naming the helper base
classes. FetchAndAddUsingAddAndFetch is explicit but quite a
mouthful. Perhaps FetchAndAddHelper, which provides fetch_and_add in
terms of Derived's add_and_fetch?
Apologies for not having followed the internal pre-review discussion
of this and so not commenting there.
> There has been some thoughts that maybe we should have:
>
> void Atomic::add(...)
> D Atomic::add_and_fetch(...)
> D Atomic::fetch_and_add(...)
>
> Not sure if it's worth changing to this, but if others think this is good, I can do that change.
I would be okay with renaming "add" to "add_and_fetch", though I don't
at the moment have a strong preference either way. I'm not sure also
providing "add" that returns void is really all that useful. We have
"inc" and "dec" that return void; my recollection is that they were at
one time thought to provide an opportunity for a more efficient
implementation on some platforms, but discussion during the
templatization project showed that compilers could eliminate an unused
return value anyway.
For symmetry it seems like we should have fetch_and_sub, but I don't
see any current uses of sub that would need that.
If add_and_fetch is added (either instead of or in addition to add),
then we should definitely treat the subtraction case similarly.
> Tested with tier123, but only compiled on platforms I have access to.
mach5 has the ability to cross-compile to several OpenJDK platforms
not otherwise supported by Oracle, though can't run tests. That can
be a useful smoke test to at least avoid things like typos that don't
build. For example, "-b linux-aarch64-debug”.
More information about the hotspot-dev
mailing list