RFR: 8247213: G1: Reduce usage of volatile in favour of Atomic operations
coleen.phillimore at oracle.com
coleen.phillimore at oracle.com
Mon Jun 15 12:39:31 UTC 2020
I like the concept of this. I really want the compiler to do to the
work to verify that if something is interestingly concurrent, we
shouldn't use plain loads and stores of it.
The name AtomicValue<> seems as good as any. Can AtomicValue be
declared in its own header file: runtime/atomicValue.hpp please?
On 6/14/20 4:51 AM, Kim Barrett wrote:
>> On Jun 10, 2020, at 8:46 AM, Leo Korinth <leo.korinth at oracle.com> wrote:
>>
>> Hi, could I have a review for this change that adds AtomicValue<> to atomic.hpp and uses it in G1?
>>
>> I am adding an AtomicValue<> to atomic.hpp. This is an opaque type (the "value" is private) and that protect against non-atomic operations being used by mistake. AtomicValue methods are proxied to the corresponding (all-static) Atomic:: methods. All operations are explicitly atomic and in a type safe manner with semantics defined in enum atomic_memory_order
>> (memory_order_conservative by default).
>>
>> Instead of using field variables as volatile, I change them to be of AtomicValue<> type. No need to verify that += (for example) will result in an atomic instruction on all supported compilers.
>>
>> I have some open questions regarding the exported fields in vmStructs_g1.hpp. Today, volatile fields are sometimes "exported" as volatile (and sometimes not, and I guess this is by mistake). I choose to export them all as non-volatile. From what I can see the volatile specific part only casts to void* (only documentation?). Java code is unchanged and still access them as the unwrapped values (static assert in atomic.hpp guarantees that memory layout is the same for T and AtomicValue<T>). I think this is okay, but would like feedback on all this.
The SA doesn't really need anything to be volatile. I think they are
there to get vmStructs to compile for volatile fields but don't have any
meaning for SA since it is used to look at core files or unresponsive
VMs. I don't see the void* cast that you are referring to in the
macros, which doesn't mean it's not there. The SA seems to read these
values as their underlying types so if you don't get errors in
test/hotspot/jtreg/serviceability/sa, the changes are fine. Hopefully
you won't have to add AtomicValue to SA.
thanks,
Coleen
>>
>> The change is presented as a two part change. The first part changes all volatile to AtomicValue, the other part removes the AtomicValue part on non-field accesses. By doing it two part I will not forget to transform some operations by mistake.
>>
>> Copyright years will be updated when all other changes are approved.
>>
>> How about pushing this after 15 is branched off and thus have it for 16?
>>
>> Enhancement:
>> https://bugs.openjdk.java.net/browse/JDK-8247213
>>
>> Webrev:
>> http://cr.openjdk.java.net/~lkorinth/8247213/0/part1
>> http://cr.openjdk.java.net/~lkorinth/8247213/0/part2
>> http://cr.openjdk.java.net/~lkorinth/8247213/0/full
>>
>> Testing:
>> tier1-3
>>
>> Thanks,
>> Leo
> I think having a class for atomic members and variables is good, using
> the compiler to help keep us honest about what we're doing. So I
> generally like this change, despite the number of comments below.
>
> First a few general comments, then a bunch of detailed comments on
> specific changes.
>
> ------------------------------------------------------------------------------
>
> I kind of wish this had been broken up into smaller chunks. Looking
> at a large number of changes involving atomic operations all in one go
> is pretty tiring.
>
> ------------------------------------------------------------------------------
>
> I hope that this will not be a G1-specific usage, nor even a
> GC-specific usage, but rather will eventually be used throughout
> HotSpot. If not, then I'm not sure this change should be made; I
> wouldn't like having the differing styles of volatile members in some
> code and AtomicValue in other code (other than as a temporary state as
> changes are incrementally made). (That doesn't mean that I think all
> atomic operations should involve AtomicValue, and the static Atomic
> functions are no longer needed. I think there are some places where
> forcing things into the former mold is problematic and the Atomic
> functions will still have direct uses.)
>
> ------------------------------------------------------------------------------
>
> The name of the class has generated some discussion. (Full disclosure:
> I suggested AtomicValue.) "AtomicValue" follows the style guide of
> using CamelCase for type names. But there are lots of
> counter-examples. "Atomic" is a bit strange, since then we'd have a
> class that is both a thin wrapper with member functions and an
> AllStatic with similarly named static member functions taking an
> additional explicit argument. The alternative would be to give this
> class the "Atomic" name and rename the existing "Atomic", but that
> would be pretty disruptive. atomic is short and aligns nicely with
> Atomic, but names that differ only in case may be considered a
> problem. Brevity doesn't seem super critical here; the type name is
> mostly used as the type for data members and global variables, and
> rarely appears in function signatures.
>
> ------------------------------------------------------------------------------
>
> There are a number of places where plain arithmetic was being
> performed on relevant variables (mostly pre/post increment/decrement).
> I think most of these were intentionally not using Atomic arithmetic,
> because they are in single-threaded (mutex or safepoint) contexts.
>
> This change currently transforms those all into atomic inc/add/&etc.
> I've been waffling about this, and have noted occurrences in the
> detailed comments below. There is a (probably often negligable, though
> maybe not always) performance cost to that transformation, but that's
> not what concerns me. When I see an atomic operation, I expect there's
> something interesting happening and ordering or atomicity is
> important. Using atomic operations unnecessarily breaks that
> expectation, making the code less clear. That doesn't mean we should
> use pre/post increment/decrement operators and such for "non-atomic"
> uses though; we could have other names that are more explicit. (Note
> that atomic arithmetic operations with relaxed memory order are not
> the same thing as non-atomic arithmetic operations.)
>
> Also note that during review I found one place where I think the use
> of non-atomic arithmetic may have been a mistake, in which case the
> change to atomic arithmetic would be correct and necessary.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 42 #include "utilities/debug.hpp"
>
> Out of order.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 434 template<typename D, typename T, typename PlatformOp>
> 435 struct Atomic::StoreImpl<
> 436 D, T,
> 437 PlatformOp,
> 438 typename EnableIf<(IsIntegral<D>::value || IsRegisteredEnum<D>::value) && (IsIntegral<T>::value || IsRegisteredEnum<T>::value)>::type>
>
> ...
> 442 D value = new_value;
>
> Allowing implicit conversions here was discussed and rejected during
> the templatization of Atomic. While I think allowing some conversions
> here would be good, I think this is much too loose. I think a change
> of this type should be dealt with separately.
>
> Also, the line length for the EnableIf line seems rather excessive to me.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 943 AtomicValue(const AtomicValue<D>& a); // = delete
>
> I think copy-assign should also be forbidden. Why not use NONCOPYABLE?
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 946 STATIC_ASSERT(sizeof(AtomicValue<D>) == sizeof(D)); // ensure the same memory layout
>
> I think there are more requirements than same-size.
> But maybe they fall out of usage, since the Atomic operations will
> provide the additional requirements.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 949 inline void inc(atomic_memory_order order=memory_order_conservative) {
>
> I think there should be whitespace around "=".
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 940 class AtomicValue {
>
> I think I'd like the order of the load/store operations rearranged:
>
> load, load_acquire
> store, release_store, release_store_fence
>
> I also think I'd like inc/dec near add/sub.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 1007 template<typename T>
> 1008 static AtomicValue<T>* cast_to_atomic_ptr(T* nonatomic);
>
> Declared but not defined.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 1011 template<typename T>
> 1012 static AtomicValue<T>* cast_to_atomic_ptr(T* nonatomic) {
>
> I would really like to not have this. I'd like to see how far we can
> get without it.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 940 class AtomicValue {
>
> Should it be default constructible? If so, should the default
> constructor guarantee value initialization? I think having a trivial
> default constructor is needed to avoid UB in some uses. See discussion
> of G1BOT's _offset_array below.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/runtime/atomic.hpp
> 957 inline D load() const {
>
> 961 template <typename T>
> 962 inline void store(T store_value) {
>
> I wonder if load_relaxed and relaxed_store or something like that
> might be better names?
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1BlockOffsetTable.cpp
> 45 _offset_array = cast_to_atomic_ptr((u_char*) bot_reserved.start());
>
> This is currently the only use of cast_to_atomic_ptr. It leads to UB
> since the (non-trivial) constructor for AtomicValue is never called
> for any of the objects in the array. I think the call to
> memset_with_concurrent_readers is also UB. I *think* having a trivial
> default constructor solves the UB problem, but that isn't possible
> until C++11.
>
> Rather than just casting to a pointer to AtomicValue<>, this could
> iterate over the array memory, doing placement new construction of the
> elements. If we gave AtomicValue<> a non-trivial but empty default
> constructor (so not initializing the value), then such an iteration
> should be optimized away. We'd still have the UB from memset_with_cr,
> though that could use an iteration with the one-arg constructor instead.
> Or ignore the technical UB and just keep using memset_with_cr.
>
> And that gets rid of the one use (so far) of cast_to_atomic_ptr.
>
> The problem, of course, is that having a trivial/empty default
> constructor means one could unintentionally construct without
> initialization. I think a private trivial default constructor would
> deal with that, though again that requires C++11.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CodeCacheRemSet.cpp
> 39 AtomicValue<G1CodeRootSetTable*> G1CodeRootSetTable::_purge_list(NULL);
>
> Extra space between type and name.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 2023 _old_marking_cycles_started.load() == _old_marking_cycles_completed.load() + 1,
>
> Mis-indented.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 2027 _old_marking_cycles_started.inc();
>
> This was _old_marking_cycles_started++. The membars introduced by
> using inc() aren't needed here. It is sufficient here to use
> _old_marking_cycles_started.store(_old_marking_cycles_started.load() + 1)
> This is the only write, and it's in the vmthread in a safepoint.
>
> Similarly in increment_old_marking_cycles_completed, except here we
> have the only write serialized by holding G1OldGCCount_lock.
>
> I also think _old_marking_cycles_started might not need any special
> handling at all.
>
> The additional membars don't really matter for performance, since
> these operations are very rare. But they catch they eye and make one
> wonder what is being ordered here and why. Maybe some commentary
> would suffice?
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
> 4677 void G1CollectedHeap::increase_used(size_t bytes) {
> 4681 void G1CollectedHeap::decrease_used(size_t bytes) {
>
> I think these functions are mutexed, so the additional membars from
> using AtomicValue<>::add/sub aren't needed.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectionSet.cpp
> 129 _collection_set_regions[_collection_set_cur_length.fetch_and_add((size_t)1)] = hr->hrm_index();
>
> I think unnecessary membars.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectionSet.cpp
> 191 size_t len = _collection_set_cur_length.load();
> 192 OrderAccess::loadload();
>
> This could just be a load_acquire().
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1CollectionSet.cpp
> 332 _collection_set_cur_length.inc();
>
> I think unnecessary membars.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMark.cpp
> 179 _chunks_in_chunk_list.inc();
> 199 _chunks_in_chunk_list.dec();
>
> I think unnecessary membars.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ConcurrentMark.hpp
> 466 AtomicValue<HeapWord*>* _top_at_rebuild_starts;
>
> [pre-existing]
> Not sure, but I think this doesn't need to be atomic / didn't need to
> have a volatile qualifier in there.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1FullGCAdjustTask.hpp
> 42 AtomicValue<uint> _references_done; // Atomic counter / bool
>
> Member name no longer aligned with others adjacent.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1ParallelCleaning.hpp
> 32 AtomicValue<int> _cleaning_claimed;
>
> [pre-existing]
> This looks like it ought to be bool rather than int, and change values
> from 1/0 to true/false.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1RegionsOnNodes.cpp
> 44 _count_per_node[node_index].inc();
>
> Previously not atomic; was that a bug? If not a bug, then not clear
> why these array elements were volatile.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/g1SurvivorRegions.cpp
> 67 _used_bytes.add(used_bytes);
>
> I think unnecessary membars.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionManager.cpp
> 663 for (uint i = 0; i < _n_regions; ++i) {
> 664 new (&_claims[i]) AtomicValue<uint>(Unclaimed);
> 665 }
>
> Even if AtomicValue has a trivial default destructor (as discussed
> above) to avoid UB when using memset, this change is an improvement.
> The old memset code only works because Unclaimed == 0.
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
> 54 fl = _free_list.load();
>
> `fl = res;` instead?
>
> ------------------------------------------------------------------------------
> src/hotspot/share/gc/g1/heapRegionType.hpp
> 94 AtomicValue<Tag> _tag;
>
> It's not immediately obvious why _tag needs to be atomic. Unless I've
> missed something, there don't appear to be anything other than relaxed
> load/store operations. It's not even obvious that those might occur
> concurrently.
>
> ------------------------------------------------------------------------------
>
More information about the hotspot-dev
mailing list