RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and overloads from hotspot
Kim Barrett
kim.barrett at oracle.com
Thu Oct 12 23:17:38 UTC 2017
> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>
> Summary: With the new template functions these are unnecessary.
>
> The changes are mostly s/_ptr// and removing the cast to return type. There weren't many types that needed to be improved to match the template version of the function. Some notes:
> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging arguments.
> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null. I disliked the first name because it's not explicit from the callers that there's an underlying cas. If people want to fight, I'll remove the function and use cmpxchg because there are only a couple places where this is a little nicer.
> 3. Added Atomic::sub()
>
> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>
> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>
> Thanks,
> Coleen
I looked harder at the potential ABA problems, and believe they are
okay. There can be multiple threads doing pushes, and there can be
multiple threads doing pops, but not both at the same time.
------------------------------------------------------------------------------
src/hotspot/cpu/zero/cppInterpreter_zero.cpp
279 if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != disp) {
How does this work? monitor and disp seem like they have unrelated
types? Given that this is zero-specific code, maybe this hasn't been
tested?
Similarly here:
423 if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != lock) {
------------------------------------------------------------------------------
src/hotspot/share/asm/assembler.cpp
239 dcon->value_fn = cfn;
Is it actually safe to remove the atomic update? If multiple threads
performing the assignment *are* possible (and I don't understand the
context yet, so don't know the answer to that), then a bare non-atomic
assignment is a race, e.g. undefined behavior.
Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
------------------------------------------------------------------------------
src/hotspot/share/classfile/classLoaderData.cpp
167 Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
I think the cast to Chunk* is no longer needed.
------------------------------------------------------------------------------
src/hotspot/share/classfile/classLoaderData.cpp
946 ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, (ClassLoaderData*)NULL);
947 if (old != NULL) {
948 delete cld;
949 // Returns the data.
950 return old;
951 }
That could instead be
if (!Atomic::replace_if_null(cld, cld_addr)) {
delete cld; // Lost the race.
return *cld_addr; // Use the winner's value.
}
And apparently the caller of CLDG::add doesn't care whether the
returned CLD has actually been added to the graph yet. If that's not
true, then there's a bug here, since a race loser might return a
winner's value before the winner has actually done the insertion.
------------------------------------------------------------------------------
src/hotspot/share/classfile/verifier.cpp
71 static void* verify_byte_codes_fn() {
72 if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
73 void *lib_handle = os::native_java_library();
74 void *func = os::dll_lookup(lib_handle, "VerifyClassCodesForMajorVersion");
75 OrderAccess::release_store(&_verify_byte_codes_fn, func);
76 if (func == NULL) {
77 _is_new_verify_byte_codes_fn = false;
78 func = os::dll_lookup(lib_handle, "VerifyClassCodes");
79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
80 }
81 }
82 return (void*)_verify_byte_codes_fn;
83 }
[pre-existing]
I think this code has race problems; a caller could unexpectedly and
inappropriately return NULL. Consider the case where there is no
VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
The variable is initially NULL.
Both Thread1 and Thread2 reach line 73, having both seen a NULL value
for the variable.
Thread1 reaches line 80, setting the variable to VerifyClassCodes.
Thread2 reaches line 76, resetting the variable to NULL.
Thread1 reads the now (momentarily) NULL value and returns it.
I think the first release_store should be conditional on func != NULL.
Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
And a minor additional nit: the cast in the return is unnecessary.
------------------------------------------------------------------------------
src/hotspot/share/code/nmethod.cpp
1664 nmethod* observed_mark_link = _oops_do_mark_link;
1665 if (observed_mark_link == NULL) {
1666 // Claim this nmethod for this thread to mark.
1667 if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, &_oops_do_mark_link)) {
With these changes, the only use of observed_mark_link is in the if.
I'm not sure that variable is really useful anymore, e.g. just use
if (_oops_do_mark_link == NULL) {
------------------------------------------------------------------------------
src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
of type oopDesc*, I think there would be a whole lot fewer casts and
cast_to_oop's. Later on, I think suffix_head, observed_overflow_list,
and curr_overflow_list could also be oopDesc* instead of oop to
eliminate more casts.
And some similar changes in CMSCollector::par_push_on_overflow_list.
And similarly in parNewGeneration.cpp, in push_on_overflow_list and
take_from_overflow_list_work.
As noted in the comments for JDK-8165857, the lists and "objects"
involved here aren't really oops, but rather the shattered remains of
oops. The suggestion there was to use HeapWord* and carry through the
fanout; what was actually done was to change _overflow_list to
oopDesc* to minimize fanout, even though that's kind of lying to the
type system. Now, with the cleanup of cmpxchg_ptr and such, we're
paying the price of doing the minimal thing back then.
------------------------------------------------------------------------------
src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
7960 Atomic::add(-n, &_num_par_pushes);
Atomic::sub
------------------------------------------------------------------------------
src/hotspot/share/gc/cms/parNewGeneration.cpp
1455 Atomic::add(-n, &_num_par_pushes);
Atomic::sub
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/dirtyCardQueue.cpp
283 void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, nd);
...
289 nd = static_cast<BufferNode*>(actual);
Change actual's type to BufferNode* and remove the cast on line 289.
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1CollectedHeap.cpp
[pre-existing]
3499 old = (CompiledMethod*)_postponed_list;
I think that cast is only needed because
G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
"volatile CompiledMethod*", when I think it ought to be
"CompiledMethod* volatile".
I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
with a similar should not be needed cast:
3530 first = (CompiledMethod*)_claimed_nmethod;
and another for _postponed_list here:
3552 claim = (CompiledMethod*)_postponed_list;
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1HotCardCache.cpp
77 jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
I think the cast of the cmpxchg result is no longer needed.
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
254 char* touch_addr = (char*)Atomic::add(actual_chunk_size, &_cur_addr) - actual_chunk_size;
I think the cast of the add result is no longer needed.
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/g1StringDedup.cpp
213 return (size_t)Atomic::add(partition_size, &_next_bucket) - partition_size;
I think the cast of the add result is no longer needed.
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/heapRegionRemSet.cpp
200 PerRegionTable* res =
201 Atomic::cmpxchg(nxt, &_free_list, fl);
Please remove the line break, now that the code has been simplified.
But wait, doesn't this alloc exhibit classic ABA problems? I *think*
this works because alloc and bulk_free are called in different phases,
never overlapping.
------------------------------------------------------------------------------
src/hotspot/share/gc/g1/sparsePRT.cpp
295 SparsePRT* res =
296 Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
and
307 SparsePRT* res =
308 Atomic::cmpxchg(next, &_head_expanded_list, hd);
I'd rather not have the line breaks in these either.
And get_from_expanded_list also appears to have classic ABA problems.
I *think* this works because add_to_expanded_list and
get_from_expanded_list are called in different phases, never
overlapping.
------------------------------------------------------------------------------
src/hotspot/share/gc/shared/taskqueue.inline.hpp
262 return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
263 (volatile intptr_t *)&_data,
264 (intptr_t)old_age._data);
This should be
return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
------------------------------------------------------------------------------
src/hotspot/share/interpreter/bytecodeInterpreter.cpp
This doesn't have any casts, which I think is correct.
708 if (Atomic::cmpxchg(header, rcvr->mark_addr(), mark) == mark) {
but these do.
718 if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), mark) == mark) {
737 if (Atomic::cmpxchg((void*)new_header, rcvr->mark_addr(), header) == header) {
I'm not sure how the ones with casts even compile? mark_addr() seems
to be a markOop*, which is a markOopDesc**, where markOopDesc is a
class. void* is not implicitly convertible to markOopDesc*.
Hm, this entire file is #ifdef CC_INTERP. Is this zero-only code? Or
something like that?
Similarly here:
906 if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
and
917 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
935 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
and here:
1847 if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
1858 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
1878 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
and here:
1847 if (Atomic::cmpxchg(header, lockee->mark_addr(), mark) == mark) {
1858 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), mark) == mark) {
1878 if (Atomic::cmpxchg((void*)new_header, lockee->mark_addr(), header) == header) {
------------------------------------------------------------------------------
src/hotspot/share/memory/metaspace.cpp
1502 size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
...
1537 return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
These and other uses of _capacity_until_GC suggest that variable's
type should be size_t rather than intptr_t. Note that I haven't done
a careful check of uses to see if there are any places where such a
change would cause problems.
------------------------------------------------------------------------------
src/hotspot/share/oops/constantPool.cpp
229 OrderAccess::release_store((Klass* volatile *)adr, k);
246 OrderAccess::release_store((Klass* volatile *)adr, k);
514 OrderAccess::release_store((Klass* volatile *)adr, k);
Casts are not needed.
------------------------------------------------------------------------------
src/hotspot/share/oops/constantPool.hpp
148 volatile intptr_t adr = OrderAccess::load_acquire(obj_at_addr_raw(which));
[pre-existing]
Why is adr declared volatile?
------------------------------------------------------------------------------
src/hotspot/share/oops/cpCache.cpp
157 intx newflags = (value & parameter_size_mask);
158 Atomic::cmpxchg(newflags, &_flags, (intx)0);
This is a nice demonstration of why I wanted to include some value
preserving integral conversions in cmpxchg, rather than requiring
exact type matching in the integral case. There have been some others
that I haven't commented on. Apparently we (I) got away with
including such conversions in Atomic::add, which I'd forgotten about.
And see comment regarding Atomic::sub below.
------------------------------------------------------------------------------
src/hotspot/share/oops/cpCache.hpp
139 volatile Metadata* _f1; // entry specific metadata field
[pre-existing]
I suspect the type should be Metadata* volatile. And that would
eliminate the need for the cast here:
339 Metadata* f1_ord() const { return (Metadata *)OrderAccess::load_acquire(&_f1); }
I don't know if there are any other changes needed or desirable around
_f1 usage.
------------------------------------------------------------------------------
src/hotspot/share/oops/method.hpp
139 volatile address from_compiled_entry() const { return OrderAccess::load_acquire(&_from_compiled_entry); }
140 volatile address from_compiled_entry_no_trampoline() const;
141 volatile address from_interpreted_entry() const{ return OrderAccess::load_acquire(&_from_interpreted_entry); }
[pre-existing]
The volatile qualifiers here seem suspect to me.
------------------------------------------------------------------------------
src/hotspot/share/oops/oop.inline.hpp
391 narrowOop old = (narrowOop)Atomic::xchg(val, (narrowOop*)dest);
Cast of return type is not needed.
------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp
[pre-existing]
copy_jni_function_table should be using Copy::disjoint_words_atomic.
------------------------------------------------------------------------------
src/hotspot/share/prims/jni.cpp
[pre-existing]
3892 // We're about to use Atomic::xchg for synchronization. Some Zero
3893 // platforms use the GCC builtin __sync_lock_test_and_set for this,
3894 // but __sync_lock_test_and_set is not guaranteed to do what we want
3895 // on all architectures. So we check it works before relying on it.
3896 #if defined(ZERO) && defined(ASSERT)
3897 {
3898 jint a = 0xcafebabe;
3899 jint b = Atomic::xchg(0xdeadbeef, &a);
3900 void *c = &a;
3901 void *d = Atomic::xchg(&b, &c);
3902 assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, "Atomic::xchg() works");
3903 assert(c == &b && d == &a, "Atomic::xchg() works");
3904 }
3905 #endif // ZERO && ASSERT
It seems rather strange to be testing Atomic::xchg() here, rather than
as part of unit testing Atomic? Fail unit testing => don't try to
use...
------------------------------------------------------------------------------
src/hotspot/share/prims/jvmtiRawMonitor.cpp
130 if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
142 if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
I think these casts aren't needed. _owner is void*, and Self is
Thread*, which is implicitly convertible to void*.
Similarly here, for the THREAD argument:
280 Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
283 Contended = Atomic::cmpxchg((void*)THREAD, &_owner, (void*)NULL);
------------------------------------------------------------------------------
src/hotspot/share/prims/jvmtiRawMonitor.hpp
This file is in the webrev, but seems to be unchanged.
------------------------------------------------------------------------------
src/hotspot/share/runtime/atomic.hpp
520 template<typename I, typename D>
521 inline D Atomic::sub(I sub_value, D volatile* dest) {
522 STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
523 // Assumes two's complement integer representation.
524 #pragma warning(suppress: 4146)
525 return Atomic::add(-sub_value, dest);
526 }
I'm pretty sure this implementation is incorrect. I think it produces
the wrong result when I and D are both unsigned integer types and
sizeof(I) < sizeof(D).
------------------------------------------------------------------------------
src/hotspot/share/runtime/mutex.cpp
304 intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, &_LockWord.FullWord, (intptr_t)0); // agro ...
_LBIT should probably be intptr_t, rather than an enum. Note that the
enum type is unused. The old value here is another place where an
implicit widening of same signedness would have been nice. (Such
implicit widening doesn't work for enums, since it's unspecified
whether they default to signed or unsigned representation, and
implementatinos differ.)
------------------------------------------------------------------------------
src/hotspot/share/runtime/mutex.hpp
[pre-existing]
I think the Address member of the SplitWord union is unused. Looking
at AcquireOrPush (and others), I'm wondering whether it *should* be
used there, or whether just using intptr_t casts and doing integral
arithmetic (as is presently being done) is easier and clearer.
Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
rather than polluting the global namespace. And technically, that
name is reserved word.
------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.cpp
252 void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
409 if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
1983 ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
I think the casts of Self aren't needed.
------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.cpp
995 if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
1020 if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
I think the casts of THREAD aren't needed.
------------------------------------------------------------------------------
src/hotspot/share/runtime/objectMonitor.hpp
254 markOopDesc* volatile* header_addr();
Why isn't this volatile markOop* ?
------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
242 Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
I think the cast of Self isn't needed.
------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
992 for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
1734 for (; block != NULL; block = (PaddedEnd<ObjectMonitor> *)next(block)) {
[pre-existing]
All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
result. How about moving all that behavior into next().
------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.cpp
1970 if (monitor > (ObjectMonitor *)&block[0] &&
1971 monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
[pre-existing]
Are the casts needed here? I think PaddedEnd<ObjectMonitor> is
derived from ObjectMonitor, so implicit conversions should apply.
------------------------------------------------------------------------------
src/hotspot/share/runtime/synchronizer.hpp
28 #include "memory/padded.hpp"
163 static PaddedEnd<ObjectMonitor> * volatile gBlockList;
I was going to suggest as an alternative just making gBlockList a file
scoped variable in synchronizer.cpp, since it isn't used outside of
that file. Except that it is referenced by vmStructs. Curses!
------------------------------------------------------------------------------
src/hotspot/share/runtime/thread.cpp
4707 intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, (intptr_t)0);
This and other places suggest LOCKBIT should be defined as intptr_t,
rather than as an enum value. The MuxBits enum type is unused.
And the cast of 0 is another case where implicit widening would be nice.
------------------------------------------------------------------------------
src/hotspot/share/services/mallocSiteTable.cpp
261 bool MallocSiteHashtableEntry::atomic_insert(const MallocSiteHashtableEntry* entry) {
262 return Atomic::cmpxchg_if_null(entry, (const MallocSiteHashtableEntry**)&_next);
263 }
I think the problem here that is leading to the cast is that
atomic_insert is taking a const T*. Note that it's only caller passes
a non-const T*.
------------------------------------------------------------------------------
More information about the hotspot-dev
mailing list