RFR (L, but tedious) 8188220: Remove Atomic::*_ptr() uses and overloads from hotspot

Fri Oct 13 14:09:42 UTC 2017

On 10/12/17 8:55 PM, David Holmes wrote:
> Hi Kim,
>
> Very detailed analysis! A few things have already been updated by Coleen.
>
> Many of the issues with possibly incorrect/inappropriate types really 
> need to be dealt with separately - they go beyond the basic renaming - 
> by their component teams.

Yes, I fixed up some types that were trivial changes, but agree with you 
and left other types to be dealt with by the component teams.

I filed some RFEs and bugs.

thanks,
Coleen
>
> Similarly any ABA issues - which are likely non-issues but not clearly 
> documented - should be handled separately. And the potential race you 
> highlight below - though to be honest I couldn't match your statements 
> with the code as shown.
>
> Thanks,
> David
>
> On 13/10/2017 9:17 AM, Kim Barrett wrote:
>>> On Oct 10, 2017, at 6:01 PM, coleen.phillimore at oracle.com wrote:
>>>
>>> Summary: With the new template functions these are unnecessary.
>>>
>>> The changes are mostly s/_ptr// and removing the cast to return 
>>> type.  There weren't many types that needed to be improved to match 
>>> the template version of the function.   Some notes:
>>> 1. replaced CASPTR with Atomic::cmpxchg() in mutex.cpp, rearranging 
>>> arguments.
>>> 2. renamed Atomic::replace_if_null to Atomic::cmpxchg_if_null.  I 
>>> disliked the first name because it's not explicit from the callers 
>>> that there's an underlying cas.  If people want to fight, I'll 
>>> remove the function and use cmpxchg because there are only a couple 
>>> places where this is a little nicer.
>>> 3. Added Atomic::sub()
>>>
>>> Tested with JPRT, mach5 tier1-5 on linux,windows and solaris.
>>>
>>> open webrev at http://cr.openjdk.java.net/~coleenp/8188220.01/webrev
>>> bug link https://bugs.openjdk.java.net/browse/JDK-8188220
>>>
>>> Thanks,
>>> Coleen
>>
>> I looked harder at the potential ABA problems, and believe they are
>> okay.  There can be multiple threads doing pushes, and there can be
>> multiple threads doing pops, but not both at the same time.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/cpu/zero/cppInterpreter_zero.cpp
>>   279     if (Atomic::cmpxchg(monitor, lockee->mark_addr(), disp) != 
>> disp) {
>>
>> How does this work?  monitor and disp seem like they have unrelated
>> types?  Given that this is zero-specific code, maybe this hasn't been
>> tested?
>>
>> Similarly here:
>>   423       if (Atomic::cmpxchg(header, rcvr->mark_addr(), lock) != 
>> lock) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/asm/assembler.cpp
>>   239         dcon->value_fn = cfn;
>>
>> Is it actually safe to remove the atomic update?  If multiple threads
>> performing the assignment *are* possible (and I don't understand the
>> context yet, so don't know the answer to that), then a bare non-atomic
>> assignment is a race, e.g. undefined behavior.
>>
>> Regardless of that, I think the CAST_FROM_FN_PTR should be retained.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>>   167   Chunk* head = (Chunk*) OrderAccess::load_acquire(&_head);
>>
>> I think the cast to Chunk* is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/classLoaderData.cpp
>>   946     ClassLoaderData* old = Atomic::cmpxchg(cld, cld_addr, 
>> (ClassLoaderData*)NULL);
>>   947     if (old != NULL) {
>>   948       delete cld;
>>   949       // Returns the data.
>>   950       return old;
>>   951     }
>>
>> That could instead be
>>
>>    if (!Atomic::replace_if_null(cld, cld_addr)) {
>>      delete cld;           // Lost the race.
>>      return *cld_addr;     // Use the winner's value.
>>    }
>>
>> And apparently the caller of CLDG::add doesn't care whether the
>> returned CLD has actually been added to the graph yet.  If that's not
>> true, then there's a bug here, since a race loser might return a
>> winner's value before the winner has actually done the insertion.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/classfile/verifier.cpp
>>    71 static void* verify_byte_codes_fn() {
>>    72   if (OrderAccess::load_acquire(&_verify_byte_codes_fn) == NULL) {
>>    73     void *lib_handle = os::native_java_library();
>>    74     void *func = os::dll_lookup(lib_handle, 
>> "VerifyClassCodesForMajorVersion");
>>    75     OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>    76     if (func == NULL) {
>>    77       _is_new_verify_byte_codes_fn = false;
>>    78       func = os::dll_lookup(lib_handle, "VerifyClassCodes");
>>    79 OrderAccess::release_store(&_verify_byte_codes_fn, func);
>>    80     }
>>    81   }
>>    82   return (void*)_verify_byte_codes_fn;
>>    83 }
>>
>> [pre-existing]
>>
>> I think this code has race problems; a caller could unexpectedly and
>> inappropriately return NULL.  Consider the case where there is no
>> VerifyClassCodesForMajorVersion, but there is VerifyClassCodes.
>>
>> The variable is initially NULL.
>>
>> Both Thread1 and Thread2 reach line 73, having both seen a NULL value
>> for the variable.
>>
>> Thread1 reaches line 80, setting the variable to VerifyClassCodes.
>>
>> Thread2 reaches line 76, resetting the variable to NULL.
>>
>> Thread1 reads the now (momentarily) NULL value and returns it.
>>
>> I think the first release_store should be conditional on func != NULL.
>> Also, the usage of _is_new_verify_byte_codes_fn seems suspect.
>> And a minor additional nit: the cast in the return is unnecessary.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/code/nmethod.cpp
>> 1664   nmethod* observed_mark_link = _oops_do_mark_link;
>> 1665   if (observed_mark_link == NULL) {
>> 1666     // Claim this nmethod for this thread to mark.
>> 1667     if (Atomic::cmpxchg_if_null(NMETHOD_SENTINEL, 
>> &_oops_do_mark_link)) {
>>
>> With these changes, the only use of observed_mark_link is in the if.
>> I'm not sure that variable is really useful anymore, e.g. just use
>>
>>    if (_oops_do_mark_link == NULL) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>>
>> In CMSCollector::par_take_from_overflow_list, if BUSY and prefix were
>> of type oopDesc*, I think there would be a whole lot fewer casts and
>> cast_to_oop's.  Later on, I think suffix_head, observed_overflow_list,
>> and curr_overflow_list could also be oopDesc* instead of oop to
>> eliminate more casts.
>>
>> And some similar changes in CMSCollector::par_push_on_overflow_list.
>>
>> And similarly in parNewGeneration.cpp, in push_on_overflow_list and
>> take_from_overflow_list_work.
>>
>> As noted in the comments for JDK-8165857, the lists and "objects"
>> involved here aren't really oops, but rather the shattered remains of
>> oops.  The suggestion there was to use HeapWord* and carry through the
>> fanout; what was actually done was to change _overflow_list to
>> oopDesc* to minimize fanout, even though that's kind of lying to the
>> type system.  Now, with the cleanup of cmpxchg_ptr and such, we're
>> paying the price of doing the minimal thing back then.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/concurrentMarkSweepGeneration.cpp
>> 7960   Atomic::add(-n, &_num_par_pushes);
>>
>> Atomic::sub
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/cms/parNewGeneration.cpp
>> 1455   Atomic::add(-n, &_num_par_pushes);
>>
>> Atomic::sub
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/dirtyCardQueue.cpp
>>   283     void* actual = Atomic::cmpxchg(next, &_cur_par_buffer_node, 
>> nd);
>> ...
>>   289       nd = static_cast<BufferNode*>(actual);
>>
>> Change actual's type to BufferNode* and remove the cast on line 289.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1CollectedHeap.cpp
>>
>> [pre-existing]
>> 3499         old = (CompiledMethod*)_postponed_list;
>>
>> I think that cast is only needed because
>> G1CodeCacheUnloadingTask::_postponed_list is incorrectly typed as
>> "volatile CompiledMethod*", when I think it ought to be
>> "CompiledMethod* volatile".
>>
>> I think G1CodeCacheUnloading::_claimed_nmethod is similarly mis-typed,
>> with a similar should not be needed cast:
>> 3530       first = (CompiledMethod*)_claimed_nmethod;
>>
>> and another for _postponed_list here:
>> 3552       claim = (CompiledMethod*)_postponed_list;
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1HotCardCache.cpp
>>    77   jbyte* previous_ptr = (jbyte*)Atomic::cmpxchg(card_ptr,
>>
>> I think the cast of the cmpxchg result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1PageBasedVirtualSpace.cpp
>>   254       char* touch_addr = (char*)Atomic::add(actual_chunk_size, 
>> &_cur_addr) - actual_chunk_size;
>>
>> I think the cast of the add result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/g1StringDedup.cpp
>>   213   return (size_t)Atomic::add(partition_size, &_next_bucket) - 
>> partition_size;
>>
>> I think the cast of the add result is no longer needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/heapRegionRemSet.cpp
>>   200       PerRegionTable* res =
>>   201         Atomic::cmpxchg(nxt, &_free_list, fl);
>>
>> Please remove the line break, now that the code has been simplified.
>>
>> But wait, doesn't this alloc exhibit classic ABA problems?  I *think*
>> this works because alloc and bulk_free are called in different phases,
>> never overlapping.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/g1/sparsePRT.cpp
>>   295     SparsePRT* res =
>>   296       Atomic::cmpxchg(sprt, &_head_expanded_list, hd);
>> and
>>   307     SparsePRT* res =
>>   308       Atomic::cmpxchg(next, &_head_expanded_list, hd);
>>
>> I'd rather not have the line breaks in these either.
>>
>> And get_from_expanded_list also appears to have classic ABA problems.
>> I *think* this works because add_to_expanded_list and
>> get_from_expanded_list are called in different phases, never
>> overlapping.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/gc/shared/taskqueue.inline.hpp
>>   262   return (size_t) Atomic::cmpxchg((intptr_t)new_age._data,
>>   263                                   (volatile intptr_t *)&_data,
>>   264 (intptr_t)old_age._data);
>>
>> This should be
>>
>>    return Atomic::cmpxchg(new_age._data, &_data, old_age._data);
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/interpreter/bytecodeInterpreter.cpp
>> This doesn't have any casts, which I think is correct.
>>   708             if (Atomic::cmpxchg(header, rcvr->mark_addr(), 
>> mark) == mark) {
>>
>> but these do.
>>   718             if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), mark) == mark) {
>>   737             if (Atomic::cmpxchg((void*)new_header, 
>> rcvr->mark_addr(), header) == header) {
>>
>> I'm not sure how the ones with casts even compile?  mark_addr() seems
>> to be a markOop*, which is a markOopDesc**, where markOopDesc is a
>> class.  void* is not implicitly convertible to markOopDesc*.
>>
>> Hm, this entire file is #ifdef CC_INTERP.  Is this zero-only code?  Or
>> something like that?
>>
>> Similarly here:
>>   906           if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> and
>>   917           if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>>   935           if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858               if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878               if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> and here:
>> 1847               if (Atomic::cmpxchg(header, lockee->mark_addr(), 
>> mark) == mark) {
>> 1858               if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), mark) == mark) {
>> 1878               if (Atomic::cmpxchg((void*)new_header, 
>> lockee->mark_addr(), header) == header) {
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/memory/metaspace.cpp
>> 1502   size_t value = OrderAccess::load_acquire(&_capacity_until_GC);
>> ...
>> 1537   return (size_t)Atomic::sub((intptr_t)v, &_capacity_until_GC);
>>
>> These and other uses of _capacity_until_GC suggest that variable's
>> type should be size_t rather than intptr_t.  Note that I haven't done
>> a careful check of uses to see if there are any places where such a
>> change would cause problems.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.cpp
>>   229   OrderAccess::release_store((Klass* volatile *)adr, k);
>>   246   OrderAccess::release_store((Klass* volatile *)adr, k);
>>   514   OrderAccess::release_store((Klass* volatile *)adr, k);
>>
>> Casts are not needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/constantPool.hpp
>>   148     volatile intptr_t adr = 
>> OrderAccess::load_acquire(obj_at_addr_raw(which));
>>
>> [pre-existing]
>> Why is adr declared volatile?
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.cpp
>>   157     intx newflags = (value & parameter_size_mask);
>>   158     Atomic::cmpxchg(newflags, &_flags, (intx)0);
>>
>> This is a nice demonstration of why I wanted to include some value
>> preserving integral conversions in cmpxchg, rather than requiring
>> exact type matching in the integral case.  There have been some others
>> that I haven't commented on.  Apparently we (I) got away with
>> including such conversions in Atomic::add, which I'd forgotten about.
>> And see comment regarding Atomic::sub below.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/cpCache.hpp
>>   139   volatile Metadata*   _f1;       // entry specific metadata field
>>
>> [pre-existing]
>> I suspect the type should be Metadata* volatile.  And that would
>> eliminate the need for the cast here:
>>
>>   339   Metadata* f1_ord() const                       { return 
>> (Metadata *)OrderAccess::load_acquire(&_f1); }
>>
>> I don't know if there are any other changes needed or desirable around
>> _f1 usage.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/method.hpp
>>   139   volatile address from_compiled_entry() const   { return 
>> OrderAccess::load_acquire(&_from_compiled_entry); }
>>   140   volatile address from_compiled_entry_no_trampoline() const;
>>   141   volatile address from_interpreted_entry() const{ return 
>> OrderAccess::load_acquire(&_from_interpreted_entry); }
>>
>> [pre-existing]
>> The volatile qualifiers here seem suspect to me.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/oops/oop.inline.hpp
>>   391     narrowOop old = (narrowOop)Atomic::xchg(val, 
>> (narrowOop*)dest);
>>
>> Cast of return type is not needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> copy_jni_function_table should be using Copy::disjoint_words_atomic.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jni.cpp
>>
>> [pre-existing]
>>
>> 3892   // We're about to use Atomic::xchg for synchronization. Some Zero
>> 3893   // platforms use the GCC builtin __sync_lock_test_and_set for 
>> this,
>> 3894   // but __sync_lock_test_and_set is not guaranteed to do what 
>> we want
>> 3895   // on all architectures.  So we check it works before relying 
>> on it.
>> 3896 #if defined(ZERO) && defined(ASSERT)
>> 3897   {
>> 3898     jint a = 0xcafebabe;
>> 3899     jint b = Atomic::xchg(0xdeadbeef, &a);
>> 3900     void *c = &a;
>> 3901     void *d = Atomic::xchg(&b, &c);
>> 3902     assert(a == (jint) 0xdeadbeef && b == (jint) 0xcafebabe, 
>> "Atomic::xchg() works");
>> 3903     assert(c == &b && d == &a, "Atomic::xchg() works");
>> 3904   }
>> 3905 #endif // ZERO && ASSERT
>>
>> It seems rather strange to be testing Atomic::xchg() here, rather than
>> as part of unit testing Atomic?  Fail unit testing => don't try to
>> use...
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.cpp
>>   130     if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>>   142     if (_owner == NULL && Atomic::cmpxchg_if_null((void*)Self, 
>> &_owner)) {
>>
>> I think these casts aren't needed. _owner is void*, and Self is
>> Thread*, which is implicitly convertible to void*.
>>
>> Similarly here, for the THREAD argument:
>>   280     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>>   283     Contended = Atomic::cmpxchg((void*)THREAD, &_owner, 
>> (void*)NULL);
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/prims/jvmtiRawMonitor.hpp
>>
>> This file is in the webrev, but seems to be unchanged.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/atomic.hpp
>>   520 template<typename I, typename D>
>>   521 inline D Atomic::sub(I sub_value, D volatile* dest) {
>>   522   STATIC_ASSERT(IsPointer<D>::value || IsIntegral<D>::value);
>>   523   // Assumes two's complement integer representation.
>>   524   #pragma warning(suppress: 4146)
>>   525   return Atomic::add(-sub_value, dest);
>>   526 }
>>
>> I'm pretty sure this implementation is incorrect.  I think it produces
>> the wrong result when I and D are both unsigned integer types and
>> sizeof(I) < sizeof(D).
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.cpp
>>   304   intptr_t v = Atomic::cmpxchg((intptr_t)_LBIT, 
>> &_LockWord.FullWord, (intptr_t)0);  // agro ...
>>
>> _LBIT should probably be intptr_t, rather than an enum.  Note that the
>> enum type is unused.  The old value here is another place where an
>> implicit widening of same signedness would have been nice. (Such
>> implicit widening doesn't work for enums, since it's unspecified
>> whether they default to signed or unsigned representation, and
>> implementatinos differ.)
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/mutex.hpp
>>
>> [pre-existing]
>>
>> I think the Address member of the SplitWord union is unused. Looking
>> at AcquireOrPush (and others), I'm wondering whether it *should* be
>> used there, or whether just using intptr_t casts and doing integral
>> arithmetic (as is presently being done) is easier and clearer.
>>
>> Also the _LSBINDEX macro probably ought to be defined in mutex.cpp
>> rather than polluting the global namespace.  And technically, that
>> name is reserved word.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>>   252   void * cur = Atomic::cmpxchg((void*)Self, &_owner, (void*)NULL);
>>   409   if (Atomic::cmpxchg_if_null((void*)Self, &_owner)) {
>> 1983       ox = (Thread*)Atomic::cmpxchg((void*)Self, &_owner, 
>> (void*)NULL);
>>
>> I think the casts of Self aren't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.cpp
>>   995       if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>> 1020         if (!Atomic::cmpxchg_if_null((void*)THREAD, &_owner)) {
>>
>> I think the casts of THREAD aren't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/objectMonitor.hpp
>>   254   markOopDesc* volatile* header_addr();
>>
>> Why isn't this volatile markOop* ?
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>>   242         Atomic::cmpxchg_if_null((void*)Self, &(m->_owner))) {
>>
>> I think the cast of Self isn't needed.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>>   992   for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>> 1734     for (; block != NULL; block = (PaddedEnd<ObjectMonitor> 
>> *)next(block)) {
>>
>> [pre-existing]
>> All calls to next() pass a PaddedEnd<ObjectMonitor>* and cast the
>> result.  How about moving all that behavior into next().
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.cpp
>> 1970     if (monitor > (ObjectMonitor *)&block[0] &&
>> 1971         monitor < (ObjectMonitor *)&block[_BLOCKSIZE]) {
>>
>> [pre-existing]
>> Are the casts needed here?  I think PaddedEnd<ObjectMonitor> is
>> derived from ObjectMonitor, so implicit conversions should apply.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/synchronizer.hpp
>>    28 #include "memory/padded.hpp"
>>   163   static PaddedEnd<ObjectMonitor> * volatile gBlockList;
>>
>> I was going to suggest as an alternative just making gBlockList a file
>> scoped variable in synchronizer.cpp, since it isn't used outside of
>> that file. Except that it is referenced by vmStructs.  Curses!
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/runtime/thread.cpp
>> 4707   intptr_t w = Atomic::cmpxchg((intptr_t)LOCKBIT, Lock, 
>> (intptr_t)0);
>>
>> This and other places suggest LOCKBIT should be defined as intptr_t,
>> rather than as an enum value.  The MuxBits enum type is unused.
>>
>> And the cast of 0 is another case where implicit widening would be nice.
>>
>> ------------------------------------------------------------------------------ 
>>
>> src/hotspot/share/services/mallocSiteTable.cpp
>>   261 bool MallocSiteHashtableEntry::atomic_insert(const 
>> MallocSiteHashtableEntry* entry) {
>>   262   return Atomic::cmpxchg_if_null(entry, (const 
>> MallocSiteHashtableEntry**)&_next);
>>   263 }
>>
>> I think the problem here that is leading to the cast is that
>> atomic_insert is taking a const T*.  Note that it's only caller passes
>> a non-const T*.
>>
>> ------------------------------------------------------------------------------ 
>>
>>