Using C++11+ in hotspot

Tue Aug 7 18:36:40 UTC 2018

In utilities/copy.[ch]pp there’s Copy::conjoint_copy and its close friends which does support different element sizes, and which promises to not tear the words/elements (if the underlying implementation doesn’t do the right thing it needs to be fixed). It doesn’t currently allow for configuring/customizing memory ordering requirements though, and If “extreme” performance is required there may well be some additional specialization needed as well.

Cheers,
Mikael

> On Aug 6, 2018, at 11:26 PM, Martin Buchholz <martinrb at google.com> wrote:
> 
> 
> 
> On Mon, Aug 6, 2018 at 11:12 AM, John Rose <john.r.rose at oracle.com <mailto:john.r.rose at oracle.com>> wrote:
> On Aug 5, 2018, at 8:30 AM, Martin Buchholz <martinrb at google.com <mailto:martinrb at google.com>> wrote:
>> 
>> Thanks to whoever added the comment long ago.
> 
> FTR I think it was Steffen Grarup.  We were just learning about MT safety at the time.
> The copy conjoint/disjoint APIs were not yet in existence.  I think they came around
> 2003, and Paul Hohensee's name is all over the SCCS history there.
> 
> s 00008/00002/00762
> d D 1.147 99/02/17 10:14:36 steffen 235 233
> …
> I 235
> static inline void copy_table(address* from, address* to, int size) {
>   // Copy non-overlapping tables. The copy has to occur word wise for MT safety.
>   while (size-- > 0) *to++ = *from++;
> }
> 
> Today, that loop should be recoded to use copy, and copy in turn needs to
> do whatever magic is required to force word-atomic access on non-atomic data.
> 
> 
> That loop copies address*, while  pd_disjoint_words_atomic copies HeapWord, so these are not compatible out of the box.
> 
> We could have atomic relaxed copies like below.  Using compiler builtins also avoids the problem of the underlying type not being declared atomic<>, and is ISA-independent.  OTOH maybe we always want that loop compiled to REP MOVSQ on x64.
> 
> 
> template <typename T>
> static ALWAYSINLINE void copy_atomic_relaxed(const T* from, T* to) {
>   T val;
>   __atomic_load(from, &val, __ATOMIC_RELAXED);
>   __atomic_store(to, &val, __ATOMIC_RELAXED);
> }
> 
> static void pd_disjoint_words_atomic(const HeapWord* from, HeapWord* to, size_t count) {
> #ifdef AMD64
>   switch (count) {
>   case 8:  copy_atomic_relaxed(from + 7, to + 7);
>   case 7:  copy_atomic_relaxed(from + 6, to + 6);
>   case 6:  copy_atomic_relaxed(from + 5, to + 5);
>   case 5:  copy_atomic_relaxed(from + 4, to + 4);
>   case 4:  copy_atomic_relaxed(from + 3, to + 3);
>   case 3:  copy_atomic_relaxed(from + 2, to + 2);
>   case 2:  copy_atomic_relaxed(from + 1, to + 1);
>   case 1:  copy_atomic_relaxed(from + 0, to + 0);
>   case 0:  break;
>   default:
>     while (count-- > 0) {
>       copy_atomic_relaxed(from++, to++);
>     }
>     break;
>   }