RFR: 7143664: Clean up OrderAccess implementations and usage

Sat Feb 21 01:08:46 UTC 2015

Hi Kim,

Thanks for the review! :)

In general: I might be a bit biased against the use of macros in general 
and avoid it whenever it can be avoided as you know. :)
To me it seems a bit over engineered and unnecessarily complex to reduce 
the already quite small (84 LoC) code size of OrderAccess.inline.hpp 
with macros. And for me it's unclear what the benefit of such code size 
decrease would be. A means to reduce complexity, increase 
understandability? Probably not. A means to increase maintainability? 
Perhaps possible but not sure!

But in matters where it's really up to personal taste, I'd like to not 
decide. Kim I will let you call the shots!

Comments:

On 20/02/15 23:41, Kim Barrett wrote:
>> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v4/
>
> I haven't reviewed all of the changes, instead focusing on the
> structure and use of templates.  That generally looks fine; I have a
> few stylistic comments below, but nothing earth shaking.
>
> Some of the remaining comments, even if technically correct and
> acceptable, might not be appropriate for this change set, but should
> instead be dealt with in follow-on changes.
>
> ------------------------------------------------------------------------------
> Throughout:
>
> I would probably have used some simple macros to avoid near
> duplication of a lot of groups of related functions that differ only
> in a type parameter.
>
> The present approach of making the groups of related functions
> "obviously" similar by putting the entire definition on one line so
> there are several nearly identical lines in a row makes me look at
> each of the lines carefully to make sure the several parallel
> differences are what's expected.  It has the additional defect of
> involving very long lines, which makes them harder to review in
> side-by-side views.
>
> YMMV.

If I get you right, you would like to use macros to generate code lines 
that are now instead aligned but longish, in order to make the lines 
shorter?

Hmm, it would indeed become shorter lines then. I think it would make it 
harder to read and understand though as a consequence. Do you think it's 
worth macroing it up to reduce code size? I mean the motivation/point of 
reducing code size is often to decrease complexity, but I feel like here 
it would on the contrary induce unnecessary complexity just for the sake 
of having shorter lines. Hmm!

I don't know, what is the policy in general for sticking to 80 
characters BTW? Personally I was never a fan of this, especially not for 
C++ out of all languages because it sometimes requires long types, 
qualified scopes etc. All lines fit on my screen, so I never considered 
it a problem. I guess it's a matter of taste (and screen size haha). 
Personally I would rank readability in this order (which explains why it 
looks this way now):

1. Long aligned lines
2. Shorter split but not aligned lines
3. Macros

> ------------------------------------------------------------------------------
>
> The ordered access operations are defined on jbyte ... jlong.  The
> problem is that there is a type in that sequence that is the same size
> as two distinct C/C++ types.  And as a result, one of those C/C++
> types is missing atomic support.  This can lead to problems with types
> like size_t on some platforms
>
> I also don't see atomic support for char*.
>
> This might be something to consider for a separate change.

Sorry I don't think I understand what you mean here. :(
If I get you right, you want more responsibilities to move to Atomic 
regarding making atomic accesses? In that case I agree. :)

> ------------------------------------------------------------------------------
> An observation:
>
> If OrderAccess was a real namespace instead of a pseudo-namespace
> class, the specialized versions of the specialized_xxx functions could
> be ordinary function overloads rather than function template
> specializations.
>
> OTOH, namespaces don't provide private access control.  I often find
> myself wishing for such a feature.
>
> No actual change needed here, just pointing out an interesting
> variation from the approach taken.

The neat thing about template specializations that I wanted to exploit 
here is that the definition is the declaration. For overloads, you have 
to somehow declare the function overloads to be specialized somewhere 
(which would be different for every platform as there are different sets 
of specializations). I think it would have been a bit more awkward 
because of that. With the template specializations on the other hand, 
you just stick the definition in there and the compiler will 
automatically know it is available too without any need for a 
declaration of available template specializations.

I don't think you could do that with the namespace+overload approach, right?

> ------------------------------------------------------------------------------
> An observation:
>
> load_ptr_acquire isn't strictly necessary, and could be replaced by
> load_acquire with some additional work.  But it might be the "_ptr_"
> in the name is considered useful documentation.
>
> Given that, load_ptr_acquire can be generalized to support any type of
> pointer, not just intptr_t* and void*.  This would eliminate the need
> for casting result types in some places, and for some strange value
> types in others.  For example MetaspaceGC::_capacity_until_gc is
> declared intptr_t rather than size_t.
>
> This might be something to consider for a separate change.

Interesting you mention this! I had similar thoughts.

My analysis about _ptr_:

The _ptr_ is required because both some kind of pointer type (like 
void*) and also intptr_t was wanted. Now intptr_t will be the same as 
either jint or jlong, which already has overloads. Therefore, this would 
lead to compiler moaning about ambiguity. I have no problem with leaving 
it like that though.

My analysis about generic pointer types:

Ideally we would want generic/parameterized pointer types, like:
release_store(T *volatile *addr, T *value)
And it looks very strange that it is instead
release_store(volatile void *addr, void *value)

The reason it looks like this is because any pointer is implicitly 
casted to void* which has been abused all over the place. So you find 
actual usages like:
OrderAccess::release_store_ptr((Klass* volatile 
*)obj_at_addr_raw(which), k);

Here the user gave an exact pointer type. But there is no such overload, 
so it is implicitly casted to void* like any other pointer. The same 
does *not* work for void**, and that's why void* is there even though it 
seems very wrong.

I tried making a generalized pointer version in pretty much exactly the 
way you suggest using templates. And it didn't work, because since 
everything has been just implicitly casted to void* all over the place 
regardless of actual pointer type, the code is split between sometimes 
giving good exact pointer types and sometimes bad ones not following the 
intended pattern. So while this ideally would be nicer, I gave up on 
that for this change - we need to clean up the actual usage of 
OrderAccess too in order to do this. :(

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.hpp
>
> I don't see the point of having ScopedFence derive from
> ScopedFenceGeneral.  It seems to me that ScopedFenceGeneral could
> derive from AllStatic (with corresponding changes to functions) and
> ScopedFence could instead derive from StackObj.

It does have a point to me. The reason is simple: ScopedFence is-a 
ScopedFenceGeneral - and that's the point. Whether the inheritance is 
technically needed or could be removed is IMO not important. Because 
it's more intuitive for my understanding that if it's used like 
inheritance and ScopedFence is-a ScopedFenceGeneral, then they should be 
related by inheritance whether required for the code to compile or not.

It's like AllStatic classes with inheritance: they might not technically 
need the inheritance because calls to super are qualified anyway, but 
they should use inheritance anyway if they are conceptually related by 
inheritance IMO.

If A is-a B, then A should inherit from B.

Would you agree with this?

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.hpp
>
> Why are ScopedFenceType, ScopedFenceGeneral, and ScopedFence defined
> at global scope rather than nested in OrderAccess?

I actually had them nested at first. But eh, the lines in definitions 
became pretty long then because of the long nested names and I thought 
I'd make the lines a bit tighter. I think I just contradicted myself 
haha! Anyway, if you prefer them to be nested I'm fine with that: I do 
not feel strongly about this. :)

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.hpp
>
> For all load_acquire variants, why (volatile T* p) rather than
> (const volatile* p)?

Did you mean (const T * volatile *p)? I assume so. And yes I had the 
same question. And like before, I tried it and ran into a lot of 
compiler errors because of implicit void* casts abused and incompatible 
with void** throughout hotspot, and gave up on it. ;)

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.inline.hpp
>    75 #ifdef VM_HAS_GENERALIZED_ORDER_ACCESS
>
> I'm not sure what the point of this is.  Is this just to allow some
> (perhaps third-party?) platforms to continue to use an old
> implementation?
>
> Or is it for incremental development of this set of changes?
>
> Is the plan that it will eventually go away?

Yeah it's only here while adapting other platforms and smoothly 
transitioning to the new architecture. Then it will be removed (together 
with the store_fence declarations now made private meanhile).

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.inline.hpp
>
> I think there needs to be some documentation here about which
> specialized_xxx functions can/should be considered for specialization.

Yes in OrderAccess.inline.hpp I wrote next to the OrderAccess::store 
definitions:
// Generalized atomic volatile accesses valid in OrderAccess
// All other types can be expressed in terms of these.
inline void OrderAccess::store(volatile jbyte*   p, jbyte   v) { *p = v; }
inline void OrderAccess::store(volatile jshort*  p, jshort  v) { *p = v; }
inline void OrderAccess::store(volatile jint*    p, jint    v) { *p = v; }
inline void OrderAccess::store(volatile jlong*   p, jlong   v) { 
Atomic::store(v, p); }
inline void OrderAccess::store(volatile jdouble* p, jdouble v) { 
Atomic::store(jlong_cast(v), (volatile jlong*)p); }
inline void OrderAccess::store(volatile jfloat*  p, jfloat  v) { *p = v; }

...and these are the exact types you can specialize.

If you do not think this is clear enough or would like to move the 
comment somewhere else I'm fine with that. What do you think?

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.inline.hpp
>   138 // The following methods can be specialized using simple template specialization
>   139 // in the platform specific files for optimization purposes. Otherwise the
>   140 // generalized variant is used.
>   141 template<typename T> inline T    OrderAccess::specialized_load_acquire       (volatile T* p)       { return ordered_load<T, X_ACQUIRE>(p);    }
>   142 template<typename T> inline void OrderAccess::specialized_release_store      (volatile T* p, T v)  { ordered_store<T, RELEASE_X>(p, v);       }
>   143 template<typename T> inline void OrderAccess::specialized_release_store_fence(volatile T* p, T v)  { ordered_store<T, RELEASE_X_FENCE>(p, v); }
>
> I *think* it doesn't matter in practice, but I would put these
> definitions before the potential instantiations.  The relevant
> declarations are obviously in scope where those instantiations occur,
> and everything is declared inline, but I worry that some
> insufficiently clever compiler might end up generating out of line
> code in some cases because the definition was after use.

As your initial intuition said - it doesn't matter in practice: inline 
function definitions are not instantiations. Instantiation will happen 
when the user actually calls e.g. OrderAccess::load_acquire, at which 
point all template definitions/inline definitions are known and used for 
instantiating them. Why? Because inlined functions are instantiated when 
called, not when defined.

So it won't be a problem. I'm pretty sure any compiler not doing this 
wouldn't work. ;)

> ------------------------------------------------------------------------------
> src/share/vm/runtime/orderAccess.inline.hpp
>   101 inline jubyte   OrderAccess::load_acquire(volatile jubyte*  p) { return (jubyte) specialized_load_acquire((volatile jbyte*)p);  }
>   102 inline jushort  OrderAccess::load_acquire(volatile jushort* p) { return (jushort)specialized_load_acquire((volatile jshort*)p); }
>   103 inline juint    OrderAccess::load_acquire(volatile juint*   p) { return (juint)  specialized_load_acquire((volatile jint*)p);   }
>   104 inline julong   OrderAccess::load_acquire(volatile julong*  p) { return (julong) specialized_load_acquire((volatile jlong*)p);  }
>
> I would probably define these as calling load_acquire with casts,
> rather than directly calling specialized_load_acquire.  Similarly for
> store_release and load_ptr_acquire variants.  This would make it
> easier to fiddle with the how the specializations are implemented, and
> shorten the line lengths by a dozen characters.

I agree it would undoubtably be 12 characters smaller, but consistency 
seems more important to me. I think it's nice they all call the same 
function with different types instead of introducing exceptions to the 
rule in order to save 12 characters of line width for some specific 
overloads. Would you agree?

> I would probably also have used some simple macrology to avoid near
> duplication of code.

Again, not convinced macros would improve readability/understandability 
here as argued before. ;)

> ------------------------------------------------------------------------------
> src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp
>
> I see no uses of the inlasm_eieio macro.
> I see no uses of the inlasm_isync macro.

No I thought they were there as available tools for the future if they 
will then be wanted as well as to comprehensively list the available 
fences. I don't think eieio was ever used. If you want me to though, I 
will remove them.

Many thanks for reviewing these changes Kim! :)

Cheers,
/Erik