RFR: 7143664: Clean up OrderAccess implementations and usage

Sun Feb 22 04:51:46 UTC 2015

My 2c as the sponsor here. If we get into matters of style then we can 
debate endlessly, so lets not do that :) (or at least lets not have that 
hold up getting this pushed through - we can debate and refine later if 
needed). As far as macros are concerned I'd need to see a concrete 
example of what is proposed to evaluate its utility with regard to 
understandability or maintainability - but as a template novice I find 
the current structure quite readable and understandable.

Re Atomics ... seems to be a separate issue. For a while people 
overlooked the fact that 64-bit loads/stores needed special handling on 
32-bit systems. Hence we need specific Atomic handling for jlong and 
jdouble. For all other types (regardless of whether there is an 
Atomic::store variant) we already assume accesses are atomic (which may 
only be true if alignment constraints are met).

Re: > 
------------------------------------------------------------------------------
> src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp
>
> I see no uses of the inlasm_eieio macro.
> I see no uses of the inlasm_isync macro.

This was put there by the ppc64 folk. It is up to them whether it is 
removed or left in place for potential future use.

Thanks,
David

On 21/02/2015 11:08 AM, Erik Österlund wrote:
> Hi Kim,
>
> Thanks for the review! :)
>
> In general: I might be a bit biased against the use of macros in general
> and avoid it whenever it can be avoided as you know. :)
> To me it seems a bit over engineered and unnecessarily complex to reduce
> the already quite small (84 LoC) code size of OrderAccess.inline.hpp
> with macros. And for me it's unclear what the benefit of such code size
> decrease would be. A means to reduce complexity, increase
> understandability? Probably not. A means to increase maintainability?
> Perhaps possible but not sure!
>
> But in matters where it's really up to personal taste, I'd like to not
> decide. Kim I will let you call the shots!
>
> Comments:
>
> On 20/02/15 23:41, Kim Barrett wrote:
>>> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v4/
>>
>> I haven't reviewed all of the changes, instead focusing on the
>> structure and use of templates.  That generally looks fine; I have a
>> few stylistic comments below, but nothing earth shaking.
>>
>> Some of the remaining comments, even if technically correct and
>> acceptable, might not be appropriate for this change set, but should
>> instead be dealt with in follow-on changes.
>>
>> ------------------------------------------------------------------------------
>> Throughout:
>>
>> I would probably have used some simple macros to avoid near
>> duplication of a lot of groups of related functions that differ only
>> in a type parameter.
>>
>> The present approach of making the groups of related functions
>> "obviously" similar by putting the entire definition on one line so
>> there are several nearly identical lines in a row makes me look at
>> each of the lines carefully to make sure the several parallel
>> differences are what's expected.  It has the additional defect of
>> involving very long lines, which makes them harder to review in
>> side-by-side views.
>>
>> YMMV.
>
> If I get you right, you would like to use macros to generate code lines
> that are now instead aligned but longish, in order to make the lines
> shorter?
>
> Hmm, it would indeed become shorter lines then. I think it would make it
> harder to read and understand though as a consequence. Do you think it's
> worth macroing it up to reduce code size? I mean the motivation/point of
> reducing code size is often to decrease complexity, but I feel like here
> it would on the contrary induce unnecessary complexity just for the sake
> of having shorter lines. Hmm!
>
> I don't know, what is the policy in general for sticking to 80
> characters BTW? Personally I was never a fan of this, especially not for
> C++ out of all languages because it sometimes requires long types,
> qualified scopes etc. All lines fit on my screen, so I never considered
> it a problem. I guess it's a matter of taste (and screen size haha).
> Personally I would rank readability in this order (which explains why it
> looks this way now):
>
> 1. Long aligned lines
> 2. Shorter split but not aligned lines
> 3. Macros
>
>> ------------------------------------------------------------------------------
>>
>> The ordered access operations are defined on jbyte ... jlong.  The
>> problem is that there is a type in that sequence that is the same size
>> as two distinct C/C++ types.  And as a result, one of those C/C++
>> types is missing atomic support.  This can lead to problems with types
>> like size_t on some platforms
>>
>> I also don't see atomic support for char*.
>>
>> This might be something to consider for a separate change.
>
> Sorry I don't think I understand what you mean here. :(
> If I get you right, you want more responsibilities to move to Atomic
> regarding making atomic accesses? In that case I agree. :)
>
>> ------------------------------------------------------------------------------
>> An observation:
>>
>> If OrderAccess was a real namespace instead of a pseudo-namespace
>> class, the specialized versions of the specialized_xxx functions could
>> be ordinary function overloads rather than function template
>> specializations.
>>
>> OTOH, namespaces don't provide private access control.  I often find
>> myself wishing for such a feature.
>>
>> No actual change needed here, just pointing out an interesting
>> variation from the approach taken.
>
> The neat thing about template specializations that I wanted to exploit
> here is that the definition is the declaration. For overloads, you have
> to somehow declare the function overloads to be specialized somewhere
> (which would be different for every platform as there are different sets
> of specializations). I think it would have been a bit more awkward
> because of that. With the template specializations on the other hand,
> you just stick the definition in there and the compiler will
> automatically know it is available too without any need for a
> declaration of available template specializations.
>
> I don't think you could do that with the namespace+overload approach, right?
>
>> ------------------------------------------------------------------------------
>> An observation:
>>
>> load_ptr_acquire isn't strictly necessary, and could be replaced by
>> load_acquire with some additional work.  But it might be the "_ptr_"
>> in the name is considered useful documentation.
>>
>> Given that, load_ptr_acquire can be generalized to support any type of
>> pointer, not just intptr_t* and void*.  This would eliminate the need
>> for casting result types in some places, and for some strange value
>> types in others.  For example MetaspaceGC::_capacity_until_gc is
>> declared intptr_t rather than size_t.
>>
>> This might be something to consider for a separate change.
>
> Interesting you mention this! I had similar thoughts.
>
> My analysis about _ptr_:
>
> The _ptr_ is required because both some kind of pointer type (like
> void*) and also intptr_t was wanted. Now intptr_t will be the same as
> either jint or jlong, which already has overloads. Therefore, this would
> lead to compiler moaning about ambiguity. I have no problem with leaving
> it like that though.
>
> My analysis about generic pointer types:
>
> Ideally we would want generic/parameterized pointer types, like:
> release_store(T *volatile *addr, T *value)
> And it looks very strange that it is instead
> release_store(volatile void *addr, void *value)
>
> The reason it looks like this is because any pointer is implicitly
> casted to void* which has been abused all over the place. So you find
> actual usages like:
> OrderAccess::release_store_ptr((Klass* volatile
> *)obj_at_addr_raw(which), k);
>
> Here the user gave an exact pointer type. But there is no such overload,
> so it is implicitly casted to void* like any other pointer. The same
> does *not* work for void**, and that's why void* is there even though it
> seems very wrong.
>
> I tried making a generalized pointer version in pretty much exactly the
> way you suggest using templates. And it didn't work, because since
> everything has been just implicitly casted to void* all over the place
> regardless of actual pointer type, the code is split between sometimes
> giving good exact pointer types and sometimes bad ones not following the
> intended pattern. So while this ideally would be nicer, I gave up on
> that for this change - we need to clean up the actual usage of
> OrderAccess too in order to do this. :(
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.hpp
>>
>> I don't see the point of having ScopedFence derive from
>> ScopedFenceGeneral.  It seems to me that ScopedFenceGeneral could
>> derive from AllStatic (with corresponding changes to functions) and
>> ScopedFence could instead derive from StackObj.
>
> It does have a point to me. The reason is simple: ScopedFence is-a
> ScopedFenceGeneral - and that's the point. Whether the inheritance is
> technically needed or could be removed is IMO not important. Because
> it's more intuitive for my understanding that if it's used like
> inheritance and ScopedFence is-a ScopedFenceGeneral, then they should be
> related by inheritance whether required for the code to compile or not.
>
> It's like AllStatic classes with inheritance: they might not technically
> need the inheritance because calls to super are qualified anyway, but
> they should use inheritance anyway if they are conceptually related by
> inheritance IMO.
>
> If A is-a B, then A should inherit from B.
>
> Would you agree with this?
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.hpp
>>
>> Why are ScopedFenceType, ScopedFenceGeneral, and ScopedFence defined
>> at global scope rather than nested in OrderAccess?
>
> I actually had them nested at first. But eh, the lines in definitions
> became pretty long then because of the long nested names and I thought
> I'd make the lines a bit tighter. I think I just contradicted myself
> haha! Anyway, if you prefer them to be nested I'm fine with that: I do
> not feel strongly about this. :)
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.hpp
>>
>> For all load_acquire variants, why (volatile T* p) rather than
>> (const volatile* p)?
>
> Did you mean (const T * volatile *p)? I assume so. And yes I had the
> same question. And like before, I tried it and ran into a lot of
> compiler errors because of implicit void* casts abused and incompatible
> with void** throughout hotspot, and gave up on it. ;)
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.inline.hpp
>>     75 #ifdef VM_HAS_GENERALIZED_ORDER_ACCESS
>>
>> I'm not sure what the point of this is.  Is this just to allow some
>> (perhaps third-party?) platforms to continue to use an old
>> implementation?
>>
>> Or is it for incremental development of this set of changes?
>>
>> Is the plan that it will eventually go away?
>
> Yeah it's only here while adapting other platforms and smoothly
> transitioning to the new architecture. Then it will be removed (together
> with the store_fence declarations now made private meanhile).
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.inline.hpp
>>
>> I think there needs to be some documentation here about which
>> specialized_xxx functions can/should be considered for specialization.
>
> Yes in OrderAccess.inline.hpp I wrote next to the OrderAccess::store
> definitions:
> // Generalized atomic volatile accesses valid in OrderAccess
> // All other types can be expressed in terms of these.
> inline void OrderAccess::store(volatile jbyte*   p, jbyte   v) { *p = v; }
> inline void OrderAccess::store(volatile jshort*  p, jshort  v) { *p = v; }
> inline void OrderAccess::store(volatile jint*    p, jint    v) { *p = v; }
> inline void OrderAccess::store(volatile jlong*   p, jlong   v) {
> Atomic::store(v, p); }
> inline void OrderAccess::store(volatile jdouble* p, jdouble v) {
> Atomic::store(jlong_cast(v), (volatile jlong*)p); }
> inline void OrderAccess::store(volatile jfloat*  p, jfloat  v) { *p = v; }
>
> ...and these are the exact types you can specialize.
>
> If you do not think this is clear enough or would like to move the
> comment somewhere else I'm fine with that. What do you think?
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.inline.hpp
>>    138 // The following methods can be specialized using simple template specialization
>>    139 // in the platform specific files for optimization purposes. Otherwise the
>>    140 // generalized variant is used.
>>    141 template<typename T> inline T    OrderAccess::specialized_load_acquire       (volatile T* p)       { return ordered_load<T, X_ACQUIRE>(p);    }
>>    142 template<typename T> inline void OrderAccess::specialized_release_store      (volatile T* p, T v)  { ordered_store<T, RELEASE_X>(p, v);       }
>>    143 template<typename T> inline void OrderAccess::specialized_release_store_fence(volatile T* p, T v)  { ordered_store<T, RELEASE_X_FENCE>(p, v); }
>>
>> I *think* it doesn't matter in practice, but I would put these
>> definitions before the potential instantiations.  The relevant
>> declarations are obviously in scope where those instantiations occur,
>> and everything is declared inline, but I worry that some
>> insufficiently clever compiler might end up generating out of line
>> code in some cases because the definition was after use.
>
> As your initial intuition said - it doesn't matter in practice: inline
> function definitions are not instantiations. Instantiation will happen
> when the user actually calls e.g. OrderAccess::load_acquire, at which
> point all template definitions/inline definitions are known and used for
> instantiating them. Why? Because inlined functions are instantiated when
> called, not when defined.
>
> So it won't be a problem. I'm pretty sure any compiler not doing this
> wouldn't work. ;)
>
>> ------------------------------------------------------------------------------
>> src/share/vm/runtime/orderAccess.inline.hpp
>>    101 inline jubyte   OrderAccess::load_acquire(volatile jubyte*  p) { return (jubyte) specialized_load_acquire((volatile jbyte*)p);  }
>>    102 inline jushort  OrderAccess::load_acquire(volatile jushort* p) { return (jushort)specialized_load_acquire((volatile jshort*)p); }
>>    103 inline juint    OrderAccess::load_acquire(volatile juint*   p) { return (juint)  specialized_load_acquire((volatile jint*)p);   }
>>    104 inline julong   OrderAccess::load_acquire(volatile julong*  p) { return (julong) specialized_load_acquire((volatile jlong*)p);  }
>>
>> I would probably define these as calling load_acquire with casts,
>> rather than directly calling specialized_load_acquire.  Similarly for
>> store_release and load_ptr_acquire variants.  This would make it
>> easier to fiddle with the how the specializations are implemented, and
>> shorten the line lengths by a dozen characters.
>
> I agree it would undoubtably be 12 characters smaller, but consistency
> seems more important to me. I think it's nice they all call the same
> function with different types instead of introducing exceptions to the
> rule in order to save 12 characters of line width for some specific
> overloads. Would you agree?
>
>> I would probably also have used some simple macrology to avoid near
>> duplication of code.
>
> Again, not convinced macros would improve readability/understandability
> here as argued before. ;)
>
>> ------------------------------------------------------------------------------
>> src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp
>>
>> I see no uses of the inlasm_eieio macro.
>> I see no uses of the inlasm_isync macro.
>
> No I thought they were there as available tools for the future if they
> will then be wanted as well as to comprehensively list the available
> fences. I don't think eieio was ever used. If you want me to though, I
> will remove them.
>
> Many thanks for reviewing these changes Kim! :)
>
> Cheers,
> /Erik
>