RFR: 7143664: Clean up OrderAccess implementations and usage

Thu Feb 19 21:51:32 UTC 2015

On 2/12/15 4:51 PM, David Holmes wrote:
> Updated webrev:
>
> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3/

src/share/vm/runtime/orderAccess.hpp
   line 74: // used in such a pair, it is adviced to use a membar instead:
     Typo: 'adviced' -> 'advised'

   line 256: // the methods of its superclass using
     superclass? Perhaps subclass?

   line 318: // Give platforms a varation point to specialize.
     Typo: 'varation' -> 'variation'

src/share/vm/runtime/orderAccess.inline.hpp
   No comments.

src/cpu/x86/vm/c1_LIRAssembler_x86.cpp
   No comments.

src/os_cpu/bsd_x86/vm/orderAccess_bsd_x86.inline.hpp
   Do we know if the compiler_barrier() construct works with
   official MacOS X compilers we're using for JDK9?

   Update: I couldn't figure out if there was a clear
   answer about the MacOS X compilers.

src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp
   No comments.

src/os_cpu/solaris_x86/vm/orderAccess_solaris_x86.inline.hpp
   Do we know if the compiler_barrier() construct works with
   official Solaris compilers we're using for JDK9?

   Update: Based on other e-mails in this thread, it sounds
   like this has been tested.

src/os_cpu/solaris_x86/vm/solaris_x86_32.il
   No comments.

src/os_cpu/solaris_x86/vm/solaris_x86_64.il
   No comments.

src/os_cpu/windows_x86/vm/orderAccess_windows_x86.inline.hpp
   No comments.

src/os_cpu/bsd_zero/vm/orderAccess_bsd_zero.inline.hpp
   No comments.

src/os_cpu/linux_zero/vm/orderAccess_linux_zero.inline.hpp
   No comments.

src/os_cpu/linux_sparc/vm/orderAccess_linux_sparc.inline.hpp
   No comments.

src/os_cpu/solaris_sparc/vm/orderAccess_solaris_sparc.inline.hpp
   No comments.

src/os_cpu/solaris_sparc/vm/solaris_sparc.il
   No comments.

src/os_cpu/aix_ppc/vm/orderAccess_aix_ppc.inline.hpp
   No comments.

src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp
   No comments.

This all looks good and reads well to me. IMHO, the best example
of the value of the generalization is how much the SPARC files
have shrunk... :-)

Dan

>
> Incremental webrev from v2:
>
> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3-incr/
>
> One response inlined below ...
>
> On 13/02/2015 2:50 AM, Erik Österlund wrote:
>> Hi David,
>>
>> On 12/02/15 01:38, David Holmes wrote:
>>
>>>>> I still think we need to say something about C++ volatile and about
>>>>> compiler barriers in general.
>>>>
>>>> How about something like:
>>>>
>>>> C++ volatile semantics prevent compiler re-ordering between volatile
>>>> memory accesses. However, reordering between non-volatile and volatile
>>>> memory accesses is in general undefined. For compiler reordering
>>>> constraints taking non-volatile memory accesses into consideration, a
>>>> compiler barrier has to be used instead. Note also that both volatile
>>>> semantics and compiler barrier do not prevent hardware reordering.
>>>
>>> Sounds great! Though perhaps add "Some compiler implementations may
>>> choose to enforce additional constraints beyond those required by the
>>> language." ( a reference to MVSC)
>>
>> Sounds good, fixed!
>>
>>>> But since this is nowadays supported by solaris studio too, I joined
>>>> them into one. Not only because it's nicer to unify, but because I 
>>>> need
>>>> the compiler barrier constraints that inline asm can give us. I don't
>>>> know the compiler reordering constraints are guaranteed when calling
>>>> external assembly. So better make it explicit and safe.
>>>
>>> Totally fine with making use of Solaris Studio's modern capabilities,
>>> only minor concern is no way to test gcc on Solaris. But unless someone
>>> else screams about this I'm happy to assume that gcc will still be 
>>> okay.
>>
>> Since solaris studio is now actually using GCC syntax of inline asm
>> which worked well (on all other GCC variants as well), I just took for
>> granted that GCC would handle its own syntax on solaris too. ;)
>> But it never hurts to verify!
>>
>>>> How do you mean they are different? To me they look quite equivalent.
>>>> xchg is always implicitly locked on x86, and they both use xchg to get
>>>> the wanted fence semantics.
>>>
>>> Windows:
>>>
>>>      89 template<>
>>>      90 inline void OrderAccess::specialized_release_store_fence<jint>
>>> (volatile jint*   p, jint   v) {
>>>      91   __asm {
>>>      92     mov edx, p;
>>>      93     mov eax, v;
>>>      94     xchg eax, dword ptr [edx];
>>>      95   }
>>>
>>> Linux:
>>>
>>> ! template<>
>>> ! inline void OrderAccess::specialized_release_store_fence<jint>
>>> (volatile jint*   p, jint   v) {
>>>        __asm__ volatile (  "xchgl (%2),%0"
>>>                          : "=r" (v)
>>>                          : "0" (v), "r" (p)
>>>                          : "memory");
>>>      }
>>>
>>> but I'm guessing the MVSC inline assembler doesn't have the same
>>> capabilities as gcc hence we have to write the code to load the
>>> registers needed by the xchg.
>>
>> Yeah pretty sure the xchg instruction assumes operands are in certain
>> registers. GCC is clever enough to know the constraints of the operands
>> and I'm guessing MSVC inline assembly is not, so it's made explicit.
>> Although who knows, maybe it's smarter nowadays. Add that on the list of
>> things to find out about windows and modern compiler support! ;)
>>
>>>>> Not clear about your change to the float version of 
>>>>> release_store_fence:
>>>>> a) do we need to change it?
>>>>> b) given what we changed it to isn't that the default ?
>>>>
>>>> I didn't want to have default specializations of jfloat/jdouble
>>>> equalling jint/jlong. I thought maybe some architecture might want to
>>>> put it in different registers or something, making a default 
>>>> conversion
>>>> to int/long maybe undesirable. So the default for jfloat/jdouble is to
>>>> use acquire()/release() paired with an atomic memory access. So you 
>>>> have
>>>> to explicitly specialize them to be the same as jint/jlong if wanted,
>>>> which I believe is what is done in the code you refer to.
>>>
>>> Okay. It was the difference between the jfloat and jdouble versions in
>>> the original code that threw me. But the jdouble needs to delegate 
>>> so we
>>> end up doing an Atomic::load.
>>
>> jdouble and jlong is delegated to Atomic::* in orderAccess.inline.hpp as
>> they should be. It would perhaps be preferrable to eventually forward
>> all atomic accesses to Atomic since it's really a different concern and
>> would be good to have in one place. But the methods we need are not
>> there yet, so I elected not to for now.
>>
>> But yeah some jfloat vs jdouble stuff in the original code is very
>> strange and inconsistent. Notably, release_store_fence on linux_x86
>> properly uses compiler barriers for release for all variants except the
>> awkward jfloat, meaning it does not have the protection it needs. Looks
>> to me like this was a human mistake due to the high redundancy of the 
>> code.
>
> You mean this:
>
> OrderAccess::release_store_fence(volatile jfloat*  p, jfloat  v) {
>   *p = v; fence();
> }
>
> technically should be: compiler_barrier(); *p =v; fence();
>
> That's an artifact of the assumption that volatile writes are sequence 
> points and hence a compiler-barrier (something we now know is not the 
> case with respect to preceding non-volatile accesses).
>
> Cheers,
> David
>
>
>
>
>>>> If you think I'm paranoid about floats, I can of course make it a
>>>> default behaviour to use jint/jlong even if it isn't explicit, in the
>>>> same way I handled unsigned variants.
>>>
>>> No I think that is okay.
>>
>> Then we keep it that way. :)
>>
>>> If you can send me an incremental patch I will update the webrev.
>>
>> Will do, thanks for uploading!
>>
>> Thanks,
>> /Erik
>>