RFR: 7143664: Clean up OrderAccess implementations and usage

Thu Feb 19 23:26:13 UTC 2015

Hi Daniel,

On 19/02/15 22:51, Daniel D. Daugherty wrote:
> On 2/12/15 4:51 PM, David Holmes wrote:
>> Updated webrev:
>>
>> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3/
>
> src/share/vm/runtime/orderAccess.hpp
>     line 74: // used in such a pair, it is adviced to use a membar instead:
>       Typo: 'adviced' -> 'advised'

Fixed.

>     line 256: // the methods of its superclass using
>       superclass? Perhaps subclass?

Actually, this comment is a relic from when I first implemented 
generalization using inheritance. I later found it was unnecessary and a 
bit overengineered, and removed the unnecessary superclasses - but the 
comment apparently stuck, sorry about that.
Thank you for spotting this! I will remove this comment.

>     line 318: // Give platforms a varation point to specialize.
>       Typo: 'varation' -> 'variation'

Fixed.

> src/share/vm/runtime/orderAccess.inline.hpp
>     No comments.
>
>
> src/cpu/x86/vm/c1_LIRAssembler_x86.cpp
>     No comments.
>
> src/os_cpu/bsd_x86/vm/orderAccess_bsd_x86.inline.hpp
>     Do we know if the compiler_barrier() construct works with
>     official MacOS X compilers we're using for JDK9?
>
>     Update: I couldn't figure out if there was a clear
>     answer about the MacOS X compilers.

It works.

I've been using compiler barriers in my own code (and primarily use Mac 
OS X) since jdk8 where I used that certain old horrible apple patch of 
gcc42. And I can confirm that it did save me from reorderings where 
OrderAccess was not enough (my code is pretty sensitive to reorderings 
he he).
Now I use clang for jdk9 and it works fine too.

> src/os_cpu/linux_x86/vm/orderAccess_linux_x86.inline.hpp
>     No comments.
>
> src/os_cpu/solaris_x86/vm/orderAccess_solaris_x86.inline.hpp
>     Do we know if the compiler_barrier() construct works with
>     official Solaris compilers we're using for JDK9?
>
>     Update: Based on other e-mails in this thread, it sounds
>     like this has been tested.

Yes this has been tested, except the GCC-variant of Solaris if memory 
serves right. I think I recall David said Oracle doesn't support GCC on 
solaris x86, correct me if I'm wrong.

Background: Solaris studio previously did not support GCC-style inline 
assembly syntax so there was a GCC variant with inline assembly and a 
solaris variant with .il-files. I found out solaris studio now supports 
gcc-style inline asm (with explicit compiler barriers that I need). 
Therefore I joined them to use the same GCC-style inline assembly code 
both when using solaris studio and GCC. And it would be very strange if 
the GCC-style inline asm worked on solaris studio but not GCC. But as I 
said then, if anyone cares to try it, I guess it would be good.

> src/os_cpu/solaris_x86/vm/solaris_x86_32.il
>     No comments.
>
> src/os_cpu/solaris_x86/vm/solaris_x86_64.il
>     No comments.
>
> src/os_cpu/windows_x86/vm/orderAccess_windows_x86.inline.hpp
>     No comments.
>
>
> src/os_cpu/bsd_zero/vm/orderAccess_bsd_zero.inline.hpp
>     No comments.
>
> src/os_cpu/linux_zero/vm/orderAccess_linux_zero.inline.hpp
>     No comments.
>
>
> src/os_cpu/linux_sparc/vm/orderAccess_linux_sparc.inline.hpp
>     No comments.
>
> src/os_cpu/solaris_sparc/vm/orderAccess_solaris_sparc.inline.hpp
>     No comments.
>
> src/os_cpu/solaris_sparc/vm/solaris_sparc.il
>     No comments.
>
>
> src/os_cpu/aix_ppc/vm/orderAccess_aix_ppc.inline.hpp
>     No comments.
>
> src/os_cpu/linux_ppc/vm/orderAccess_linux_ppc.inline.hpp
>     No comments.
>
>
> This all looks good and reads well to me. IMHO, the best example
> of the value of the generalization is how much the SPARC files
> have shrunk... :-)

Yeah and the zero files. There is almost zero code left. (he-he)

Thanks a lot Daniel for reviewing my changes! :)

/Erik

>
> Dan
>
>
>>
>> Incremental webrev from v2:
>>
>> http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3-incr/
>>
>> One response inlined below ...
>>
>> On 13/02/2015 2:50 AM, Erik Österlund wrote:
>>> Hi David,
>>>
>>> On 12/02/15 01:38, David Holmes wrote:
>>>
>>>>>> I still think we need to say something about C++ volatile and about
>>>>>> compiler barriers in general.
>>>>>
>>>>> How about something like:
>>>>>
>>>>> C++ volatile semantics prevent compiler re-ordering between volatile
>>>>> memory accesses. However, reordering between non-volatile and volatile
>>>>> memory accesses is in general undefined. For compiler reordering
>>>>> constraints taking non-volatile memory accesses into consideration, a
>>>>> compiler barrier has to be used instead. Note also that both volatile
>>>>> semantics and compiler barrier do not prevent hardware reordering.
>>>>
>>>> Sounds great! Though perhaps add "Some compiler implementations may
>>>> choose to enforce additional constraints beyond those required by the
>>>> language." ( a reference to MVSC)
>>>
>>> Sounds good, fixed!
>>>
>>>>> But since this is nowadays supported by solaris studio too, I joined
>>>>> them into one. Not only because it's nicer to unify, but because I
>>>>> need
>>>>> the compiler barrier constraints that inline asm can give us. I don't
>>>>> know the compiler reordering constraints are guaranteed when calling
>>>>> external assembly. So better make it explicit and safe.
>>>>
>>>> Totally fine with making use of Solaris Studio's modern capabilities,
>>>> only minor concern is no way to test gcc on Solaris. But unless someone
>>>> else screams about this I'm happy to assume that gcc will still be
>>>> okay.
>>>
>>> Since solaris studio is now actually using GCC syntax of inline asm
>>> which worked well (on all other GCC variants as well), I just took for
>>> granted that GCC would handle its own syntax on solaris too. ;)
>>> But it never hurts to verify!
>>>
>>>>> How do you mean they are different? To me they look quite equivalent.
>>>>> xchg is always implicitly locked on x86, and they both use xchg to get
>>>>> the wanted fence semantics.
>>>>
>>>> Windows:
>>>>
>>>>       89 template<>
>>>>       90 inline void OrderAccess::specialized_release_store_fence<jint>
>>>> (volatile jint*   p, jint   v) {
>>>>       91   __asm {
>>>>       92     mov edx, p;
>>>>       93     mov eax, v;
>>>>       94     xchg eax, dword ptr [edx];
>>>>       95   }
>>>>
>>>> Linux:
>>>>
>>>> ! template<>
>>>> ! inline void OrderAccess::specialized_release_store_fence<jint>
>>>> (volatile jint*   p, jint   v) {
>>>>         __asm__ volatile (  "xchgl (%2),%0"
>>>>                           : "=r" (v)
>>>>                           : "0" (v), "r" (p)
>>>>                           : "memory");
>>>>       }
>>>>
>>>> but I'm guessing the MVSC inline assembler doesn't have the same
>>>> capabilities as gcc hence we have to write the code to load the
>>>> registers needed by the xchg.
>>>
>>> Yeah pretty sure the xchg instruction assumes operands are in certain
>>> registers. GCC is clever enough to know the constraints of the operands
>>> and I'm guessing MSVC inline assembly is not, so it's made explicit.
>>> Although who knows, maybe it's smarter nowadays. Add that on the list of
>>> things to find out about windows and modern compiler support! ;)
>>>
>>>>>> Not clear about your change to the float version of
>>>>>> release_store_fence:
>>>>>> a) do we need to change it?
>>>>>> b) given what we changed it to isn't that the default ?
>>>>>
>>>>> I didn't want to have default specializations of jfloat/jdouble
>>>>> equalling jint/jlong. I thought maybe some architecture might want to
>>>>> put it in different registers or something, making a default
>>>>> conversion
>>>>> to int/long maybe undesirable. So the default for jfloat/jdouble is to
>>>>> use acquire()/release() paired with an atomic memory access. So you
>>>>> have
>>>>> to explicitly specialize them to be the same as jint/jlong if wanted,
>>>>> which I believe is what is done in the code you refer to.
>>>>
>>>> Okay. It was the difference between the jfloat and jdouble versions in
>>>> the original code that threw me. But the jdouble needs to delegate
>>>> so we
>>>> end up doing an Atomic::load.
>>>
>>> jdouble and jlong is delegated to Atomic::* in orderAccess.inline.hpp as
>>> they should be. It would perhaps be preferrable to eventually forward
>>> all atomic accesses to Atomic since it's really a different concern and
>>> would be good to have in one place. But the methods we need are not
>>> there yet, so I elected not to for now.
>>>
>>> But yeah some jfloat vs jdouble stuff in the original code is very
>>> strange and inconsistent. Notably, release_store_fence on linux_x86
>>> properly uses compiler barriers for release for all variants except the
>>> awkward jfloat, meaning it does not have the protection it needs. Looks
>>> to me like this was a human mistake due to the high redundancy of the
>>> code.
>>
>> You mean this:
>>
>> OrderAccess::release_store_fence(volatile jfloat*  p, jfloat  v) {
>>    *p = v; fence();
>> }
>>
>> technically should be: compiler_barrier(); *p =v; fence();
>>
>> That's an artifact of the assumption that volatile writes are sequence
>> points and hence a compiler-barrier (something we now know is not the
>> case with respect to preceding non-volatile accesses).
>>
>> Cheers,
>> David
>>
>>
>>
>>
>>>>> If you think I'm paranoid about floats, I can of course make it a
>>>>> default behaviour to use jint/jlong even if it isn't explicit, in the
>>>>> same way I handled unsigned variants.
>>>>
>>>> No I think that is okay.
>>>
>>> Then we keep it that way. :)
>>>
>>>> If you can send me an incremental patch I will update the webrev.
>>>
>>> Will do, thanks for uploading!
>>>
>>> Thanks,
>>> /Erik
>>>
>
>