RFR: 7143664: Clean up OrderAccess implementations and usage

Thu Feb 12 23:51:45 UTC 2015

Updated webrev:

http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3/

Incremental webrev from v2:

http://cr.openjdk.java.net/~dholmes/7143664/webrev.v3-incr/

One response inlined below ...

On 13/02/2015 2:50 AM, Erik Österlund wrote:
> Hi David,
>
> On 12/02/15 01:38, David Holmes wrote:
>
>>>> I still think we need to say something about C++ volatile and about
>>>> compiler barriers in general.
>>>
>>> How about something like:
>>>
>>> C++ volatile semantics prevent compiler re-ordering between volatile
>>> memory accesses. However, reordering between non-volatile and volatile
>>> memory accesses is in general undefined. For compiler reordering
>>> constraints taking non-volatile memory accesses into consideration, a
>>> compiler barrier has to be used instead. Note also that both volatile
>>> semantics and compiler barrier do not prevent hardware reordering.
>>
>> Sounds great! Though perhaps add "Some compiler implementations may
>> choose to enforce additional constraints beyond those required by the
>> language." ( a reference to MVSC)
>
> Sounds good, fixed!
>
>>> But since this is nowadays supported by solaris studio too, I joined
>>> them into one. Not only because it's nicer to unify, but because I need
>>> the compiler barrier constraints that inline asm can give us. I don't
>>> know the compiler reordering constraints are guaranteed when calling
>>> external assembly. So better make it explicit and safe.
>>
>> Totally fine with making use of Solaris Studio's modern capabilities,
>> only minor concern is no way to test gcc on Solaris. But unless someone
>> else screams about this I'm happy to assume that gcc will still be okay.
>
> Since solaris studio is now actually using GCC syntax of inline asm
> which worked well (on all other GCC variants as well), I just took for
> granted that GCC would handle its own syntax on solaris too. ;)
> But it never hurts to verify!
>
>>> How do you mean they are different? To me they look quite equivalent.
>>> xchg is always implicitly locked on x86, and they both use xchg to get
>>> the wanted fence semantics.
>>
>> Windows:
>>
>>      89 template<>
>>      90 inline void OrderAccess::specialized_release_store_fence<jint>
>> (volatile jint*   p, jint   v) {
>>      91   __asm {
>>      92     mov edx, p;
>>      93     mov eax, v;
>>      94     xchg eax, dword ptr [edx];
>>      95   }
>>
>> Linux:
>>
>> ! template<>
>> ! inline void OrderAccess::specialized_release_store_fence<jint>
>> (volatile jint*   p, jint   v) {
>>        __asm__ volatile (  "xchgl (%2),%0"
>>                          : "=r" (v)
>>                          : "0" (v), "r" (p)
>>                          : "memory");
>>      }
>>
>> but I'm guessing the MVSC inline assembler doesn't have the same
>> capabilities as gcc hence we have to write the code to load the
>> registers needed by the xchg.
>
> Yeah pretty sure the xchg instruction assumes operands are in certain
> registers. GCC is clever enough to know the constraints of the operands
> and I'm guessing MSVC inline assembly is not, so it's made explicit.
> Although who knows, maybe it's smarter nowadays. Add that on the list of
> things to find out about windows and modern compiler support! ;)
>
>>>> Not clear about your change to the float version of release_store_fence:
>>>> a) do we need to change it?
>>>> b) given what we changed it to isn't that the default ?
>>>
>>> I didn't want to have default specializations of jfloat/jdouble
>>> equalling jint/jlong. I thought maybe some architecture might want to
>>> put it in different registers or something, making a default conversion
>>> to int/long maybe undesirable. So the default for jfloat/jdouble is to
>>> use acquire()/release() paired with an atomic memory access. So you have
>>> to explicitly specialize them to be the same as jint/jlong if wanted,
>>> which I believe is what is done in the code you refer to.
>>
>> Okay. It was the difference between the jfloat and jdouble versions in
>> the original code that threw me. But the jdouble needs to delegate so we
>> end up doing an Atomic::load.
>
> jdouble and jlong is delegated to Atomic::* in orderAccess.inline.hpp as
> they should be. It would perhaps be preferrable to eventually forward
> all atomic accesses to Atomic since it's really a different concern and
> would be good to have in one place. But the methods we need are not
> there yet, so I elected not to for now.
>
> But yeah some jfloat vs jdouble stuff in the original code is very
> strange and inconsistent. Notably, release_store_fence on linux_x86
> properly uses compiler barriers for release for all variants except the
> awkward jfloat, meaning it does not have the protection it needs. Looks
> to me like this was a human mistake due to the high redundancy of the code.

You mean this:

OrderAccess::release_store_fence(volatile jfloat*  p, jfloat  v) {
   *p = v; fence();
}

technically should be: compiler_barrier(); *p =v; fence();

That's an artifact of the assumption that volatile writes are sequence 
points and hence a compiler-barrier (something we now know is not the 
case with respect to preceding non-volatile accesses).

Cheers,
David

>>> If you think I'm paranoid about floats, I can of course make it a
>>> default behaviour to use jint/jlong even if it isn't explicit, in the
>>> same way I handled unsigned variants.
>>
>> No I think that is okay.
>
> Then we keep it that way. :)
>
>> If you can send me an incremental patch I will update the webrev.
>
> Will do, thanks for uploading!
>
> Thanks,
> /Erik
>