Unasafe.putOdered -> Unsafe.releasePut

Tue Mar 10 02:06:54 UTC 2015

On 10/03/2015 12:42 AM, Paul Sandoz wrote:
>
> On Mar 5, 2015, at 1:49 AM, David Holmes <david.holmes at oracle.com> wrote:
>
>> On 2/03/2015 9:54 PM, Paul Sandoz wrote:
>>>
>>> On Mar 2, 2015, at 12:07 PM, David Holmes <david.holmes at oracle.com> wrote:
>>>
>>>> Hi Paul,
>>>>
>>>> This is more a question for core-libs and the users of this Unsafe API.
>>>
>>> I wanted to check here first because of work done in [1] and because this is where there is most experience with the Unsafe implementation. I don't want to change the semantics of existing Unsafe methods.
>>>
>>> Note that the VarHandle work will expose using release/acquire names, where the former implementation defers to Unsafe.putOrdered*, and the later to relaxed plus explicit load fence.
>>
>> The Unsafe API naming tends to be done by whomever is defining the public API that uses it (eg. Doug Lea for j.u.c). The implementation of the Unsafe API then has to map through to whatever the VM provides. So given there are three layers you either match the first two or the last two but you can rarely make all three the same.
>>
>> So for example, j.u.c.atomic lazySet is defined as providing:
>>
>> *   <li> {@code lazySet} has the memory effects of writing (assigning)
>> *   a {@code volatile} variable except that it permits reorderings with
>> *   subsequent (but not previous) memory actions that do not themselves
>> *   impose reordering constraints with ordinary non-{@code volatile}
>> *   writes.
>>
>> and is implemented by a call to Unsafe.putOrdered, which in turn is implemented by doing a full volatile write, which is OrderAccess::release_store_fence.
>
> I don't observe that from looking at the C2 code, nor from the generated x86 code. Its implemented like a release_store and used accordingly with that in mind.

I was referring to the non-intrinsified version of Unsafe:

// The non-intrinsified versions of setOrdered just use setVolatile

UNSAFE_ENTRY(void, Unsafe_SetOrderedInt(JNIEnv *env, jobject unsafe, 
jobject obj, jlong offset, jint x))
   UnsafeWrapper("Unsafe_SetOrderedInt");
   SET_FIELD_VOLATILE(obj, offset, jint, x);
UNSAFE_END

and IIRC that was something Martin complained about a few weeks back.

Okay in that case aligning the names makes more sense - and we should 
fix the the non-intrinsic form. :)

Thanks,
David

> See the following for Java code snippets and corresponding C2 x86 generated code:
>
> static int putOrdered(Object o, long offset, int i) {
>      Unsafer.UNSAFE.putOrderedInt(o, offset, i);
>      return i;
> }
>
> 000   B1: # N1 <- BLOCK HEAD IS JUNK   Freq: 1 IDom: 0/#2 RegPressure: 3 IHRP Index: 23 FRegPressure: 0 FHRP Index: 23
> 000    # stack bang (96 bytes)
>     pushq   rbp    # Save rbp
>     subq    rsp, #16   # Create frame
>
> 00c    MEMBAR-release ! (empty encoding)
> 00c
> 00c    movl    [RSI + RDX], RCX   # int     <---- store
> 00f
> 00f    movl    RAX, RCX   # spill
> 011    addq    rsp, 16    # Destroy frame
>     popq   rbp
>     testl  rax, [rip + #offset_to_poll_page]   # Safepoint: poll for GC
>
> 01c    ret
> 01c
>
>
> static int putVolatile(Object o, long offset, int i) {
>      Unsafer.UNSAFE.putIntVolatile(o, offset, i);
>      return i;
> }
>
> 000   B1: # N1 <- BLOCK HEAD IS JUNK   Freq: 1 IDom: 0/#2 RegPressure: 3 IHRP Index: 27 FRegPressure: 0 FHRP Index: 27
> 000    # stack bang (96 bytes)
>     pushq   rbp    # Save rbp
>     subq    rsp, #16   # Create frame
>
> 00c    MEMBAR-release ! (empty encoding)
> 00c
> 00c    movl    [RSI + RDX], RCX   # int     <---- store
> 00f    lock addl [rsp + #0], 0    ! membar_volatile   <---- fence
> 015
> 015    movl    RAX, RCX   # spill
> 017    addq    rsp, 16    # Destroy frame
>     popq   rbp
>     testl  rax, [rip + #offset_to_poll_page]   # Safepoint: poll for GC
>
> 022    ret
> 022
>
>
> It's the naming that appears to be all over the place (as Andrew's email points out). As a result it's hard to infer what the actual implementation does, which in turn is why we have conversations like this :-)
>
> Paul.
>
>> The implementation provides a stronger barrier than what is required. Depending on your perspective either putOrdered captures the right "lazySet" semantics but happens to be implemented using stronger ones (and that implementation might be relaxed in the future); or putOrdered is really intended to be release_store_fence (and so could be renamed accordingly) and if you want to implement true lazySet semantics then you have to define a new Unsafe API to capture that. Conversely if you need release_store_fence then you either define an API that always provide it, or use an existing one (in this case it is a normal volatile field write).
>>
>> I'm in the former camp where putOrdered is weaker than a putRelease would be. I think the VarHandle work would be wrong to assume putOrdered must be as strong as it is, and should instead define a stronger Unsafe API.
>>
>
>> But that is why I think this needs to be discussed with the folks at the front-end API's not the backend ones :)
>>
>> Cheers,
>> David
>> -----
>>
>>>
>>>> Sometimes the implementation of unsafe doesn't exactly match the API, so we need to check what the intended API semantics are.
>>>>
>>>
>>> I believe things align naming-wise at least from eyeballing C2 code i.e. Unsafe.putOrdered* is a form of release-store like that of OrderAccess::release_store and Unsafe.storeFence is a form of release-fence like that of OrderAccess::release.
>>>
>>> src/share/vm/opto/library_call.cpp:
>>>
>>> bool LibraryCallKit::inline_unsafe_ordered_store(BasicType type) {
>>>    ...
>>>    insert_mem_bar(Op_MemBarRelease);
>>>    insert_mem_bar(Op_MemBarCPUOrder);
>>>    // Ensure that the store is atomic for longs:
>>>    const bool require_atomic_access = true;
>>>    Node* store;
>>>    if (type == T_OBJECT) // reference stores need a store barrier.
>>>      store = store_oop_to_unknown(control(), base, adr, adr_type, val, type, MemNode::release);
>>>    else {
>>>      store = store_to_memory(control(), adr, val, type, adr_type, MemNode::release, require_atomic_access);
>>>    }
>>>    insert_mem_bar(Op_MemBarCPUOrder);
>>>    return true;
>>> }
>>>
>>> bool LibraryCallKit::inline_unsafe_fence(vmIntrinsics::ID id) {
>>>    // Regardless of form, don't allow previous ld/st to move down,
>>>    // then issue acquire, release, or volatile mem_bar.
>>>    insert_mem_bar(Op_MemBarCPUOrder);
>>>    switch(id) {
>>>      case vmIntrinsics::_loadFence:
>>>        insert_mem_bar(Op_LoadFence);
>>>        return true;
>>>      case vmIntrinsics::_storeFence:
>>>        insert_mem_bar(Op_StoreFence);
>>>        return true;
>>>      case vmIntrinsics::_fullFence:
>>>        insert_mem_bar(Op_MemBarVolatile);
>>>        return true;
>>>      default:
>>>        fatal_unexpected_iid(id);
>>>        return false;
>>>    }
>>> }
>>>
>>>
>>> src/cpu/x86/vm/x86_32.ad:
>>>
>>> instruct membar_release() %{
>>>    match(MemBarRelease);
>>>    match(StoreFence);
>>>    ins_cost(400);
>>>
>>>    size(0);
>>>    format %{ "MEMBAR-release ! (empty encoding)" %}
>>>    ins_encode( );
>>>    ins_pipe(empty);
>>> %}
>>>
>>> ...
>>>
>>> instruct membar_acquire() %{
>>>    match(MemBarAcquire);
>>>    match(LoadFence);
>>>    ins_cost(400);
>>>
>>>    size(0);
>>>    format %{ "MEMBAR-acquire ! (empty encoding)" %}
>>>    ins_encode();
>>>    ins_pipe(empty);
>>> %}
>>>
>>> ...
>>>
>>> instruct membar_volatile(eFlagsReg cr) %{
>>>    match(MemBarVolatile);
>>>    effect(KILL cr);
>>>    ins_cost(400);
>>>
>>>    format %{
>>>      $$template
>>>      if (os::is_MP()) {
>>>        $$emit$$"LOCK ADDL [ESP + #0], 0\t! membar_volatile"
>>>      } else {
>>>        $$emit$$"MEMBAR-volatile ! (empty encoding)"
>>>      }
>>>    %}
>>>    ins_encode %{
>>>      __ membar(Assembler::StoreLoad);
>>>    %}
>>>    ins_pipe(pipe_slow);
>>> %}
>>>
>>>
>>> Paul.
>>>
>