RFR(M): 8141635: Implement new Unsafe intrinsics on POWER

Volker Simonis volker.simonis at gmail.com
Wed May 4 17:57:28 UTC 2016


Hi Martin,

thanks for the explanation. I think the change looks good.

Aleksey, could you please run the change through JPRT and sponsor it
once your happy with it?

Regards,
Volker


On Wed, May 4, 2016 at 11:03 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
> Hi Aleksey,
>
> thank you for reviewing my change. Here my answers:
>
>> Actually, that seems to mean current CAS implementation on POWER, prior
>> to VarHandles, was also broken.
> Good catch! We do have a MemBarVolatile before CAS in the SAP JVM but somehow forgot to integrate it into OpenJDK :)
>
>> I wonder, however, why you need sync() after CAS?
> The sync is only for the case !support_IRIW_for_not_multiple_copy_atomic_cpu. So it's not used.
>
> With support_IRIW_for_not_multiple_copy_atomic_cpu:
> volatile load: sync+ld+acquire by isync
> volatile store: lwsync+st
> volatile CAS: sync+ld+st+isync
>
> Without support_IRIW_for_not_multiple_copy_atomic_cpu:
> volatile load: ld+acquire by isync
> volatile store: lwsync+st+sync
> volatile CAS: lwsync+ld+st+sync
>
>> Therefore isync() is enough after CAS in ppc.ad? Also, why not fix the
>> CAS in MacroAssembler with these barriers to match C++11 mappings,
>> instead of doing special cases in .ad?
> support_IRIW_for_not_multiple_copy_atomic_cpu is true on PPC64, so we already use isync after CAS with my current version of the change.
> I don't want to change the CAS implementation in MacroAssembler as it is also used for monitor locking etc. which is unrelated to this change.
>
>> I don't think this is about the returned value, but rather about the
>> transitive memory effects. So, I would post the barrier on the exit path
>> for weakCAS{Acquire,Release} too.
> Please note that we do execute the memory barrier before the CAS in both, the failed and the successful case.
> In the failed case, the failure reason may be a reservation loss due to an interrupt. It is not specified which value was loaded,
> so the CAS did not perform any memory access which is observable in Java. So what should be ordered by an additional memory barrier in this case?
> What is your concern about transitive memory effects?
>
> After the weak CAS (maybe after several retries) succeeds, we execute the memory barrier of course.
>
>> P.S. Note that we have weakCompareAndSetVolatile intrinsics in works,
>> which would probably require a few additional matches in .ad...
> Thanks for the heads-up. I think my change is ready for it.
>
> Best regards,
> Martin
>
>
> -----Original Message-----
> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
> Sent: Dienstag, 3. Mai 2016 21:28
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8141635: Implement new Unsafe intrinsics on POWER
>
> * PGP Signature not checked
>
> Hi Martin,
>
> On 05/03/2016 01:16 PM, Doerr, Martin wrote:
>> According to JEP 193, Compare and Swap/Exchange methods need to behave
>> like volatile load + volatile store.
>>
>> Therefore, I had to make a change to the shared library_call
>> implementation (support_IRIW_for_not_multiple_copy_atomic_cpu was not
>> yet included).
>
> Yes, CAS/CAE are supposed to be sequentially consistent. This change
> looks correct, if a hardware CAS is not SC:
>
> 2852   switch (access_kind) {
> ...
> 2856     case Release:
> 2857       insert_mem_bar(Op_MemBarRelease);
> 2858       break;
> 2859     case Volatile:
> 2860       if (support_IRIW_for_not_multiple_copy_atomic_cpu) {
> 2861         insert_mem_bar(Op_MemBarVolatile);
> 2862       } else {
> 2863         insert_mem_bar(Op_MemBarRelease);
> 2864       }
> 2865       break;
>
> Actually, that seems to mean current CAS implementation on POWER, prior
> to VarHandles, was also broken.
>
>> I also assert that GetAndAdd and GetAndSet are only used with seq_cst
>> semantics, because the memory order is not passed to these intrinsics.
>
> Yes, good.
>
>> C++2011 supports specifying weaker semantics for failed cases of
>> atomic_compare_exchange than for successful ones.
>>
>> As I understand JEP 193, Java expects the semantics for failed cases to
>> be the same as for successful cases.
>>
>> That means the memory barrier instruction must be executed even though
>> the store was not executed. So the failed Compare and Swap/Exchange
>> behaves like a volatile load.
>
> Yes, that's the difference between VarHandles and C++11 semantics.
> VarHandles provide a single memory ordering, regardless of success/failure.
>
> I wonder, however, why you need sync() after CAS? C++11 mappings on
> POWER: https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html ...say
> that Cmpxchg SeqCst is:
>
>        hwsync;     // Op_MemBarVolatile
> _loop: lwarx;
>        cmp;
>        bc _exit;
>        stwcx.;
>        bc _loop;
>        isync;
> _exit:
>
> Load Seq Cst, which is similar to failing CAS semantics, is:
>
>        hwsync;
>        ld;
>        cmp;
>        bc;
>        isync
>
> Therefore isync() is enough after CAS in ppc.ad? Also, why not fix the
> CAS in MacroAssembler with these barriers to match C++11 mappings,
> instead of doing special cases in .ad?
>
>> They do not return the loaded value, but only report whether the Swap
>> was performed or not. The failure cause is not clearly identifiable and
>> I don't see any purpose of executing a memory barrier in case of failure.
>
> I don't think this is about the returned value, but rather about the
> transitive memory effects. So, I would post the barrier on the exit path
> for weakCAS{Acquire,Release} too.
>
> Thanks,
> -Aleksey
>
> P.S. Note that we have weakCompareAndSetVolatile intrinsics in works,
> which would probably require a few additional matches in .ad...
>   https://bugs.openjdk.java.net/browse/JDK-8155965
>
>
> * Signature checking is off by policy


More information about the hotspot-compiler-dev mailing list