RFR(M): 8141635: Implement new Unsafe intrinsics on POWER
Volker Simonis
volker.simonis at gmail.com
Wed May 4 17:57:28 UTC 2016
Hi Martin,
thanks for the explanation. I think the change looks good.
Aleksey, could you please run the change through JPRT and sponsor it
once your happy with it?
Regards,
Volker
On Wed, May 4, 2016 at 11:03 AM, Doerr, Martin <martin.doerr at sap.com> wrote:
> Hi Aleksey,
>
> thank you for reviewing my change. Here my answers:
>
>> Actually, that seems to mean current CAS implementation on POWER, prior
>> to VarHandles, was also broken.
> Good catch! We do have a MemBarVolatile before CAS in the SAP JVM but somehow forgot to integrate it into OpenJDK :)
>
>> I wonder, however, why you need sync() after CAS?
> The sync is only for the case !support_IRIW_for_not_multiple_copy_atomic_cpu. So it's not used.
>
> With support_IRIW_for_not_multiple_copy_atomic_cpu:
> volatile load: sync+ld+acquire by isync
> volatile store: lwsync+st
> volatile CAS: sync+ld+st+isync
>
> Without support_IRIW_for_not_multiple_copy_atomic_cpu:
> volatile load: ld+acquire by isync
> volatile store: lwsync+st+sync
> volatile CAS: lwsync+ld+st+sync
>
>> Therefore isync() is enough after CAS in ppc.ad? Also, why not fix the
>> CAS in MacroAssembler with these barriers to match C++11 mappings,
>> instead of doing special cases in .ad?
> support_IRIW_for_not_multiple_copy_atomic_cpu is true on PPC64, so we already use isync after CAS with my current version of the change.
> I don't want to change the CAS implementation in MacroAssembler as it is also used for monitor locking etc. which is unrelated to this change.
>
>> I don't think this is about the returned value, but rather about the
>> transitive memory effects. So, I would post the barrier on the exit path
>> for weakCAS{Acquire,Release} too.
> Please note that we do execute the memory barrier before the CAS in both, the failed and the successful case.
> In the failed case, the failure reason may be a reservation loss due to an interrupt. It is not specified which value was loaded,
> so the CAS did not perform any memory access which is observable in Java. So what should be ordered by an additional memory barrier in this case?
> What is your concern about transitive memory effects?
>
> After the weak CAS (maybe after several retries) succeeds, we execute the memory barrier of course.
>
>> P.S. Note that we have weakCompareAndSetVolatile intrinsics in works,
>> which would probably require a few additional matches in .ad...
> Thanks for the heads-up. I think my change is ready for it.
>
> Best regards,
> Martin
>
>
> -----Original Message-----
> From: Aleksey Shipilev [mailto:aleksey.shipilev at oracle.com]
> Sent: Dienstag, 3. Mai 2016 21:28
> To: Doerr, Martin <martin.doerr at sap.com>; hotspot-compiler-dev at openjdk.java.net
> Subject: Re: RFR(M): 8141635: Implement new Unsafe intrinsics on POWER
>
> * PGP Signature not checked
>
> Hi Martin,
>
> On 05/03/2016 01:16 PM, Doerr, Martin wrote:
>> According to JEP 193, Compare and Swap/Exchange methods need to behave
>> like volatile load + volatile store.
>>
>> Therefore, I had to make a change to the shared library_call
>> implementation (support_IRIW_for_not_multiple_copy_atomic_cpu was not
>> yet included).
>
> Yes, CAS/CAE are supposed to be sequentially consistent. This change
> looks correct, if a hardware CAS is not SC:
>
> 2852 switch (access_kind) {
> ...
> 2856 case Release:
> 2857 insert_mem_bar(Op_MemBarRelease);
> 2858 break;
> 2859 case Volatile:
> 2860 if (support_IRIW_for_not_multiple_copy_atomic_cpu) {
> 2861 insert_mem_bar(Op_MemBarVolatile);
> 2862 } else {
> 2863 insert_mem_bar(Op_MemBarRelease);
> 2864 }
> 2865 break;
>
> Actually, that seems to mean current CAS implementation on POWER, prior
> to VarHandles, was also broken.
>
>> I also assert that GetAndAdd and GetAndSet are only used with seq_cst
>> semantics, because the memory order is not passed to these intrinsics.
>
> Yes, good.
>
>> C++2011 supports specifying weaker semantics for failed cases of
>> atomic_compare_exchange than for successful ones.
>>
>> As I understand JEP 193, Java expects the semantics for failed cases to
>> be the same as for successful cases.
>>
>> That means the memory barrier instruction must be executed even though
>> the store was not executed. So the failed Compare and Swap/Exchange
>> behaves like a volatile load.
>
> Yes, that's the difference between VarHandles and C++11 semantics.
> VarHandles provide a single memory ordering, regardless of success/failure.
>
> I wonder, however, why you need sync() after CAS? C++11 mappings on
> POWER: https://www.cl.cam.ac.uk/~pes20/cpp/cpp0xmappings.html ...say
> that Cmpxchg SeqCst is:
>
> hwsync; // Op_MemBarVolatile
> _loop: lwarx;
> cmp;
> bc _exit;
> stwcx.;
> bc _loop;
> isync;
> _exit:
>
> Load Seq Cst, which is similar to failing CAS semantics, is:
>
> hwsync;
> ld;
> cmp;
> bc;
> isync
>
> Therefore isync() is enough after CAS in ppc.ad? Also, why not fix the
> CAS in MacroAssembler with these barriers to match C++11 mappings,
> instead of doing special cases in .ad?
>
>> They do not return the loaded value, but only report whether the Swap
>> was performed or not. The failure cause is not clearly identifiable and
>> I don't see any purpose of executing a memory barrier in case of failure.
>
> I don't think this is about the returned value, but rather about the
> transitive memory effects. So, I would post the barrier on the exit path
> for weakCAS{Acquire,Release} too.
>
> Thanks,
> -Aleksey
>
> P.S. Note that we have weakCompareAndSetVolatile intrinsics in works,
> which would probably require a few additional matches in .ad...
> https://bugs.openjdk.java.net/browse/JDK-8155965
>
>
> * Signature checking is off by policy
More information about the hotspot-compiler-dev
mailing list