RFR: JDK-8035663 Suspicious failure of test java/util/concurrent/Phaser/FickleRegister.java
David Holmes
david.holmes at oracle.com
Wed Nov 19 12:03:35 UTC 2014
Hi Aleksey,
On 19/11/2014 9:21 PM, Aleksey Shipilev wrote:
> Hi David,
>
> On 11/19/2014 08:17 AM, David Holmes wrote:
>> This test failure exposed a number of issues with the logic in
>> unsafe.cpp for handling atomic updates of Java long fields on platforms
>> without any direct support for a 64-bit CAS operation - platforms for
>> which supports_cx8 is not true. This only impacts our SE Embedded PPC32
>> platform (where we have been using this fix for some time now) but in
>> case other such platforms came along I wanted to get this pushed to
>> mainline.
>
> Does this also apply to "double" handling?
It would in theory, but there's no API for CAS of double fields.
>> What the unsafe code did was to use the object containing the field as a
>> lock object for reading and writing the field. This seems reasonable on
>> the surface but in fact had a fatal flaw - because we were locking a
>> Java-level visible object inside what was considered to be a lock-free
>> code-path by the application and library logic, we could actually induce
>> a deadlock - which is why the test failed.
>
> Ouch.
>
>> In addition the code had two further flaws:
>>
>> 1. Because the field could also be updated via direct assignment in Java
>> code the unsafe code needed to perform an Atomic::load of the field. And
>> for good measure we also employ an Atomic::store to ensure no
>> interference with direct reads of the field in Java code.
>
> I am confused by this explanation. This seems to imply Atomic::store and
> Atomic::load are already providing the access atomicity. Why do we need
> the lock?
The lock provides the atomicity needed for the logical CAS operation.
Atomic loads and stores don't help with that. But atomic loads and
stores are needed to ensure direct assignment of the volatile field
(which is already implemented via Atomic::store) can't interfere with a
concurrent "cas" operation.
> Given the interaction with Java code, I wonder if this should be handled
> in per-platform Atomic::* definitions, not in Unsafe stubs?
Not sure what you mean by this, but this is the implementation of the
Unsafe API - which is done using the runtime facilities of the Atomic
class. This code handles both the supports_cx8 and doesn't_support_cx8
cases.
> Also, if
> Atomic::* provide the atomic support, why do we have this:
>
> 1178 #ifdef SUPPORTS_NATIVE_CX8
> 1179 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e;
> 1180 #else
> 1181 if (VM_Version::supports_cx8())
> 1182 return (jlong)(Atomic::cmpxchg(x, addr, e)) == e;
> 1183 else {
> 1184 jboolean success = false;
> 1185 MutexLockerEx mu(UnsafeJlong_lock,
> Mutex::_no_safepoint_check_flag);
> 1186 jlong val = Atomic::load(addr);
> 1187 if (val == e) { Atomic::store(x, addr); success = true; }
> 1188 return success;
> 1189 }
> 1190 #endif
>
> ...instead of delegating to Atomic::cmpxchg directly? If locking is
> needed to maintain the atomicity in absence of 8-byte CAS on target
> platform, the Atomic::* should IMO maintain the global lock guarding
> *all* load/stores as well as CASes there.
The aim here is not to try to provide an Atomic::cmpxchg(jlong)
implementation for general use, but to provide the atomic operations
needed by the Unsafe class. We've deliberately steered away from using a
64-bit CAS in the runtime precisely because it is not available on all
platforms, and a lock-based solution for the general runtime would
become a bottleneck. Of course if the original implementation had
instead gone down that path we wouldn't have known anything different -
but it didn't.
This patch simply fixes the current broken implementation of Unsafe.
Thanks,
David
> Thanks,
> -Aleksey.
>
More information about the hotspot-dev
mailing list