RFR: 8181085: Race condition in method resolution may produce spurious NullPointerException
Erik Österlund
erik.osterlund at oracle.com
Mon May 29 12:20:26 UTC 2017
Hi Andrew,
I just thought I'd put my opinions in here as I see I have been
mentioned a few times already.
First of all, I find using the volatile keyword on things that are
involved in lock-free protocols meaningful from a readability point of
view. It allows the reader of the code to see care is needed here.
About the compiler barriers - you are right. Volatile should indeed not
be necessary if the compiler barriers do everything right. The compiler
should not reorder things and it should not prevent reloading.
On windows we rely on the deprecated _ReadWriteBarrier(). According to
MSDN, it guarantees:
"The _ReadWriteBarrier intrinsic limits the compiler optimizations that
can remove or reorder memory accesses across the point of the call."
This should cut it.
The GCC memory clobber is defined as:
"The "memory" clobber tells the compiler that the assembly code performs
memory reads or writes to items other than those listed in the input and
output operands (for example, accessing the memory pointed to by one of
the input parameters). To ensure memory contains correct values, GCC may
need to flush specific register values to memory before executing the
asm. Further, the compiler does not assume that any values read from
memory before an asm remain unchanged after that asm; it reloads them as
needed. Using the "memory" clobber effectively forms a read/write memory
barrier for the compiler."
This seems to only guarantee values will not be re-ordered. But in the
documentation for ExtendedAsm it also states:
"You will also want to add the volatile keyword if the memory affected
is not listed in the inputs or outputs of the asm, as the `memory'
clobber does not count as a side-effect of the asm."
and
"The volatile keyword indicates that the instruction has important
side-effects. GCC will not delete a volatile asm if it is reachable.
(The instruction can still be deleted if GCC can prove that control-flow
will never reach the location of the instruction.) Note that even a
volatile asm instruction can be moved relative to other code, including
across jump instructions."
This is a bit vague, but seems to suggest that by making the asm
statement volatile and having a memory clobber, it definitely will not
reload variables. About not re-ordering non-volatile accesses, it
shouldn't but it is not quite clearly stated. I have never observed such
a re-ordering across a volatile memory clobber. But the semantics seem a
bit vague.
As for clang, the closest to a definition of what it does I have seen is:
"A clobber constraint is indicated by a “~” prefix. A clobber does not
consume an input operand, nor generate an output. Clobbers cannot use
any of the general constraint code letters – they may use only explicit
register constraints, e.g. “~{eax}”. The one exception is that a clobber
string of “~{memory}” indicates that the assembly writes to arbitrary
undeclared memory locations – not only the memory pointed to by a
declared indirect output."
Apart from sweeping statements saying clang inline assembly is largely
compatible and working similar to GCC, I have not seen clear guarantees.
And then there are more compilers.
As a conclusion, by using volatile in addition to OrderAccess you rely
on standardized compiler semantics (at least for volatile-to-volatile
re-orderings and re-loading, but not for volatile-to-nonvolatile, but
that's another can of worms), and regrettably if you rely on OrderAccess
memory model doing what it says it will do, then it should indeed work
without volatile, but to make that work, OrderAccess relies on
non-standardized compiler-specific barriers. In practice it should work
well on all our supported compilers without volatile. And if it didn't,
it would indeed be a bug in OrderAccess that needs to be solved in
OrderAccess.
Personally though, I am a helmet-on-synchronization kind of person, so I
would take precaution anyway and use volatile whenever possible, because
1) it makes the code more readable, and 2) it provides one extra layer
of safety that is more standardized. It seems that over the years it has
happened multiple times that we assumed OrderAccess is bullet proof, and
then realized that it wasn't and observed a crash that would never have
happened if the code was written in a helmet-on-synchronization way. At
least that's how I feel about it.
Now one might argue that by using C++11 atomics that are standardized,
all these problems would go away as we would rely in standardized
primitives and then just trust the compiler. But then there could arise
problems when the C++ compiler decides to be less conservative than we
want, e.g. by not doing fence in sequentially consistent loads to
optimize for non-multiple copy atomic CPUs arguing that IRIW issues that
violate sequential consistency are non-issues in practice. That makes
those loads "almost" sequentially consistent, which might be good
enough. But it feels good to have a choice here to be more conservative.
To have the synchronization helmet on.
Meta summary:
1) Current OrderAccess without volatile:
- should work, but relies on compiler-specific not standardized and
sometimes poorly documented compiler barriers.
2) Current OrderAccess with volatile:
- relies on standardized volatile semantics to guarantee compiler
reordering and reloading issues do not occur.
3) C++11 Atomic backend for OrderAccess
- relies on standardized semantics to guarantee compiler and hardware
reordering issues
- nevertheless isn't always flawless, and when it isn't, it gets painful
Hope this sheds some light on the trade-offs.
Thanks,
/Erik
On 2017-05-28 10:45, Andrew Haley wrote:
> On 27/05/17 10:10, Volker Simonis wrote:
>> On Fri, May 26, 2017 at 6:09 PM, Andrew Haley <aph at redhat.com> wrote:
>>> On 26/05/17 17:03, Volker Simonis wrote:
>>>
>>>> Volatile not only prevents reordering by the compiler. It also
>>>> prevents other, otherwise legal transformations/optimizations (like
>>>> for example reloading a variable [1]) which have to be prevented in
>>>> order to write correct, lock free programs.
>>> Yes, but so do compiler barriers.
>> Please correct me if I'm wrong, but I thought "compiler barriers" are
>> to prevent reordering by the compiler. However, this is a question of
>> optimization. If you have two subsequent loads from the same address,
>> the compiler is free to do only the first load and keep the value in a
>> register if the address is not pointing to a volatile value.
> No it isn't: that is precisely what a compiler barrier prevents. A
> compiler barrier (from the POV of the compiler) clobbers all of
> the memory state. Neither reads nor writes may move past a compiler
> barrier.
>
More information about the jdk10-dev
mailing list