RFR (M): 8184334: Generalizing Atomic with templates

Thu Jul 20 13:45:33 UTC 2017

Hi Andrew,

On 2017-07-19 18:21, Andrew Haley wrote:
> On 18/07/17 16:26, Erik Österlund wrote:
>>
>> On 2017-07-18 16:41, Andrew Haley wrote:
>>> On 18/07/17 14:46, Erik Österlund wrote:
>>>> On 2017-07-18 11:57, Andrew Haley wrote:
>>>>> On 18/07/17 10:38, Erik Österlund wrote:
>>> It is more general than that.  If you access the stored value of an
>>> object through a glvalue of other than one of the allowed types, then
>>> your *whole program is undefined*.
>> And I think it is in that list of allowed types. Specifically, it is my
>> interpretation of the standard that intptr_t is "a type that is the
>> signed or unsigned type corresponding to the dynamic type of the object"
>> (cf. 5.2.10) if the dynamic type is e.g. void*.
>>
>>>> and that therefore it is fine to load something (that was stored as
>>>> A*) as intptr_t and subsequently cast it to A* before the A itself
>>>> is being accessed. Am I wrong?
>>> No, that is correct.  The question is whether intptr_t is a type
>>> compatible with a pointer type, and I don't think you will find any
>>> language to the effect that it is.
>> I think 5.2.10 describing reinterpret_cast seems to suggest that they
>> are compatible by saying "A pointer converted to an integer of
>> sufficient size (if any such exists on the implementation) and back to
>> the same pointer type will have its original value". That is precisely
>> what is done.
> This is the core of our disagreement.
>
> In a sense, it doesn't matter which of us is right; what really
> matters is the opinion of the compiler.  GCC does not put intptr_t and
> pointer types into the same alias set, and even if this did turn out
> to be a bug in GCC it wouldn't help us because we often use
> unsupported old versions of GCC to build OpenJDK, so it wouldn't get
> fixed anyway.
>
> So what are we to do?  Do you believe me when I say that GCC does
> treat A* and intptr_t to be compatible types?  If not, what would it
> take to convince you?

It seems like the aliasing problem has multiple levels at which we can 
debate. I will try to put my thoughts down in a reasonably structured 
way... let's see how that goes.

These are the core questions we are debating (and I will answer them in 
order):

1) What does the standard say about aliasing rules?
2) What did concrete implementations like GCC implement the aliasing rules?
3) What are the implications of those aliasing rules being violated? 
Does it matter?
4) Have these changes actually changed anything relating to those 
aliasing rules?
5) How do we proceed?

Regarding #1: We seem to agree that there are various possible 
interpretations of the standard (wouldn't be the first time...); at 
least one interpretation that violates treating intptr_t as compatible 
with A* and at least one interpretation that permits it. I do not know 
if there is any one true interpretation of the C++ standard, so I will 
assume the possibility that either one of us could be right here, and 
hence that a compiler writer might write a compiler in which it is 
either permitted aliasing or not.

Regarding #2: While the simplest most contrived example of what we are 
arguing about being a strict aliasing violation is not caught as a 
strict aliasing violation using the strictest strict aliasing checks 
available in GCC, I feel dubious, but will take your word for it that 
intptr_t and A* are not compatible types. So for the rest of the 
discussion, I will assume this is true.

Regarding #3: We already inherently rely on passed in intptr_t or even 
jlong aliasing A*, possibly today enforced through -fno-strict-aliasing. 
I do not think the JVM will ever be able to turn such optimizations on. 
Our C++ runtime lives in the border lands between Java and C++ that I 
will refer to as "shady land". C++ standards are relatively successful 
at describing well-behaved semantics for a program where C++ code 
interacts with other C++ code. Unfortunately, we are working in "shady 
land" where C++ code also interacts with JIT-compiled machine code from 
Java land which not surprisingly has poorly defined meaning in the 
context of C++ standards. For example, we take pointers from C/C++, send 
them as jlong via JNI into Java that may pass it around and ultimately 
store it in some Java long field of an object (with a store or perhaps 
an Unsafe CompareAndSetLong call that may or may not be implemented via 
an intrinsic), and ultimately be passed in to C++ code that loads that 
long as jlong, and casts it into a pointer and starts poking around in 
values behind that pointer (e.g. Unsafe_SetMemory0(), 
Copy::conjoint_jlongs_atomic(), jni_NewDirectByteBuffer(), etc). In 
fact, the type for "native pointers" in Java is long, a.k.a. jlong, and 
I am fairly certain that the code base is sufficiently full of this that 
pointer sized integers and pointers *have* to be aliased by the compiler 
or we are in deep trouble. I do not think we can ever allow them not to 
be aliased, especially not in "shady land". So if compilers ever started 
assuming that there is no way anything loaded as an intptr_t, or jlong 
for that matter, could point at any C++ types (despite being casted to 
do so), and perform aggressive optimizations utilizing points-to 
analysis, relying on those assumptions, then that would be so 
fundamentally broken that we could arguably not recover. This goes 
without appreciating just how much other code in hotspot already relies 
on this.
While it is possible to try to isolate code in "shady land" from the 
rest of the C++ code, it is precisely in "shady land" that Atomic and 
OrderAccess live. They are first class citizens of "shady land". They 
poke around in memory that is being concurrently accessed either by 
other C++ code or by JIT-compiled Java code (possibly performing atomics 
with jlongs that are in fact native pointers passed in from JNI). Here 
the aliasing must be as conservative as possible.
My conclusion is that if intptr_t/jlong is not considered compatible 
with A* by the compiler, the compiler has to be tamed to think so 
anyway, or the JVM will break horribly. Especially for Atomic and 
OrderAccess that lives in "shady land".

Regarding #4: I have made changes in the connection between the frontend 
and the backend. But I would like to maintain that the backends retain 
the same behaviour as they did before. For example, your xchg_ptr for 
void* casts (C-style) the exchange_value to intptr_t and use the 
intptr_t variant of xchg_ptr that passes intptr_t (a.k.a. int64_t on 
your architecture) into _sync_lock_test_and_set. So whatever 
hypothetical aliasing problem the new Atomic implementation class were 
still there before these changes. I have not introduced any of those 
aliasing issues, only improved the mediation between the frontend to the 
platform dependent backend.

Regarding #5: Now as for how to move on with this, I hope we can agree 
that for code in "shady land", the compiler simply has to treat intptr_t 
as possible aliases to A* and that if it does not, then we need to tame 
it to do so. Because the C++ compiler simply does not know the full 
story of what is going on in our program and can not safely make any 
meaningful points-to analysis and data flow optimizations crossing 
dynamically generated JIT-compiled machine code of a different language.

I propose three different solutions that we can discuss:

Solution A: Do nothing. With this solution we recognize that a) we have 
actually not introduced any undefined behaviour that was not already 
there - the backend used intptr_t before and continues to do so, and b) 
doing so is safe (and inherently has to be safe) due to other levels of 
safeguarding such as turning off strict aliasing and using volatile.

Solution B: If intptr_t does indeed not alias A*, we could find a 
pointer sized type, let's call it CanonicalPointer, that does indeed 
alias a generic A*, and pass pointer types from the frontend as 
CanonicalPointer to the backend. For example char* is explicitly 
described in the standard as an exception of the rules that is 
explicitly aliased to a generic A* (hopefully we agree about that). I 
have put together a small example of how xchg might look with this model 
here:
http://cr.openjdk.java.net/~eosterlund/8184334/webrev.01_alias_pointer_A/
The frontend canonicalizes types and sends them to the backend. The 
canonical type of any pointer is char*, which explicitly aliases to all 
other pointers. Passing pointers through jlong (which is actually done) 
is possibly still not safe in this variant.

Solution C: Go even further (yet with less code) and continue sending 
canonical integer types into the backend, and leave it to the backend to 
decide what measures need to be taken to trick the compiler into not 
performing dangerous optimizations (in the hypothetical scenario that we 
turn on strict aliasing). For example on AArch64 linux treating int64_t 
as char* that explicitly aliases all pointers (3.10), relying on the 
roundtrip guarantees of reinterpret_cast guarantees (5.2.10) between 
pointers and intptr_t. A code example, again for xchg, can be found here:
http://cr.openjdk.java.net/~eosterlund/8184334/webrev.01_alias_pointer_B/
This solution also allows atomically loading a jlong and subsequently 
casting it to a pointer (because its backing load access used char*), 
which I argue inherently needs to be allowed.

My preference is solution A for now (as no new violations are actually 
introduced with these changes, conversely the part of the pipeline from 
the C++ runtime to the backend in "shady land" has been purged from 
evil), and possibly looking into C later, on a platform by platform 
basis, to further strengthen aliasing in "shady land".

I am curious to hear your thoughts about this.

Thanks,
/Erik

>> Conversely, I do not find any language suggesting that intptr_t would
>> not be compatible with pointer types.
>>
>>> Pointer types are distinct from one another and from all integer
>>> types.  People sometimes get confused by this: the fact that you can
>>> cast from A* to e.g. void*, doesn't tell you that you can cast from
>>> A** to void** and use the result: they are different types.
>> Agreed. But I do not see any problem with reinterpret_cast from A* to
>> intptr_t back to A* and using the A from that result. That is explicitly
>> allowed as per 5.2.10 as previously described.
> I don't think that's what you're doing.  I think you're casting from
> A** to intptr_t*, dereferencing the intptr_t*, and casting the result
> to A*.
>
>> That looks like the correct (and indeed expected) output. iptr should be
>> the same as &b, and iptr->_x should be 0, and &a should be different
>> from &b and iptr. I tried this experiment locally too with similar
>> correct result. Am I missing something here?
> No, I was; I misinterpreted the result, and it's a red herring.
> Sorry.
>
> I believe this patch to be dangerous.  We might well get away with it,
> given sufficient larding of volatile, but it's still undefined
> behaviour.  We have far too much UB in HotSpot already, and we should
> not be adding any more.
>
> And I still believe that this patch is more complex than is needed,
> and with many (if not all) targets we could do something much simpler,
> and which does not need to cast pointer types.  It could also map
> cleanly onto C++11.
>