[jmm-dev] Another way to punt OOTA

Thu Oct 30 14:35:03 UTC 2014

If racing access has undefined semantics, are we not going to have the 
same problems in Java as in C/C++11, where undefined behavior generates 
the write that justifies the read that causes the undefined behavior? 
For example (initially x and y are different objects, both with field a=0):

   thread 1: if (x.a == 1) { undefined(); }
   thread 2: y.a = 1;

One possible undefined behaviour is y=x, which can then be used to 
justify the read x.a == 1. Of course, this is an OOTA behavior, but 
ruling out such OOTA occurrences of undefined seems just as difficult as 
ruling out OOTA regular behaviours (and possibly harder).

A.

On 10/28/2014 08:39 PM, Hans Boehm wrote:
> Here's another conceivable approach.  This sounds somewhat crazy based on
> our prior assumptions, but it may only be mildly crazy, so here goes.  (I
> ran this past Jeremy who thought it was at least mildly crazy.)
>
> Assumptions/observations:
>
> - Very few people actually care about Java security as originally
> conceived.  We usually either don't allow untrusted Java code into our
> process, or we sandbox the resulting process.  (Android does the latter.)
>
> - The argument for providing well-defined behavior for data races in Java
> was based either on the original model of Java, or on the desire to
> unconditionally maintain secondary properties, like memory safety, even in
> the presence of data races.
>
> - [Dubious, but plausible assumption] We should be able to explicitly
> require those secondary properties, even if we allow otherwise allow less
> than well-defined behavior for data races.
>
> - We're essentially already making the preceding dubious assumption.  We
> have all sorts of library APIs that don't specify how they behave in the
> presence of races, but only say that you shouldn't do that.  We're somehow
> supposed to conclude that the heap can't be corrupted even in the presence
> of racy library calls, because of the (unstated?) assumption that libraries
> should behave as though they were implemented in Java, and Java programs
> preserve e.g. memory safety.  The only place we actually claim (incorrectly
> as we now know) to have well-defined semantics for data races is at the
> basic type and reference level, which is arguably not even the most
> interesting place in a large code base.
>
> - Most of us don't care that much about slowing down potentially racing
> accesses a bit.  The reason the load->store ordering solution seems so
> dubious for Java is that the compiler doesn't know which accesses might
> race, and thus we need to treat many accesses conservatively.  ARM is much
> more concerned about Java than C++ memory_order_relaxed.
>
>
> Possible long-term approach:
>
> - Switch to a much more C++-like (or Java library-like) model in which data
> races have something like "undefined behavior".  Exactly how to model that
> is an open question.  Ordinary loads can only see stores that happen-before
> them, but racing loads trigger "undefined behavior". "Undefined behavior"
> should be defined to allow reporting an error and produce any type correct
> answer for the racing load.  Unlike C++, we probably want to be explicit
> that it disallows a VM crash
>
> - [Probably challenging, others understand the Java constraints better than
> I] Introduce a mechanism for (nearly) unordered memory-order-relaxed like
> racing loads.  Require current racing accesses to use that mechanism
> instead.  (Open question: coherence)
>
> - Strengthen this memory_order_relaxed analog to guarantee load->store
> ordering.
>
> - Implementations may specify that they treat all ordinary accesses as
> these "memory_order_relaxed" accesses.  Browser-based implementations, or
> others that require the current security model, would do so.  Whether and
> how to specify this is open..
>
>
> Transition strategy:
>
> - Implementations are not required to break existing code in any way,
> though the spec would allow them to.  In the near term, we would expect
> implementations to continue to support (possibly with a suitable flag) the
> current ill-defined model, along with the new model.  (Which should be
> easy; the only really new support id for memory_order_relaxed and
> load->store ordering.)
>
> - The fact that data race detection is much easier in the new model may
> help to inspire people to move to it.
>
>
> Long term potential advantages:
>
> - No OOTA issues!
>
> - Easier race detection.  A standard conforming JVM should be able to throw
> a data race exception if it finds one.
>
> - Uniform model for basic types and libraries.
>
> - We can delete the special cases for tearing longs and doubles without
> unacceptably slowing down some implementations.
>
>
> Long term disadvantages:
>
> - As for C++, we promise less about behavior of buggy programs.  AFAICT,
> much has been written about this, but it isn't a major practical issue
> along the lines of memory safety.  And again, we already have that problem
> for library clients.
>
> - Security sensitive code will run slower and need a separate JVM flag.
> Again, my impression is there is little of it, so this isn't a big deal.
>   (There are probably some mild security advantages in exchange; untrusted
> racy code can induce slightly fewer perverse behaviors.)
>
>
> Short term issues:
>
> - This would declare significant amounts of currently correct code to be
> deprecated and in the future incorrect.  But if we can figure out how to
> cleanly introduce something like memory_order_relaxed, I think the required
> changes would clearly also improve readability.
>