[jmm-dev] Another way to punt OOTA

Wed Oct 29 01:39:44 UTC 2014

Here's another conceivable approach.  This sounds somewhat crazy based on
our prior assumptions, but it may only be mildly crazy, so here goes.  (I
ran this past Jeremy who thought it was at least mildly crazy.)

Assumptions/observations:

- Very few people actually care about Java security as originally
conceived.  We usually either don't allow untrusted Java code into our
process, or we sandbox the resulting process.  (Android does the latter.)

- The argument for providing well-defined behavior for data races in Java
was based either on the original model of Java, or on the desire to
unconditionally maintain secondary properties, like memory safety, even in
the presence of data races.

- [Dubious, but plausible assumption] We should be able to explicitly
require those secondary properties, even if we allow otherwise allow less
than well-defined behavior for data races.

- We're essentially already making the preceding dubious assumption.  We
have all sorts of library APIs that don't specify how they behave in the
presence of races, but only say that you shouldn't do that.  We're somehow
supposed to conclude that the heap can't be corrupted even in the presence
of racy library calls, because of the (unstated?) assumption that libraries
should behave as though they were implemented in Java, and Java programs
preserve e.g. memory safety.  The only place we actually claim (incorrectly
as we now know) to have well-defined semantics for data races is at the
basic type and reference level, which is arguably not even the most
interesting place in a large code base.

- Most of us don't care that much about slowing down potentially racing
accesses a bit.  The reason the load->store ordering solution seems so
dubious for Java is that the compiler doesn't know which accesses might
race, and thus we need to treat many accesses conservatively.  ARM is much
more concerned about Java than C++ memory_order_relaxed.

Possible long-term approach:

- Switch to a much more C++-like (or Java library-like) model in which data
races have something like "undefined behavior".  Exactly how to model that
is an open question.  Ordinary loads can only see stores that happen-before
them, but racing loads trigger "undefined behavior". "Undefined behavior"
should be defined to allow reporting an error and produce any type correct
answer for the racing load.  Unlike C++, we probably want to be explicit
that it disallows a VM crash

- [Probably challenging, others understand the Java constraints better than
I] Introduce a mechanism for (nearly) unordered memory-order-relaxed like
racing loads.  Require current racing accesses to use that mechanism
instead.  (Open question: coherence)

- Strengthen this memory_order_relaxed analog to guarantee load->store
ordering.

- Implementations may specify that they treat all ordinary accesses as
these "memory_order_relaxed" accesses.  Browser-based implementations, or
others that require the current security model, would do so.  Whether and
how to specify this is open..

Transition strategy:

- Implementations are not required to break existing code in any way,
though the spec would allow them to.  In the near term, we would expect
implementations to continue to support (possibly with a suitable flag) the
current ill-defined model, along with the new model.  (Which should be
easy; the only really new support id for memory_order_relaxed and
load->store ordering.)

- The fact that data race detection is much easier in the new model may
help to inspire people to move to it.

Long term potential advantages:

- No OOTA issues!

- Easier race detection.  A standard conforming JVM should be able to throw
a data race exception if it finds one.

- Uniform model for basic types and libraries.

- We can delete the special cases for tearing longs and doubles without
unacceptably slowing down some implementations.

Long term disadvantages:

- As for C++, we promise less about behavior of buggy programs.  AFAICT,
much has been written about this, but it isn't a major practical issue
along the lines of memory safety.  And again, we already have that problem
for library clients.

- Security sensitive code will run slower and need a separate JVM flag.
Again, my impression is there is little of it, so this isn't a big deal.
 (There are probably some mild security advantages in exchange; untrusted
racy code can induce slightly fewer perverse behaviors.)

Short term issues:

- This would declare significant amounts of currently correct code to be
deprecated and in the future incorrect.  But if we can figure out how to
cleanly introduce something like memory_order_relaxed, I think the required
changes would clearly also improve readability.