[jmm-dev] Implementing JMM on top of LLVM

Tue Jun 27 15:16:11 UTC 2023

Hi!

I am working in a team at JetBrains currently studying memory models in the
context of the Kotlin/JVM and Kotlin/Native (implemented through LLVM)
compiler backends.
We want to arrive at a common multiplatform memory model, preferably close
to JMM.

We discovered several issues, which, I believe, are not specific to Kotlin,
but rather to any Java implementation on top of LLVM (e.g. Graal AOT
native-image compiler). They stem from the difference in Java and LLVM
memory models.

Here is the list of known problems.

1) Compiler backend of Kotlin/Native (and Graal AOT compiler too), uses
`NotAtomic` access mode for plain accesses, instead of `Unordered` mode as
recommended by the LLVM doc itself.
   https://llvm.org/docs/Atomics.html#unordered

   The problem is that races on `NotAtomic` accesses result in undefined
behavior --- this can violate any safety guarantees of the Java language
(the problem, in fact, is even wider and can affect any "safe" languages
compiled through LLVM).
   https://llvm.org/docs/LangRef.html#undefined-values

   For example, in the following program, LLVM semantics allows to print
message "Error".

   x = 1 || t = x; if (t < 0 && t >= 0) { println("Error") }

   As you can imagine, it is easy to construct similar example to violate
e.g. memory safety by bypassing array bounds check.

   The particularly problematic compiler transformation that can result in
such behavior, is register spilling/rematerialization. However, to the best
of our knowledge, LLVM currently is unable to perform this optimization
(except cases when load is trivially rematerializable, e.g. load from
constant). We couldn't reproduce such behaviors, and we are unaware of any
other problematic optimizations. Thus we believe that currently this
problem is mostly theoretical. Yet the question remains: should Java
implementations use `NotAtomic` or `Unordered` access mode for plain
accesses?

   We have tried to replace access modes of all plain accesses emitted by
Kotlin/Native to `Unordered`. However, we observed significant performance
overhead (for example, 10% average, 20% max on M1 ARM), which currently
stops us from trying to adopt `Unordered` access mode in the Kotlin/Native
compiler.

2) It seems that safe initialization semantics (and `final` fields
semantics) is formally unsound with respect to the LLVM memory model.
   For example consider the following program:

   class A {
       final int x;

       static A a;
   }

   void writer() {
       a = new A();
   }

   void reader() {
       if (a != null) {
           int y = a.x; // guarantee to read 0
       }
   }

   Here JMM guarantees that the field should be initialized with initial
value, even in presence of unsafe publication of an object through a race.
   Similar rules apply to initialization of `final` fields:

   class A {
       final int x;
       A() { x = 42; }
       static A a;
   }

   void writer() {
       a = new A();
   }

   void reader() {
       if (a != null) {
           int y = a.x; // guarantee to read 42
       }
   }

   (in addition, JMM also guarantees happens-before ordering between
initialization store `x = 42` and load `int y = a.x`).

   To achieve this, Java implementations put release fence (i) after object
default initialization, and (ii) at the end of a constructor of an object
containing `final` fields.

   However, according to the LLVM memory model, this code gives no
guarantees, (neither on read value, nor on happens-before ordering).
According to LLVM only a pair of release and acquire actions can result in
synchronization. This is quite problematic. While the semantics of `final`
fields is controversial, and can be sacrificed in the worst case, the safe
initialization guarantee still should be provided because "safe" language
cannot tolerate reads of uninitialized "garbage" values.

   In fact, for the read `a != null` we want a semantics similar to
semantics of `memory_order_consume` ordering. However, this ordering was
abandoned in the C++ memory model, and it was never a part of the LLVM
model.

   Moreover, because the plain loads are intended to be compiled to
`Unordered` accesses, this `memory_order_consume` semantics should be
enabled for `Unordered` accesses. But then, we probably want the property
that access modes are monotonically stronger, so if we want such behavior
for `Unordered`, we also might want it for `Monotonic` (i.e.
relaxed/opaque).
(For `Acquire` and stronger modes we already have it).

3) As was recently shown in the paper "Compiling Volatile Correctly in
Java" by Liu, Bender, and Palsberg, sequentially consistent atomic accesses
have different semantics in Java and C++ (and thus in LLVM).
   Java has stronger semantics for Volatile accesses: it should guarantee
that these accesses behave similarly as sequentially consistent fences
(`fullFence`). Because of that, a large class of optimizations sound for
C++/LLVM seq_cst atomics accesses are unsound for Java Volatile accesses.
   For example, in Java compiler cannot remove dead volatile store (one can
find more examples in the paper). Again, according to our preliminary
experiments, it does not look that current LLVM implementation would
actually try to eliminate seq_cst access, but still, memory model
specification allows this.

   It implies that Volatile accesses in Java cannot be simply compiled as
accesses with `SequentiallyConsistent` mode in LLVM, but instead as a pair
of Acquire/Release access and a sequentially consistent fence.

I would be really grateful to hear an opinion on these problems from other
experts on JMM.

P.S. I intentionally did not touch the out-of-thin-air problem, because it
is orthogonal to the problems described above.

-- 
Best Regards,
Evgenii Moiseenko
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jmm-dev/attachments/20230627/f4caf92b/attachment.htm>