Improve determinism in the Java language
Ioi Lam
ioi.lam at oracle.com
Tue May 31 21:21:33 UTC 2022
A lot of the recent Leyden discussion has been around "what
optimizations can be done ahead of time" (e.g., static field
initialization). However, I think we also need to look at a
lower level.
One reason that Java has been difficult to optimize ahead-of-time
is the tremendous dynamism in the language.
Here are a few things that I think we can do to make Java programs
more deterministic so that ahead of time optimizations can
be applied:
1 Deterministic Program Code
A Java program can essentially rewrite itself and even
the libraries it uses. Here's an example:
class App {
static {
if (...) {
MethodHandles.lookup()
.defineClass(.. hacked App$Bar ...);
}
}
static final Bar bar = new Bar();
static class Bar {
....
}
}
- We can't effectively AOT-compile the program code because
the native code may not match the runtime generated
bytecodes.
- We can't pre-initialize the App.bar field because its shape
may be different.
One idea is to disallow such code patching when Leyden is enabled.
For example, we can require that to use Leyden, an application
must be "prelinked", which means that as soon as the application
is loaded, the classes App and App$Bar are already loaded. The
defineClass() call will fail with a LinkageError (duplicated class
definition).
2 Decouple class namespaces from dynamic bytecode generation
This is a corollary of the above item. Java uses
ClassLoader.defineClass() for BOTH namespace and dynamic
bytecode generation. I would stipulate that most users
of Leyden want to do the former and not the latter.
We should have a new API to load a fixed set of classes
into a namespace.
3 <clinit> order
Java allows <clinit>s that recursively depend on each other. The
result depends on the reference order of these classes.
class A { static final int a = B.b++; }
class B { static final int b = A.a++; }
We could have a problem if the application assumes that A is
always initialized before B, but the Leyden optimizer
initializes them in the opposite order.
We could:
- Refuse to optimize classes that have mutually recursive
<clinit>, or
- Change the language spec to give the JVM more freedom to
decide the initialization order.
More information about the leyden-dev
mailing list