Improve determinism in the Java language
A lot of the recent Leyden discussion has been around "what optimizations can be done ahead of time" (e.g., static field initialization). However, I think we also need to look at a lower level. One reason that Java has been difficult to optimize ahead-of-time is the tremendous dynamism in the language. Here are a few things that I think we can do to make Java programs more deterministic so that ahead of time optimizations can be applied: 1 Deterministic Program Code A Java program can essentially rewrite itself and even the libraries it uses. Here's an example: class App { static { if (...) { MethodHandles.lookup() .defineClass(.. hacked App$Bar ...); } } static final Bar bar = new Bar(); static class Bar { .... } } - We can't effectively AOT-compile the program code because the native code may not match the runtime generated bytecodes. - We can't pre-initialize the App.bar field because its shape may be different. One idea is to disallow such code patching when Leyden is enabled. For example, we can require that to use Leyden, an application must be "prelinked", which means that as soon as the application is loaded, the classes App and App$Bar are already loaded. The defineClass() call will fail with a LinkageError (duplicated class definition). 2 Decouple class namespaces from dynamic bytecode generation This is a corollary of the above item. Java uses ClassLoader.defineClass() for BOTH namespace and dynamic bytecode generation. I would stipulate that most users of Leyden want to do the former and not the latter. We should have a new API to load a fixed set of classes into a namespace. 3 <clinit> order Java allows <clinit>s that recursively depend on each other. The result depends on the reference order of these classes. class A { static final int a = B.b++; } class B { static final int b = A.a++; } We could have a problem if the application assumes that A is always initialized before B, but the Leyden optimizer initializes them in the opposite order. We could: - Refuse to optimize classes that have mutually recursive <clinit>, or - Change the language spec to give the JVM more freedom to decide the initialization order.
I'm leaning towards making certain parts of Java stricter if it's being compiled Ahead of Time, such as the compile time linking you mention, much like what languages such as C and C++ require you to do when generating binaries (Using the rough analogy of object files as compared to classfiles). Many of the dynamic features in the language typically only make sense if being run with a JVM anyway, such as using reflection to modify access to fields and methods, something which is significantly harder to do in a standalone executable. Not being able to optimize code based on a certain condition seems like a bit of a waste to me. best regards, Julian On Wed, Jun 1, 2022 at 5:21 AM Ioi Lam <ioi.lam@oracle.com> wrote:
A lot of the recent Leyden discussion has been around "what optimizations can be done ahead of time" (e.g., static field initialization). However, I think we also need to look at a lower level.
One reason that Java has been difficult to optimize ahead-of-time is the tremendous dynamism in the language.
Here are a few things that I think we can do to make Java programs more deterministic so that ahead of time optimizations can be applied:
1 Deterministic Program Code
A Java program can essentially rewrite itself and even the libraries it uses. Here's an example:
class App { static { if (...) { MethodHandles.lookup() .defineClass(.. hacked App$Bar ...); } } static final Bar bar = new Bar(); static class Bar { .... } }
- We can't effectively AOT-compile the program code because the native code may not match the runtime generated bytecodes.
- We can't pre-initialize the App.bar field because its shape may be different.
One idea is to disallow such code patching when Leyden is enabled. For example, we can require that to use Leyden, an application must be "prelinked", which means that as soon as the application is loaded, the classes App and App$Bar are already loaded. The defineClass() call will fail with a LinkageError (duplicated class definition).
2 Decouple class namespaces from dynamic bytecode generation
This is a corollary of the above item. Java uses ClassLoader.defineClass() for BOTH namespace and dynamic bytecode generation. I would stipulate that most users of Leyden want to do the former and not the latter.
We should have a new API to load a fixed set of classes into a namespace.
3 <clinit> order
Java allows <clinit>s that recursively depend on each other. The result depends on the reference order of these classes.
class A { static final int a = B.b++; } class B { static final int b = A.a++; }
We could have a problem if the application assumes that A is always initialized before B, but the Leyden optimizer initializes them in the opposite order.
We could:
- Refuse to optimize classes that have mutually recursive <clinit>, or - Change the language spec to give the JVM more freedom to decide the initialization order.
participants (2)
-
Ioi Lam
-
Julian Waters