Experimentation with build time and runtime class initialization in qbicc

Fri May 27 16:15:30 UTC 2022

From: Brian Goetz <brian.goetz at oracle.com>
> From reading your notes, it seems that at build time, you start with the root class(es), execute their <clinit>, 
> which will cause loading of more classes, more <clinits>, and you iterate until there are no new classes to initialize.  
> You then treat the statics as roots, and serialize those objects to the initial heap image.  But before doing that, 
> you exclude (zero out) any which are marked as "reinitialize at runtime."  

This is correct.  In addition, as qbicc serializes each object, it also looks for annotations on instance fields that indicate that instead of serializing the build-time value of the instance field it should substitute a different value (FileDescriptor is a motivating example...we want to serialize a closed FileDescriptor to ensure any runtime reads/writes through it will result in the proper exception being raised). 

> ... what happens in cases like this:
> 
> class Aliased { 
>        @RuntimeInitialized private static final Socket s = ...;
 >       private static final Socket copy = s;
 >}

First, I'll say what this code snippet would do with qbicc, then I'll say what the program should be to get the semantics the programmer probably intended.

At build time, qbicc will execute the <clinit> of Aliased, presumably a Socket object will be allocated by ... and references to that Socket object will be stored in s and copy.  Any build-time usage of either s or copy via a build-time executed getfield will get a reference to that Socket object in the build-time heap.  The @RuntimeInitialized has no impact on the build-time execution of code. At the end of compilation, when we serialize the static fields for Aliased, we will write null for s and a reference to the serialized Socket object for copy.  In the generated code, all getfields to s will be preceded by checks to ensure that the <rtinit> method for s has been executed (similar to how a clinit check would be generated in a JVM).  Since copy does not have a <rtinit>, getfields to copy in the generated code will not be preceded by any checks.   The first time s is accessed at runtime, the ... code will be executed by the <rtinit> method and a new Socket object will be created and stored in s.  The fields s and copy will now point to distinct Socket objects. Usages of the Socket object reachable from copy would likely result in an exception because the backing FileDescriptor for the Socket object referenced from copy would have been modified during the serialization process so that its instance fields have values as if the FileDescriptor had been closed.  

Using the syntax above, one would need to write this code to get the intended aliasing at both build-time and runtime. 
class Aliased { 
        @RuntimeInitilalized private static final Socket s = ...;
        @RuntimeInitilalized private static final Socket copy = s;
}

The way we would actually write this pattern in qbicc today is a little more indirect because we (1) we don't want to change javac and (2) we don’t want to directly edit OpenJDK source code (to make it easier to consume updates). Therefore, we define a "patch class" with a @RuntimeAspect annotation that qbicc combines with the unmodified bytecodes of the Aliased class to get what we need.  I've added a third field just to emphasize that we need to allow the <rtinit> of a class to be a subset of its <clinit>.

class Aliased { 
        private static final Socket s = ...;
        private static final Socket copy = s;
       private static final Object anotherField = ...
}

@RuntimeAspect(Aliased.class)
class Aliased_RT {
        private static final Socket s = ...;
        private static final Socket copy = s;
}

The only part of the Aliased_RT class we are interested in is the <clinit> method that javac generated for it.  The qbicc compiler takes Aliased_RT's <clinit> and uses it as the <rtinit> method for the fields s and copy of the Aliased class.  The rest of the Aliased_RT class is ignored.

If one was able to change javac, then the simpler @RuntimeInitialized syntax you had used would be better.  From a single class definition, javac could generate both a <clinit> method that initialized s, copy, and anotherField and an <rtinit> method that initialized s and copy.

Finally, qbicc does not attempt to recognize when an object that is directly referred to by a @RuntimeInitialized static field is also reachable in some other (perhaps deeply nested) way.  As a result, it is certainly possible to write programs where build-time and runtime-time identity (==) of two access paths is different.  So far, this hasn't been an issue for us, but it is one of the ways in which one could detect at runtime that something non-standard has happened. 

Hope this explains more clearly without being tediously long,

--dave