Experimentation with build time and runtime class initialization in qbicc
David P Grove
groved at us.ibm.com
Thu May 26 20:22:06 UTC 2022
Hi,
I’ve appended the contents of the referenced wiki page in this email. Apologies in advance if the formatting doesn’t come through as intended.
There is a full implementation of this (GPLv2 + Classpath exception) as part of the qbicc project on GitHub. There is also a GitHub discussion in the qbicc project that links to various GitHub issues that capture the history that led to the current design. I will not hyperlink to those here so that if people have any IP concerns, they can avoid seeing them. They are easily findable.
Regards,
--dave
## Overview
One of the goals of the qbicc project is to explore technical approaches for adapting Java's specification of class initialization to fully support native image compilation. Enabling build-time evaluation of complex class initialization logic is essential for obtaining much of the benefits of native image compilation: reduced memory footprint and fast startup. However, both the core JDK and many frameworks will not be primarily be used in native image scenarios. Therefore, it is essential that the approach taken for build-time initialization enables both the existing runtime class initialization and the new build-time class initialization logic to co-exist. Furthermore, for as many cases as possible, the class initialization code should be shared between the two usage scenarios and have non-surprising semantics in both.
## Build-time Initialization
In qbicc, all classes are initialized at build-time. Class initialization at build time is performed according to the existing semantics of Java class initialization driven by build-time execution of the `<clinit>` methods of reachable classes. The set of reachable classes is determined iteratively, starting with the program entrypoints and adding the methods and classes they utilize until no further reachable classes are discovered (a fixed point is reached).
After build-time initialization has completed, a build-time heap has been constructed that contains the objects that were created during the build-time execution of the `<clinit>` methods. Using the reachable static fields of the reachable program as roots, this build-time heap is serialized into the native image. This set of objects will form the initial runtime heap of the program when it is executed.
## Runtime Initializers
There are cases where one or more initialization actions of a class **must** be executed at program runtime. Most typically these involve the creation of native resources (open files, threads, etc) that cannot be successfully serialized into the build time heap.
Qbicc supports runtime initialization by allowing static fields of a classes to be declared as runtime initialized. These fields will be initialized lazily, at first access, by executing a runtime initializer (`<rtinit>`) associated with the accessed field. Runtime initialization is localized: accessing a particular static field will cause its runtime initializer to be executed but has no implications for other runtime initializers defined either in the field's defining class or any superclass or implemented interface of the field's defining class.
When serialized from the build-time heap to the runtime heap, all runtime-initialized fields will be serialized with the zero (uninitialized) value appropriate for their type.
Qbicc allows related static fields in the same class to share a common `<rtinit>` method. The first access to any of the fields will cause the execution of the associated `<rtinit>` method and the initialization of all the fields.
## Adjusting Heap Serialization
For some objects it is necessary to initialize them during build-time initialization, but "reset" them before they are used at runtime.
Qbicc supports this by allowing fields to be annotated to be serialized as the type-appropriate zero value or as a primitive constant value. This value replacement happens as the build time heap is serialized.
One common scenario is to invalidate objects that are wrapping native resources. For example, when a `FileDescriptor` is serialized its `fd` and `handle` instance fields are serialized as `-1` and its `closed` field is serialize as `true`. Thus, any attempt to use the build-time FileDescriptor at runtime will raise the appropriate exception.
## Patching: Migration for Existing Classes
The runtime initialization mechanisms described above are currently enabled via a set of annotations. This allows qbicc to implement the desired semantics without requiring any changes to the Java compiler, class file format, or language specification. In the long term, we believe small modifications to the Java specification, for example defining a `rtinit { ... }` similar to the existing `static { ... }` construct could enable a simpler specification.
The primary annotation for runtime initialization is `RuntimeAspect`. This annotation is defined on a class and is interpreted as meaning that the `<clinit>` method of the class should be interpreted as an `<rtinit>` method. This method will not be executed during build-time initialization and instead will be deferred until the first access of one of the static fields defined in the class.
To allow us to "externally" modify JDK core classes for qbicc, we have developed an annotation-driven patcher infrastructure. The patcher allows the declaration of patch classes that add, remove, and modify the methods and fields of an existing class. This modification includes the replacement of the `<clinit>` method and the declaration of multiple `RuntimeAspect` patch classes.
The best way to explore what is possible with the patcher is to examine the java.base/src directory in the qbicc-class-library project. It makes extensive use of the patcher annotations to adapt the core JDK classes to qbicc while still allowing us to consume the upstream OpenJDK code base via an unmodified git submodule.
## Design Alternatives
A number of alternatives were considered before arriving at the final design documented here. The technical discussions and options considered can be explored starting in qbicc discussion #764 on GitHub.
From: Brian Goetz <brian.goetz at oracle.com>
Date: Thursday, May 26, 2022 at 2:21 PM
To: David P Grove <groved at us.ibm.com>, "leyden-dev at openjdk.java.net" <leyden-dev at openjdk.java.net>
Subject: [EXTERNAL] Re: Experimentation with build time and runtime class initialization in qbicc
Hi David; Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no
ZjQcmQRYFpfptBannerStart
This Message Is From an External Sender
This message came from outside your organization.
ZjQcmQRYFpfptBannerEnd
Hi David;
Would like to understand more about this, but first, from an IP-hygiene perspective, documents linked from this list should be under the OpenJDK terms and conditions. Can you post the contents of that document here, so there are no issues there?
Thanks,
-Brian
On 5/26/2022 12:35 PM, David P Grove wrote:
Hi,
In the qbicc project, we’ve been exploring options for adapting Java’s class initialization semantics for native images. In particular, we are trying to arrive at a non-surprising semantics that in a native-image scenarios allows most initialization to happen at build-time while still enabling runtime initialization of selected static fields.
Our current design and experience is captured here: https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc<https://github.com/qbicc/qbicc/wiki/Class-Initialization-in-qbicc>. In a nutshell, the idea is to initialize classes via build-time execution of existing <clinit> methods as per normal Java semantics while adding per-static-field <rtinit> methods to provide a capability for runtime-reinitialization of a field before its first access.
--dave
More information about the leyden-dev
mailing list