Experimentation with build time and runtime class initialization in qbicc
Brian Goetz
brian.goetz at oracle.com
Mon Jun 6 17:45:10 UTC 2022
Thanks, Dan, for the detailed information. The other investigation also seems interesting, so I hope some day you’ll find the time to write it up.
There’s lots to unpack here, but I want to focus on a specific aspect, related to the issue of “stale” or ‘aliased” compile-time values that I raised in my earlier mail. Taking the specific example of caching Runtime.availableProcessors(), let’s ask: WHY are these classes caching R.aP() in a static? There are two possible cases:
- Pure caching. Here, the author has made a choice (right or wrong) that calling R.aP() repeatedly will be too expensive, and so caches the value in a static for later use for, say, allocating arena arrays in the constructor of Striped64 or Exchanger — but the instances created in the early phase are still valid in the later phase, and compatible with instances created in the later phase.
- Enforcement of invariant. Here, the author has captured the fact that they require the value to be stable, because (say) they’re going to create multiple arrays and expect them all to be of the same length. Here, early-phase and later-phase instances could not compatibly coexist.
In the first case, reinitializing the cached field at phase change points may be harmless; it’s essentially equivalent to replacing reads of fields with repeated evaluation of the initializer (assuming the initialization is pure); in the second, the runtime has broken an invariant the author had reason to believe is valid.
Without diving into solutions at this point, we can’t escape the following observations:
- This is what happens when you try to reinterpret old code with new semantics; code that had every reason to work properly when it was written, becomes retroactively broken when the runtime reinterprets old cold in a new way. New semantics require permission from the user.
- If there are N separate desirable (but incompatible) outcomes, such as the two cases cited above, their code has to be different from each other. Right now, we can’t tell the difference between these cases.
If, as in the “its an invariant” case, it would be unacceptable for the value to change (i.e., when the user said “static final”, they were serious), the one of the following has to happen:
- We must be prepared to keep the earlier-phase result in later phases, even if the underlying quantity has changed;
- We must defer evaluation until the later phase (potentially deferring all dependent early evaluations);
- We fail at early-eval time if someone attempts to evaluate the must-be-stable quantity in the early phase, and let the programmer sort it out.
In fact, to the extent we want early evaluation, I suspect that we may want to be able to express *all three* of these in the programming model.
> On Jun 6, 2022, at 10:36 AM, Dan Heidinga <heidinga at redhat.com> wrote:
>
> On Tue, May 31, 2022 at 12:17 PM Brian Goetz <brian.goetz at oracle.com> wrote:
>>
>> I think Dan is homing in on one of the key questions, which is the nature of the third bucket (static finals that require reinitialization.) It would be useful for everyone following the discussion if we had a more complete list of situations you've encountered where this seems essential, and their notable aspects.
>
> In qbicc, the places we've had to reinitialize static fields are
> captured in the qbicc/qbicc-class-library repo [0] using "$_runtime"
> source files [1]. Many of the cases have to do with capturing the
> build time vs the runtime environment.
>
> The number of available CPUs is captured in several places:
> * j.l.Runtime :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/lang/Runtime*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWZcddsUo$
> * j.u.c.Exchanger:
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/util/concurrent/Exchanger*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWgRvBKQc$
> * j.u.c.Phaser :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/util/concurrent/Exchanger*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWgRvBKQc$
> * j.u.c.a.Striped64 :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/util/concurrent/atomic/Striped64*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWLhwhUqM$
>
> The environment variables are captured:
> * j.l.ProcessEnvironment :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/lang/ProcessEnvironment*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWD8e1uHk$
>
> The in / out / err file descriptors need to be reinitialized:
> * j.io.FileDescriptor :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/io/FileDescriptor*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWoWz91ck$
>
> Prevent threads from being created in a static initializer:
> * j.l.ref.Reference :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/lang/ref/Reference*24_patch.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWDR1ZEl4$
> * Likely more cases for this we just haven't hit yet
>
> Unsafe pageSize needs to be configured at runtime. As do
> UnsafeConstants like ADDRESS_SIZE0:
> * j.i.m.Unsafe :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/jdk/internal/misc/Unsafe*24_patch.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskW0SpuyLU$
> * j.i.m.UnsafeConstants:
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/jdk/internal/misc/UnsafeConstants*24_patch.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskW37nD06M$
> & https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/jdk/internal/misc/UnsafeConstants*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWUc6nR8s$
>
> Capturing the default directory:
> * sun.nio.fs.UnixFileSystem :
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/sun/nio/fs/UnixFileSystem*24_runtime.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWplHr18o$
>
> We're still working through detangling the "initPhase" process in
> j.l.System into a build time and runtime ("rtInitPhase") version:
> https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/blob/17.x/java.base/src/main/java/java/lang/System*24_patch.java__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWmsN5BXk$
>
> We also did some investigation of how feasible it would be to rewrite
> SubstrateVM's Substitutions to use the IODH pattern and I can share
> that info as well but it'll take a bit for me to write it up in a
> clear state.
>
> --Dan
>
> [0] https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library__;!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWo-EkTjg$
> [1] https://urldefense.com/v3/__https://github.com/qbicc/qbicc-class-library/search?q=*24_runtime__;JQ!!ACWV5N9M2RV99hQ!KgOr2DVo5L8QdXtEcuC-663xn5kbfHfTsNu4t27jI-AfKCyfQi5GqoKLWA8ImxNCaVeZMOWekskWJRfnDJs$
>
>>
>> As you point out, there are a host of potential "solutions"; while it is surely premature to try to propose a solution, it is never too early to come to a better understanding of the problem.
>>
>>
>>
>> On 5/31/2022 11:50 AM, Dan Heidinga wrote:
>>
>> On Fri, May 27, 2022 at 7:53 AM Kasper Nielsen <kasperni at gmail.com> wrote:
>>
>> Hi David,
>>
>> Thanks for the write-up.
>>
>> One thing that isn't completely clear to me after reading this is why
>> language
>> changes (<rtinit>) are needed?
>>
>> The <rtinit> model was a convenient way for us to explore a model that
>> put all class initialization at build time, while allowing a small set
>> of fields to be reinitialized at runtime. It also minimized the
>> changes we had to make to the core JDK classes which makes maintaining
>> the changes much easier given the rate of JDK updates. SubstrateVM
>> uses a similar approach with their Substitutions for what I assume are
>> similar reasons.
>>
>> Leyden will be able to update the JDK core classes directly and can
>> take a more direct approach to indicating in which phase a static
>> field should be initialized.
>>
>> It seems to me this could be entirely
>> implemented via a standard API. Using ClassValue as the main inspiration you
>> could have something like:
>>
>> abstract class RuntimeLocal<T> {
>> protected RuntimeLocal() {
>> checkBuildTime();
>> VM.registerForRuntimeInitialization(this);
>> }
>> protected abstract T computeValue();
>> public final T get(); // Calls to get are optimized by the vm
>> }
>>
>>
>> Usage would be something similar to:
>>
>> class Usage {
>>
>> static final LocalDateTime BUILD_TIME = LocalDateTime.now();
>>
>> static final RuntimeLocal<LocalDateTime> RUNTIME_TIME = new
>> RuntimeLocal<>() {
>> protected LocalDateTime computeValue() {
>> return LocalDateTime.now();
>> }
>> };
>> }
>>
>> I might be missing some details, but it seems to me that this approach would
>> be strongly favorable to needing to change the language as well as adding
>> new bytecodes.
>>
>> This is a good starting point. I went a fair ways looking at how to
>> group static fields into different classes to decouple their lifetimes
>> and found that I couldn't cleanly split them into two groups. I used
>> the Initialization on demand holder pattern (IODH) rather than your
>> RuntimeLocal but the idea is very similar.
>>
>> The problem is that while it's clear that some fields can be
>> initialized early (build time) and others must be initialized late
>> (runtime), there is a third group that needs to be reinitialized. I
>> list 3 buckets: early, late, and reinit, but that's a minimum number.
>> There may be more than 3. And due to the "soupy" nature of <clinit>,
>> it's not always easy to avoid depending on a field that's in a
>> different bucket. And values in that 3rd bucket - the fields that
>> need to be reinitialized - don't have a clear meaning when their value
>> propagates around the program. Does it need to be cleared everywhere
>> and force reinit of all consumers? Lots to figure out here.
>>
>> We need a better model - whether that's library features or new
>> language features - that makes it easier to express when (which phase)
>> an operation should occur and some way to talk about the dependency
>> chain of that value (all the classes that have to be initialized,
>> values calculated, etc).
>>
>> --Dan
>>
>> /Kasper
>>
>> On Thu, 26 May 2022 at 21:22, David P Grove <groved at us.ibm.com> wrote:
>>
>> Hi,
>> I’ve appended the contents of the referenced wiki page in this email.
>> Apologies in advance if the formatting doesn’t come through as intended.
>>
>> There is a full implementation of this (GPLv2 + Classpath
>> exception) as part of the qbicc project on GitHub. There is also a GitHub
>> discussion in the qbicc project that links to various GitHub issues that
>> capture the history that led to the current design. I will not hyperlink
>> to those here so that if people have any IP concerns, they can avoid seeing
>> them. They are easily findable.
>>
>> Regards,
>>
>> --dave
>>
>>
>>
>
More information about the leyden-dev
mailing list