Project Leyden: Beginnings

Tue May 24 18:00:21 UTC 2022

n 23 May 2022, at 9:10, Ioi Lam wrote:

> On 5/20/2022 9:16 AM, Volker Simonis wrote:
>> … It seems to me that "snapsafety" could be such a constraint and I 
>> hope
>> for a fruitful and successful cooperation between the two projects.

A snappy term indeed!  When applied to the existing Java platform, the 
concept (probably) leads to all sorts of complicated considerations 
about remote and hidden side effects and environmental queries.

As Ioi points out, the big new thing here, not possible outside of 
Leyden, is the option to *modify* the Java language specification (and 
standard libraries), if we think it helps clarify or simplify the 
(suitably modified) definition of snapsafety.

>
> I think we have an opportunity in Leyden to improve the language and 
> platform to support such concepts. I don't know the details of 
> "snapsafety", but in general we should have language support to 
> indicate some sort of "immutable" constraints. These constraints can 
> be validated (so that we can use pre-optimized snapshot (s)), or 
> invalidated (so we will go back to the old slow-but-correct 
> initialization).

The part of the language I like to think about changing is not so much 
assertions (maybe `assert`s) about past events (which are those 
“immutable constraints”?) but rather relaxation or modification for 
rules regarding order of evaluation, for suitably marked expressions and 
statements.

The small scale constant-folding rules which every JIT uses are really 
order of evaluation changes:  An expression like `1+2+x` folding to 
`3+x` takes the expression `1+2` and moves it “back in time” to JIT 
time.  This is safe because the JIT knows there is no way the program 
can give evidence of the difference (unless a debugger single-steps 
through bytecodes).  But I think we should chase after constant-folding 
this sort of thing:

```
Object lookup(String x) {
   // hey, can someone please do this just once, at jlink time?
   var mydata = readHashTable(findResourceFile("mydata.xml”));
   // this depends on x, so cannot be moved back in time:
   return var.get(x);
}
```

The standard technique is to put `mydata` in a static final variable.  
And now that’s easy to do inline as well:

```
Object lookup(String x) {
   // like a C++ static, the initializer is executed on first use:
   class Static {
     static final HashMap<String,Object>
     mydata = readHashTable(findResourceFile("mydata.xml”));
     // but still, can someone please do it just once, at jlink time?
   }
   // this depends on x, so cannot be moved back in time:
   return Static.mydata.get(x);
}
```

(Side note: Reading files throws a checked exception.  Does this mean 
that the above method should be amended to throw a possible checked 
exception, but marked as “somewhere in the past”?  If so, then 
time-shifted expressions would need to have associated time-shifted 
exception checking rules.)

This is a kind of time-shifting currently under programmer control.  It 
suggests to me that we can and should double down on supporting static 
final state (and also lazy statics as in JDK-8209964), by focusing some 
effort on time-shifting not so much arbitrary expressions and 
statements, but the initialization of classes.  If a programmer could 
mark a *whole class* as time-shiftable in its initialization, then the 
programmer could expect that jlink could make good provisioning 
decisions about that class, rather than the current standard policy of 
initializing a class on first use (of a static or of an instance 
creation).

One more bit of mental framework:  A Java class is initialized no 
earlier and no later than its first initializing use (static or instance 
creation).  Certainly there must be other events that the class 
initialization could be referred to.  “jlink time” is a hazy 
concept, but program startup is not:  A Java program starts just before 
its selected `main` entry point is run.  If a class C could be marked 
(by the programmer) as being initialized no earlier than entry to 
`main`, then the programmer could certify that the class is a candidate 
for pre-initialization, regardless of the change of semantics (relative 
to Java’s current order of class initialization).  And that would 
solve some (not all) of the problems around making valid jlink-time 
evaluations.  I guess I’m suggesting that a language-level proxy for 
“jlink time” is main method entry.

I suspect that time-shifted class initialization probably needs a 
concept of time-shifted dependency (as well as time-shifted exceptions, 
see above?) so that if class C is marked as “can initialize around 
main entry” C can also be marked as “but no earlier than 
initialization of D”, for some other class D that C’s initialization 
depends on.

(The work on lazies JDK-8209964 is sort of a complementary image of what 
Leyden is after, since a lazy variable is time-shifted *after* its 
containing class is initialized, another change from standard Java 
rules.  The two kinds of time shifting, backward and forward, probably 
deserve a combined treatment of some sort.)

>
> Also, in addition to a single snapshot of an app, perhaps we can also 
> consider multiple snapshots at a lower granularity.
>
> One parallel to draw from is the "constexpr" keyword in C++. However, 
> "constexpr" only deals with language-level constructs. For Java, 
> perhaps we need something that includes a wider set of environmental 
> dependencies. For example, many immutable tables in Java apps are 
> created from external XML files. Do we want a way to snapshot such 
> tables? Maybe we can do that if the XML files are statically stored 
> inside a jlink image?
>
> Again, I don't know what the answer is, but I am excited that we are 
> able to look for solutions at all levels of the language and platform.
>
>
> Thanks
> - Ioi
>
>
>>
>> Thank you and best regards,
>> Volker
>>
>>> We will lean heavily on existing components of the JDK including the
>>> HotSpot JVM, the C2 compiler, application class-data sharing (CDS), 
>>> and
>>> the `jlink` linking tool.
>>>
>>> In the long run we will likely embrace the full closed-world 
>>> constraint
>>> in order to produce fully-static images.  Between now and then, 
>>> however,
>>> we will develop and deliver incremental improvements which 
>>> developers
>>> can use sooner rather than later.
>>>
>>> Let us begin!
>>>
>>> - Mark
>>>
>>>
>>> [1] 
>>> https://mail.openjdk.java.net/pipermail/discuss/2020-April/005429.html
>>>
>>> // https://openjdk.java.net/projects/leyden/notes/01-beginnings