Hello,
Early this year we were doing some experiments on portability of
checkpoints under project CRaC and prepared a document [0] on our findings.
We feel parts of it would be relevant under Project Leyden as well, as it
*may* enable ahead-of-time compilation in future (I know we are not there
yet!)
For the CRaC checkpoints, we found the code generated by C1/C2 compiler is
not always portable due to the use of architecture specific instructions.
The same would hold true in the context of AoT compilations as well.
To make the code portable C1/C2 compilers should be provided with a minimal
set of cpu features that they are allowed to exploit during codegen phase.
However, this can also negatively impact the performance of the generated
code as it would now not be utilizing all the features of the underlying
architecture.
So the performance may have to be traded for gaining portability.
If anyone has thoughts on this aspect of the problem, please share them.
[0] http://cr.openjdk.java.net/~heidinga/crac/Portability_of_checkpoints.pdf
- Ashutosh Mehra
Continuing on the Class init progression discussion....
Why don't we put every static field in its own class?
The obvious answer is that it's too much mental load for developers. But if
we put that aside for a moment, and assume that we have infinitely smart
developers, it might be useful to understand why we don't program like this
now. Or what programming like this might actually look like.
Putting every static field in its own class trivially gives us lazy static
fields (sorry John, no new JEP required in this world) with each static
only being initialized when actually accessed.
It gives each static field a clear initialization point where we can more
easily tell what caused a particular static to be initialized.
It makes it easier to determine the true dependency graph between static
fields rather than today's "soupy" model.
It doesn't solve the "soupy" <clinit> problem as developers can still do
arbitrary things in the <clinit> but it does reduce the problem as it moves
a lot of code out of the common <clinit> as each static now has its own
<clinit>. Does this make analysis more tractable?
In our investigation [0], we focused on the underlying JVM physics of
classes and looked at the memory use of this approach. Which was estimated
to average out to under 1K per class.
What do other languages do with their equivalent of static state? Are there
different design points for expressing static state we should be
investigating to better enable shifting computation to different points in
time?
--Dan
[0] http://cr.openjdk.java.net/~heidinga/leyden/ClassInitPlan.pdf
Previously, we at Red Hat had done some investigation into static final
fields and how existing classes could be adapted to enable their statics to
be initialized at build time.
I'd like to replay some of that investigation here, both to share our
findings and also to gather different perspectives on the problem space.
Our work had been very focused on moving work from runtime to build time
and with legitimizing build time initialization - recognizing two phases:
build & run time. With Leyden's approach to shifting constraints there may
be different outcomes from these kinds of investigations, especially in
light of the recognition that there may be many phases beyond just build
and run times.
I'd like to break the document up into a series of threads on each topic as
that often leads to more focused discussion. I've also posted the full
document is available at [0].
Let's start off with some background on class initialization and a key
question: why don't users group their static fields together in ways to
give them different lifetimes? We have tools to do this today, like the
IODH pattern, why aren't they used more broadly?
Key question:
As we look at shifting work from runtime to build time, we should
understand what drives user behaviour today. What tools are available for
controlling lifecycles of static fields and how popular are those tools?
Broadly used or narrowly? Why?
Background:
Class initialization is a bit of a magic process from the perspective of
users. Users typically believe it happens when the class is loaded rather
than understanding the actual triggers specified in JVMS { new,
get/putstatic, invokestatic, and caused by subclass init, reflection }.
Class init is also difficult to reason about because it takes all the
static field initializations (apart from those done using the ConstantValue
attribute) and puts them together into a single “soupy” method. This method
is “soupy” in that it mixes all the interactions and operations - even
those expressed separately in the source - into a single method that allows
unbounded side effects.
To make matters even less clear, the language and VM have slightly
different rules for how static final fields are written: At the language
level, the compiler enforces that static finals are only written to once
but the VM allows them to be written as many times as needed if done within
the <clinit> method (this used to be if done from within the defining
class).
All this results in the life cycle of every static field in a class being
conjoined, even if they would be better handled in different life cycles.
There is:
* One initialization point
* One method that is called to write the fields
* One duration for all of them (the life of the class or, equivalently, its
classloader)
* One thread that executes the method (good for exactly once semantics)
There is a common case where users recognize these conditions and use them
to produce obviously correct code: the initialization-on-demand holder
idiom. This is used to create singletons that ensure there is only a single
instance of the class, and occasionally used to defer expensive operations
until first use (lazy init).
--Dan
[0] http://cr.openjdk.java.net/~heidinga/leyden/ClassInitPlan.pdf
For a few months now I’ve been pondering, and brainstorming with a few
of you, how best to frame the work of Project Leyden conceptually.
Here’s my proposal:
https://openjdk.org/projects/leyden/notes/02-shift-and-constrain
Short summary:
The goal of Project Leyden is to improve the startup time, time to
peak performance, and footprint of Java programs. In this note we
propose to work toward that goal by extending the Java programming
model with features for _selectively shifting and constraining
computation_ by means of _condensing code_. We also propose an
initial research and development roadmap.
Comments, questions, and suggestions welcome!
- Mark