Toward Condensers

Dan Heidinga heidinga at redhat.com
Tue Aug 1 20:34:33 UTC 2023


Thanks for sharing this document.  And for the work, you, Brian and Paul
have been putting into this.

A couple of questions / comments about the document based on my reading:

* In the "The condenser pipeline" section, the model shows an Extractor ->
Condenser 1 -> Condenser 2 -> Distiller as an example of the pipeline which
is linear: A, then B, then C,....  Often when applying optimizations (ie:
in a compiler), there's a virtuous circle where one optimization exposes
new opportunities for another, which triggers more opportunities for the
first.  This leads to running through optimization passes until a fixpoint
or some limit occurs.  Dead code elimination is often an pass that benefits
from being repeatedly run.  This is still early days but I'll ask anyway:
Have you thought about how the condenser pipeline would benefit from
repeated application?  Or how condensers could opt into repeated
application?

* in "The application model", the first goals states:
> Abstracted away from the representation — Condensers should not
> directly read and write files as they do their work; they should express
> their behavior in terms of changes to the model, and let the tooling
> handle the representation.

I agree with the goal of having the model mediate access to the classfiles
/ jars / modules / resource files, I think this goal may overstate the
requirement of "not directly read and write files" as condensers that have
offline training runs will want to use the filesystem to access their
training data.  Would phrasing this as "Condensers should only access
classfiles and resources through the data model; they should ....." express
the intent more clearly?

And the last goal states:
> Scrutable — It should be possible to answer the question,
> “what did that condenser do?”

Which I don't see explicitly addressed in the rest of the document.  It's a
good goal.  Is the intention here that the ModelUpdater can be interrogated
to analyse the changes?  Are you envisioning a logging mechanism of some
sort here or something else?

* In the "Data model" section, how are duplicate classes on the classpath
handled?  Are Containers representing JARs on the classpath explicitly
ordered so as to linearize them to ensure the earliest definition of a
class wins?  Does the model need to expose the classpath ordering?

Is ContainerKind missing a "directory" type as well?  It might be possible
to pretend a filesystem directory on the CP is a JAR for model purposes,
but that doesn't feel quite right.  Did you consider directories when
making the model?  If they were deliberately excluded it might help to
expand on why in the document.

Should the Data model include "classloader" as a member?  With modules we
can map which module will be loaded by which classloader and can guess for
most classpath entries.  For more complicated classloading schemes
(including self-first), it might be beneficial to model the classloader
network in the model as well.  This may be something that a non-standard
condenser could augment the model / analysis with.

* In "Example: Lambda forms" contains the following sentence
> After condensing we can update the java.base module on the file system
from the updated application model.

which should probably delegate the updating of the file system to the
Distiller.  Given the prohibition against filesystem access in the "The
application model" section, talking about the filesystem here seems odd.

I'm looking forward to seeing the prototype and trying to port my
pregenerate lambdas jlink plugin to be a condenser.  I think it should be a
fairly smooth process.

--Dan



On Mon, Jul 31, 2023 at 4:31 PM Mark Reinhold <mark.reinhold at oracle.com>
wrote:

> A few of us have been thinking about how condensers might work.  We now
> have a prototype design and implementation of a condenser API and tool.
>
> We’ve deliberately started small, focusing on principles of condenser
> operation and a minimal set of features sufficient for the simplest
> condensers, i.e., those that don’t require additional changes to the
> Platform Specification.
>
> Design note:
> https://openjdk.org/projects/leyden/notes/03-toward-condensers
>
> Summary:
>
>   We elaborate the concept of composable condensers to introduce a
>   simple, abstract, immutable, data-driven model of applications so that
>   condensers can be expressed as transformers of instances of the model.
>   The model is sufficient to express simple condensers; we include two
>   examples.
>
> This is just a starting point; we expect to evolve it considerably going
> forward.  We’ll publish the prototype code shortly after we return from
> the upcoming JVM Language Summit.
>
> - Mark
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20230801/bdc8e23a/attachment.htm>


More information about the leyden-dev mailing list