Bytecode transformation investigation

Wed Aug 3 18:38:13 UTC 2022

----- Original Message -----
> From: "Dan Heidinga" <heidinga at redhat.com>
> To: "leyden-dev" <leyden-dev at openjdk.java.net>
> Sent: Tuesday, August 2, 2022 10:30:38 PM
> Subject: Bytecode transformation investigation

> When Mark kicked off the project, he wrote about the "spectrum of
> constraints" enabling optimizations that are weaker than those of the
> closed world constraint, but more broadly applicable.  In line with
> that, I've been doing some investigation into bytecode
> transformations.
> 
> While bytecode transformations are strictly less powerful than AOT,
> they provide a way to simplify the program we're running based on
> information available at build / deploy time.  They allow us to move
> (some) dynamic behaviour from one phase (runtime) to an earlier one -
> behaviour such as reflective operations, runtime class generation,
> optional paths, etc can be simplified at the bytecode level based on
> information the author (or deployer) of the software knows without
> having to discover it at runtime.
> 
> Great!  But there's always a catch.  And the primary catch here is
> that bytecode transformation can result in user visible changes.
> Before we go too far down the path of developing transformations, we
> should determine which user-visible changes are legitimate and where
> the lines need to be drawn.
> 
> jlink experiment:
> ----------------------
> As a starting point, I prototyped using jlink to transform Lambda
> expressions to use pre-generated classes rather than runtime generated
> ones.
> 
> Lambda expressions
> * encode the lambda body as a private method in the defining class
> * use an invokedynamic instruction to dynamically pick the strategy
> for creating the lambda instances at runtime, and
> * encode a "recipe" combining MethodHandle, MethodType, Class and int
> arguments passed to the LambdaMetaFactory to actually generate the
> required class and create the lambda instance.
> 
> None of the code outside the LambdaMetafactory (LMF) cares how the
> lambda is implemented as long as it meets the contract by implementing
> the correct interfaces and by calling the private implementation
> method.
> 
> I modified the LMF internals to allow a jlink plugin to pre-generate
> the lambda classes [0], but doing so produces user-visible behaviour
> changes:
> 
> 1) Lambda classes are no longer hidden anonymous classes.
> The LMF loaded the implementation class as a hidden, anonymous class.
> This meant Class.forName() can't find the class, that
> Class::isHidden()[1] returned true, and that the class was specially
> named [2].
> With the pre-generated class, Class.forName can find the class, it is
> no longer hidden as it is loaded using normal class loading, and the
> name is a normal class name. [3]
> 
> 2) The pregenerated class must be a Nest member peer to the defining class.
> Since Lambda implementation methods are private on the class that
> defines them, the pre-generated lambda class must be a nest peer of
> the defining class inorder to call them.
> Calls to Class:getNestHost on the lambda class may result in different
> answers between the two strategies.  The nest host will also now
> include the pre-generated classes in its list of nest members for the
> pre-generated case.
> Users can observe this difference with the Class::getNestHost &
> ::getNestMember calls.
> 
> 3) Stacktraces
> Classes generated by the LMF at runtime are not visible in stack
> traces.  The pre-generated classes are visible.
> Users will be able to observe this with the StackWalker class and may
> notice the difference in any tools they use to process stack traces.
> 
> This is the initial set of user visible changes I've run across in
> this experiment.  There are likely other corner cases that I haven't
> hit yet, and other experiments will reveal other user visible
> differences.
> 
> The key question out of this effort is whether these kinds of
> user-visible differences are "acceptable"?  Where do we draw the line
> and how do we inform users of these differences?

isHidden() returning false is a compatibility issue because i've seen it used has an equivalent of isALambda() (like isAnonymous() was used before isHidden()), GraalVM emulates isHidden() for this reason.

For me, instead of trying to emulate those differences, i think it's easier here to provide a method Class.isLambdaProxy() and adds an empty classfile attribute LambdaProxy in the VM spec so both the lambda proxy generated using invokedynamic or pre-generated will mostly behave the same way.

I'm afraid that Leyden will be exactly that, see how people are using a dynamic thingy, see how it can be emulated it at generation time, provide a way for library developers to see them the same way by bridging the gap between the two and also try to convince library developers that relying too much on implementation details is not a good idea.

> 
> --Dan
> 
> [0] https://github.com/DanHeidinga/jdk-sandbox/pull/1/files (prototype code)
> [1]
> https://docs.oracle.com/en/java/javase/17/docs/api/java.base/java/lang/Class.html#isHidden()
> [2] ex.mod.Example$$Lambda$23/0x0000000800c019f0
> [3] ex.mod.Example$$Lambda$4

Rémi