Bytecode transformation investigation

Brian Goetz brian.goetz at oracle.com
Fri Aug 5 20:39:40 UTC 2022


Remi;

I think this misses a bigger picture here.  A key goal of Leyden is that 
we be able to _selectively and flexibly constrain and shift dynamism_.  
We don't want to force users to decide at compile time whether they want 
to AOT it, partially evaluate the program, gather profiling data, 
constrain away indy and other classloading, etc; we want them to be able 
to write their program and run it on a dynamic VM, as well as choosing 
to condense it (perhaps in a series of phases) to shift some behavior 
from runtime to an earlier phase, all completely optionally.

This is why Dan has focused on jlink; because jlink is positioned as the 
thing you run when you're ready to accept some tighter coupling in 
exchange for a smaller or faster deployment unit.  So the choice of 
`jlink` here is entirely appropriate, and has the advantage that 
developers can do their develop-test-run cycle with the lightest 
possible build chain, and spend more cycles to condense the program to a 
smaller/faster one only when spending those cycles has a positive return.



On 8/5/2022 1:49 PM, Remi Forax wrote:
> ----- Original Message -----
>> From: "Dan Heidinga"<heidinga at redhat.com>
>> To: "Brian Goetz"<brian.goetz at oracle.com>
>> Cc: "leyden-dev"<leyden-dev at openjdk.java.net>
>> Sent: Friday, August 5, 2022 4:49:31 PM
>> Subject: Re: Bytecode transformation investigation
>> Responding to one piece of this now as it's important to get everyone
>> on the same page with the requirements.  And I know I've tripped over
>> the "move fast, break things" philosophy multiple times in this space
>> before coming to this conclusion.
>>
>>>  From a specification perspective, there are multiple separate specifications
>>> viewpoints to consider: JLS, JDK and JVMS.  From a JLS perspective, I would say
>>> that if the Java *compiler* were to do what your jlink plugin does, this would
>>> be a reasonable way to implement a compiler for the Java language -- the
>>> classfiles emitted would respect the semantics of the language.  There's
>>> nothing that says a Java compiler has to translate lambdas with indy, or with
>>> hidden classes, so if the indy never got generated, that's not a problem.
>>>
>>>  From the JDK+JVMS perspective, it starts to get a little murky, and one of the
>>> goals of Leyden is to bring more clarity to this area.  The compiler emits
>>> certain classfiles with `invokedynamic`, and then some build-time tool rewrites
>>> these classes to be different.  Is this OK?  If the build-time tool is just
>>> "Dan's Magic Unofficial (Not) Java Bytecode Mangler", then this is the sort of
>>> build time mangling people do every day.  But we want this to be an official
>>> part of the platform, so I think there's a little more specification work to be
>>> done to allow (and specify) such transformations.  This is not a deal breaker,
>>> but we need to apply more thought here.  I think there are two categories of
>>> new work here: some specification work to characterize what build-time
>>> transformations like this are allowed to do or not do, and your transformer
>>> will likely want a specification for what it does as well.
>>>
>> What if we doubled down on treating all pre-runtime bytecode
>> transformations as optional behaviours akin to "Dan's Magic Unofficial
>> (Not) Java Bytecode Mangler" despite shiping with the platform?  Each
>> transformation - jlink plugin? - could be self describing so users
>> know what they are opting (the key point!) into when they enable the
>> transformation.  This allows treating these transformations as a
>> pre-step that has significant leeway on what it does provided the
>> modified classfiles run correctly.
> I don't think it should be run by jlink but more as a post process step of javac, more like annotation processors.
> It will work with anything that using invokedynamic.
>
> If the transformation are done by jlink, you can do more transformation, resolve Class.forName() / ServiceLoader by example, but you are in closed world assumption.
>
>> The JVM's role is then to load / verify / execute the classes as
>> required by the application and defined by the JVM specification.
>> Anything done to the classfiles prior to that is outside the JVM
>> spec's remit.
>>
>> This "user opt-in to transformations" model shrinks the two categories
>> to one: specifying what a transformer does.  As the first
>> "specification work to characterize what build-time transformations
>> like this are allowed to do or not do" category is answered with
>> "whatever they want, provided they generate valid classfiles".  And if
>> the user is opting-in for an application-specific runtime (jlinked),
>> then why not?
>>
>> Although it's kind of satisfying to say we can do what we want here,
>> it doesn't actually work.  Why? Because this model destroys any
>> invariants built into the JDK platform.
>>
>> Don't like how a method operates?  Transform it to do something else!
>> Introduce bugs!  Open security holes!  It's trivially easy to break
>> the platform invariants, get surprising results, or open subtle
>> security holes here. Basically, all the concerns raised with Native
>> Image's Substitution mechanism come into play here.  Though it's
>> possible to do many of these things today with JVMTI agents or even
>> user written jlink plugins (or historically by hand hacking rt.jar),
>> it's less common because it's hard! and because users have been
>> rightfully wary of what this can do to their applications.
> Why do you want the user to be able to opt-in to an unbounded set of transformation ?
> You can be far more restrictive by saying that you only have one javac flag to opt-in to a more "static" view of the world, using a bytecode transformer or not becomes an implementation details in that case.
>
>> Not to mention that Support Engineers will hate us if we take this
>> approach as it's hard to argue something isn't a supported config if
>> jdk ships the transformation that breaks the invariant.
>>
>> All that to say, I think the "specification work to characterize what
>> build-time transformations like this are allowed to do or not do" is
>> important to this work actually being successful.
> yes
>
>> --Dan
> Rémi
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20220805/9d4f11c1/attachment.htm>


More information about the leyden-dev mailing list