Bytecode transformation investigation

Brian Goetz brian.goetz at oracle.com
Thu Aug 4 16:36:27 UTC 2022


Yes, sorry for the delay, I've been trying to organize my thoughts on this.

Overall I am very happy to see this investigation.  It is obviously 
relevant to a number of points across the Leyden spectrum.  I had done a 
related thought experiment at one point about behavioral differences, 
and came up with a similar list with respect to LambdaMetafactory:

  - proxy class goes from hidden to non-hidden;
  - perturbs the set of nestmates of both the proxy class and capturing 
class;
  - potentially perturbs the timing of loading the proxy class (though 
this can be controlled);
  - freezing of bootstrap behavior -- if the bootstrap behavior were to 
change between build time and runtime (e.g., different JDK), any changes 
wouldn't be reflected in the execution.

Your "stack traces" observation wasn't on my list, so that's a good catch.

The "freezing of behavior" one is likely to be common to a number of 
Leyden techniques, such as AOT.  The answer there likely involves the 
creation of some sort of coupling between the artifact and a specific 
JDK version.  Since the main mission of `jlink` is to create a runtime 
image with both an application and a specific JDK, this seems sensible 
but there is likely additional spec work needed here.

Overall, none of these seem like show stoppers, but the devil is in the 
details.  There's categories of details here, too, such as 
implementation vs specification.

The implementation details, such as "is it OK to have the lambda proxy 
class be findable via Class::forName" (even if the lambda is never 
captured!) need to at least be evaluated through the security lens; does 
it allow anyone to instantiate a lambda with bogus captured arguments?  
I'm guessing no, because the constructor/factory is still private to the 
nest, but this is the sort of questions we'd have to ask ourselves.  My 
gut feeling says that these behavioral changes can be, as you suggest, 
framed as acceptable implementation variation.

 From a specification perspective, there are multiple separate 
specifications viewpoints to consider: JLS, JDK and JVMS.  From a JLS 
perspective, I would say that if the Java *compiler* were to do what 
your jlink plugin does, this would be a reasonable way to implement a 
compiler for the Java language -- the classfiles emitted would respect 
the semantics of the language.  There's nothing that says a Java 
compiler has to translate lambdas with indy, or with hidden classes, so 
if the indy never got generated, that's not a problem.

 From the JDK+JVMS perspective, it starts to get a little murky, and one 
of the goals of Leyden is to bring more clarity to this area.  The 
compiler emits certain classfiles with `invokedynamic`, and then some 
build-time tool rewrites these classes to be different. Is this OK?  If 
the build-time tool is just "Dan's Magic Unofficial (Not) Java Bytecode 
Mangler", then this is the sort of build time mangling people do every 
day. But we want this to be an official part of the platform, so I think 
there's a little more specification work to be done to allow (and 
specify) such transformations. This is not a deal breaker, but we need 
to apply more thought here.  I think there are two categories of new 
work here: some specification work to characterize what build-time 
transformations like this are allowed to do or not do, and your 
transformer will likely want a specification for what it does as well.

As with related techniques such as intrinsification, we need to ensure 
that there are not going to be observable differences with respect to 
behavior specified by either JVMS or JDK, or that those differences are 
permissible under the specifications.  Some of the things to worry about 
here might be:

  - timing of loading the proxy class
  - observable side-effects of indy linkage
  - observable side-effects of bootstrap execution
  - conformance with LMF specification, not just for the code shapes 
emitted by `javac`, but for any code shape supported by LMF

 From a side-effects perspective, the answer might well be "there aren't 
any", but the claim "this code has no side-effects" is often both tricky 
to ascertain, and can easily become false over time as the code is 
evolved. As an example (I'm not worried about this one, but it is a good 
illustration), there's a system property, 
`jdk.internal.lambda.dumpProxyClasses`, which causes proxy class files 
to be dumped to the file system for debugging.  That's a side-effect of 
bootstrap execution that would not happen (or would happen at build time 
instead of run time).  As this one turns out, this is an implementation 
detail, not a specified behavior, but this is the sort of line-by-line 
analysis we'd have to do to convince ourselves that what we're doing is 
safe -- and watch how the bootstrap implementation evolves to keep it so.


On 8/4/2022 8:57 AM, Dan Heidinga wrote:
> Hi Brian,
>
> Glad we're on the same page regarding isHidden being an implementation
> detail.  Do the ::getNestHost & ::getNestMembers calls and stacktrace
> differences fall into the same implementation detail bucket in your
> mind?
>
> I'd be happy for the nest mates cases to be implementation details
> but would need to look closer at the intersection of stacktraces,
> @callerSensitive methods, and the SecurityManager to be certain
> stacktrace differences aren't making bigger problems.  Any other areas
> concern with this kind of approach?
>
> --Dan
>
> On Wed, Aug 3, 2022 at 7:26 PM Brian Goetz<brian.goetz at oracle.com>  wrote:
>>
>>> isHidden() returning false is a compatibility issue because i've seen it used has an equivalent of isALambda() (like isAnonymous() was used before isHidden()), GraalVM emulates isHidden() for this reason.
>> I'm not very sympathetic here.  Code that interprets isHidden in this
>> way is just wrong.  There were extensive discussions about "how do I
>> detect whether an object is a lambda" and the answer has consistently
>> been "don't try, you don't need to know, and none of the mechanisms
>> answer the question you are asking."
>>
>>> For me, instead of trying to emulate those differences, i think it's easier here to provide a method Class.isLambdaProxy() and adds an empty classfile attribute LambdaProxy in the VM spec so both the lambda proxy generated using invokedynamic or pre-generated will mostly behave the same way.
>> We made a very clear decision in the JSR 335 EG -- that at runtime,
>> lambdas are not a thing.  The question of "are you a lambda proxy" is no
>> more interesting than "was it a tuesday when the source file for this
>> class was last changed", and it was a deliberate choice to not provide
>> any sort of reflection support here.  So I would not want to expose
>> this; it's an implementation detail.
>>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/leyden-dev/attachments/20220804/d6dcdebd/attachment.htm>


More information about the leyden-dev mailing list