Runtime code generation and barriers in migrating away from JVM-interal APIs

Rafael Winterhalter rafael.wth at
Sat Jan 6 23:07:53 UTC 2018


I am the author of Byte Buddy and a maintainer of cglib, two of the major
code generation libraries in the Java ecosystem. Both libraries are
downloaded about 160 million times a year and I wanted to give a report and
opinion on the current state of moving away from JVM-internal APIs to save
and officially supported alternatives.

When code generation tools define classes at runtime, there are currently
different alternatives to achieving that:

1) Using sun.misc.Unsafe::defineClass to define a class directly. This is a
fairly easy API that even allows defining classes in the bootstrap loader.
2) Accessing protected methods of java.lang.ClassLoader via Java
reflection. This allows more fine-grained access to class loading by
respecting class loading locks etc.
3) Creating a custom class loader as a parent of the class loader of a
proxied class. This avoids any use of internal API but limits proxying to
public (and since Java 9 also exported) classes and their protected and
public methods.
4) Using a Java agent to define classes using the Instrumentation API.
Using this API, it is also possible to gain access to internal APIs as it
becomes possible to open encapsulated APIs.
5) Using JNI to define classes using its APIs or to avoid encapsulation

Of course, strategies (1) and (2) were always discouraged and might no
longer work in a future Java release due to the encapsulation of internal
APIs. Yet, as of Java 9, most code generation tools achieved Java 9
compatibility by migrating from solution (2) to solution (1) thanks to the
jdk.unsupported module. Method (5) is rarely used as it requires the
inclusion of C code for something that can be achieved easier.

As of Java 9, the JVM offers a new approach to defining classes:

6) Using java.lang.invoke.MethodHandles.Lookup::defineClass

While Byte Buddy supports this new approach as a user-chosen class
definition strategy, for most use cases, the API does not offer sufficient
comfort. Code generation is mainly used for the following two purposes:

A) When defining a proxy, the proxy class is normally defined in the same
package as the proxied class. Doing so, a proxy can be created for
package-private classes and it can proxy package-private methods. Using
strategy (6), it is however not possible to define a class in a package
outside of the package that has created the lookup as this would require
PACKAGE access for the target package. If the proxy is created by another
module then the module of the proxied class, this access right is never
available, even if the proxied class’s module opens its package to the
module that generates the proxy. In this context, Strategy (3) is not an
option either as the runtime package of the child class loader would be
different to the user class package’s class loader.

B) When programming a Java agent, a class enhancement makes it sometimes
necessary to define an auxiliary class in the same package as the
instrumented class. This is similar to javac’s need for such classes where
it sometimes defines anonymous classes to provide a certain type for using
an API. Unfortunately, the ClassFileTransformer::transform method does not
provide a method handle lookup for the package of the instrumented class.
At the same time, the Java agent itself typically lives in a different
package then the instrumented class such that it cannot create its own
lookup what makes (6) inapplicable. Of course, (3) is not an option in this
case either.

For scenario (A) one could argue that for many use cases, access to
package-private classes and methods is not necessary as it breaks the Java
programming language's encapsulation model. However, giving such access has
been proven useful in the past: the Spring framework does for example
induce a bean scope when defining a Java configuration class’s method as
package-private. And for the Mockito framework, such access allows for the
creation of package-private mocks what avoids that users have to extend the
scope of such classes only for a unit test.

For (B), a Java agent is able to access internal APIs by opening packages.
Providing a method handle lookup for the target class would however offer a
cleaner, more standardized approach. It is however unclear what lookupClass
the method handle lookup would be assigned to as the instrumented class is
not necessarily loaded when the class file transformer is applied.

Additionally, some proxying tools such as Mockito require an API to
instantiate a class without invoking a constructor. This way, a mock can be
created without triggering any user code which might have unwanted
side-effects or throw an exception for invalid inputs that are unknown to
the mocking framework. I understand that such instantiations are frowned
upon as they break the object model. But again, this possibility has proven
to be very useful in the past and it would be too bad if such libraries
could no longer be maintained in the future.

To create instances without invoking a constructor, there are currently
several options:

7) Use sun.misc.Unsafe::allocateInstance or the also internal reflection
factory. Such use is often done via the Objenesis library. If such access
was encapsulated, a Java agent could still open these APIs.
8) Using JNI to avoid encapsulation or allocating an instance without a
constructor call from JNI.

Again, (8) is a rarely chosen approach but (7) via the use of Objenesis is
still common.

At a result, even with Java 9 being supported by many popular frameworks, a
migration away from internal APIs has not yet been achieved. I would
therefore like to suggest the following extensions:

C) When a module opens a package, other modules should gain package access
to this package when creating method handle lookups. This way, if a user
opens a package containing Spring beans to the Spring framework, it could
proxy all of these beans as it does today. Since opening a package also
permits reflection on package-private types and methods of this package,
this is not a security concern either.

D) A class file transformer should be provided with an instance of a method
handle lookup for the instrumented class as an argument. This way, Java
agents gain an easy and standardized way of defining auxiliary classes what
is currently rather cumbersome.

E) There should be a jdk.test module that is not resolved by default and
that is not part of a non-JDK distribution of the JVM that contains an API
that allows for the instantiation of classes without a constructor
invocation. By depending on this module, test libraries that offer such
insecure abilities can also make their intention clear that a library is
meant for test and not for production. With Mockito, we regularly get
inquiries about performance issues when the library is used in production
systems what it is not designed for. This module could also include an API
for getting hold of an Instrumentation instance for the current JVM
process. This would be useful for many testing libraries such as Mockito
and also for testing Java agents under development. Currently, it is
necessary to self-attach using the attachment API. Since Java 9, it is
additionally required to explicitly allow such self-attachment or to use an
intermediate Java process to avoid the constraint.

With these three extensions, I believe that the many users of code
generation tools could easily migrate away from the use of internal APIs in
a few months what would allow a full encapsulation of JVM-internal APIs
without any major disruptions.

Thank you for your time and feedback on my proposal!

Best regards, Rafael

More information about the adoption-discuss mailing list