RFC: 8200559: Java agents doing instrumentation need a means to define auxiliary classes
Rafael Winterhalter
rafael.wth at gmail.com
Fri Apr 16 20:18:31 UTC 2021
I am trying to revive issue 8200559 (
https://bugs.openjdk.java.net/browse/JDK-8200559) which was briefly
discussed on this mailing list over three years ago (
http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html). I
am hopeful that a solution could be reached prior to releasing Java 17.
With the LTS label, JVMs of that version are likely to be seen for quite
some time and without a fix, many Java agents will still rely on unsafe API
to support that version. With the issue resolved, agents that aim to
support Java 17 as a baseline could be implemented using only official API
which I consider a desirable milestone. This would also allow me to remove
use of unsafe API from a range of popular open source projects that I
maintain once Java 17 becomes their baseline. I have proactively added a
pull request (https://github.com/openjdk/jdk/pull/3546) with what I
consider to be the most complete and satisfactory solution for agent
developers. I implemented this patch to test potential migration for my own
and customer projects and can confirm that this would indeed make any
unsafe API obsolete.
I had already lined out why APIs such as MethodHandles.Lookup::defineClass
do not work for ClassFileTransformers in my original posting. I will try to
avoid reiterating on my original argument. As a result of this discussion,
an API was proposed where the ClassFileTransformer would be overloaded with
an additional parameter that receives an instance of ClassDefiner as its
argument. The instance would allow you to define a class but assert that
the auxiliary class is located in the same package as the class under
transformation. For example, if a class:
package p;
class C {
void m(A a) {
a.callback(new Value());
}
}
would be instrumented to become:
package p;
class C {
void m(A a) {
a.callback(new Replacement()); // changed
}
}
then ClassDefiner would assert that the auxiliary class Replacement is also
defined in package 'p'. In my eyes, this is not a very satisfying solution.
In some cases, a Java agent is for example using existing classes to carry
additional state between two instrumented classes. For instance, class
Replacement might be used as a carrier of additional state to pass to class
A which is itself instrumented as follows:
package p2;
class A {
void callback(Value v) {
if (v instanceof Replacement) {
// read data of v and process
}
}
}
This example is far from constructed, but a common and effective strategy
to implement tracing, security monitoring or code intelligence.
It is of course possible to implement this pattern by using the
package-restricted ClassDefiner. It is however rather costly and
complicated to implement. A ClassFileTransformer is a simple callback that
is not in control of the class loading order but is only invoked whenever a
class is loaded for the first time. The first class to be loaded might of
course be either C or A, it might not even be deterministic for the same
application. If the class Replacement was required to live in the package
of the class that is currently instrumented, Replacement would need to be
defined in either p or p2, depending on the materialized class loading
order. This would require some form of state management in the class file
transformer to keep track of the package that one would need to use when
instrumenting the second class. Due to parallel class loading and with
constellations that involve even more classes, this can get quite
complicated. Using a stable package name is however not only a convenience.
The name of a package can affect reflection-heavy frameworks, debugging
opportunities, logs, stacktraces and more.
Furthermore, one should consider the costs of migrating an existing agent
in the light that an agent can easily open jdk.internal.misc.Unsafe by
using the instrumentation API or use a class file transformer to erase the
explicit package assertion of ClassDefiner. Current agents often use
sun.misc.Unsafe or jdk.internal.misc.Unsafe to invoke the defineClass
method of these types. This API is very performant and easy to use. To some
degree, it is even possible to emulate the Unsafe API using ClassDefiner by
strategically retransforming classes in the right package. The following
code allows to define a class using a witness class by using a
non-operational class file transformer:
static Class<?> defineClass(Instrumentation inst, final Class<?> witness,
final byte[] classFile) throws Exception {
AtomicReference<Class<?>> ref = new AtomicReference<>();
ClassFileTransformer definer = new ClassFileTransformer() {
@Override
public byte[] transform(ClassDefiner classDefiner,
Module module,
ClassLoader loader,
String className,
Class<?> classBeingRedefined,
ProtectionDomain protectionDomain,
byte[] classfileBuffer) {
if (classBeingRedefined == witness) {
ref.set(classDefiner.define(classFile));
}
return null;
}
};
inst.addTransformer(definer, true);
try {
inst.retransformClasses(witness);
} finally {
inst.removeTransformer(definer);
}
return ref.get();
}
In this sense, Java agents already have the possibility to define new
classes within any package. The restriction that ClassDefiner introduces is
therefore merely cosmetic and does not really reduce the actual
capabilities of any owner of an instance of Instrumentation. Beyond that,
JNI already offers a similar API for defining a class which is also used as
there are wrappers that compile the method for any common platform and
architecture.
Finally, many Java agents need to define general infrastructure classes to
function. A tracing framework might for example want to be notified of all
HTTP calls of known frameworks. To accomplish this, some form of common
callback is injected into a generally known class loader, often the
globally visible bootstrap loader:
package com.acme.tracer;
public class AgentCallback {
public static void callback(String framework) { ... }
}
This is typically done before registering a first ClassFileTransformer
which is later instrumenting framework classes to invoke this callback,
none of which is defined by the boot loader which is only used to assure
that the counter can be seen from all application classes and the agent
framework alike. At the moment,
Instrumentation::appendToBootstrapClassLoaderSearch is the most feasible
approach to this. However, the API also has significant downsides. First of
all, it is of course restricted to the boot or system loader by the way the
API is set up. In some scenarios, another class loader would however be
more appropriate, for example if a JVM is segmented as in a JEE container
where the callback should only apply to a specific deployment area and
might therefore better be placed in a different class loader that is root
to the deployment. Additionally, it is rather inefficient to serialize
these auxiliary classes to the file system, only to pass it to the API
which then reads them back into memory to load the contained classes. In
some environments, a file system for such purposes is not even available.
For all these reasons, I still believe that adding
Instrumentation::defineClass is the best solution to address all of these
legitimate needs. Java agents have created excellent tooling for the JVM
which make the platform successful in the enterprise. In my eyes it is
overdue that these legitimate use cases can start relying on a stable API.
Going this route would offer an easy path for migration and agents that
supported older versions than Java 17 could easily replace their usage of
Unsafe by this similar API which promises a quick adoption. Alternatively,
I am afraid many will find ways around, even by emulating unsafe using
official API as I demonstrated, the cost being decreased performance and
convenience.
Thank you for your comments. I really hope this can be addressed before 17
is released, I am hopeful that this would make the use of unsafe API in
agents disappear sooner rather than later.
Best regards, Rafael
More information about the core-libs-dev
mailing list