RFC: 8200559: Java agents doing instrumentation need a means to define auxiliary classes
Ron Pressler
ron.pressler at oracle.com
Sun Apr 18 10:30:26 UTC 2021
Hi Rafael!
The use cases you mention might be legitimate, but why are they legitimate *for
agents*? For example, why can’t the Replacement class be defined in a regular library,
not injected by the agent?
As to the existence of Unsafe hacks and others, those are going away, so “we can do
this anyway,” is not an excuse. I think library authors need to accept that things
they’re able to achieve with various hacks might not be possible to do as
conveniently soon. The security and ecosystem maintenance philosophy is that certain
things are allowed only when the *application* explicitly gives its consent by signifying
it somehow on the command line. Libraries will not be able to take such permissions for
themselves, and “the application gives its consent by choosing my library” is insufficient.
Yes, it will be less convenient, but convenience is not the only important concern.
At this point, that capabilities that might compromise security or upgradability (i.e
reliance on non-API elements) might require explicit consent on the command line, even
when this might be less convenient, should be taken as an axiom for the discussion.
— Ron
> On 16 Apr 2021, at 21:18, Rafael Winterhalter <rafael.wth at gmail.com> wrote:
>
> I am trying to revive issue 8200559 (
> https://bugs.openjdk.java.net/browse/JDK-8200559) which was briefly
> discussed on this mailing list over three years ago (
> http://mail.openjdk.java.net/pipermail/jdk-dev/2018-January/000405.html). I
> am hopeful that a solution could be reached prior to releasing Java 17.
> With the LTS label, JVMs of that version are likely to be seen for quite
> some time and without a fix, many Java agents will still rely on unsafe API
> to support that version. With the issue resolved, agents that aim to
> support Java 17 as a baseline could be implemented using only official API
> which I consider a desirable milestone. This would also allow me to remove
> use of unsafe API from a range of popular open source projects that I
> maintain once Java 17 becomes their baseline. I have proactively added a
> pull request (https://github.com/openjdk/jdk/pull/3546) with what I
> consider to be the most complete and satisfactory solution for agent
> developers. I implemented this patch to test potential migration for my own
> and customer projects and can confirm that this would indeed make any
> unsafe API obsolete.
>
> I had already lined out why APIs such as MethodHandles.Lookup::defineClass
> do not work for ClassFileTransformers in my original posting. I will try to
> avoid reiterating on my original argument. As a result of this discussion,
> an API was proposed where the ClassFileTransformer would be overloaded with
> an additional parameter that receives an instance of ClassDefiner as its
> argument. The instance would allow you to define a class but assert that
> the auxiliary class is located in the same package as the class under
> transformation. For example, if a class:
>
> package p;
> class C {
> void m(A a) {
> a.callback(new Value());
> }
> }
>
> would be instrumented to become:
>
> package p;
> class C {
> void m(A a) {
> a.callback(new Replacement()); // changed
> }
> }
>
> then ClassDefiner would assert that the auxiliary class Replacement is also
> defined in package 'p'. In my eyes, this is not a very satisfying solution.
> In some cases, a Java agent is for example using existing classes to carry
> additional state between two instrumented classes. For instance, class
> Replacement might be used as a carrier of additional state to pass to class
> A which is itself instrumented as follows:
>
> package p2;
> class A {
> void callback(Value v) {
> if (v instanceof Replacement) {
> // read data of v and process
> }
> }
> }
>
> This example is far from constructed, but a common and effective strategy
> to implement tracing, security monitoring or code intelligence.
>
> It is of course possible to implement this pattern by using the
> package-restricted ClassDefiner. It is however rather costly and
> complicated to implement. A ClassFileTransformer is a simple callback that
> is not in control of the class loading order but is only invoked whenever a
> class is loaded for the first time. The first class to be loaded might of
> course be either C or A, it might not even be deterministic for the same
> application. If the class Replacement was required to live in the package
> of the class that is currently instrumented, Replacement would need to be
> defined in either p or p2, depending on the materialized class loading
> order. This would require some form of state management in the class file
> transformer to keep track of the package that one would need to use when
> instrumenting the second class. Due to parallel class loading and with
> constellations that involve even more classes, this can get quite
> complicated. Using a stable package name is however not only a convenience.
> The name of a package can affect reflection-heavy frameworks, debugging
> opportunities, logs, stacktraces and more.
>
> Furthermore, one should consider the costs of migrating an existing agent
> in the light that an agent can easily open jdk.internal.misc.Unsafe by
> using the instrumentation API or use a class file transformer to erase the
> explicit package assertion of ClassDefiner. Current agents often use
> sun.misc.Unsafe or jdk.internal.misc.Unsafe to invoke the defineClass
> method of these types. This API is very performant and easy to use. To some
> degree, it is even possible to emulate the Unsafe API using ClassDefiner by
> strategically retransforming classes in the right package. The following
> code allows to define a class using a witness class by using a
> non-operational class file transformer:
>
> static Class<?> defineClass(Instrumentation inst, final Class<?> witness,
> final byte[] classFile) throws Exception {
> AtomicReference<Class<?>> ref = new AtomicReference<>();
> ClassFileTransformer definer = new ClassFileTransformer() {
> @Override
> public byte[] transform(ClassDefiner classDefiner,
> Module module,
> ClassLoader loader,
> String className,
> Class<?> classBeingRedefined,
> ProtectionDomain protectionDomain,
> byte[] classfileBuffer) {
> if (classBeingRedefined == witness) {
> ref.set(classDefiner.define(classFile));
> }
> return null;
> }
> };
> inst.addTransformer(definer, true);
> try {
> inst.retransformClasses(witness);
> } finally {
> inst.removeTransformer(definer);
> }
> return ref.get();
> }
>
> In this sense, Java agents already have the possibility to define new
> classes within any package. The restriction that ClassDefiner introduces is
> therefore merely cosmetic and does not really reduce the actual
> capabilities of any owner of an instance of Instrumentation. Beyond that,
> JNI already offers a similar API for defining a class which is also used as
> there are wrappers that compile the method for any common platform and
> architecture.
>
> Finally, many Java agents need to define general infrastructure classes to
> function. A tracing framework might for example want to be notified of all
> HTTP calls of known frameworks. To accomplish this, some form of common
> callback is injected into a generally known class loader, often the
> globally visible bootstrap loader:
>
> package com.acme.tracer;
> public class AgentCallback {
> public static void callback(String framework) { ... }
> }
>
> This is typically done before registering a first ClassFileTransformer
> which is later instrumenting framework classes to invoke this callback,
> none of which is defined by the boot loader which is only used to assure
> that the counter can be seen from all application classes and the agent
> framework alike. At the moment,
> Instrumentation::appendToBootstrapClassLoaderSearch is the most feasible
> approach to this. However, the API also has significant downsides. First of
> all, it is of course restricted to the boot or system loader by the way the
> API is set up. In some scenarios, another class loader would however be
> more appropriate, for example if a JVM is segmented as in a JEE container
> where the callback should only apply to a specific deployment area and
> might therefore better be placed in a different class loader that is root
> to the deployment. Additionally, it is rather inefficient to serialize
> these auxiliary classes to the file system, only to pass it to the API
> which then reads them back into memory to load the contained classes. In
> some environments, a file system for such purposes is not even available.
>
> For all these reasons, I still believe that adding
> Instrumentation::defineClass is the best solution to address all of these
> legitimate needs. Java agents have created excellent tooling for the JVM
> which make the platform successful in the enterprise. In my eyes it is
> overdue that these legitimate use cases can start relying on a stable API.
> Going this route would offer an easy path for migration and agents that
> supported older versions than Java 17 could easily replace their usage of
> Unsafe by this similar API which promises a quick adoption. Alternatively,
> I am afraid many will find ways around, even by emulating unsafe using
> official API as I demonstrated, the cost being decreased performance and
> convenience.
>
> Thank you for your comments. I really hope this can be addressed before 17
> is released, I am hopeful that this would make the use of unsafe API in
> agents disappear sooner rather than later.
>
> Best regards, Rafael
More information about the core-libs-dev
mailing list