[External] : Re: Disallowing the dynamic loading of agents by default

Tue Mar 21 12:40:03 UTC 2023

Hi Volker.

JEP 261 states: "The dynamic loading of JVM TI agents will be disabled by default in a future release. To prepare for that change we recommend that applications that allow dynamic agents start using the option -XX:+EnableDynamicAgentLoading to enable that loading explicitly." The purpose of my email was to announce that that change will be put into effect in JDK 21 and to give a final reminder to those who have not yet done so to follow the recommendation in JEP 261 to prepare for that change.

The Project Jigsaw team made that decision at the time after considering the perspectives of performance experts, security experts, and tooling experts, but unless anyone has some *new* information to present, there is no point in replaying the same discussions. You can revisit at least some of the technical discussions on jigsaw-dev.

I will summarise the salient aspects (all discussed at the time) in the forthcoming JEP but, briefly, dynamically loaded agents -- alongside JNI and Unsafe -- break integrity, the ability to guarantee certain invariants, which has various implications on performance, security, and code evolution. They don't always break integrity in the way a cursory contemplation would suggest, which is why you should study those discussions if you're interested in the subject. Since JEP 261, the JDK has been evolving under the assumption that integrity is preserved unless the application grants explicit consent for it to be broken. As far as security in particular is concerned, the point you made is *not* the relevant one to the implications that were considered by Project Jigsaw at the time. As a member of the Vulnerability Group you may want to discuss that particular aspect with the appropriate people.

-- Ron

> On 20 Mar 2023, at 12:16, Volker Simonis <volker.simonis at gmail.com> wrote:
> 
> Hi Ron,
> 
> I'm still missing convincing technical arguments for disallowing
> dynamic loading of agents.
> 
> If the argument is security then I can only agree with previous
> answers in that an attacker needs local access with the same
> credentials like the attacked JVM. But once he has that, all bets are
> off anyway.
> 
> If you plan for features/enhancements/optimizations that rely on not
> being able to dynamically load an agent (which I haven't heard off
> yet), I don't understand this change either. Because as long as a
> switch for enabling dynamic loading exists (and I haven' heard that
> you want to completely forbid it) the dynamic loading use case has to
> be supported anyway.
> 
> Dynamic agent loading is one of the features which sets the OpenJDK
> apart from other languages, managed runtimes and even closely related
> platforms like for example GraalVM Native Image which don't support
> such a feature. The mere  existence of tools which rely on it and
> which are in widespread productive use, demonstrates its usefulness.
> And it is always good to know you have this possibility in your
> toolbox for the worst case (e.g. our log4j-hotpatcher [1]).
> 
> I also can't by your argument that "the relatively few sophisticated
> users who know how to write ad-hoc agents can even opt to enable
> dynamic agent loading on all their servers". It is *exactly* not the
> few sophisticated authors of dynamic agents who would need to enable
> them but instead the millions of ingenious end-users and
> administrators who bag for help once they run into trouble. The other
> way round makes much more sense to me - the few sophisticated users
> who know for sure that they will never need the help of dynamic agents
> are free to disable them at startup.
> 
> Given the current arguments, for me the usefulness of dynamic agents
> outweigh their drawbacks by far. Of course every OpenJDK distributor
> is free to change the default settings of command line options at his
> sole discretion, but I don't currently see a compelling reason for
> doing this by default for the whole OpenJDK community. If you have
> future plans which rely on disabling/forbidding dynamic agents please
> let us know.
> 
> Best regards,
> Volker
> 
> [1] https://urldefense.com/v3/__https://aws.amazon.com/blogs/opensource/hotpatch-for-apache-log4j/__;!!ACWV5N9M2RV99hQ!I7QWWsAmQNvmFzektSGaq4lWBWuMxP5R8P6nSwxfugmyEpKOrd_Io64JBX9mD8PBHywYZ7gEbDumhe5MdiQz_QdvpQ$ 
> 
> On Mon, Mar 20, 2023 at 11:37 AM Jaroslav Bachorik <j.bachorik at gmail.com> wrote:
>> 
>> Hi,
>> 
>> On Mon, Mar 20, 2023 at 11:11 AM Ron Pressler <ron.pressler at oracle.com> wrote:
>>> 
>>> Hi.
>>> 
>>> The majority of serviceability tools don’t require dynamically loading an agent, and the majority of applications never load an agent dynamically.
>> 
>> 
>> The majority of the JDK built-in tools, I would say. What about eg. the JMC agent?
>> 
>>> 
>>> 
>>> True, there are some tools that will be affected, which is why the decision was to introduce the flag in JDK 9 and to announce this change, but change the default in a later version to give tools ample time to prepare their users. The rationale for this change then hasn’t changed, but will be reiterated in a JEP (we just wanted to announce this ahead of the JEP to give tool authors another reminder more than six months ahead of JDK 21). The only change between then and now is that even fewer use cases require dynamically loaded agents, and so the impact is even smaller.
>> 
>> 
>> As a maintainer of one of such tools I can confidently say that this change will either kill the tool as the ease of use will be gone or the workaround (eg. using JAVA_TOOL_OPTIONS) will completely defeat the purpose of this change. Having to put a flag when starting the JVM to allow dynamic loading of agents sounds a bit nonsensical to me - it would be much easier to directly add the agent to the JVM startup and then implement a lightweight control protocol over socket/shared memory to enabled/disable the agent features dynamically.
>> 
>>> 
>>> 
>>> It is also true that, when starting an application you don’t know that you *will* need to load an agent, but in most situations you know that you might. E.g. processes that are too critical to bring down even for deep maintenance (although not many of these are written in modern version of Java anyone) or canary services that are under trial. The relatively few sophisticated users who know how to write ad-hoc agents can even opt to enable dynamic agent loading on all their servers; these users are better equipped to can weigh the risks and tradeoffs involved.
>> 
>> 
>> Wouldn't having this enabled system-wide actually defeat the purpose of having this flag? Considering that the dynamic attach can be performed only on the same host under the same user as the target process there seems to be a very small chance of loading agents accidentally. In the end people would set up their systems to enabled dynamic agent loading via eg. JAVA_TOOL_OPTIONS and we will be in the same place as before, with the additional hurdle of setting everything up.
>> 
>>> 
>>> Finally, some tools that require a dynamically loaded JVM TI agents, such as profilers that profile native code, are so tied to the VM's internals that the best place for them is in the JDK. If anything, the bigger problem is not that profilers are used too much in production, but too little, including less advanced ones that don’t require an agent. There is plenty of time to enhance the JDK’s built-in profiling capabilities ahead of demand.
>> 
>> 
>> I think this is an overly optimistic view. It is *much more* difficult to enhance the JDK's built-in profiling capabilities than do the same in an external profiling agent.
>> 
>> 
>> Overall, I don't seem to understand the anticipated attack vectors this change is supposed to prevent. AFAIK, in order to do the dynamic agent load one needs to have full access to the target process. That means that there are more convenient and straightforward ways to do anything nefarious than loading a JVMTI agent. Am I missing some other usages where the JVMTI agent would actually give access to something which would be otherwise inaccessible considering that the attacher and attachee must be on the same host and under the same user?
>> 
>> Cheers,
>> 
>> -JB-
>> 
>>> 
>>> 
>>> — Ron
>>> 
>>> On 20 Mar 2023, at 01:21, Andrei Pangin <andrei.pangin at gmail.com> wrote:
>>> 
>>> Hi all,
>>> 
>>> Serviceability has been one of the biggest Java strengths, but the proposed change is going to have a large negative impact on it.
>>> 
>>> Disallowing dynamic agents by default means it will no longer be possible to attach a profiler to a running app in runtime. JFR cannot close this gap due to lack of capabilities modern Java profilers have (that's a separate topic though).
>>> 
>>> When an issue happens with a live app, it's already too late to add a command line argument. Furthermore, it may not be even feasible to add an agent at startup in containerized applications. Starting profiler on demand from the host OS or from a sidecar is the only viable solution in these cases.
>>> 
>>> Next, it's hard to predict beforehand what tools exactly might be useful for troubleshooting: e.g., one tool may be better for finding memory leaks, a different one for analyzing CPU performance. Adding all possible tools at startup does not seem a reasonable approach, especially when tools may conflict with each other.
>>> 
>>> The most important aspect of dynamic agents is the possibility to make a special tool just in time for solving a particular problem. A typical example is to get a value of some field in a live app without dumping the entire 60 GB heap. Another common use case is hot patching for fixing trivial bugs or for adding debug logs dynamically. The prominent example is when the dynamic agent has proved irreplaceable aid in addressing the notorious log4j vulnerabilities CVE-2021-44228 and CVE-2021-45046.
>>> 
>>> I would be grateful to know more about the reasons why we should give up all the above advantages of dynamic agents in their good and legitimate use cases.
>>> 
>>> Thank you,
>>> Andrei
>>> 
>>> чт, 16 мар. 2023 г. в 18:48, Ron Pressler <ron.pressler at oracle.com>:
>>>> 
>>>> Hi.
>>>> 
>>>> In JDK 21 we intend to disallow the dynamic loading of agents by default. This
>>>> will affect tools that use the Attach API to load an agent into a JVM some time
>>>> after the JVM has started [1]. There is no change to any of the mechanisms that
>>>> load an agent at JVM startup (-javaagent/-agentlib on the command line or the
>>>> Launcher-Agent-Class attribute in the main JAR's manifest).
>>>> 
>>>> This change in default behavior was proposed in 2017 as part of JEP 261 [2][3].
>>>> At that time the consensus was to switch to this default not in JDK 9 but in a
>>>> later release to give tool maintainers sufficient time to inform their users.
>>>> To allow the dynamic loading of agents, users will need to specify
>>>> -XX:+EnableDynamicAgentLoading on the command line.
>>>> 
>>>> I'll post a draft JEP for review shortly.
>>>> 
>>>> -- Ron
>>>> 
>>>> [1]: https://docs.oracle.com/en/java/javase/19/docs/api/jdk.attach/com/sun/tools/attach/package-summary.html
>>>> [2]: https://openjdk.org/jeps/261
>>>> [3]: https://mail.openjdk.org/pipermail/jigsaw-dev/2017-April/012040.html
>>> 
>>>