New candidate JEP: 451: Prepare to Disallow the Dynamic Loading of Agents

Fri May 19 18:40:11 UTC 2023

Hi Ron,

I reviewed integrity JEPs once again along with this email thread and I
think there are several flaws in the proposal that need to be addressed
before implementation.

   1. First, the JEP draws equality between an agent and an instrumenting
   agent, which is not true. Instrumentation is just one of the capabilities
   that an agent needs to request explicitly by calling JVM TI AddCapabilities
   function. There are many other read-only features of JVM TI that
   observability and troubleshooting agents can use without compromising
   application integrity. Disabling all agents by default just to protect from
   a few ones that modify application code is like cracking a nut with a
   sledgehammer, especially when a more fine-grained approach is already built
   into JVM TI.

   2. JEP states that most serviceability tools do not require dynamic
   agents. This sounds weird to me. How was that "most" measured? How can half
   a dozen JDK builtin tools be compared to an infinite number of custom tools
   that may be and already developed using JVM TI?

   3. JEP assumes that existing JDK tools are enough for troubleshooting. I
   wish they were. How, for example, you would dump an object graph without
   sensitive user data from a live service? With JVM TI agent, this is
   possible. Which builtin tool allows you to find native memory leaks,
   sources of long time-to-safepoint pauses, map perf counters to Java code?
   Unfortunately, none. Even worse, when dynamic agents are disabled,
   development of new custom tools will become meaningless.

   4. You emphasized many times that the proposal to disable dynamic agents
   appeared years ago. And that's actually the problem with this JEP. It
   relies on outdated assumptions and has not been adjusted to the modern
   trends. Technology didn't stay still; new use cases became popular, which
   this proposal does not take into account. Here are some examples:
      1. Containers became the standard way to ship and deploy applications
      (btw, a good thing integrity-wise). Container image usually has
the minimum
      amount of software required to run the app: no additional tools,
restricted
      environment. Now consider that I want to monitor the application. Even if
      I'm allowed to modify the command line, I can't simply add -agentpath,
      since the agent library is not available in the container. A typical
      pattern for using serviceability tools with containerized applications is
      to run a sidecar container that has all required tools and capabilities.
      How would you suggest attaching a tool to a running container?
      2. In the last couple of years, with the growing popularity of
      continuous profilers, a number of solutions appeared for system-wide or
      infrastructure-wide zero-configuration monitoring. The idea is that you
      install the observability software, and it automatically discovers all
      supported processes and starts monitoring/profiling them,
regardless of how
      they were deployed. gProfiler, Parca, Pyroscope, just to name a few
      examples. The keyword here is "zero-configuration". Observability by
      default is just as important nowadays as integrity by default.

      5. JEP outlines JFR as a universal solution for profiling, claiming
   it is "far more efficient than anything" in collecting stack traces. This
   is not true. Async-profiler (6K stars on GitHub, 700+ forks, more than a
   million downloads) can collect 1000 execution samples per second per core
   without significant overhead, thanks to hardware performance counters.
   Scalability of JFR sampling mechanism is inherently poor: it uses just one
   dedicated thread to walk through all Java threads in a loop and stop them
   one by one. JFR does not show non-Java threads in a profile, it is blind to
   native frames, its notion of thread states is misleading (e.g., Socket.read
   can spend CPU time in the networking stack or just wait for incoming data,
   but JFR has no clue). JFR fails to traverse valid Java stacks and silently
   discards such samples, e.g., you will not see arraycopy in a profile,
   although it's a common performance bottleneck. JFR is misleading not only
   in CPU profiling but also in memory profiling, see JDK-8307488. It's
   utopian to think that JFR can replace external profilers sometime soon -
   there is no even progress on fixing smaller issues: open bugs hang for
   years (JDK-8252417, JDK-8153167, JDK-8281677), some are closed as
   will-not-fix (JDK-8191415). Is it fair to disallow valid usages of
   profilers at runtime without providing a viable alternative?

   6. You mentioned two goals: 1) disallow libraries to grant themselves
   superpowers; 2) minimize the impact on serviceability tools that have to be
   started by a human operator. However, what this JEP actually suggests is
   the opposite: disabling dynamic loading of agents does not prevent
   libraries from obtaining superpowers - they can simply call System.load().
   At the same time, disabling dynamic loading of agents has a huge impact on
   serviceability, up to the complete inability to use external tools at
   runtime. I understand that the plan is to disallow JNI someday too (unless
   explicitly allowed via a command line option) for the purpose of integrity.
   Following your goals, it would be more logical to disallow JNI first, as it
   is an easier way for libraries to break integrity.

To summarize the above, the current proposal does not seem to me elaborate
enough for targeting to JDK 21. I would suggest improving it by 1)
actualizing assumptions; 2) taking mentioned use cases into account; 3)
providing read-to-use alternatives; 4) matching the plan with the goals.

Thank you,

Andrei Pangin

пт, 19 мая 2023 г. в 15:44, Ron Pressler <ron.pressler at oracle.com>:

> Because the discussion of this JEP has veered in many directions, let me
> summarise where we are:
>
> This JEP proposes to emit a suppressible warning when a JVM TI or Java
> agent is loaded into a JVM sometime after startup through the Attach
> mechanism.
>
> The warning helps make users aware that an agent has been injected into
> the JVM and identify deployments that may need adjustment in advance of any
> future changes to disallow agents from being dynamically loaded without the
> application's consent. The warning will also let us better judge the impact
> of such a future change.
>
> — Ron
>
> > On 8 May 2023, at 20:17, Mark Reinhold <mark.reinhold at oracle.com> wrote:
> >
> > https://openjdk.org/jeps/451
> >
> >  Summary: Issue warnings when agents are loaded dynamically into a
> >  running JVM. These warnings aim to prepare users for a future release
> >  which disallows the dynamic loading of agents by default in order to
> >  improve integrity by default. Serviceability tools that load agents at
> >  startup will not cause warnings to be issued in any release.
> >
> > - Mark
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jigsaw-dev/attachments/20230519/dc5c5926/attachment-0001.htm>