[External] : Re: JEP draft: Integrity and Strong Encapsulation

Mon May 8 13:41:14 UTC 2023

Thanks for the response, Ron.

My comments are in line.

On Fri, May 5, 2023 at 8:10 AM Ron Pressler <ron.pressler at oracle.com> wrote:

>
>
> On 4 May 2023, at 21:32, Dan Heidinga <heidinga at redhat.com> wrote:
>
>
> I’ve read this draft a number of times and each time I struggled with the
> framing of the problem given Java’s success over the past almost 30 years.
>
>
> The old regime worked when: 1. Almost all the runtime was written in C++
> (so the fact Java code couldn’t really establish invariants didn’t matter
> as much), 2. The JDK evolved at a slow pace, and 3. Java applications were
> deployed in a particular way. That lasted for a very long time, but all of
> these are now changing: 1. More and more of the runtime is being written
> (or rewritten) in Java, 2. The JDK is evolving faster, and 3. New
> deployment kinds are desired.
>

I agree the old regime worked.  It worked well and enabled Java to flourish
as a stable base for applications built on top of the runtime.  And many of
those applications have chosen to "violate integrity" to achieve business
goals.  Enforcing more constraints on the ecosystem to make JDK development
/ maintenance easier isn't necessarily a winning strategy for the
applications built on top of the runtime.  Especially given we have
existing tools - such as marking specific classes as "unmodifiable" [0] -
that would allow the VM to enforce invariants on critical implementation
classes that are ported from C++ to Java and could be extended to protect
the runtime further.

Can you speak further to the "new deployments" and why integrity
constraints are critical to them?

[0]
https://docs.oracle.com/javase/8/docs/platform/jvmti/jvmti.html#IsModifiableClass

>
> In light of this new situations problems are arising due to the old
> regime, which isn’t working so well anymore.
>
> As JEP 411 states, the SecurityManager has:
> * Brittle permission model
> * Difficult programming model
> * Poor performance
> Which translates into a whole lot of cost both for maintainers of the JDK
> and for all users who must pay the runtime costs related to the
> SecurityManager (high when enabled, but non-zero always).
>
> Although the SecurityManager has high costs, and is infrequently used at
> runtime in production, it provides the only way to limit certain
> capabilities like:
> * JNI (SecurityManager::checkLink)
> * Encapsulation (SecurityManager::checkPackageAccess)
> * Launch new processes (SecurityManager::checkExec)
> * Reflective access (accessDeclaredMembers, etc)
> * and others
>
> Some of those controls need replacements if the SecurityManager will go
> away.  JNI, surprisingly, is a key one here for large corporations.
>
> If I understand correctly, this new Integrity JEP draft aims, amongst
> other things, to replace the hard to maintain, expensive runtime checks of
> the SecurityManager with configuration via command line options.  This
> allows those who previously relied on the SecurityManager to continue to
> control the high-order bits of functionality without imposing a cost on the
> rest of the ecosystem.  It also makes it easier to determine which
> libraries are relying on the restricted features.
>
> Overall, this provides a smoother migration path for users, makes the
> intention of users very clear (just read the command line vs auditing
> SecurityManager implementation) and improves performance by shifting these
> decisions to configuration time rather than paying cost of code complexity
> and stack walks.
>
> I also appreciate the “nudge” being made with this JEP by requiring
> explicit opt-in to disabling protections versus the previous uphill battle
> to enable the SecurityManager.  It makes for an easier conversation to ask
> for i.e. JNI to be enabled for one library on the command line rather than
> having to deal with all the potential restrictions of the SecurityManager.
>
>
> The relationship between security and integrity is as follows: integrity
> is a prerequisite to robust security (i.e. security that doesn’t require
> full-program analysis). That’s because security depends on maintaining
> security invariant — e.g. a sensitive method is only ever called after an
> access check — and there can be no robust invariants, aka integrity
> invariants, *of any kind* without integrity.
>
> SecurityManager was a security mechanism, and because robust security
> requires integrity, SecurityManager *also* had to offer integrity. But
> strong encapsulation isn’t a security mechanism. It is an integrity
> mechanism. As such, it makes it *possible* to build robust security
> mechanisms, such as an authorisation mechanism, at any layer: the JDK,
> frameworks/libraries, the application. Without integrity, it would be
> impossible to build such security mechanisms at any layer. In a way,
> SecurityManager served as an excuse of sorts: if you really needed
> integrity you could have hypothetically achieved it using SM (though in
> practice it was hard).
>

That's a fair characterization.  I see this JEP draft as a necessary
foundational step towards the removal of the SecurityManager.  Without the
limitations being proposed by this JEP, there is nothing the runtime offers
to fill the gap produced by removing the SecurityManager.  I think it's
worth calling out that this JEP draft is an enabling step towards the
complete removal of the deprecated SecurityManager.

>
> You are right that strong encapsulation’s “permissions” are, by design,
> more coarsely grained than SM’s security permissions, but that’s not the
> only difference, or even the main one. A bigger difference is that it is
> quite normal for an application to give some component/user access to some
> file. On the other hand, it is abnormal and relatively rare for an
> application to grant *any* strong-encapsulation-breaking permissions (those
> that override the permissions in modules’ module-info, that is) with the
> possible exception of --enable-native-access to allow JNI/FFM. Few programs
> should have *any* of --add-exports/add-opens/patch-module in production
> (although it’s normal in whitebox testing); these are all red flags. Unlike
> a “reasonable” security policy, which is quite complex, the only reasonable
> integrity configuration is the empty one, again, with the exception of
> —enable-native-access; a *minority* of programs may also have -javaagent.
>

It's a great vision statement but the unfortunate reality is much messier.
Most programs - especially given the current adoption of modules - will
need --add-exports/add-opens until their dependencies are all fully
modularized and even then, if today's setAccessible use is any indication,
will continue to use those options.  Additionally, -javaagent is a key
enabler of Observability tooling.  I'd be surprised if only a minority of
programs were deployed with monitoring agents... in fact, I expect that
given the increasing emphasis on Observability, usage will increase,
especially with these tools needing to switch away from dynamic attach.

>
> So it’s not just fine-grained vs. coarse-grained, opt-in vs. opt out, but
> also: the “right” configuration is the default one or one that’s very close
> to it.
>
>
> So while overall, when viewed from the lens of removing the
> SecurityManager, this approach makes sense, I do want to caution on betting
> against Java’s strengths, particularly against its use of speculative
> optimizations.
>
> > Neither a person reading the code nor the platform itself – as it
> compiles and runs it – can fully be assured that the code does what it says
> or that its meaning does not change over time as the program runs.
> …..
> > In the Java runtime, certain optimizations assume that conditions that
> hold at the time the optimization is made hold forever.
>
> This is the basis of all speculative optimization - the platform assumes
> the meaning doesn’t change and compiles as though it won’t.  If the
> application is modified at runtime, the JVM applies the necessary
> compensations such as deoptimization and recompilation.
>
> Java has bet on dynamic features time and again (even when others have
> championed static approaches) and those bets - backed by speculative
> optimizations - have paid off time and again.  So this can’t be what you’re
> arguing against.
>
> If the concern is that the runtime behaviour may appear to be different
> than the intent expressed in the source code due to use of setAccessible or
> changes by agents, then I think the JEP should be more explicit about that
> concern.  The current wording reads as equally applying to many of Java’s
> existing dynamic behaviours (and belies the power of speculation coupled
> with deoptimization!).
>
>
> I’m certainly not arguing against the power of speculative optimisation.
> It has certainly worked time and again for Java… except when it doesn’t.
> For example, Valhalla realised that value objects cannot be *just* a
> speculative optimisation, and a different user-facing model, with stricter
> integrity invariants are needed.
>

As a member of the Valhalla EG, I can confidently state that many of the
Valhalla requirements come out of the underlying "vm physics" and need to
reflect those tradeoffs in a way that makes sense to developers who aren't
familiar with the ins-and-outs of the core runtime.  Valhalla still bets
hard on speculation - preferring to assume "this won't be null" for most
values rather than hard code that into the underlying runtime (see recent
discussions on removing the "Q" descriptor).

> In this JEP, however, I’m mostly hinting at link-time (or, in any event,
> pre-production-runtime) optimisations that may come in Project Leyden. It’s
> not so much the difference between the source code and what ends up running
> that matters, but what some form of analysis (either static or dynamic
> during a trial-run) sees vs. what the application may later do. In some
> cases, speculation that falls back on deopt may do, but for other,
> “tighter” link-time/pre-run optimisations, it may prove insufficient. The
> platform would need to know that the meaning of the program does not change
> between the time the optimisations are performed and the time the program
> is run.
>

Or it needs to be able to cheaply and quickly validate that the assumptions
made based on the training runs / static analysis continue to hold in this
new run.  Some of those assumptions will hold by fiat while others will
need to be checked.  Enforcing additional integrity checks may make that
analysis easier but is unlikely to remove the need to validate the
assumptions.

>
> As for dynamic features, we need to separate regular reflection — which
> isn’t affected at all — from deep reflection. The two primary uses for deep
> reflection in production are dependency injection and serialization. But
> dependency injection requires only a very controlled form of deep
> reflection — one that is nicely served by Lookups, and the use of deep
> reflection in serialization is considered a mistake that can and should be
> fixed (
> https://openjdk.org/projects/amber/design-notes/towards-better-serialization).
> Until then, the JDK offers special provisions for serialization libraries
> that wish to serialize JDK objects (
> https://github.com/openjdk/jdk/blob/master/src/jdk.unsupported/share/classes/sun/reflect/ReflectionFactory.java).
> There is no reason --add-opens shouldn’t be rare.
>

Has there been any analysis on how common --add-opens actually is?  Or has
the use of setAccessible (as a proxy for --add-opens) been analyzed to
validate the assumptions here?  If that analysis could be shared it would
help to validate the assumptions being stated here.  I know we've examined
common corpuses as part of other JSRs to validate ie how widespread "_" was
used as variable name before restricting it.  Can the same be done here (if
it hasn't already)?

>
>
> > For example, every developer assumes that changing the signature of a
> private method, or removing a private field, does not impact the class's
> clients.
>
> Right.  The private modifier defines a *contract* which states anyone
> depending on the implementation details are on their own and shouldn’t be
> surprised by changes.  I understand that it can be problematic when large
> successful frameworks are broken by such changes, but that doesn’t
> invalidate the contract that’s in place.  The risk is higher for the JDK
> than for other libraries or applications given the common dependency on the
> JDK.
>
>
> True, which is why we’re not forcing libraries to be modularised (although
> they may have to be modularised to enjoy some of the features that Project
> Leyden may end up delivering).
>
> But I’ll also say this. What we know *now* that the designers of Java 1.0
> didn’t know is that that contract — at least as far as the JDK goes —
> wasn’t respected, which ended up giving users a bad upgrade experience
> especially since the rate of the platform’s evolution started rising. We
> can advise library authors not to do something time and again, but they
> care about their own users, as they should, and so justify doing what they
> do. Even though everyone is justified in pursuing their interests, the end
> result has been a tragedy of the commons. As the maintainers of the
> platform, our user base is the entire Java ecosystem as a whole and, as it
> turned out, some regulatory intervention is needed to stop this tragedy of
> the commons.
>

For applications that made the jump to a version > 9, the upgrade from
release to release has been (to my knowledge) fairly smooth apart from
dealing with --illegal-access=deny becoming mandatory.

>
>
> > However, with deep reflection, doSensitiveOperation could be invoked
> from anywhere without an isAuthorized check, nullifying the intended
> restriction; even worse, an agent could modify the code of the isAuthorized
> method to always return true.
>
> And clearly, these would be bugs.  Not much different than leaking a
> privileged MethodHandles.Lookup object outside a Class’s nest (the boundary
> for private access) for which there is no enhanced integrity check.
>
> We can’t fully protect users from code that does the wrong thing even
> while undertaking efforts to minimize the attack surface.  “Superpowers”
> are exactly that, while we support making them opt-in, we should be careful
> not to overstate the risk as the same principle applies to all code running
> in a process - it must be trusted as it has the same privileges as the
> process.
>
>
> When it comes to security, such bugs are known as vulnerabilities (though
> not necessarily exploits), and we must differentiate between them depending
> on which side of the encapsulation boundary these vulnerabilities lie. If a
> security-sensitive class has a bug that causes it to leak a capabilities
> object that’s one thing, but if a bug in a serialization library that uses
> a super-powered deep-reflection library could have its inputs manipulated
> so that a security-sensitive class is compromised, that’s a whole other
> story.
>
> Strong encapsulation builds bulkheads that allows a sensitive module to be
> analysed *in isolation*, given its well-defined surface area, and robustly
> protected from vulnerabilities in *other* modules. That’s precisely why
> integrity is a required for robust security. Obviously, no security
> mechanism is perfect, but strong encapsulation gives the authors of
> security mechanisms a very valuable tool.
>
> While talking about this subject it’s worth mentioning that the Java
> Platform should provide the necessary integrity, but it can’t provide all
> the sufficient integrity. Some integrity guarantees must also be provided
> by OS mechanisms (say, filesystem and process isolation) and even hardware
> mechanism (timing/rowhammer etc.). To be as secure as possible, a security
> mechanism must rely on the integrity of all layers below it.
>

I think we're in the same ballpark here - there's a balancing act between
what the runtime can provide and the risk of running any code on a system.

>
>
> > A tool like jlink could remove unused strongly-encapsulated methods at
> link time to reduce image size and class loading time.
>
> Most of the benefit here is not time saved by not loading the methods,
> it’s actually due to avoiding the need to load classes during
> verification.  The verifier needs to validate relationships between classes
> and every extra method potentially asserts new relationships (such as class
> X subclasses Throwable) and it is these extra classes that need loading
> that typically increases the startup time.
>
>
> Right. I count that as class loading, or startup time.
>
>
> > The guarantee that code may not change over time even opens the door to
> ahead-of-time compilation (AOT).
>
> AOT doesn’t depend on the code never changing.  OpenJ9 has AOT code that
> is resilient in the face of changes to the underlying Java class files.
> I’m positive Hotspot will be able to develop similar resilient AOT code.
> The cost of validating the assumptions made while AOT compiling is much
> lower than doing the compile while still enabling Java’s dynamic features.
>
>
> There are different kinds of AOT compilation, and Leyden may allow
> multiple modes. Some may support deoptimisation, and others may not (or may
> even not have class files available to them at all). Given an application
> configuration, we want to know which modes are possible and what link-time
> transformation is needed or possible.
>

> — Ron
>

Thanks,
--Dan
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/jigsaw-dev/attachments/20230508/0726b89a/attachment-0001.htm>