Minor thoughts (Re: [External] : Re: JEP draft: Prepare to Restrict The Use of JNI

Fri Sep 1 12:08:12 UTC 2023

Hi.

The integrity of the platform is very important to us for the multiple reasons that I mentioned before, one of which is precisely because we’d like code that runs on Java N to also run, unchanged, on Java N+5. One of the main causes for that not happening in the past has been a lack, not excess, of integrity. We want to get to a place where most Java programs would, by default, be assured that they’re portable and enjoy all the safety guarantees that Java can offer.

Yes, unfortunately that means introducing some inconvenience. The inconvenience is not the goal, it’s just a necessity, and we’re sorry we have to impose it. We’re doing our best to keep it as minimal as we can.

Moreover, we’re still in the midst of a transition from an era when integrity was not the default but only opt in — which, sadly caused a rather big inconvenience to the Java ecosystem — to a position where integrity is the default. Again, we try to keep this inconvenience to a minimum. We’re encouraged to see the positive effect that strengthened integrity by default has already had on the ecosystem.

I realise that encountering this somewhat changed world may be scary for some, but the big improvements we deliver — including FFM, ZGC, records, patterns, virtual threads, JFR, and some other big ones coming soon — should more than make up for that, not to mention very significant performance improvements that you get for free.

If we want to deliver all that so that Java could thrive by adapting to a changing environment — in terms of hardware, deployment, and security — we must make many changes, and they require platform integrity.

However, our analysis may still be wrong. It’s certainly possible that a flag problematic enough to negate all the benefits we’re delivering. That is why we usually try not to impose new rules right away but to start with warnings. Those warnings help us evaluate what the effect of some change really are and not what we, or anyone else, speculate they would be. If using the flag to disable the warning turns out to be a serious problem — we’ll be able to reevaluate with actual feedback from the field in hand. We’ve done this process several times over the past few years, and it ended up working well.

BTW, if you have an example of a program that business administration students write and doesn’t work unchanged on JDK 22, I’d love to know what it is. Making Java easier for beginners is another focus area for us (see https://openjdk.org/jeps/445).

Also, I’m afraid that the JRE ceased to exist in JDK 11. You can talk of the JDK or of a Java runtime, but the JRE was a very particular Java runtime (it was a global Java Runtime ENVIRONMENT) with certain features that simply no longer exists. I believe that some vendors now distribute Java runtimes that they call a JRE just to minimise the surprise that the JRE is gone, but I assure you that those runtimes are not JREs, and that the JRE is, indeed, a thing of the past. Java runtimes, however, still offer some of the things the JRE did, so perhaps you are referring to those. Or perhaps you mean the Java 8 JRE, which is still available.

Now, let me try and answer your specific questions:

1. If the "user" is the "application author" who is one who creates a pure Java application and controls its setup, the JRE to use, why is it then necessary to warn Java "application authors" about something that may not be even true, that the JNI module in use would be dangerous and/or a security risk?

First, let’s define what we mean by “integrity”. It is the ability to make a promise (that is then kept). For example, the JDK makes the promise that all instances of String are immutable. This invariant (a property that always holds true) is, in turn, enforced by the integrity invariant that all private fields are only accessible by their declaring class. Both user code and JDK code depends on the invariant that all Strings are immutable for its correctness. However, any library that uses JNI can decide, nah, I will access private fields and I will mutate Strings. Another example of an integrity invariant (and we can argue over it’s as important as the first) is that the JDK would like to promise that the JVM process never crashes or that the GC never hangs the application (short of a bug in the JVM). If any JNI-using library is used, a bug in the library may also cause the process to crash or the GC to hang. Yet another invariant is that a program that runs on one version of Java and on one kind of OS/hardware platform is portable to another. This invariant may also not hold when a JNI library is used.

Therefore, when you use a library that employs JNI, you enter a world where most or even all of the promises that the JDK otherwise makes may not hold. This is a different mode where invariants that give Java code its usual meaning at *any* point in the program — not just code that directly calls the library — may no longer hold, and we want the application author to know that they’re application is running in such a mode.

Because there are situations that the JDK can also perform better optimisations when its promises can be trusted, it is also important for the author to know whether the application can enjoy such optimisations or not.

2. People know that one can kill with a knife, why would you want to warn everyone forcefully that knives can kill people, if the normal, expected usage of a knife is for cutting meat and vegetables? (If an "application author" uses modules from sources she or he does not trust, then such a human would be expected to check thoroughly such modules, if they should be used nevertheless, wouldn't she or he?)

Because the way Java applications are composed means that it’s very difficult for the application’s author to know what their libraries do and what invariants they may break. Typically, the application author picks a few libraries and lists them in a build-tool configuration, but they then require further “transitive dependencies”. I.e. an application that’s set up to use five libraries may actually use 50. Not only that, the number and composition of those transitive dependencies may change when any library is upgraded. So the same application that asks for five libraries, may end up using 50 today and 60 tomorrow. Can any of those transitive dependencies break the invariants that Java makes and the author wants to trust (because, say, they want to ensure their application is portable)? It’s very hard to tell without a full analysis of the code of all the transitive dependencies.

In other words, it’s extremely hard for the author to know whether their application is carrying knives or not, and if so where to look if its handling them safely.

Furthermore, it is impossible for the Java runtime to know whether the knives are used safely or not. That’s the whole point of various unsafe mechanisms, including native code: that they’re free of further safety checks. So because the vast majority of libraries don’t need knives at all, imposing a conservative policy is efficient (BTW, in my country it’s illegal to carry a knife unless you can explain your use).

3. For an application author to learn about JNI being used in some of the modules it would be sufficient to get a warning/information when running a tool that would tell her or him. Such an application author would be able to learn from module-info entries (maybe via a tool) which modules document their use of JNI. In such a scenario it then would be probably acceptable that in the case that if modules at runtime use JNI without having this usage documented in their module-info that then a warning gets issued to make the application author aware of it (and only if the module is at least at the class file level of the Java version that introduced this feature and should have been aware of this newly introduced documentation obligation).

As the JEP states, we would like to offer modular libraries the ability to declare their use of JNI, or, more precisely, to declare that they request such permission, but the application would still need to grant that permission, because the whole point is that a library should not be allowed to decide for itself in which mode the entire application is running in: one where promises are kept or one where they may not be. And because of the way applications are assembled as I described above, requiring the user to analyse all of their transitive dependencies over and over is placing too much of a burden on them. The goal of this JEP is to make the life of those who want to enjoy integrity (which is the majority of application) much easier, and only impose a very slight inconvenience on those who want to grant some libraries superpowers.

3. What integrity guarantees does the JDK give that a JNI author would want to intentionally and forcefully break with the intent to harm the JDK? (And if a JNI author would do that intentionally to harm the JDK then she or he can be traced down and made accountable for it.)

If only things were that simple (and it’s all explained in the integrity JEP, BTW). Let me speak not about JNI specifically but all unsafe mechanisms in the JDK because the reasons and effects are the same: Some libraries want to do away with the JDK’s invariant that all newly allocated memory, either on heap or off heap, is initialised for performance reasons. A bug in the *use* of such a library may cause your secret keys to be sent over the wire. What’s even more interesting is that such libraries are rarely used directly: you use some high-level library that, under the covers, uses a low-level library that allocates memory without initialising it. Now, note how complicated the situation has become: the low level library is not at fault — yes, they break an invariant, but they warn their users to use the library correctly. The high-level library has a bug in the use of the low-level library, but the high-level library itself never breaks any invariants. Finally, the application never even asked for a library that breaks JDK invariants.

And if you think that’s the worst of it, well, get a load of this. Some serialization libraries, say, those that can deserialise JSON, need to set private (perhaps even final) fields and instantiate objects without calling a constructor. This, of course, requires breaking access control, the invariant that supports all others. So they use a low-level library with superpowers that break access control. So far, everyone has the best intentions. However, the JSON library deals with input that arrives to the application from the outside world, and that input determines which fields it sets and which objects it instantiates without a constructor. A vulnerability in that innocent library can direct its action toward critical JDK classes, and because the operation is done by using a super-powered low-level library, there are no longer any access checks in its way. So, everyone here is well meaning — the application and both libraries; the vulnerability is not in the super-powered library. But now a vulnerability in one library can have a catastrophic effect because there is some transitive dependency with superpowers.

It can actually get worse and more complicated than that (the library with the vulnerability need not even use the superpowered library). 

4. If a module employs JNI without reporting it that may be regarded as not behaving like a good citizen in the modular Java land and hence reporting the fact that module xyz employs JNI without telling in its module-info at runtime.

First, most libraries aren’t modular. Second, we’ve tried a social approach for many years, and it doesn’t work. I mentioned before that one of the most important invariants is portability: if a program works on JDK N, it would work on JDK N+5 (with very high probability). What happened before we started having integrity by default is that libraries — for their own good reasons — reached into non-standard JDK internals. These may have been a few low-level libraries, but more high level libraries used them, and even more applications used those. The result was that when JDK 9 was released (which made many internal changes but *no* changes to access) a lot of applications broke. All they knew was that a change to Java broke their program. Maybe the information that “using this may make you non-portable” was never published by the low level libraries. Maybe it was but it wasn’t repeated by the high level libraries. Maybe it was but application developers didn’t notice or didn't. Whatever the reason, the result was that “Java broke backward compatibility” even though you may think that everyone made a conscious choice to be not portable (and in reality, the application authors certainly didn’t).

Whoever is to blame, we want to offer a way for this not to happen again, that for most programs (and for the vast majority of *new* programs) means they need to do nothing. Maybe modules will allow us to do something cleverer for modular libraries someday, but because most libraries and applications aren’t modularised, we need to offer them integrity by default, too.

We fully realise that integrity is a very complex subject, which is why so much work has been put into making it simple, which, in this case, requires nothing more than a flag. The goal is to balance the needs of the majority of applications, which can and should enjoy integrity, and the minority that need some less integrity. Furthermore, we’d like applications that employ the class path (the majority) and not just the module path to also be able to enjoy integrity. Previously, integrity was opt-in, and it required such complex configuration that few could ever get it right. Now we need it to be the default, and the minority of applications that need to opt out can do so easily. The majority will be able to rest easy that their programs are portable and don't crash.

— Ron