Please stop incrementing the classfile version number when there are no format changes

Mike Hearn mike at plan99.net
Sat Oct 12 12:07:35 UTC 2019


Alan, after Alexey followed up with me off list the SSL issues we've hit
have been marked for backport or turned out to be already in the process of
being backported (I had third hand information via a vendor originally).
Here they are:

https://bugs.openjdk.java.net/browse/JDK-8225745
https://bugs.openjdk.java.net/browse/JDK-8217610
https://bugs.openjdk.java.net/browse/JDK-8223940

Everyone on this thread seems to understand the problems the ecosystem
faces and the consequent impact on testing of non-LTS releases.
Disagreement is about solutions.

I'd like to propose these incremental changes. Then I'll respond to the
various objections made so far:

   1. (again) Only increment the version if something actually changed.
   2. Make javac emit bytecode with the lowest possible class version for
   what it contains, not the current target version.
   3. Make javac default to a target version of N-1 (or the previous LTS),
   thus requiring developers to opt-in if they want to use features that
   depend on new bytecode features.

These changes are so small they wouldn't create any resourcing problems for
OpenJDK maintainers. But they'd reduce (not eliminate) upgrade pain for
users and make testing OpenJDK EA releases much more feasible. You could
upgrade your JDK and your existing codebases would still compile and run.

Objections so far:

a. The classfile policy hasn't changed. Well, OpenJDK's expectations of its
users *has* changed. Keeping Java 8 era policies will result in users
keeping Java 8 era behaviours and just ignoring the attempt to release
faster. It makes sense: if upgrade cost is the same as it used to be but
the benefits of the upgrades are smaller, why not just wait for the
accumulated benefits to start matching the costs?

b. The format will change every release anyway. Combined with proposals (2)
and (3), the cost of a classfile change should be limited to the users who
need it. The only way that's not true is if the translation of basic Java
constructs keeps changing every 6 months such that literally every file is
affected, and that the benefits of this ever-shifting translation is so
large that users keep opting in. If so then congratulations, the velocity
of the ecosystem must be phenomenal.

But it's more likely there'll be other releases like Java 13 where nothing
actually changes, or releases like Java 8, where only files using the new
features need to change. For those sorts of releases users could benefit.

Now let's consider a world where the format does change in important ways
with every release. That's certainly your call to make. But with the
current way the ecosystem works that will just result in the same outcome
as (a): users will reject the faster release cycle. It's a
features/community testing tradeoff.


c. LTS exists for users that need a lot of stability. I don't think this
dichotomy really exists. By day I work for a firm that makes "enterprise"
Java software and by night I do hobbyist Java stuff (when I get the energy
at least). But I use the same tools and libraries for both. The ecosystem
isn't split along slow/fast lines. So when I watch tools like Groovy,
Gradle and ClassGraph break during the day, I'm not filled with desire to
track down compatibility bugs in my limited coding time at night. Nor are
the developers of those tools motivated to do so either just for non-paying
hobbyist/fast stream devs like me. We'll just stick with Java 8 or Java 11
in both contexts as they work well enough. I think most Java developers are
like me in this regard.

Take this example. When Java 10 came out I wanted to try JPMS with Kotlin
in a hobby project. But that didn't work: Kotlin's stdlib wasn't
modularised. I investigated why not and it turned out JetBrains had tried
but hit a critical problem - Android had a tool that barfed when trying to
parse module-info.class, because of course that file isn't a class, it's a
module. So modularising anything used by Android devs was impossible.
Google eventually fixed it but it was a low priority issue, Android devs
have upgrade latency, and nobody wants to release a modular JAR until most
Android devs have upgraded.

In turn that caused JetBrains to punt modularisation of the Kotlin
ecosystem until Kotlin 1.4, their next major release, so that added even
more latency, and because modules can only depend on other modules that
means the entire Kotlin ecosystem can't easily adopt JPMS and thus jlink,
ensuring there was mostly no reason to upgrade to Java 9+. Kotlin 1.4 still
hasn't happened nearly two years later:

https://youtrack.jetbrains.com/issue/KT-21266

(there's some experimental stdlib version that has it, but nobody uses it).

The entire thing could have been avoided by simply calling the file
"info.module" instead, avoiding exposing old tools to a classfile format
change. How important was the file name "module-info.class" to you?
Important enough to block adoption of JPMS for over two years? That's the
cost every time this number increments.

d. The right fix is to {add java.lang.bytecode, length prefix all fields}.
That would fix it, but only because it'd void the current policy by the
back door. Late-bound OOP like Java enabled the industry to scale to
enormous dependency graphs because it enabled a lot of compatibility.
Objects encapsulate data such that old code can be given a reference to a
"new" object, store it, use it, pass it into callbacks etc without the new
semantics or data getting lost along the way. To preserve the current
policy would require an API like this:

class BytecodeProcessor {
    static int CURRENT_VERSION = 13;
    BytecodeProcessor() {}

    public static BytecodeProcessor instance(int clientVersion) {
        if (clientVersion < CURRENT_VERSION) throw
IllegalStateException("Client too old");
        return new BytecodeProcessor();
    }
}

Nobody writes APIs this way because breaking backwards compatibility on
every release is developer-hostile and hardly ever necessary. New features
can usually be added in ways that don't disturb or confuse old code. For
sure a new bytecode processor API would work like normal, implicitly ending
the current policy, but the policy and format can also be changed to have
the same impact.

So on reflection last night I concluded that whilst a new official bytecode
API would be great, it's orthogonal to the current discussion.

e. Performance. JAR files are very inefficient already. They're one of the
most commonly thrown out parts of the platform (e.g. in jlinked images).
People use Java for productivity first and performance second, so this
doesn't seem especially compelling and yes, I know about the efforts to
micro-optimise startup time ... inability to upgrade hurts more.

f. Can just drop a new ASM into the classpath. Doesn't work because so many
projects  fat-JAR or shade ASM, but our project uses ClassGraph too. The
general point isn't actually about ASM, that was just the proximate cause
of pain. The point is that Java tries hard to let code handle code/data
"from the future" by doing lots of late linking, JIT inlining etc, and the
ecosystem depends on this to scale. But the classfile format doesn't follow
this philosophy and is now a scaling bottleneck for software sizes.

g. Why not just jlink the JDK you need? Static linking the JDK works at the
app level but not at the library level, and a significant number of Java
developers are library developers. We need dependency upgrade cost to be
kept under control.

Finally, there's a more general observation to make on backwards
compatibility. In the 1990s Microsoft became famous for bending over
backwards to keep running code, even if it was buggy or totally
unreasonable. With the internet this intensity slackened because vendors
could release new versions to users, but only a little bit - companies go
bankrupt and projects become unmaintained. Java today is backwards
compatible in theory but not in practice. Every release breaks something,
often something fundamental like our build systems. If OpenJDK is serious
about wanting users to upgrade every six months then the backwards
compatibility philosophy will need to change to be like Microsoft's: remove
the constructor *after* Groovy accepted a patch to stop using it, not
before.

The alternative is that the ecosystem fragments into isolated islands like
Linux distros: bundles of dependencies+JDKs that are tested to work well
together, but which aren't really compatible with each other. I'd really
hate to see that; it's an avoidable future.


More information about the jdk-dev mailing list