Please stop incrementing the classfile version number when there are no format changes

Mon Oct 14 19:55:10 UTC 2019

On Mon, Oct 14, 2019 at 1:26 PM Mike Hearn <mike at plan99.net> wrote:

> software reliably stop working on schedule 'just in case'. ClassGraph is
> commonly used just to locate classes that have an annotation on them after
> all, and as far as a I know all planned format changes would still allow
> this task to be done. Yes, people might annotate new class-like-things such
> as values and confuse old ClassGraphs; that's an ordinary sort of bug and
>

Yes, this is the crux of it. Brian's claims are all correct and unarguable
from a purist point of view; however, most people cannot afford to operate
in the realm of purism because of the realities of the JVM ecosystem.

99% of classfile parsing is for only one of the following two usecases:

(1) Locating classes, fields, or methods, that have a specific annotation.
(2) Locating subclasses of a given class, or classes that implement a given
interface.

The JDK *does not help* people with either of these needs, therefore
libraries like ClassGraph have been created to provide these functions.
ClassGraph does a lot more than this, but few users do anything more
advanced than this. (The main reason ClassGraph is often picked these days
over some other library is because it robustly supports classpath and
module path scanning across the extreme mess of different classpath /
module path specification mechanisms, something that the JVM also does not
help programmers come to grips with:
https://github.com/classgraph/classgraph/wiki/Classpath-Specification-Mechanisms
 )

And here is the critical point: Programmers who need these capabilities
*simply need these functionalities to keep working, period* -- they don't
care what other exotic features the JVM might choose to add ("nest mates"?
How did that term even get rubberstamped?) -- they just need to find their
classes, fields or methods using annotations, superclasses or interfaces,
*and they need this to never break*. They don't plan to change their whole
codebase to make use of entirely new paradigms that would break the way the
old code works.

This is where the numbers Brian gave are so out of touch with reality:

And have also been three versions (5, 7, and 11) that added new _constant
> pool forms_.  That’s almost 25%!  Not rare at all.  (If your website failed
> on 25% of requests, you’d not get away with calling that “rare failures”,
> you’d be out of business.)

We're coming up on 24 years since Java's initial 1.0 release. So three
breakages in 24 years that would break any scanner (like ClassGraph) that
ignores the classfile version number averages out to one breakage (due to
constant pool changes) every 8 years. (I previously asserted this might
happen once every 10 years.) But 24 years from now, assuming the 6 month
cadence continues, any library that simply throws an exception when it
encounters a new classfile version number (as Brian has suggested is the
only reasonable thing to do) leads to one breakage every 6 months, or (24 *
2 + 1) = 45 breakages over the same period. What Brian has proposed would
lead to breakage 45 / 8 = 5.625x as often on average. So you'd be "out of
business" 5.625x faster if you followed this suggestion and simply balked
at new version numbers.

I stand by my assertion that it is completely reasonable, even desirable,
for a library to parse only the subset of a classfile that it understands,
and ignore everything else, with the understanding that when things do
break, you get to keep the pieces (and the library maintainers must take
responsibility to fix their library and push out a new version when that
does happen). Even the method and field attributes part of the classfile
spec is designed with exactly this in mind: you just skip attributes you're
not interested in. I think it is very shortsighted to outright reject
making a small update to the classfile format to enable this sort of
selective parsing for the entire classfile. But I have said all I think I
can say on the issue.

Mike's point about only bumping the version number for semantic changes is
orthogonal to my request for "subset-parseability", but I also think this
is an entirely reasonable request to make, and a sensible suggestion, in
order to reduce the frequency of breakage for libraries like ASM that must
parse 100% of a classfile in a semantically correct way, so must throw an
exception for unknown classfile formats.