Please stop incrementing the classfile version number when there are no format changes

Uwe Schindler uschindler at apache.org
Mon Oct 14 21:54:28 UTC 2019


Hi,

I was following this thread for some time and also wanted to give my comment. I am the developer of the static analysis tools "forbidden-apis", which is well known in the Apache community, as it allows to scan class files for calls to classes/methods or fields on a list of forbidden signatures (somehow the inverse of the animal sniffer). You can find it there: https://github.com/policeman-tools/forbidden-apis - The main purpose is to prevent library authors from doing String#toLowerCase() without locale or opening Readers/Writers without charset.

> > software reliably stop working on schedule 'just in case'. ClassGraph is
> > commonly used just to locate classes that have an annotation on them
> after
> > all, and as far as a I know all planned format changes would still allow
> > this task to be done. Yes, people might annotate new class-like-things such
> > as values and confuse old ClassGraphs; that's an ordinary sort of bug and
> >
> 
> Yes, this is the crux of it. Brian's claims are all correct and unarguable
> from a purist point of view; however, most people cannot afford to operate
> in the realm of purism because of the realities of the JVM ecosystem.
> 
> 99% of classfile parsing is for only one of the following two usecases:
> 
> (1) Locating classes, fields, or methods, that have a specific annotation.
> (2) Locating subclasses of a given class, or classes that implement a given
> interface.

forbiddenapis is a classical one that just needs to parse classes and complain when a signature is found somewhere in the bytecode (as invoke, invokedynamic, handle in constant pool,...). This tool does not transform class files, so it's perfectly fine when it does not understand the whole class file.

> And here is the critical point: Programmers who need these capabilities
> *simply need these functionalities to keep working, period* -- they don't
> care what other exotic features the JVM might choose to add ("nest mates"?
> How did that term even get rubberstamped?) -- they just need to find their
> classes, fields or methods using annotations, superclasses or interfaces,
> *and they need this to never break*. They don't plan to change their whole
> codebase to make use of entirely new paradigms that would break the way
> the
> old code works.

I fully agree. The forbiddenapis scanner just needs to detect signatures, if it for some reason misses a signature, then we need an update of the tool, but important: it still works and does not fail horribly.Semantic changes in the static analysis tool were needed when Java 7 came out, because I had to scan the constant pool for Handles to methods (and the code was changed a bit to better handle lambdas, so line numbers were correct and the synthetic methods are not exposed).

But in general the tool still worked fine. When nestmates came - nothing needed to be changed. Update of ASM just to get around the clasfile version limitation was enough (and adding the empty visitor methods for nestmates).

> [...]
> I stand by my assertion that it is completely reasonable, even desirable,
> for a library to parse only the subset of a classfile that it understands,
> and ignore everything else, with the understanding that when things do
> break, you get to keep the pieces (and the library maintainers must take
> responsibility to fix their library and push out a new version when that
> does happen). Even the method and field attributes part of the classfile
> spec is designed with exactly this in mind: you just skip attributes you're
> not interested in. I think it is very shortsighted to outright reject
> making a small update to the classfile format to enable this sort of
> selective parsing for the entire classfile. But I have said all I think I
> can say on the issue.

Actually the problem here is NOT the bumping of classfile versions, the problem we are discussing here is more a problem for the bytecode libries, especially ASM. So I full agree with the JDK people: Raise the classfile version with every version, I have no problem with that. The problem is somethis else: Currently ASM barfs out if it uses ClassReader and the version number is wrong!

The most funny thing with that is that the IllegalArgumentException thrown does not even have a message - which I don't understand at all. People use old ASM versions and then an IllegalArgumentException is thrown without a message! I reported this several times to ASM and the answer by Eric was always like this: "This tool must be small, so we add no text strings not even for error messages." BUMMER.

> Mike's point about only bumping the version number for semantic changes is
> orthogonal to my request for "subset-parseability", but I also think this
> is an entirely reasonable request to make, and a sensible suggestion, in
> order to reduce the frequency of breakage for libraries like ASM that must
> parse 100% of a classfile in a semantically correct way, so must throw an
> exception for unknown classfile formats.

The trick I did for forbiddenapis is to add some wrapper around asm's ClassReader that just patches the bytecode before sending it to ASM:

https://tinyurl.com/y285cm6e

The trick was to just "patch" the class file while ASM reads it and replace the class version by the last one ASM known. This was a workaround I used several times during the lifetime, especially because ASM took very lonk until new version was realeased after new Java version came out.

IMHO, ASM should add an option that you can still parse a class file by ignoring the class file version *optionally* (not by default), so tools like forbiddenapis or classpath scanners extracting annotations can work. If ASM is really not able to parse over unknown attributes (like the constant pool entries) then it will barf out later. But as long as it can parse the file to the end, user is happy. And code works until the next big breaking classfile  change. Maybe once you open a ClassReader with that special flag "ignore unknown class version" the reader should be marked as "tainted" and passing a tainted (+filtered) ClassReader to a ClassWriter should be prevented by ASM. Problem solved, all happy!

Mabye Rémi Forax has some comment about this proposal. I'd like to open an issue on the OW2 bugtracker, but because of my previous experience about that I am now willing to try it again. So I brought the issue up here, where it has more people looking at it.

Finally: ASM currently also violates its own principles: The current ASM version (7.2) has support for the Java 14 classfile version. Now the question is: What happens if during Java 14 development the classfile format gets a change or new feature? Rémi -> you did that change in ASM - please explain! If somebody transforms a class file he would create broken bytecode!

Uwe



More information about the jdk-dev mailing list