Please stop incrementing the classfile version number when there are no format changes

Uwe Schindler uschindler at apache.org
Wed Oct 16 14:51:10 UTC 2019


Hi Rémi,

> > I was following this thread for some time and also wanted to give my
> comment. I
> > am the developer of the static analysis tools "forbidden-apis", which is well
> > known in the Apache community, as it allows to scan class files for calls to
> > classes/methods or fields on a list of forbidden signatures (somehow the
> > inverse of the animal sniffer). You can find it there:
> > https://github.com/policeman-tools/forbidden-apis - The main purpose is
> to
> > prevent library authors from doing String#toLowerCase() without locale or
> > opening Readers/Writers without charset.
> >
> >> > software reliably stop working on schedule 'just in case'. ClassGraph is
> >> > commonly used just to locate classes that have an annotation on them
> >> after
> >> > all, and as far as a I know all planned format changes would still allow
> >> > this task to be done. Yes, people might annotate new class-like-things
> such
> >> > as values and confuse old ClassGraphs; that's an ordinary sort of bug
> and
> >> >
> >>
> >> Yes, this is the crux of it. Brian's claims are all correct and unarguable
> >> from a purist point of view; however, most people cannot afford to
> operate
> >> in the realm of purism because of the realities of the JVM ecosystem.
> >>
> >> 99% of classfile parsing is for only one of the following two usecases:
> >>
> >> (1) Locating classes, fields, or methods, that have a specific annotation.
> >> (2) Locating subclasses of a given class, or classes that implement a given
> >> interface.
> >
> > forbiddenapis is a classical one that just needs to parse classes and
> complain
> > when a signature is found somewhere in the bytecode (as invoke,
> invokedynamic,
> > handle in constant pool,...). This tool does not transform class files, so it's
> > perfectly fine when it does not understand the whole class file.
> 
> 
> forbiddenapis is special because it only need to scan the constant pool, it's
> not a typical application that consumes/produces bytecodes.
> In my opinion, using ASM to implement forbiddenapis is a mistake because
> ASM hides the constant pool and only visit the constant items that has a
> reference in the code.
> If javac does a bytecode transformation like replacing a constant by its value,
> ASM will not see the reference because in the bytecode you only see the
> value.
> And it's may be worst with if JEP 303 is implemented, which is a restricted
> form of a macro system, where you transfer part of the calculation from the
> runtime to the compile time.

These are known issues, but actually forbiddenapis also parses the actual bytecode, not only the constant pool entries, because you would need some connection where the method signatures are used in the code (like presenting line numbers or method names in error output). But sure parsing the whole constant pool for class constants or method handles would be fine to find the violations about forbidden apis.

I agree for forbidden fields we already have this problem. Javac replaces some static final constants like integers and strings already by their values, so you won't see the reference in the class file anymore. But as forbiddenapis is mostly about method calls, that's not an urgent issue.

JEP 303 will be an issue if it's used everywhere, but I don't think it will change standard invokevirtual calls to methods - the main interest of forbiddenapis.

> >> And here is the critical point: Programmers who need these capabilities
> >> *simply need these functionalities to keep working, period* -- they don't
> >> care what other exotic features the JVM might choose to add ("nest
> mates"?
> >> How did that term even get rubberstamped?) -- they just need to find
> their
> >> classes, fields or methods using annotations, superclasses or interfaces,
> >> *and they need this to never break*. They don't plan to change their
> whole
> >> codebase to make use of entirely new paradigms that would break the
> way
> >> the
> >> old code works.
> >
> > I fully agree. The forbiddenapis scanner just needs to detect signatures, if it
> > for some reason misses a signature, then we need an update of the tool,
> but
> > important: it still works and does not fail horribly.Semantic changes in the
> > static analysis tool were needed when Java 7 came out, because I had to
> scan
> > the constant pool for Handles to methods (and the code was changed a bit
> to
> > better handle lambdas, so line numbers were correct and the synthetic
> methods
> > are not exposed).
> >
> > But in general the tool still worked fine. When nestmates came - nothing
> needed
> > to be changed. Update of ASM just to get around the clasfile version
> limitation
> > was enough (and adding the empty visitor methods for nestmates).
> >
> >> [...]
> >> I stand by my assertion that it is completely reasonable, even desirable,
> >> for a library to parse only the subset of a classfile that it understands,
> >> and ignore everything else, with the understanding that when things do
> >> break, you get to keep the pieces (and the library maintainers must take
> >> responsibility to fix their library and push out a new version when that
> >> does happen). Even the method and field attributes part of the classfile
> >> spec is designed with exactly this in mind: you just skip attributes you're
> >> not interested in. I think it is very shortsighted to outright reject
> >> making a small update to the classfile format to enable this sort of
> >> selective parsing for the entire classfile. But I have said all I think I
> >> can say on the issue.
> >
> > Actually the problem here is NOT the bumping of classfile versions, the
> problem
> > we are discussing here is more a problem for the bytecode libries,
> especially
> > ASM. So I full agree with the JDK people: Raise the classfile version with
> > every version, I have no problem with that. The problem is somethis else:
> > Currently ASM barfs out if it uses ClassReader and the version number is
> wrong!
> 
> I beg to differ.

That's fine. I fully agree that this is the right thing by default. My proposal would just be some mode in ASM that allows to parse the file as good as it can to allow "inspections" like finding methods marked with annotations.

> > The most funny thing with that is that the IllegalArgumentException thrown
> does
> > not even have a message - which I don't understand at all. People use old
> ASM
> > versions and then an IllegalArgumentException is thrown without a
> message! I
> > reported this several times to ASM and the answer by Eric was always like
> this:
> > "This tool must be small, so we add no text strings not even for error
> > messages." BUMMER.
> 
> It has a message since two or three versions, this has been fixed
> (and don't forget that ASM was created to do bytecode tranformation on Iot
> devices, so the ASM jar size is something we are monitoring seriously)

Sorry, I was not aware of that. Thanks for the information that this was solved. At least when I looked into the source code of ASM 6.x it was always without error message.

> >> Mike's point about only bumping the version number for semantic
> changes is
> >> orthogonal to my request for "subset-parseability", but I also think this
> >> is an entirely reasonable request to make, and a sensible suggestion, in
> >> order to reduce the frequency of breakage for libraries like ASM that must
> >> parse 100% of a classfile in a semantically correct way, so must throw an
> >> exception for unknown classfile formats.
> >
> > The trick I did for forbiddenapis is to add some wrapper around asm's
> > ClassReader that just patches the bytecode before sending it to ASM:
> >
> > https://tinyurl.com/y285cm6e
> >
> > The trick was to just "patch" the class file while ASM reads it and replace
> the
> > class version by the last one ASM known. This was a workaround I used
> several
> > times during the lifetime, especially because ASM took very lonk until new
> > version was realeased after new Java version came out.
> >
> > IMHO, ASM should add an option that you can still parse a class file by
> ignoring
> > the class file version *optionally* (not by default), so tools like
> > forbiddenapis or classpath scanners extracting annotations can work. If
> ASM is
> > really not able to parse over unknown attributes (like the constant pool
> > entries) then it will barf out later. But as long as it can parse the file to
> > the end, user is happy. And code works until the next big breaking classfile
> > change. Maybe once you open a ClassReader with that special flag "ignore
> > unknown class version" the reader should be marked as "tainted" and
> passing a
> > tainted (+filtered) ClassReader to a ClassWriter should be prevented by
> ASM.
> > Problem solved, all happy!
> 
> No barfing later is not an option. It only means more debugging time for us.

If you allow to parse the file on consumer's own risk, no debgging would be needed. With the current status I can only recommend to all those libraries that just parse through classes to find methods with some runtime annotation to just patch the class files as described before. I would like to see an easier approach without patching class files before passing to ASM.

This would make it easier to use ASM for a lot of analysis tools (like autocompletion in IDEs), forbiddenapis.

> > Mabye Rémi Forax has some comment about this proposal. I'd like to open
> an issue
> > on the OW2 bugtracker, but because of my previous experience about that
> I am
> > now willing to try it again. So I brought the issue up here, where it has more
> > people looking at it.
> 
> Yes, please submit your request to the bug tracker, we are polluting the jdk-
> dev mailing list with an ASM RFE.

I agree with that. Let's do that!

My answer here was more meant as an additional comment about the "increment class file version number of every release". To me it's still something that can be solved without changing Java's policy on incrementing classfile version numbers. If the class file parsing by libraries like ASM (the most common one) could *optionally* (!!!) a bit more relaxed for purposes of pure metadata retrieval from class files, most of the issues with older versions of ASM and newer JDKs can be solved. I just wanted to bring this opinion in. Sorry for digressing into more details about the issues I have seen.

> > Finally: ASM currently also violates its own principles: The current ASM
> version
> > (7.2) has support for the Java 14 classfile version. Now the question is:
> What
> > happens if during Java 14 development the classfile format gets a change or
> new
> > feature? Rémi -> you did that change in ASM - please explain! If somebody
> > transforms a class file he would create broken bytecode!
> 
> yes, i fucked up.
> this issue is fixed in the master, we will release a beta of ASM 7.3 soon.

OK!

Thanks,
Uwe



More information about the jdk-dev mailing list