Scanning multi version jars?

Greg Wilkins gregw at webtide.com
Tue Sep 19 04:37:37 UTC 2017


Stephen,

I think the use-case can be pretty well defined.

There should be an enumeration/iterator/stream available that provides the
contents of a jar file as it would be seen/interpreted by the JVMs
classloader.    So if the classloader is doing any processing to handle
versioned classes/resources, then we need an iterator that implements the
exact same logic.

Which raises an interesting point....   with the multi versioned jar I have
used as an example, which contains:

   - org/example/Foo.class
   - org/example/Foo$Bar.class
   - META-INF/versions/9/org/example/Foo.class

What does the classloader do when asked to load "org.example.Foo$Bar" ?
 If it loads it OK, then the JarFile enumerator/iterator/stream should also
return it.   If it throws a ClassNotFoundException, then the
JarFile enumerator/iterator/stream should skip it.

Currently the classloader will happily return a resource for a base inner
class even if its outerclass does not refer to it, so that suggests that
the iteration should also not process out the inappropriate inner classes.
However I think it could be argued that the loader should not load it.

Eitherway, there should be an iteration available that is entirely
consistent with what the classloader does.

regards




On 19 September 2017 at 13:41, Stephen Felts <stephen.felts at oracle.com>
wrote:

> Thanks for the clarification – I overstated the “any JarEntry”.
>
> I didn’t look at VersionedStream so I now understand the limitations you
> mention.
>
>
>
> In my case, it’s necessary to look at all files in the jar file to do the
> elimination of unneeded ordinary/inner classes so JarInputStream getNextJarEntry()
> can be used.
>
> By using the versioned JarFile constructor, getting the JarEntry returns
> the right one for processing.  If I needed further filtering on the file
> names, I’d need to return the real file names.
>
>
>
> Maybe the use case isn’t so universal or well defined.
>
>
>
>
>
>
>
>
> *From:* Greg Wilkins [mailto:gregw at webtide.com]
> *Sent:* Monday, September 18, 2017 9:33 PM
> *To:* Stephen Felts <stephen.felts at oracle.com>
> *Cc:* Paul Sandoz <paul.sandoz at oracle.com>; jigsaw-dev <
> jigsaw-dev at openjdk.java.net>; core-libs-dev at openjdk.java.net
>
> *Subject:* Re: Scanning multi version jars?
>
>
>
> Stephen,
>
>
>
> It is not the case that the getName() always returns the path starting
> with "META-INF/versions/". Specifically, if the entry is obtained from
> getJarEntry() API (and not from the enumerator), then the name is that of
> the unversioned file, but the metadata and contents obtained using the
> jarEntry are for the versioned entry.
>
>
>
> For example the following code:
>
>
>
> JarFile jarFile = new JarFile(new File("/tmp/example.jar"),
>                               false,
>                               JarFile.OPEN_READ,
>                               Runtime.version());
> JarEntry entry = jarFile.getJarEntry("org/example/OnlyIn9.class");
> System.err.printf("%s -> %s%n",entry.getName(),IO.toString(jarFile.
> getInputStream(entry)));
>
> when run against a jar where the class files contain just the text of
> their full path produces the following output:
>
>
>
> org/example/OnlyIn9.class -> META-INF/versions/9/org/example/OnlyIn9.class
>
>
>
> There is nothing in the public API of the JarEntry so obtained that
> indicates that it the versioned entry, nor can I distinguish it from an
> entry obtained by iteration that may report the same name (if the entry was
> also in the base), although at least equals does return false.
>
>
>
> Moreover, the proposed stream API as represented by the current
> implementation of jdk.internal.util.jar.VersionedStream, applies some
> filtering based on the versioning and then converts it's enumerated
> JarEntry instances to opaquely versioned JarEntry instances by calling
> map(jf::getJarEntry),which thus hides the version information and makes
> any additional filtering based on version impossible by any users of that
> stream.
>
>
>
> regards
>
>
>
>
>
> On 19 September 2017 at 11:05, Stephen Felts <stephen.felts at oracle.com>
> wrote:
>
> A versioned file name, JarEntry.getName(), starts with
> "META-INF/versions/".
> The version is the following string up to the next "/".
> The version can be parsed with Runtime.Version.parse().
> If not a versioned class file name, then use Jarfile.baseVersion().
> That should be sufficient to get the version for any JarEntry.
>
> If it needs to run on pre-JDK9, this needs a lot of reflection.
>
> IMO Having a method that behaves as described below is likely to be needed
> for many use cases and it would be good if someone wrote it and put it in a
> well-known, public jar file.
>
>
>
> -----Original Message-----
> From: Greg Wilkins [mailto:gregw at webtide.com]
> Sent: Monday, September 18, 2017 8:19 PM
> To: Paul Sandoz <paul.sandoz at oracle.com>
> Cc: jigsaw-dev <jigsaw-dev at openjdk.java.net>;
> core-libs-dev at openjdk.java.net
> Subject: Re: Scanning multi version jars?
>
> Paul,
>
> yeh... I guess I concede it's not JarFiles job... as much as that would
> make things easier for containers to reach agreement:(
>
> However, can we at least look at having a new default method on JarEntry
> to query the version. Without that, containers don't have the information
> available to perform the semantic filtering required and thus will not be
> able to use the stream API and will have to work from an unversioned stream.
>
> regards
>
> On 19 September 2017 at 03:04, Paul Sandoz <paul.sandoz at oracle.com> wrote:
>
> > I agree with Alan here, we should not be pushing a semantic
> > understanding of inner classes into JarFile.
> >
> > I do sympathise with the case of annotation class scanning, which has
> > always tunnelled through the class loader view to directly get at
> > class file bytes possibly dealing with various URI schemes, since that
> > is currently the only effective way of accessing the required
> > information in an efficient manner.
> >
> > As Alan mentioned we should add a traversable versioned view of a
> > JarFile, returning a Stream, from which it should be possible to
> > filter according to certain semantics.
> >
> > Paul.
> >
> >
> > > On 17 Sep 2017, at 12:27, Alan Bateman <Alan.Bateman at oracle.com>
> wrote:
> > >
> > > On 15/09/2017 22:58, Greg Wilkins wrote:
> > >> :
> > >>
> > >>  * I think the stream needs to handle inner classes and only include
> > >>    them if their matching outerclass is available at the same
> > >>    version.  So for example a base Foo$Bar.class will only be
> > >>    included if the stream includes a base Foo.class, and it will not
> > >>    be included if the Foo.class is version 9 or above.  Likewise a
> > >>    version 9 Foo$Bar.class will only be included in the stream if the
> > >>    stream also includes a version 9 Foo.class, and will not be
> > >>    included if the stream has a version 10 or above Foo.class
> > >>
> > >> If you think this last point is possible, then I'll move the
> > >> discussion
> > back the EE expert groups to try to get an agreement on the exact
> > stream code that will be used in the mid term until it is available in
> > the JRE lib, at which time the specs should be amended to say they
> > will defer the decision of which classes to scan the JRE lib so they
> > will be future proof for any changes in java 10, 11 etc.
> > >>
> > > I don't think this should be pushed down to the JarFile API. The
> > > JarFile
> > API provides the base API for accessing JAR files and should not be
> > concerned with the semantics or relationship between entries. I agree
> > that annotation scanning tools and libraries need to do additional
> > work to deal with orphaned or menacing inner classes in a MR JAR but
> > it's not too different to arranging a class path with a JAR file
> > containing the "classes for JDK 9" ahead of a JAR file containing the
> > version of the library that runs on JDK 8. I do think that further
> > checks could be done by the `jar` tool to identify issues at packaging
> time.
> > >
> > > -Alan
> >
> >
>
>
> --
> Greg Wilkins <gregw at webtide.com> CTO http://webtide.com
>
>
>
>
>
> --
>
> Greg Wilkins <gregw at webtide.com> CTO http://webtide.com
>



-- 
Greg Wilkins <gregw at webtide.com> CTO http://webtide.com


More information about the core-libs-dev mailing list