Unnamed module and duplicate package

Paul Benedict pbenedict at apache.org
Thu Mar 10 21:40:32 UTC 2016


My apologies for omitting some key qualifiers in my explanation. Read
everything as a proposal to prohibit classpath jars from silently splitting
whatever packages the JDK exports. -- Thanks

On Thu, Mar 10, 2016 at 3:25 PM, Paul Benedict <pbenedict at apache.org> wrote:

> Alex,
>
> For the sake of usability, however, it would be nice if the JDK verified
> that jars do not contain any exported JDK packages. This would be an RFE. I
> understand that in order to avoid split packages between modules and
> classpath, the module version takes precedent. For developer vs. developer
> code, I find that reasoning fine. However, I really would like to treat the
> JDK as "special" (your words) because in my experience, I see developers
> constantly perplexed by NoClassDefFoundError when something occurred like
> you just detailed.
>
> I'd like to refer you to the Servlet 3.1 spec [1], section 10.7.2, as an
> analogous concern to mine. This is the so-called "prohibited classes"
> violation clause:
>
> "As described in the Java EE license agreement, servlet containers that
> are not part of a Java EE product should not allow the application to
> override Java SE platform classes, such as those in the java.* and javax.*
> namespaces, that Java SE does not allow to be modified. The container
> should not allow applications to override or access the container’s
> implementation classes."
>
> I don't think it's good usability to let JDK packages in the classpath go
> silently unchallenged and unloaded. I recommend they are reported as an
> error.
>
>
> [1] https://java.net/downloads/servlet-spec/Final/servlet-3_1-final.pdf
>
> Cheers,
> Paul
>
> On Thu, Mar 10, 2016 at 2:30 PM, Alex Buckley <alex.buckley at oracle.com>
> wrote:
>
>> I see xml-apis.jar (2.0.2) contains:
>>
>> - a javax.xml.parser package, which includes a FactoryFinder class that's
>> not in Java SE, and
>>
>> - a javax.xml.transform package hierarchy, whose types at first glance
>> look identical to those in Java SE except for yet another FactoryFinder
>> class in javax.xml.transform.
>>
>> If you put xml-apis.jar on the classpath, its javax.xml.** packages will
>> be ignored. The unnamed module reads the java.xml module which exports
>> javax.xml.** packages (assuming java.xml in the system image, of course),
>> so the application class loader delegates for javax.xml.** packages to the
>> loader responsible for the java.xml module. User code that tries to access
>> FactoryFinder will get a NoClassDefFoundError.
>>
>> There's nothing special about JDK modules here. The same
>> NoClassDefFoundError would occur if the system image contained a module
>> exporting some package, and a JAR on the classpath contained the same
>> package with extra classes, and some code on the classpath tried to access
>> those extra classes. Since the module in the system image is probably the
>> rightful owner/exporter of the package, hard questions should be asked
>> about the provenance of the JAR on the classpath.
>>
>> There has been some discussion of a jdeps-like tool that detects when a
>> JAR on your classpath is trying to split a JDK package:
>> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2015-November/005227.html
>> .
>>
>> Alex
>>
>> On 3/10/2016 10:27 AM, Paul Benedict wrote:
>>
>>> Alex, there are JARs that contain javax packages. Anyone in the web
>>> development community knows how many people have included xml-apis in
>>> their WEB-INF :-) only to find out it wasn't loaded or it took precedent
>>> over the JDK versions.
>>>
>>> Has Jigsaw introduced any restrictions here on this front? Honestly, I
>>> think the JDK should make it illegal for the classpath to contain ANY
>>> packages that the jdk has. Please opine when it is convenient for you.
>>>
>>> Cheers,
>>> Paul
>>>
>>> On Wed, Mar 9, 2016 at 5:13 PM, Alex Buckley <alex.buckley at oracle.com
>>> <mailto:alex.buckley at oracle.com>> wrote:
>>>
>>>     Paul, thank you for asking. The module system's treatment of the
>>>     unnamed module vis-a-vis named modules is probably the biggest
>>>     factor affecting usability of the module system. This is true almost
>>>     by definition because at JDK 9 GA the only named modules in the
>>>     world will be the JDK's while every other class will be in the
>>>     unnamed module of the application class loader.
>>>
>>>     So please, ask more questions about the unnamed module. I am
>>>     especially interested to know if anyone has JARs that contain javax
>>>     packages (or heaven forbid, sun or com.sun packages) found in the
>>>     JDK -- such JARs are a mortal danger to interop between unnamed and
>>>     named modules.
>>>
>>>     Alex
>>>
>>>     On 3/9/2016 1:47 PM, Paul Benedict wrote:
>>>
>>>         Thank you Alex. Since it's roughly the same as JDK 8, then it's
>>>         also not
>>>         worse. I defer to your explanation on that point.
>>>
>>>         Cheers,
>>>         Paul
>>>
>>>         On Wed, Mar 9, 2016 at 3:37 PM, Alex Buckley
>>>         <alex.buckley at oracle.com <mailto:alex.buckley at oracle.com>
>>>         <mailto:alex.buckley at oracle.com
>>>
>>>         <mailto:alex.buckley at oracle.com>>> wrote:
>>>
>>>              Presumably you would count the equivalent scenario on JDK 8
>>>         -- my
>>>              package A is in Alex.jar on the classpath and your package
>>>         A is in
>>>              Paul.jar on the classpath -- as a security issue too,
>>>         because some
>>>              of my classes may substitute for yours (or some of yours
>>>         for mine,
>>>              depending on how the classpath is constructed).
>>>
>>>              On JDK 9, we do the "substitution" cleanly. Package A is
>>>         not split.
>>>              That avoids one category of error (ClassCastException).
>>>         What about
>>>              poor package B that finds itself accessing a different
>>>         package A
>>>              than it was compiled with? Well, since package A is
>>>         exported by a
>>>              named module, it's reasonable to assume that the named
>>>         module "owns"
>>>              package A [*], and that the developer of package B
>>>         co-bundled some
>>>              version of package A without renaming it. Dangerous in JDK
>>> 8,
>>>              dangerous in JDK 9. (We're trying to encapsulate the
>>>         internals of a
>>>              module, which is different from trying to isolate modules
>>>         from each
>>>              other.)
>>>
>>>              [*] Advanced scenario: the named module exporting A is
>>>         actually an
>>>              automatic module which happened to co-bundle package A. By
>>>         placing
>>>              this JAR on the modulepath to form an automatic module, it
>>>         dominates
>>>              the JAR left on the classpath which also co-bundled package
>>> A.
>>>
>>>              Alex
>>>
>>>              On 3/9/2016 1:17 PM, Paul Benedict wrote:
>>>
>>>                  But isn't what your proposing a security issue? Let's
>>>         say my
>>>                  package A
>>>                  is in the unnamed module and your package A is in a
>>> named
>>>                  module. You
>>>                  basically took over my code; your classes will be
>>>         substituted
>>>                  for mine.
>>>
>>>                  Cheers,
>>>                  Paul
>>>
>>>                  On Wed, Mar 9, 2016 at 2:38 PM, Alex Buckley
>>>                  <alex.buckley at oracle.com
>>>         <mailto:alex.buckley at oracle.com> <mailto:alex.buckley at oracle.com
>>>         <mailto:alex.buckley at oracle.com>>
>>>                  <mailto:alex.buckley at oracle.com
>>>         <mailto:alex.buckley at oracle.com>
>>>
>>>                  <mailto:alex.buckley at oracle.com
>>>
>>>         <mailto:alex.buckley at oracle.com>>>> wrote:
>>>
>>>                       On 3/9/2016 10:36 AM, Paul Benedict wrote:
>>>
>>>                             From the doc:
>>>                           "If a package is defined in both a named
>>>         module and the
>>>                  unnamed
>>>                           module then
>>>                           the package in the unnamed module is ignored.
>>> This
>>>                  preserves
>>>                           reliable
>>>                           configuration even in the face of the chaos of
>>> the
>>>                  class path,
>>>                           ensuring
>>>                           that every module still reads at most one
>>>         module defining a
>>>                           given package.
>>>                           If, in our example above, a JAR file on the
>>>         class path
>>>                  contains
>>>                           a class
>>>                           file named
>>>         com/foo/bar/alpha/AlphaFactory.class then
>>>                  that file
>>>                           will never
>>>                           be loaded, since the com.foo.bar.alpha package
>>> is
>>>                  exported by the
>>>                           com.foo.bar module."
>>>
>>>                           I would like some clarification. Correct me if
>>>         wrong, but I
>>>                           think this
>>>                           entire paragraph is really meant to be about
>>> the
>>>                  perspective from a
>>>                           modularized JAR? If a module has package A,
>>>         and the unnamed
>>>                           module has
>>>                           package A, then of course the module's package
>>>         A should
>>>                  win.
>>>
>>>                           However, if it is meant to be absolute
>>>         language, then I
>>>                  disagree.
>>>
>>>                           The unnamed module should be coherent among
>>>         itself. If the
>>>                           unnamed module
>>>                           has package B and relies on classes from
>>>         package A, it
>>>                  should
>>>                           still be able
>>>                           to see its own package A. I don't think
>>>         modules should
>>>                  be able
>>>                           to impact
>>>                           how the unnamed module sees itself. That's a
>>>         surprising
>>>                  situation.
>>>
>>>
>>>                       The unnamed module is not a root module during
>>>         resolution.
>>>                  If your
>>>                       main class is in the unnamed module (i.e. you did
>>>         java -jar
>>>                       MyApp.jar rather than java -m MyApp), then the
>>>         module graph is
>>>                       created by resolving various root modules (what
>>>         are they?
>>>                  separate
>>>                       discussion) and only then is the unnamed module
>>>         hooked up
>>>                  to read
>>>                       every module in the graph.
>>>
>>>                       Hope we're OK so far.
>>>
>>>                       If some named module in the graph exports package
>>>         A (more
>>>                  than one
>>>                       module exporting A? separate discussion), then
>>>         since the
>>>                  unnamed
>>>                       module reads that named module, the unnamed module
>>>         will
>>>                  access A.*
>>>                       types from that named module.
>>>
>>>                       It's hard to imagine the unnamed module NOT
>>>         accessing A.*
>>>                  types from
>>>                       that named module. Primarily, we need to avoid a
>>>         split package
>>>                       situation where code in the unnamed module
>>> sometimes
>>>                  accesses A.*
>>>                       types from the named module and sometimes from the
>>>         unnamed
>>>                  module.
>>>
>>>                       You might say, OK, let code in the unnamed module
>>>                  exclusively access
>>>                       A.* in the unnamed module rather than exclusively
>>>         access
>>>                  A.* in the
>>>                       named module. Then you have two problems:
>>>
>>>                       1. There are issues for named modules in the same
>>>         class
>>>                  loader as
>>>                       the unnamed module -- such named modules MUST get
>>>         A.* from
>>>                  the named
>>>                       module rather than the unnamed module, and the
>>>         class loading
>>>                       mechanism is incapable of switching based on
>>>         accessor. It'll be
>>>                       common for named modules to exist in the same
>>>         class loader
>>>                  as the
>>>                       unnamed module, as modular JARs on the modulepath
>>> and
>>>                  non-modular
>>>                       JARs on the classpath all end up in the
>>>         application class
>>>                  loader
>>>                       (modular JARs as named modules; non-modular JARs
>>>         jointly as the
>>>                       unnamed module).
>>>
>>>                       2. While the module system is sure that package A
>>>         exists in the
>>>                       named module, how would the module system possibly
>>>         know
>>>                  that package
>>>                       A exists in the unnamed module? Scanning every
>>>         class file
>>>                  in every
>>>                       non-modular JAR on the classpath at startup sounds
>>>         bad.
>>>
>>>                       Alex
>>>
>>>
>>>
>>>
>>>
>


More information about the jigsaw-dev mailing list