Unnamed module and duplicate package
Paul Benedict
pbenedict at apache.org
Thu Mar 10 21:40:32 UTC 2016
My apologies for omitting some key qualifiers in my explanation. Read
everything as a proposal to prohibit classpath jars from silently splitting
whatever packages the JDK exports. -- Thanks
On Thu, Mar 10, 2016 at 3:25 PM, Paul Benedict <pbenedict at apache.org> wrote:
> Alex,
>
> For the sake of usability, however, it would be nice if the JDK verified
> that jars do not contain any exported JDK packages. This would be an RFE. I
> understand that in order to avoid split packages between modules and
> classpath, the module version takes precedent. For developer vs. developer
> code, I find that reasoning fine. However, I really would like to treat the
> JDK as "special" (your words) because in my experience, I see developers
> constantly perplexed by NoClassDefFoundError when something occurred like
> you just detailed.
>
> I'd like to refer you to the Servlet 3.1 spec [1], section 10.7.2, as an
> analogous concern to mine. This is the so-called "prohibited classes"
> violation clause:
>
> "As described in the Java EE license agreement, servlet containers that
> are not part of a Java EE product should not allow the application to
> override Java SE platform classes, such as those in the java.* and javax.*
> namespaces, that Java SE does not allow to be modified. The container
> should not allow applications to override or access the container’s
> implementation classes."
>
> I don't think it's good usability to let JDK packages in the classpath go
> silently unchallenged and unloaded. I recommend they are reported as an
> error.
>
>
> [1] https://java.net/downloads/servlet-spec/Final/servlet-3_1-final.pdf
>
> Cheers,
> Paul
>
> On Thu, Mar 10, 2016 at 2:30 PM, Alex Buckley <alex.buckley at oracle.com>
> wrote:
>
>> I see xml-apis.jar (2.0.2) contains:
>>
>> - a javax.xml.parser package, which includes a FactoryFinder class that's
>> not in Java SE, and
>>
>> - a javax.xml.transform package hierarchy, whose types at first glance
>> look identical to those in Java SE except for yet another FactoryFinder
>> class in javax.xml.transform.
>>
>> If you put xml-apis.jar on the classpath, its javax.xml.** packages will
>> be ignored. The unnamed module reads the java.xml module which exports
>> javax.xml.** packages (assuming java.xml in the system image, of course),
>> so the application class loader delegates for javax.xml.** packages to the
>> loader responsible for the java.xml module. User code that tries to access
>> FactoryFinder will get a NoClassDefFoundError.
>>
>> There's nothing special about JDK modules here. The same
>> NoClassDefFoundError would occur if the system image contained a module
>> exporting some package, and a JAR on the classpath contained the same
>> package with extra classes, and some code on the classpath tried to access
>> those extra classes. Since the module in the system image is probably the
>> rightful owner/exporter of the package, hard questions should be asked
>> about the provenance of the JAR on the classpath.
>>
>> There has been some discussion of a jdeps-like tool that detects when a
>> JAR on your classpath is trying to split a JDK package:
>> http://mail.openjdk.java.net/pipermail/jigsaw-dev/2015-November/005227.html
>> .
>>
>> Alex
>>
>> On 3/10/2016 10:27 AM, Paul Benedict wrote:
>>
>>> Alex, there are JARs that contain javax packages. Anyone in the web
>>> development community knows how many people have included xml-apis in
>>> their WEB-INF :-) only to find out it wasn't loaded or it took precedent
>>> over the JDK versions.
>>>
>>> Has Jigsaw introduced any restrictions here on this front? Honestly, I
>>> think the JDK should make it illegal for the classpath to contain ANY
>>> packages that the jdk has. Please opine when it is convenient for you.
>>>
>>> Cheers,
>>> Paul
>>>
>>> On Wed, Mar 9, 2016 at 5:13 PM, Alex Buckley <alex.buckley at oracle.com
>>> <mailto:alex.buckley at oracle.com>> wrote:
>>>
>>> Paul, thank you for asking. The module system's treatment of the
>>> unnamed module vis-a-vis named modules is probably the biggest
>>> factor affecting usability of the module system. This is true almost
>>> by definition because at JDK 9 GA the only named modules in the
>>> world will be the JDK's while every other class will be in the
>>> unnamed module of the application class loader.
>>>
>>> So please, ask more questions about the unnamed module. I am
>>> especially interested to know if anyone has JARs that contain javax
>>> packages (or heaven forbid, sun or com.sun packages) found in the
>>> JDK -- such JARs are a mortal danger to interop between unnamed and
>>> named modules.
>>>
>>> Alex
>>>
>>> On 3/9/2016 1:47 PM, Paul Benedict wrote:
>>>
>>> Thank you Alex. Since it's roughly the same as JDK 8, then it's
>>> also not
>>> worse. I defer to your explanation on that point.
>>>
>>> Cheers,
>>> Paul
>>>
>>> On Wed, Mar 9, 2016 at 3:37 PM, Alex Buckley
>>> <alex.buckley at oracle.com <mailto:alex.buckley at oracle.com>
>>> <mailto:alex.buckley at oracle.com
>>>
>>> <mailto:alex.buckley at oracle.com>>> wrote:
>>>
>>> Presumably you would count the equivalent scenario on JDK 8
>>> -- my
>>> package A is in Alex.jar on the classpath and your package
>>> A is in
>>> Paul.jar on the classpath -- as a security issue too,
>>> because some
>>> of my classes may substitute for yours (or some of yours
>>> for mine,
>>> depending on how the classpath is constructed).
>>>
>>> On JDK 9, we do the "substitution" cleanly. Package A is
>>> not split.
>>> That avoids one category of error (ClassCastException).
>>> What about
>>> poor package B that finds itself accessing a different
>>> package A
>>> than it was compiled with? Well, since package A is
>>> exported by a
>>> named module, it's reasonable to assume that the named
>>> module "owns"
>>> package A [*], and that the developer of package B
>>> co-bundled some
>>> version of package A without renaming it. Dangerous in JDK
>>> 8,
>>> dangerous in JDK 9. (We're trying to encapsulate the
>>> internals of a
>>> module, which is different from trying to isolate modules
>>> from each
>>> other.)
>>>
>>> [*] Advanced scenario: the named module exporting A is
>>> actually an
>>> automatic module which happened to co-bundle package A. By
>>> placing
>>> this JAR on the modulepath to form an automatic module, it
>>> dominates
>>> the JAR left on the classpath which also co-bundled package
>>> A.
>>>
>>> Alex
>>>
>>> On 3/9/2016 1:17 PM, Paul Benedict wrote:
>>>
>>> But isn't what your proposing a security issue? Let's
>>> say my
>>> package A
>>> is in the unnamed module and your package A is in a
>>> named
>>> module. You
>>> basically took over my code; your classes will be
>>> substituted
>>> for mine.
>>>
>>> Cheers,
>>> Paul
>>>
>>> On Wed, Mar 9, 2016 at 2:38 PM, Alex Buckley
>>> <alex.buckley at oracle.com
>>> <mailto:alex.buckley at oracle.com> <mailto:alex.buckley at oracle.com
>>> <mailto:alex.buckley at oracle.com>>
>>> <mailto:alex.buckley at oracle.com
>>> <mailto:alex.buckley at oracle.com>
>>>
>>> <mailto:alex.buckley at oracle.com
>>>
>>> <mailto:alex.buckley at oracle.com>>>> wrote:
>>>
>>> On 3/9/2016 10:36 AM, Paul Benedict wrote:
>>>
>>> From the doc:
>>> "If a package is defined in both a named
>>> module and the
>>> unnamed
>>> module then
>>> the package in the unnamed module is ignored.
>>> This
>>> preserves
>>> reliable
>>> configuration even in the face of the chaos of
>>> the
>>> class path,
>>> ensuring
>>> that every module still reads at most one
>>> module defining a
>>> given package.
>>> If, in our example above, a JAR file on the
>>> class path
>>> contains
>>> a class
>>> file named
>>> com/foo/bar/alpha/AlphaFactory.class then
>>> that file
>>> will never
>>> be loaded, since the com.foo.bar.alpha package
>>> is
>>> exported by the
>>> com.foo.bar module."
>>>
>>> I would like some clarification. Correct me if
>>> wrong, but I
>>> think this
>>> entire paragraph is really meant to be about
>>> the
>>> perspective from a
>>> modularized JAR? If a module has package A,
>>> and the unnamed
>>> module has
>>> package A, then of course the module's package
>>> A should
>>> win.
>>>
>>> However, if it is meant to be absolute
>>> language, then I
>>> disagree.
>>>
>>> The unnamed module should be coherent among
>>> itself. If the
>>> unnamed module
>>> has package B and relies on classes from
>>> package A, it
>>> should
>>> still be able
>>> to see its own package A. I don't think
>>> modules should
>>> be able
>>> to impact
>>> how the unnamed module sees itself. That's a
>>> surprising
>>> situation.
>>>
>>>
>>> The unnamed module is not a root module during
>>> resolution.
>>> If your
>>> main class is in the unnamed module (i.e. you did
>>> java -jar
>>> MyApp.jar rather than java -m MyApp), then the
>>> module graph is
>>> created by resolving various root modules (what
>>> are they?
>>> separate
>>> discussion) and only then is the unnamed module
>>> hooked up
>>> to read
>>> every module in the graph.
>>>
>>> Hope we're OK so far.
>>>
>>> If some named module in the graph exports package
>>> A (more
>>> than one
>>> module exporting A? separate discussion), then
>>> since the
>>> unnamed
>>> module reads that named module, the unnamed module
>>> will
>>> access A.*
>>> types from that named module.
>>>
>>> It's hard to imagine the unnamed module NOT
>>> accessing A.*
>>> types from
>>> that named module. Primarily, we need to avoid a
>>> split package
>>> situation where code in the unnamed module
>>> sometimes
>>> accesses A.*
>>> types from the named module and sometimes from the
>>> unnamed
>>> module.
>>>
>>> You might say, OK, let code in the unnamed module
>>> exclusively access
>>> A.* in the unnamed module rather than exclusively
>>> access
>>> A.* in the
>>> named module. Then you have two problems:
>>>
>>> 1. There are issues for named modules in the same
>>> class
>>> loader as
>>> the unnamed module -- such named modules MUST get
>>> A.* from
>>> the named
>>> module rather than the unnamed module, and the
>>> class loading
>>> mechanism is incapable of switching based on
>>> accessor. It'll be
>>> common for named modules to exist in the same
>>> class loader
>>> as the
>>> unnamed module, as modular JARs on the modulepath
>>> and
>>> non-modular
>>> JARs on the classpath all end up in the
>>> application class
>>> loader
>>> (modular JARs as named modules; non-modular JARs
>>> jointly as the
>>> unnamed module).
>>>
>>> 2. While the module system is sure that package A
>>> exists in the
>>> named module, how would the module system possibly
>>> know
>>> that package
>>> A exists in the unnamed module? Scanning every
>>> class file
>>> in every
>>> non-modular JAR on the classpath at startup sounds
>>> bad.
>>>
>>> Alex
>>>
>>>
>>>
>>>
>>>
>
More information about the jigsaw-dev
mailing list