Comments on the proof
Jaroslav Tulach
jaroslav.tulach at oracle.com
Sun Dec 25 06:30:59 PST 2011
Hello Mark,
thanks for your reply.
>## 23. 12. 2011 21:27:34 ##<
> run-time behavior. It's critical that we understand the complexity of
> what we're trying to build, even if we ultimately decide to live with
> it rather than reduce it.
OK, that is fair approach.
> I've skimmed your proof (I won't have time to study it in detail until
> after the holidays),
What I had in mind was: Every symbol (e.g. class name) has exactly one meaning
in a each classloader space. It must not happen that a single fully qualified
class name could be resolved to two different classes in any classloader in
the system.
The rationale for the above is: during compilation classes are referred by
FQNs. Each FQN refers to a single class[2]. I believe it is essential to
enforce that during runtime for each classloader as well.
How can a single class name be resolved to two classes? Not by a direct
dependency. That would require my module to have dependency on org.w3c.dom at 1.0
and org.w3c.dom at 3.0 at once which is a non-sense and compiler would reject
that. During compilation there is always one exact dependency on
a module to compile against.
> You use the term "implicit re-export", but your proof of
> course doesn't actually consider what types are or are not re-exported,
> so I'm a bit confused by this terminology.
However some modules re-export types. In the jigsaw terminology use "requires
public". Still, each classloader should have just a single class for a FQN in
its namespace.
> and one question I have so far is this: Is saying
> "supports implicit re-export on an implicit base" just another way of
> saying "only one version of any particular module of a given name is
> allowed"?
Not necessarily. If there is a "master" classloader in each application that
sees all classes of all available modules (which is sort of true in NetBeans
Runtime Container, but not at all in OSGi), then the requirement that each FQN
in this "master" classloader represents at most one class naturally leads to
requirement that "only one version of any particular module of a given name is
allowed".
On the other hand, there does not need to be such "master" classloader. If one
module requires (without "requires public", e.g. without re-export)
org.w3c.dom at 1.0 and another module requires org.w3c.dom at 2.0, then it should be
OK to load the org.w3c.dom in two versions as no classloader will see the
duplicated classes from these modules at once[3].
> On your LibraryWithoutImplicitExportIsPolynomial [1] page you define the
> concept of a "complete" repository, in which the transitive dependencies
> of all modules are explicitly enumerated. You suggest that the set of a
> module's transitive dependencies could be generated at compile time, but
> doesn't that mean that you'd still have to run a general -- and possibly
> NP-complete -- resolution algorithm at compile time in order to compute
> that set?
I have a feeling that during compilation the NP-completeness is avoided. Here
is a sketch of the proof using mathematical induction:
1. java.base can be resolved in finite time.
2. when compiling a module M with (direct) dependencies on M1, ..., Mn we can
assume that modules M1,...,Mn have all their transitive dependencies
enumerated. Let's now collect all transitive dependencies of M1, ..., Mn into
a set and verify whether they are all consistent (as described in [1] section
"Consistency"). If not, reject the compilation (per arguments in [2]). If they
are consistent, select highest requested version of each module in the
transitive set and use it as dependencies of M.
Seems polynomial to me.
> Plenty of existing systems seem to work just fine in the face of this
> theoretical complexity (e.g., OSGi, Debian, RPM), and in many scenarios
> Jigsaw will never do resolution at run time anyway.
Debian and RPM workaround this by having control over whole distribution. When
you have it you usually avoid many incompatibilities before you ship. Also
each distribution tests the upgrade scenario. Thus it is generally OK to use
repositories for current+1 version and mix the modules.
However once I tried to update a 2006 Linux RPM distribution with 2009
version. It failed miserably - it has not even started. I am not claiming this
was due to the NP-compete problem described above, but it seems to me that
unless you control and verify state of the whole repository (as Glassfish,
Eclipse, Fedora, Ubuntu and NetBeans do), you'll face incompatibilities far
often and having polynomial time resolution algorithm will pay off.
> (I suppose this places me firmly on the Western side of your Ocean, but
> with a keen eye toward the East.)
One of the important minds among European rationalistic computer scientists -
Niklaus Wirth (the inventor of Pascal, Oberon, etc.) once noted: Simple,
elegant solutions are more effective, but they are harder to find and require
more time.
JDK has to ship on time and unless we find simple and elegant solution by
then, we'll have to accept the NP-complete one. That is completely acceptable
decision from my point of view.
> Finally, to the big question of whether Jigsaw should try to avoid
> NP-completeness: I'd prefer to avoid it, but I'm willing to accept it.
I am glad avoiding NP-completeness is preferred. It increases my enthusiasm to
seek for a dependency system that would be powerful enough and avoid it.
-jt
[1]
http://wiki.apidesign.org/index.php?title=LibraryWithoutImplicitExportIsPolynomial&useskin=monobook
[2] In current classpath system this is not fully true, there can be multiple
classes with the same FQN and depending on the order of classpath roots, the
first one is selected. I believe in the "modulepath" mode, this is not
desirable, there is no order as transitive dependencies are just partially
ordered, thus I believe compiler should detect if two classes representing the
same FQN are available and emit a compilation error in such case. Correct?
[3] Btw. the module system should check that types are not re-exported without
using "requires public". E.g. if there is an exported class that extends non-
public type or returns non-public type from a visible method, the compiler
should emit an error. Correct?
More information about the jigsaw-dev
mailing list