Use of modules by build/launch tools

Mon Oct 5 21:12:39 UTC 2015

Hi.
I'm posting to this mailing list on Alan Bateman's suggestion, after
discussing it on the jigsaw-dev list.

The module system leaves the task of version selection, version conflict
resolution and other version-related issues to build tools, but as one of
the maintainers of a launch tool,Capsule <http://www.capsule.io/>  — which
is similar to a build tool in many respects -- I can’t quite see how this
should be done, and would appreciate some guidance.

The problem arises with version conflicts of a transitive dependency on a
library that has breaking changes between versions. This is a common
occurrence. One notable example is ASM. Suppose I have an application
model, A, with the following dependencies:

A —> B

          C —> L at v2

          D

          E

          M —> L at v1 —> B

          L at v2 —> B

When I build the application, the build tool detects the version conflict,
L at v1/v2, and emits an error/warning. At this point the developer tells the
build tool to find a solution such that library M will resolve the
dependency on L at v1, but all other modules will use L at v2. This is done
today using shadowing, namely merging L at v1 into the same JAR file as M, and
transforming all classes both in M and L so that L’s packages are renamed.
This solution is obviously brittle. For example, it may not work when
reflection is involved, and in any event, it is excessive. There is no
reason why the same solution (shadowing) would not work with Jigsaw
modules, but it seems like Jigsaw could offer a better way, except I see
two problems that I’d like to ask for advice on:

   1. I don’t see a way other than the module path to inform the JVM of the
   module graph configuration (such as passing the name of a class that will
   create the configuration etc.). A workaround may be to inject an agent that
   will be listed first on the command line, have its premain method never
   return, and let it parse the rest of the command line, loading the main
   module through a custom layer. This may sound like a terrible hack, but
   it’s the kind of things tools do, and it might still be less brittle and
   hack then shadowing. My question is, is there a better way?
   2. Suppose there was a way to create a configuration more complex than a
   single module path. How would I construct the one I described above? My
   first though was to have one layer consisting of all modules, and then a
   child-layer with M —> L at v1. But this doesn’t work as A —> M, and you
   can’t resolve a Configuration that’s missing a dependency. My next guess
   was to put all dependencies except M in one layer, then M —> L at v1 in
   another, and then A in yet another, but this doesn’t work either as A —>
   L at v2, and the L at v1 would override that. Then I thought of putting M —>
   L at v1 —> B in the first layer, and then everything else in another,
   overriding L, but this isn’t quite right as it would result in two versions
   of B. The only solution, then, is putting M —> L at v1 —> B in the first
   layer, and then everything else, taking care to override L but not B. This
   can work, but it may be quite annoying to compute, especially given that my
   intentions are very simple — resolve everything as you would except resolve
   L to L at v1 when accessed through M. Shouldn’t there perhaps be a way to
   modify the graph selectively before final resolution (or some other form of
   fine-grained control over the graph’s construction)?

So, the module system leaves the task of version selection and conflict
resolution to build tools — and rightly so — but I don’t see it simplifying
the tools’ job or helping them avoid terrible hacks, even though it seems
like it could.
Shadowing is a solution for the problem of dependency hiding (or dependency
encapsulation), which is something the module system is intended to solve.
Shouldn't it help with this problem? The requirements document is a bit
unclear on the issue; it delegates version selection to build tools --
where it belongs -- and it mentions that multiple versions are supported
via dynamic configuration which is intended for containers. Indeed, the way
module configurations are arranged -- i.e. with a hierarchy of layers --
makes sense for multi-application containers, but less sense for the
problem of dependency version conflict (as explained in the example).

But the problem of transitive-dependency version conflict is not discussed,
except by hinting that it may not be important enough, because "most
applications are not containers and, since they currently rely upon the class
path, do not require the ability to load multiple versions of a module".
That is not quite accurate. Most applications rely on the classpath
*plus* shadowing.
A GitHub search finds over 37K *direct* uses of shadowing
<https://github.com/search?l=maven-pom&p=1&q=maven-shade-plugin&type=Code&utf8=%E2%9C%93>,
and that is a very, very small part of the picture (even if many of those
results are projects don't really require it) because shadowing is used by
libraries to hide their dependencies, and those libraries are then used by
many more projects that don't use shadowing directly. Any project that has
even a single transitive dependency that makes use of shadowing is facing
the problem of version conflicts (and handles it with this hack).

Ron