Module descriptions versus module descriptors

Thu Dec 10 15:02:38 UTC 2015

On 12/09/2015 07:26 PM, mark.reinhold at oracle.com wrote:
> 2015/10/20 4:27 -0700, david.lloyd at redhat.com:
>> I see no logical path that leads from the requirements as specified
>> exclusively to the assumption that the descriptor must be bytecode, let
>> alone part of the JVM and/or language specification. All the reasons
>> given appear to be self-justifying or based on abstract assumptions,
>> e.g. "modules are... a new kind of Java program component... therefore
>> [Jigsaw] treats them as such".
>
> I agree that there are other ways to represent module descriptions.
> I've never stated otherwise.
>
> If module boundaries, however they're expressed, are to be enforced by
> the compiler and the VM -- as you've agreed elsewhere -- then their
> manner of expression is inevitably going to be a topic for the Java
> language and VM specifications.  This is not avoidable.

Actually that does not logically follow.  I could, for example, 
"express" module boundaries in the VM by creating a hard equivalence 
between modules and class loaders, and in the compiler via command-line 
arguments.  In this way, no langauge or VM specifications must change; 
in fact only the specification of the compiler tool itself *must* 
change.  Granted it is *likely* that in such a case, one would go 
further and maybe make some small modifications to the javax.tools API 
as well to facilitate the provision of resources on a per-module basis.

>>            ... it seems to me that a significant part of the Jigsaw
>> design justification for its handling of module metadata hinges around
>> the conflation of the description of a module, and the descriptor used
>> by the static module loading implementation.  This raises a red flag for
>> me because it fundamentally locks the capabilities of module
>> descriptions to whatever makes sense to express in the descriptor, and
>> then in turn constrains these things to the language and JVM specification.
>>
>> In our (JBoss) module system, these concepts are decoupled: a filesystem
>> module's descriptor is read and parsed into a description which is then
>> consumed by the module system to create the module.  ...
>
> If module boundaries are to be enforced by the compiler and the VM then,
> in such a decoupled system, where and in what form does the compiler
> locate the information that describes the module being compiled, and any
> other modules required in order to compile that module?

You (or rather, the build tool) would tell it.

In real nontrivial systems, the universe of compiled and packaged (and 
versioned) artifacts is what you use when you're building another piece 
of the system.  The installed set of modules is necessarily a subset of 
this total universe.  Because of this, if I'm building something I 
generally cannot say "I have a build dependency upon 
org.apache.commons.collections".  I must say "Build against 
org.apache.commons.collections, version 3.2.2", and furthermore I might 
have to also specify whose build of such I use, and where to get it 
from.  And, I commonly must build different pieces of my universe 
against different versions of the same thing, especially when pieces 
come from a third party.  I will likely not use the same version I built 
against in the final system, especially in an evolving system, which I 
believe is a critical use case.  Finally, I may use the final built 
artifact in a variety of systems, whereupon it is necessary to modify 
the run time name and/or linkage information depending on the 
requirements of the target system.  I will almost certainly not even 
test the two together until a later integration testing phase, which (in 
probable addition to an ABI compatibility checking tool) will be the 
last stage before accepting the new module into the installed set.

In other words, it is often necessary to enforce different module 
boundaries depending on circumstances related to build, install, and 
run, independently of the rest of the source code.  Much of this 
information may not be available at all when the original artifact is 
being built.  Any bundling of module descriptor information inside of 
the module contents will have the effect of rendering that module 
useless in any module environment that is sufficiently unlike the one 
for/in which the module was built (unless you modify the module itself 
after the fact).  It is not realistic to assume that the version of a 
module that is compiled against is the version that will be run against, 
and it's misleading to "newbies" who might expect some kind of global 
cohesion between modules without any kind of limiting context (e.g. a 
module distribution, or a specific application).

The power that a modular Java gives you is that you can leverage what is 
arguably Java's best feature - the ability to easily provide perfect ABI 
stability between versions, and therefore the ability to substitute 
modules in the run-time environment - in order to maintain a growing and 
evolving ecosystems, or indeed many such ecosystems.  This is not unlike 
what many OS distributors can do, and for similar reasons in a similar 
way (though we have the sizable advantage of Java's more powerful 
linking capabilities at our disposal, which makes some hard things easy, 
and some impossible things possible).  I think that the module system 
should do what it can to enable this, and to me that means: external 
module descriptions, i.e. the compiled artifact (which consumes and/or 
provides specific ABI(s)) should be separate from the install/run 
linkage (which specifies how to tie modules together in such a way that 
all necessary ABI requirements are fulfilled).  This is because the 
artifact build dependency information is superficially similar to, but 
fundamentally not the same as, the run time linkage information - and 
this will be true no matter what we do or what constraints we place. 
The only thing we can influence in this regard is, how hard do we make 
these use cases to achieve?

There are other reasons that I prefer this approach as well, but this is 
the reason that directly answers your question.  And there are possibly 
other approaches that would also solve the same problems - for example, 
a hybrid approach where exported and non-exported packages are listed 
and annotated within the module, but the dependence information is 
external and provided separately at build time and run time (in a manner 
suitable to each situation).

-- 
- DML