Configurations... (Was: Re: Module-system requirements)

Tue Mar 17 13:02:21 UTC 2015

On 03/16/2015 03:44 PM, mark.reinhold at oracle.com wrote:
> 2015/3/11 6:38 -0700, david.lloyd at redhat.com:
>> ...
>>
>> Maybe what we need is a clearer definition of "Configuration" as it is
>> only defined as a side-effect of Resolution: "The resulting
>> configuration contains all of the code and data required to compile or
>> run the initial module."   And Linking is AFAICT the first section that
>> implies that Configurations actually contain modules in any way.  This
>> might help me understand what a tree topology actually means.
>
> A configuration is the result of resolution and service binding.  You
> can think of it as a graph, with modules for nodes, edges for module
> dependences, and edges (of a different color) for service-provider
> bindings.
>
> I'm pretty sure that a configuration, in this sense, is not analogous
> to what you call a module loader in your system.  A configuration can
> be loaded by a single class loader, multiple class loaders, or even one
> class loader per module, but the module system shouldn't mandate any
> particular approach.

In our case the module loader is the instigator (not the result) of 
resolution.  Resolution happens lazily, very much like class loading. 
The result of resolution is thus completely abstract and not represented 
programmatically.  This also means that there is no single resolve step 
at run time; instead each module is efficiently linked as it is loaded 
using a relatively simple traversal algorithm, regardless of what module 
loader contains the dependency being linked in, which allows the system 
to support an unbounded number of modules.  Any expensive or complex 
resolution processes are part of build or distribution.

> If you build a dynamic configuration on top of the initial configuration,
> and then another on top of that, as you might do in something like an
> application server, then the union of their graphs will be a DAG (at
> least) but the configurations themselves are just in a list, somewhat
> like a search path.  You can imagine having multiple such lists sharing
> prefixes, i.e., a tree of configurations, but to resolve a dependence in
> any particular configuration you only need to consult its parent
> configurations.

OK, but if (say) a dynamic configuration can override service-provider 
bindings, then how does a parent configuration know to use the 
overridden bindings for services it provides?  I can't imagine any 
scheme at all which would necessitate that the entire configuration be a 
list.  Maybe conflating module dependencies and service provider 
configuration into one "thing" is an error?

> If configurations can be related in a more general way, i.e., in a DAG,
> then the resolution algorithm becomes more complex.  Do we actually need
> that?  I've yet to see any use cases.

I don't know about framing in terms of configurations, because we don't 
do that.  But I think that modules from one system should be able to 
express dependencies on other modules from any other system regardless 
of their relationship, at least on a programmatic basis.  This is 
necessary if you want two completely different module systems to 
interact, because you cannot always assume that (for example) one 
namespace can just be polluted with names from another (parent) 
namespace, nor can you assume that a given configuration would only ever 
want to interact (link) with one other configuration (this is akin to 
the logic of moving away from a strict class loader hierarchy).  We made 
this mistake early on, and fixing it resulted in a much better system, 
which allowed us to have a more logical arrangement between OSGi, EE, 
the JDK, distribution modules, extension modules, and filesystem JAR 
modules without worrying about what ended up being a pretty arbitrary 
constraint.  Each module loader interacted with the others in specific 
and completely separate ways (which also greatly helps in terms of 
letting the container bind module dependencies with service 
dependencies, to ensure that things are actually started in the right 
order).  For example if I want to link in a filesystem JAR, I use 
Class-Path (in the MANIFEST).  However, the Extension-List mechanism is 
totally different (and has a different namespace), our distribution 
module Dependencies mechanism is different again, and OSGi is 
*completely* different in every possible way.  So, there was really no 
reason to force all of these resolution strategies to go down the same pipe.

It's not really clear to me in any event what makes the resolution 
algorithm more complex in graph vs. list form; as of yet we don't really 
have a description of what the algorithm you have in mind actually does, 
and how and why it does it.  So there's probably a more in-depth 
discussion to be had.  But if there's a way for any configuration to 
contain module dependencies on any other configuration, then maybe the 
constraint is only practically limited to service resolution and I'm 
worried about nothing.

-- 
- DML