Concerns mapping existing dynamic module systems to a 1-1 Module->Layer

Wed Mar 8 14:09:11 UTC 2017

There have been several topics recently discussing concerns in one context 
or another when trying to use JPMS with an existing module system (like 
OSGi or JBoss Modules) where the existing module system supports dynamic 
modules.

At a high level, the solution involves a 1-1 mapping for Module->Layer. 
 This allows such a dyamic module system to make use of the dynamic Layer 
creation to support the existing module system.

From my experience there are some performance concerns that come in three 
different categories when using a 1-1 mapping between the Module and 
Layer.

1) The need to discover private packages for the ModuleDescriptor.  Not 
specifically related to the 1-1 mapping but required to correctly create 
the Module for the Layer.  The extra work to scan for private packages 
could be handled outside of JPMS with some kind of caching within the 
dynamic module system.

2) The need to aggressively create every class loader for every module 
when creating a layer.  The Layer API requires the layer management code 
to supply a function to map a module name to a class loader and this 
function is called for each module included in the Layer at construction 
time.  This eliminates any possibility of lazy class loader creation. 
 Creating a class loader is not as expensive as it used to be on older 
VMs, but it is measurable, particularly if you need to aggressively create 
1000s of them all up front.  In Equinox we delay class loader creation 
until just before the first class load for the bundle.  This layer 
function eliminates that possibility.

3) 1-1 mapping of of Module->Layer means that we have to do a separate 
JPMS resolution for each and every module in our system.  This JPMS 
resolution has no value to our existing module systems because we already 
did the resolution outside of JPMS.  We are just trying to map our 
resolution into JPMS.  This design forces us to double resolve.  I need to 
adjust my POC to see how expensive it is to do the resolution in many 
Layers vs one Layer (at the cost of no dynamics).  This would not be a 
real solution, it would just pinpoint that the overhead can be attributed 
to the extra JPMS resolution we have to do for each Layer we have to 
create.  Right now resolution of the Layer is taking nearly 1ms per Layer 
(that is not counting the time it takes to create the class loaders).

Altogether the overhead appears to be 4-5 ms per module (on a pretty 
snappy machine).  Caching the discovery of private packages does cut that 
by at least half.  Most of the remaining overhead can be attributed to the 
fact that we are using 1-1 Module->Layer mappings in order to support 
dynamics as well as hiding our own resolution wires that do not fit into 
the strict rules of JPMS when dealing with split packages or cycles.

Most of the additional methods David Lloyd is suggesting is to get the 
functionality needed to do the complete mapping of existing module systems 
into JPMS.  Essentially I think we are trying to work with JPMS in order 
to solve the following requirement [1]:

Interoperation — It must be possible for another module system, such as 
OSGi, to locate Java modules and resolve them using its own resolver, 
except possibly for core system modules.

While the solution we have today achieves this to some extent it is not 
obvious to me that we have done this in the most useful of ways.  In 
particular, all the additions David is suggesting are being resisted 
because they potentially are exposing internal implementation details of 
the JVM which is not desired by the JVM team.  To be a complete solution 
for the JBoss case all (most?) of the methods David suggests are needed. 
 As I stated already, OSGi really only needs the addPackage and 
addExportsToAll methods to be complete.  But this still requires double 
resolution work to get us back to the functionality we were at with Java 8 
and that is only if the suggested methods make it into java 9.

Taking a certain interpretation of the above requirement seems to imply 
that we should be able to have a solution that does not require double 
resolution at the JPMS Layer.  It seems to implies that we should be able 
to use our own resolver to wire the modules for our own module system how 
ever we see fit.  But I don't think that is really what we got with the 
current solution which requires JPMS module resolution and then us using 
the controller to provide the read edges required by our module system 
class loaders.  I was not in on the early discussions of JPMS, but was 
there any thread of discussion about an approach that provided a 
pre-resolved graph to JPMS?  It seems like such a solution would naturally 
fit into the spirit of the interoperation requirement.  I acknowledge that 
such an idea is a drastic turn in the current design and I don't see how 
they could be contained in any reasonable way.

I think we are left with the Layer Controller now for better or worse.  I 
hope the overhead of creating and resolving 100s of Layers can be 
optimized in the future so we can avoid having to do this largely 
unnecessary resolution work when dealing with existing module systems. 
 But if the Controller is what we are left with I think the solution needs 
to be as complete as possible for the usecases it is used for.  Namely to 
do the adhering of JBoss Modules and OSGi into JPMS.

The next suggestion could be to implement this by always hiding our module 
system requirements, that should speed up resolution because there is 
nothing for JPMS to resolve.  I would have to confirm this actually helps, 
but I would rather not do this when OSGi resolution wires can fit into the 
JPMS rules (i.e. does not contain any cycles or split packages).  Here I 
would rather have a correctly reflected resolution of the Modules within 
the hierarchical Layer graph when it is possible, and only do the addReads 
edges trick post Layer resolution when we actually have cycles or split 
packages to deal with.  The reason this is important is because it allows 
us to more naturally fit into JPMS when it is possible.  For example, I 
think that is the easy way for us to ensure transitive dependencies work 
when we are mapping require-bundle (with visibility reexport) into JPMS 
modules which other real JPMS modules can depend on.  This way JPMS module 
resolution for the regular JPMS layers works as usual for requires 
transitive.

My TODO here is still to prove that the double resolution and required 
extra Layer per Module is contributing to the extra overhead.  I suspect 
that it is, but I guess it could be possible that just creating and 
resolve 100s of Modules in one Layer also has the same overhead.

Tom

[1] 
http://openjdk.java.net/projects/jigsaw/spec/reqs/2015-04-01#interoperation