Concerns mapping existing dynamic module systems to a 1-1 Module->Layer
Thomas Watson
tjwatson at us.ibm.com
Wed Mar 8 14:09:11 UTC 2017
There have been several topics recently discussing concerns in one context
or another when trying to use JPMS with an existing module system (like
OSGi or JBoss Modules) where the existing module system supports dynamic
modules.
At a high level, the solution involves a 1-1 mapping for Module->Layer.
This allows such a dyamic module system to make use of the dynamic Layer
creation to support the existing module system.
From my experience there are some performance concerns that come in three
different categories when using a 1-1 mapping between the Module and
Layer.
1) The need to discover private packages for the ModuleDescriptor. Not
specifically related to the 1-1 mapping but required to correctly create
the Module for the Layer. The extra work to scan for private packages
could be handled outside of JPMS with some kind of caching within the
dynamic module system.
2) The need to aggressively create every class loader for every module
when creating a layer. The Layer API requires the layer management code
to supply a function to map a module name to a class loader and this
function is called for each module included in the Layer at construction
time. This eliminates any possibility of lazy class loader creation.
Creating a class loader is not as expensive as it used to be on older
VMs, but it is measurable, particularly if you need to aggressively create
1000s of them all up front. In Equinox we delay class loader creation
until just before the first class load for the bundle. This layer
function eliminates that possibility.
3) 1-1 mapping of of Module->Layer means that we have to do a separate
JPMS resolution for each and every module in our system. This JPMS
resolution has no value to our existing module systems because we already
did the resolution outside of JPMS. We are just trying to map our
resolution into JPMS. This design forces us to double resolve. I need to
adjust my POC to see how expensive it is to do the resolution in many
Layers vs one Layer (at the cost of no dynamics). This would not be a
real solution, it would just pinpoint that the overhead can be attributed
to the extra JPMS resolution we have to do for each Layer we have to
create. Right now resolution of the Layer is taking nearly 1ms per Layer
(that is not counting the time it takes to create the class loaders).
Altogether the overhead appears to be 4-5 ms per module (on a pretty
snappy machine). Caching the discovery of private packages does cut that
by at least half. Most of the remaining overhead can be attributed to the
fact that we are using 1-1 Module->Layer mappings in order to support
dynamics as well as hiding our own resolution wires that do not fit into
the strict rules of JPMS when dealing with split packages or cycles.
Most of the additional methods David Lloyd is suggesting is to get the
functionality needed to do the complete mapping of existing module systems
into JPMS. Essentially I think we are trying to work with JPMS in order
to solve the following requirement [1]:
Interoperation — It must be possible for another module system, such as
OSGi, to locate Java modules and resolve them using its own resolver,
except possibly for core system modules.
While the solution we have today achieves this to some extent it is not
obvious to me that we have done this in the most useful of ways. In
particular, all the additions David is suggesting are being resisted
because they potentially are exposing internal implementation details of
the JVM which is not desired by the JVM team. To be a complete solution
for the JBoss case all (most?) of the methods David suggests are needed.
As I stated already, OSGi really only needs the addPackage and
addExportsToAll methods to be complete. But this still requires double
resolution work to get us back to the functionality we were at with Java 8
and that is only if the suggested methods make it into java 9.
Taking a certain interpretation of the above requirement seems to imply
that we should be able to have a solution that does not require double
resolution at the JPMS Layer. It seems to implies that we should be able
to use our own resolver to wire the modules for our own module system how
ever we see fit. But I don't think that is really what we got with the
current solution which requires JPMS module resolution and then us using
the controller to provide the read edges required by our module system
class loaders. I was not in on the early discussions of JPMS, but was
there any thread of discussion about an approach that provided a
pre-resolved graph to JPMS? It seems like such a solution would naturally
fit into the spirit of the interoperation requirement. I acknowledge that
such an idea is a drastic turn in the current design and I don't see how
they could be contained in any reasonable way.
I think we are left with the Layer Controller now for better or worse. I
hope the overhead of creating and resolving 100s of Layers can be
optimized in the future so we can avoid having to do this largely
unnecessary resolution work when dealing with existing module systems.
But if the Controller is what we are left with I think the solution needs
to be as complete as possible for the usecases it is used for. Namely to
do the adhering of JBoss Modules and OSGi into JPMS.
The next suggestion could be to implement this by always hiding our module
system requirements, that should speed up resolution because there is
nothing for JPMS to resolve. I would have to confirm this actually helps,
but I would rather not do this when OSGi resolution wires can fit into the
JPMS rules (i.e. does not contain any cycles or split packages). Here I
would rather have a correctly reflected resolution of the Modules within
the hierarchical Layer graph when it is possible, and only do the addReads
edges trick post Layer resolution when we actually have cycles or split
packages to deal with. The reason this is important is because it allows
us to more naturally fit into JPMS when it is possible. For example, I
think that is the easy way for us to ensure transitive dependencies work
when we are mapping require-bundle (with visibility reexport) into JPMS
modules which other real JPMS modules can depend on. This way JPMS module
resolution for the regular JPMS layers works as usual for requires
transitive.
My TODO here is still to prove that the double resolution and required
extra Layer per Module is contributing to the extra overhead. I suspect
that it is, but I guess it could be possible that just creating and
resolve 100s of Modules in one Layer also has the same overhead.
Tom
[1]
http://openjdk.java.net/projects/jigsaw/spec/reqs/2015-04-01#interoperation
More information about the jpms-spec-experts
mailing list