Proposal: #NonHierarchicalLayers (+ #LayerPrimitives)

Mon Mar 6 20:35:49 UTC 2017

2017/3/6 8:08:19 -0800, tjwatson at us.ibm.com:
> 2017/3/5 18:02:44 -0800, mark.reinhold at oracle.com:
>> 2017/2/27 17:47:25 +0000, tjwatson at us.ibm.com:
>>> ...
>>> 
>>> The addPackage method would be beneficial for other reasons when adhering
>>> an existing module system into a JPMS Layer.
>>> 
>>>  - addPackage would allow existing module systems to avoid aggressive
>>> discovery of all private packages for its own modules when building a JPMS
>>> ModuleDescriptor.  For the existing module systems I know about the class
>>> loader is used extensively as the module primitive.  Requirements wire up
>>> the class loaders for proper class loader delegation and typically the
>>> APIs are declared capabilities to indicate what the modular class loaders
>>> can load and expose to another module class loader.  But nowhere was it
>>> required to aggressively discover all private packages and resources up
>>> front.  Having an addPackage would allow such module systems to grow the
>>> list of private packages lazily as it is defining classes in the packages
>>> that are private to the module.
>> 
>> Understood, but for context: I previously suggested that you could use
>> the `Private-Package` manifest header added by bndtools, when present,
>> in order to avoid scanning a bundle for private packages [1].  In reply
>> you wrote that you planned to use that approach for bundles built by
>> bndtools, and that "even for the bundles where we have to scan for all
>> packages the framework can easily cache the private-package information
>> so that upon restart the scan is not needed again" [2].  So, I see how an
>> `addPackage` method could help OSGi interoperation, but it doesn't appear
>> to be a hard requirement.
>> 
>> (As a side note, scanning a JAR file to determine its private packages
>> is pretty fast these days, even without caching.  That's what we do for
>> automatic modules in the JPMS implementation, and so far it hasn't been
>> a significant performance problem.)
> 
> The Private-Package cannot be relied on from BND.  A very large number of
> existing bundles are not built with BND.  The option to have
> Private-Package header in BND can be turned off.  In many cases the
> exhaustive scan of the jars will be needed.  As a data point, to scan my
> current Eclipse installation (~360 bundles at well over 310 MB of jars) it
> is taking over 1000 ms to do the full scan (on a pretty new MacBook Pro).
> 
> Some optimization may speed this up.  In OSGi the artifact the bundle is
> stored in on disk is abstracted away.  There is no independent way to get
> hold of the jar file directly.  I am using OSGi APIs to do the scanning of
> the bundle wiring.  It is possible this is not as speedy as it can be, but
> the cost will always be measurable and it will linear O(n) with the number
> of bundles installed.  Without caching the results this also pays an
> upfront cost of having to open each and every jar file.  In Equinox we
> delay the opening of the jars until a class or resource is actually going
> to be loaded from the bundle.  This upfront scanning forces us to open all
> installed bundle jars every startup even though no classes or resources
> may be loaded from the jar.
> 
> The caching comes with its own price in code complication and performance
> cost to load.  Doing this lazily would greatly simplify this and should
> result in better performance for really large installations.  We have
> common cases with 1000s of bundles installed.

Thanks for the additional background, but this still sounds like a
performance problem that can be solved at a higher level rather than
in the JVM itself.

(Perhaps, in the long run, a future revision of the OSGi specification
 should add an API to support the caching of bundle package lists.)

>>> - In my prototype I also have a boot strapping issue that addPackage would
>>> be helpful for.  In my prototype I have a launcher that assembles a module
>>> Layer which then holds the OSGi framework implementation.  In this case
>>> the launcher is using Java 9 Layer API and then is loading up a standard
>>> OSGi framework implementation using only OSGi standard API.  The launcher
>>> then assigns the framework implementation a ModuleDescriptor to represent
>>> the OSGi framework.  In OSGi the framework is represented by an OSGi
>>> bundle called the "system.bundle".  So in this case the ModuleDescriptor
>>> also has a name "system.bundle".  And the ModuleDescriptor has a well
>>> known list of OSGi APIs it exports.  But the launcher does not know the
>>> private implementation packages of the framework implementation.  In my
>>> prototype I just hack in the list of private package for the equinox
>>> framework, but this is far from ideal.
>> 
>> Aren't the solutions for the previous problem also applicable here?  If
>> such a launcher must be able to launch an arbitrary OSGi framework
>> implementation then can't it use the `Private-Package` manifest header,
>> when available, or else scan the framework's bundle for private packages
>> and cache that result for later use?
> 
> Perhaps, but there is still a complication with system bundle fragments. I
> mentioned this in a past thread about how important it is to support
> dynamic fragment attachment.  In general I don't think there is a large
> number of OSGi users depending on dynamic fragment attachment to normal
> host bundles.  But system bundle fragments are a different story.  These
> are dynamically attached to the framework implementation class loader.
> These require special support by the launcher in order to provide the
> framework the ability to add content to its own class loader.
> 
> In fact, this is how my JPMS POC is implemented, in its own system bundle
> framework fragment.  To make this work the list of equinox framework
> packages and the list of packages for framework fragment must be specified
> for the framework module descriptor by the launcher.  For my POC I just
> hardcoded these packages directly into the launcher.  In the general case
> the launcher will not know what framework fragments will be installed
> during the lifetime of the framework instance.  The addPackage and
> addExports method would allow such a launcher to handle when new packages
> are added from a framework fragment.

How critical is this general case?  How often are framework fragments
actually used in a way in which they cannot be identified when the
framework is launched?

If I understand Â§3.15 of the OSGi Core R6 specification [6] correctly
then a framework implementation is not required to resolve framework
fragments (a.k.a. "extension bundles") dynamically, though it is allowed
to do so (see steps 2 and 3 of the lifecycle description at the bottom of
p. 85).  If an OSGi implementation can always restart when a framework
fragment is resolved or refreshed then isn't that a natural time to scan
the fragment's bundle for its packages, possibly caching that information
for later use?

Do framework fragments typically add exported packages to the system
bundle, or do they only carry non-exported implementation code?

>> A higher-level question is, how important is it for an OSGi-on-JPMS
>> launcher to be completely independent of the OSGi framework that it's
>> launching?  If every framework provides its own such launcher then, in
>> each one, a little bit of build logic could be used to embed a list of
>> the system bundle's private packages in the launcher itself, so that
>> there's no need to duplicate that list manually.
> 
> There are framework independent launchers that only use OSGi APIs (BND
> tools has one such example).  It would be beneficial if these launchers
> could do the right thing when running on JPMS.  The addPackage and
> addExportToAll method would make this possible.

Yes, but could this problem be addressed in some other way?  Would modest
additions to the OSGi API allow a launcher to do a bit of introspection
on the framework being launched?

>> ...
>> 
>> The main problem with the remainder of the proposed methods is that they
>> vastly increase the space of situations in which the run-time system must
>> be prepared to update the definitions of modules.  That can, in turn,
>> limit both the space of potential implementations and the bounds of
>> practical performance.
> 
> I obviously don't know the internal details of the java implementation, I
> can understand your potential concerns.  I do wonder if the performance
> improvements you speak of can still be done in the vast majority of cases
> where the controller is not exposed at all.  The controller is only made
> available when using one of the static defineModule methods on the Layer
> API.  The static defineModule methods are useful for cases where an
> existing module system is trying to adhere their modules into JPMS and the
> additional dynamics are needed by the controller.  But for the typical
> case where it is JPMS only there is no need for the controller at all, so
> you should be able to count on things remaining static.
> 
> I say this knowing it does not simplify the spec or RI for the controller
> case, but it should allow for you to optimize the common case more at some
> point.

This approach could allow optimization of the common case but, as you
note, it would not simplify the specification or its implementations.
It would, in fact, likely make implementations even more complex since
optimizing only in some cases, even if they are common, is more complex
than optimizing all of the time.

- Mark

[1] https://osgi.org/download/r6/osgi.core-6.0.0.pdf