Exporting things (Was: Re: Module-system requirements)

Mon Mar 9 20:59:17 UTC 2015

2015/2/16 5:15 -0800, david.lloyd at redhat.com:
> On 02/15/2015 07:24 PM, mark.reinhold at oracle.com wrote:
>> 2015/2/11 11:27 -0800, david.lloyd at redhat.com:
>>> I notice that under Exports, it is defined that only packages (i.e.
>>> classes within those packages) are considered to be exportable, and not
>>> resources.  I'm not certain how I feel about that...
>>> 
>>> ...  To sum up though, I'd be glad to see (from my perspective) this
>>> behavior go away in favor of the simpler model of restricting exports to
>>> packages (and isolating classes and resources),
>> 
>> I tend to agree.
>> 
>> Exposing all of a module's resources, as JAR files do today, is wrong
>> from the perspective of encapsulation.  ...
>> 
>> This suggests an additional requirement:
>> 
>> - _Resource encapsulation_ --- The run-time system must ensure that the
>> static resource files within a module are directly accessible only by
>> code within that module.
> 
> What you are saying here implies (to me) inspecting the call stack to 
> ensure that the caller actually exists within the module being 
> referenced.

Yes.

>              I'm not certain we really want to go this far, if my 
> interpretation is accurate.  Such a rule is reasonable on the face of 
> it, but it does more or less break all the code in existence which 
> currently uses resources as a sort of API into a library, and this 
> includes very common patterns for loading configuration (among many 
> other things).

True.  To what degree should it be a goal to preserve such mechanisms?
Must Java modules be forever constrained to look like JAR files?

>>> but only if we can be
>>> reasonably sure that the use cases solved thereby are somehow sorted
>>> out, which include:
>>> 
>>> â€¢ Supporting this case (or determining that it is not relevant): A
>>> module privately uses services provided by one or more specific peer
>>> module(s) (via ServiceLoader), the arrangement of which is determined
>>> statically by distribution.
>> 
>> One way to accomplish this is to make the service interface accessible
>> only by the relevant modules, via qualified exports.  Then only those
>> modules can provide or use implementations of the service.
>> 
>> Another possibility is to add a requirement that the binding of providers
>> to service interfaces can be controlled manually during the configuration
>> process.  Is that more along the lines of what you mean when you say
>> "determined statically by the distribution"?
> 
> Yes it *could* be, assuming that configurations end up being 
> sufficiently flexible (e.g. I have one distribution configuration, and 
> then I have N application configurations for each installed application, 
> which is in turn an extension of the distribution configuration).

Yes, that's exactly the intent of nested configurations.

To ensure that a distributor of a set of modules can control how service
dependences are resolved amongst those modules here's a new requirement,
for the Fundamentals section:

  - _Selective binding_ --- It must be possible to control the binding
    process so that specific services are provided only by specific
    providers, and in a specific order.

Does that make sense?

> It's a question, I guess, of how the configurations work and how much 
> power the application configuration has to customize/mess up the 
> distribution configuration.

The assumption is that a nested configuration, of itself, cannot change
the configuration(s) in which it's nested.  That's worth making explicit
in the existing requirement for

  - _Isolated dynamic configurations_ --- An application must be able to
    isolate the code in different dynamic configurations at least as well
    as is possible today, where this is typically done by using multiple
    class loaders.  ADD: An application must be able to ensure that code
    in a dynamic configuration does not modify other configurations.

>                              In a setup like we have, where one module 
> can just statically import another's services, it's pretty simple for 
> the distributor to set it up and really difficult for an application to 
> mess it up.

So a JBoss module can specify that it depends upon service providers from
one or more specific modules rather than whichever modules happen to
provide implementations of that service?  That seems contrary to the
nature of services, if I understand you correctly.

>>> â€¢ Having (generally) the same visibility to .class files as to the
>>> classes themselves, from a given resource-loading entry point (Class or
>>> ClassLoader), i.e. if I have visibility to
>>> classLoader.loadClass("SomeClass"), I also have visibility to
>>> classLoader.getResource("SomeClass.class"), and vice-versa (any security
>>> restrictions aside).
>> 
>> Can you describe some use cases that require this property?
> 
> Just doing a quick spin through libraries used or referenced by WildFly 
> and its dependencies: Libraries such as CGLIB and BCEL use this to 
> modify existing classes.  Ant apparently uses it to find JavaCC, though 
> this odd use is probably better done another way.  Hibernate uses it to 
> get bytes to instrument in its instrumenting class loader, as does our 
> JPA SPI.  We also use this trick to generate various fast indexes of 
> classes without actually loading the classes themselves for various 
> tools and deployment operations.  I believe we have some tooling using 
> this as well to do various types of analysis on loaded code.

When an application is assembled by putting JAR files on the class path
then yes, you definitely need CL::getResource("SomeClass.class"), or
something like it, in order to do things like instrumentation.  That's
because the class-path mechanism (mostly) hides the identities of the
original JAR files -- there's no way find the JAR file that will contain
a particular class file and then load that file yourself, rewriting its
bytecodes along the way.

When, in the future, an application is assembled by resolving a set of
actual modules into a configuration, then we won't necessarily have to
use class loaders to mediate resource access, so I'd rather not bake that
into the requirements.  A configuration ought to be able to relate
classes, modules, and module artifacts such that, given a module and a
class name, you can read the corresponding class file (or any other kind
of resource) from the module's defining artifact and load it however you
want.

To capture this as a requirement, in the Development section:

  - _Access to class files and resources_ --- If a module is defined by
    an artifact that contains class files and resources then it must be
    possible, once that module is added to a configuration, for
    suitably-privileged code to access those class files and resources.

(The "if a module is defined by an artifact" precondition allows for
 ahead-of-time-compilation scenarios.)

This requirement also addresses the problem of how to locate and read EE
deployment descriptors and similar kinds of configuration files.

> ...
> 
>>> â€¢ A consistent ruling on the right behavior of general resources which
>>> are co-packaged with classes that use them (typically loaded via e.g.
>>> getClass().getResourceAsStream("MyResource.properties") or similar).
>> 
>> If we take the view that resources are module-internal then the answer
>> is that code within a module can access that module's resources but code
>> outside of the module cannot (with the usual exception for reflection,
>> which would require a security check).
> 
> This brings us close to a previous point of contention regarding having 
> the implementation require a class loader for each module (or not); the 
> problem never came up for us (even in security manager environments) 
> because acquiring a class loader already entails a security check.

As I wrote earlier, I don't think we can mandate one class loader per
module.

>>> â€¢ A consistent ruling on whether it makes sense generally to be able to
>>> export resources to dependents.
>> 
>> If we take the view that resources are module-internal then the answer is
>> no, this does not make sense.  A module can export packages of classes
>> and interfaces, and can publish service providers; it cannot also export
>> files in a string-based hierarchical namespace.
> 
> I guess the salient underlying question is, "why not?".  I'm being (for 
> the most part) a devil's advocate here, but I think this question really 
> has to be answered succinctly, because we're really trying to change a 
> basic (de facto) principle of Java library interoperation (that they 
> often use resources as a sort of API, even if only primarily for service 
> loader-style and configuration use cases).
> 
> I can easily imagine an "export resources" feature which specifies 
> resource directories to make exportable (due in no small part to the 
> fact that such a feature has been used extensively by us with good 
> results).

Yes, we could do this, but it doesn't seem necessary.  Applications that
need to access a module's class files and resources can do so directly,
as described above.  I don't see a need to complicate the module system
itself with a notion of exportable resources.

>            Another use case beyond those mentioned above that is 
> satisfied by such a capability could be bundling supplementary i18n/l10n 
> or time zone data or similar as a module.

The approach we've taken in the JDK for things like time-zone and locale
data is to define appropriate services and providers thereof.  That way
all the information goes through a proper statically-checked interface,
and the layout and format of the actual data can evolve without having to
update its consumers.

- Mark