Exporting things (Was: Re: Module-system requirements)

Tue Mar 10 15:08:39 UTC 2015

On 3/9/15 3:59 PM, mark.reinhold at oracle.com wrote:
> 2015/2/16 5:15 -0800, david.lloyd at redhat.com:
>> On 02/15/2015 07:24 PM, mark.reinhold at oracle.com wrote:
>>> 2015/2/11 11:27 -0800, david.lloyd at redhat.com:
>>>> I notice that under Exports, it is defined that only packages (i.e.
>>>> classes within those packages) are considered to be exportable, and not
>>>> resources.  I'm not certain how I feel about that...
>>>>
>>>> ...  To sum up though, I'd be glad to see (from my perspective) this
>>>> behavior go away in favor of the simpler model of restricting exports to
>>>> packages (and isolating classes and resources),
>>>
>>> I tend to agree.
>>>
>>> Exposing all of a module's resources, as JAR files do today, is wrong
>>> from the perspective of encapsulation.  ...
>>>
>>> This suggests an additional requirement:
>>>
>>> - _Resource encapsulation_ --- The run-time system must ensure that the
>>> static resource files within a module are directly accessible only by
>>> code within that module.
>>
>> What you are saying here implies (to me) inspecting the call stack to
>> ensure that the caller actually exists within the module being
>> referenced.
>
> Yes.
>
>>               I'm not certain we really want to go this far, if my
>> interpretation is accurate.  Such a rule is reasonable on the face of
>> it, but it does more or less break all the code in existence which
>> currently uses resources as a sort of API into a library, and this
>> includes very common patterns for loading configuration (among many
>> other things).
>
> True.  To what degree should it be a goal to preserve such mechanisms?

I really feel that not having this compatibility will strongly diminish 
adoption of this system, since a substantial body of existing work is 
unlikely to function properly as modules.

> Must Java modules be forever constrained to look like JAR files?

No, but I think that isn't really related to the point at hand, which is 
more related to the solution of specific problems rather than change for 
its own sake.  If solving those problems means that modules have visible 
resources, does that mean we've tripped an invisible failure criteria by 
resembling JARs too closely in some specific way?

Or put a nicer way, is there a specific problem or set of problems that 
are solved by strictly encapsulating resources, or is it more of a way 
to try a new general approach?

>>>> but only if we can be
>>>> reasonably sure that the use cases solved thereby are somehow sorted
>>>> out, which include:
>>>>
>>>> â€¢ Supporting this case (or determining that it is not relevant): A
>>>> module privately uses services provided by one or more specific peer
>>>> module(s) (via ServiceLoader), the arrangement of which is determined
>>>> statically by distribution.
>>>
>>> One way to accomplish this is to make the service interface accessible
>>> only by the relevant modules, via qualified exports.  Then only those
>>> modules can provide or use implementations of the service.
>>>
>>> Another possibility is to add a requirement that the binding of providers
>>> to service interfaces can be controlled manually during the configuration
>>> process.  Is that more along the lines of what you mean when you say
>>> "determined statically by the distribution"?
>>
>> Yes it *could* be, assuming that configurations end up being
>> sufficiently flexible (e.g. I have one distribution configuration, and
>> then I have N application configurations for each installed application,
>> which is in turn an extension of the distribution configuration).
>
> Yes, that's exactly the intent of nested configurations.
>
> To ensure that a distributor of a set of modules can control how service
> dependences are resolved amongst those modules here's a new requirement,
> for the Fundamentals section:
>
>    - _Selective binding_ --- It must be possible to control the binding
>      process so that specific services are provided only by specific
>      providers, and in a specific order.
>
> Does that make sense?

In this requirement, what party is doing the controlling?

>> It's a question, I guess, of how the configurations work and how much
>> power the application configuration has to customize/mess up the
>> distribution configuration.
>
> The assumption is that a nested configuration, of itself, cannot change
> the configuration(s) in which it's nested.  That's worth making explicit
> in the existing requirement for
>
>    - _Isolated dynamic configurations_ --- An application must be able to
>      isolate the code in different dynamic configurations at least as well
>      as is possible today, where this is typically done by using multiple
>      class loaders.  ADD: An application must be able to ensure that code
>      in a dynamic configuration does not modify other configurations.
>
>>                               In a setup like we have, where one module
>> can just statically import another's services, it's pretty simple for
>> the distributor to set it up and really difficult for an application to
>> mess it up.
>
> So a JBoss module can specify that it depends upon service providers from
> one or more specific modules rather than whichever modules happen to
> provide implementations of that service?  That seems contrary to the
> nature of services, if I understand you correctly.

Right; this is however functionally similar to having the configuration 
determine the service bindings without any sub-configurations being able 
to override it, though at a whole-module granularity rather than a 
per-module-service granularity.

I did consider adding an additional global registry for service 
providers (because use cases are fairly easy to imagine), however as of 
yet the feature has never been requested, nor *actual* use cases raised 
within our domain that might beg for such a feature, so it has always 
slipped down the list.  But functionally I expect that the cases where a 
module would statically connect to provider implementation(s) would 
generally not overlap with the ability to do so at the configuration level.

In the configuration-level case though, establishment and predictability 
of order (at least on a per distribution basis) is almost certainly 
going to be needed I think, so I'd still maintain that it should be 
worked into the requirements somehow.

>>>> â€¢ Having (generally) the same visibility to .class files as to the
>>>> classes themselves, from a given resource-loading entry point (Class or
>>>> ClassLoader), i.e. if I have visibility to
>>>> classLoader.loadClass("SomeClass"), I also have visibility to
>>>> classLoader.getResource("SomeClass.class"), and vice-versa (any security
>>>> restrictions aside).
>>>
>>> Can you describe some use cases that require this property?
>>
>> Just doing a quick spin through libraries used or referenced by WildFly
>> and its dependencies: Libraries such as CGLIB and BCEL use this to
>> modify existing classes.  Ant apparently uses it to find JavaCC, though
>> this odd use is probably better done another way.  Hibernate uses it to
>> get bytes to instrument in its instrumenting class loader, as does our
>> JPA SPI.  We also use this trick to generate various fast indexes of
>> classes without actually loading the classes themselves for various
>> tools and deployment operations.  I believe we have some tooling using
>> this as well to do various types of analysis on loaded code.
>
> When an application is assembled by putting JAR files on the class path
> then yes, you definitely need CL::getResource("SomeClass.class"), or
> something like it, in order to do things like instrumentation.  That's
> because the class-path mechanism (mostly) hides the identities of the
> original JAR files -- there's no way find the JAR file that will contain
> a particular class file and then load that file yourself, rewriting its
> bytecodes along the way.

Well I think this goes beyond the JAR to the class loader itself, which 
can utilize arbitrary policies for locating classes.  Getting the class 
contents in this way is the only method which is very likely to work 
across most or all class loaders.  But I guess that's a trivial point.

> When, in the future, an application is assembled by resolving a set of
> actual modules into a configuration, then we won't necessarily have to
> use class loaders to mediate resource access, so I'd rather not bake that
> into the requirements.  A configuration ought to be able to relate
> classes, modules, and module artifacts such that, given a module and a
> class name, you can read the corresponding class file (or any other kind
> of resource) from the module's defining artifact and load it however you
> want.
>
> To capture this as a requirement, in the Development section:
>
>    - _Access to class files and resources_ --- If a module is defined by
>      an artifact that contains class files and resources then it must be
>      possible, once that module is added to a configuration, for
>      suitably-privileged code to access those class files and resources.

I like this requirement; I think it hits a good percentage of use cases.

> (The "if a module is defined by an artifact" precondition allows for
>   ahead-of-time-compilation scenarios.)
>
> This requirement also addresses the problem of how to locate and read EE
> deployment descriptors and similar kinds of configuration files.

Great.

>>>> â€¢ A consistent ruling on the right behavior of general resources which
>>>> are co-packaged with classes that use them (typically loaded via e.g.
>>>> getClass().getResourceAsStream("MyResource.properties") or similar).
>>>
>>> If we take the view that resources are module-internal then the answer
>>> is that code within a module can access that module's resources but code
>>> outside of the module cannot (with the usual exception for reflection,
>>> which would require a security check).
>>
>> This brings us close to a previous point of contention regarding having
>> the implementation require a class loader for each module (or not); the
>> problem never came up for us (even in security manager environments)
>> because acquiring a class loader already entails a security check.
>
> As I wrote earlier, I don't think we can mandate one class loader per
> module.

I hope we can discuss this a bit further (without belaboring it unduly), 
because the functional overlap between modules and class loaders really 
rubs me the wrong way.  But we can discuss further on the Interoperation 
sub-thread.

>>>> â€¢ A consistent ruling on whether it makes sense generally to be able to
>>>> export resources to dependents.
>>>
>>> If we take the view that resources are module-internal then the answer is
>>> no, this does not make sense.  A module can export packages of classes
>>> and interfaces, and can publish service providers; it cannot also export
>>> files in a string-based hierarchical namespace.
>>
>> I guess the salient underlying question is, "why not?".  I'm being (for
>> the most part) a devil's advocate here, but I think this question really
>> has to be answered succinctly, because we're really trying to change a
>> basic (de facto) principle of Java library interoperation (that they
>> often use resources as a sort of API, even if only primarily for service
>> loader-style and configuration use cases).
>>
>> I can easily imagine an "export resources" feature which specifies
>> resource directories to make exportable (due in no small part to the
>> fact that such a feature has been used extensively by us with good
>> results).
>
> Yes, we could do this, but it doesn't seem necessary.  Applications that
> need to access a module's class files and resources can do so directly,
> as described above.  I don't see a need to complicate the module system
> itself with a notion of exportable resources.

I think a lot of existing applications expect to have a single class 
loader which "sees" all the resources and configuration it needs. 
However I'm thinking that what you're describing in the Interoperation 
sub-thread could provide a means to solve this problem without requiring 
that resource exporting be a part of the core module system, if class 
loader based systems can still manufacture class loaders which map to 
modules but offer additional "legacy" behavior.  Let's continue there.

>
>>             Another use case beyond those mentioned above that is
>> satisfied by such a capability could be bundling supplementary i18n/l10n
>> or time zone data or similar as a module.
>
> The approach we've taken in the JDK for things like time-zone and locale
> data is to define appropriate services and providers thereof.  That way
> all the information goes through a proper statically-checked interface,
> and the layout and format of the actual data can evolve without having to
> update its consumers.

That seems like a good approach.

-- 
- DML