Exporting things (Was: Re: Module-system requirements)

Mon Feb 16 13:15:56 UTC 2015

On 02/15/2015 07:24 PM, mark.reinhold at oracle.com wrote:
> 2015/2/11 11:27 -0800, david.lloyd at redhat.com:
>> I notice that under Exports, it is defined that only packages (i.e.
>> classes within those packages) are considered to be exportable, and not
>> resources.  I'm not certain how I feel about that... in our
>> implementation we flattened the concepts of packages and directories.
>> This was mainly to solve a few specific difficulties of supporting plain
>> JARs as module content; a JAR has a mix of classes and resources, and it
>> is hard to deterministically tell the difference, especially when you
>> can do such (commonly done) things as:
>> classLoader.getResource("org/foo/bar/SomeClass.class").
>>
>> ...
>>
>> However this approach is not without its difficulties: ...
>>
>>     ...  To sum up though, I'd be glad to see (from my perspective) this
>> behavior go away in favor of the simpler model of restricting exports to
>> packages (and isolating classes and resources),
>
> I tend to agree.
>
> Exposing all of a module's resources, as JAR files do today, is wrong
> from the perspective of encapsulation.  Extending the "exports"
> requirement to allow control over which resources are exported leads to
> both conceptual and implementation complexity (e.g., what would it mean
> to export both a class, as a type, as well as the corresponding .class
> file, or the latter and not the former?)  It seems sanest to treat
> resources as strictly internal to modules.
>
> This suggests an additional requirement:
>
>    - _Resource encapsulation_ --- The run-time system must ensure that the
>      static resource files within a module are directly accessible only by
>      code within that module.

What you are saying here implies (to me) inspecting the call stack to 
ensure that the caller actually exists within the module being 
referenced.  I'm not certain we really want to go this far, if my 
interpretation is accurate.  Such a rule is reasonable on the face of 
it, but it does more or less break all the code in existence which 
currently uses resources as a sort of API into a library, and this 
includes very common patterns for loading configuration (among many 
other things).

>>                                                  but only if we can be
>> reasonably sure that the use cases solved thereby are somehow sorted
>> out, which include:
>>
>> • Supporting this case (or determining that it is not relevant): A
>> module privately uses services provided by one or more specific peer
>> module(s) (via ServiceLoader), the arrangement of which is determined
>> statically by distribution.
>
> One way to accomplish this is to make the service interface accessible
> only by the relevant modules, via qualified exports.  Then only those
> modules can provide or use implementations of the service.
>
> Another possibility is to add a requirement that the binding of providers
> to service interfaces can be controlled manually during the configuration
> process.  Is that more along the lines of what you mean when you say
> "determined statically by the distribution"?

Yes it *could* be, assuming that configurations end up being 
sufficiently flexible (e.g. I have one distribution configuration, and 
then I have N application configurations for each installed application, 
which is in turn an extension of the distribution configuration).

It's a question, I guess, of how the configurations work and how much 
power the application configuration has to customize/mess up the 
distribution configuration.  In a setup like we have, where one module 
can just statically import another's services, it's pretty simple for 
the distributor to set it up and really difficult for an application to 
mess it up.

>> • Having (generally) the same visibility to .class files as to the
>> classes themselves, from a given resource-loading entry point (Class or
>> ClassLoader), i.e. if I have visibility to
>> classLoader.loadClass("SomeClass"), I also have visibility to
>> classLoader.getResource("SomeClass.class"), and vice-versa (any security
>> restrictions aside).
>
> Can you describe some use cases that require this property?

Just doing a quick spin through libraries used or referenced by WildFly 
and its dependencies: Libraries such as CGLIB and BCEL use this to 
modify existing classes.  Ant apparently uses it to find JavaCC, though 
this odd use is probably better done another way.  Hibernate uses it to 
get bytes to instrument in its instrumenting class loader, as does our 
JPA SPI.  We also use this trick to generate various fast indexes of 
classes without actually loading the classes themselves for various 
tools and deployment operations.  I believe we have some tooling using 
this as well to do various types of analysis on loaded code.

> What if a module is compiled to native code ahead-of-time, and hence
> doesn't necessarily contain a .class file for every class?

Then things like instrumentation will fail in what is I guess a 
self-obviating manner.  Things like JPA which may count on instrumenting 
classes just won't work in such environments.

>> • A consistent ruling on the right behavior of general resources which
>> are co-packaged with classes that use them (typically loaded via e.g.
>> getClass().getResourceAsStream("MyResource.properties") or similar).
>
> If we take the view that resources are module-internal then the answer
> is that code within a module can access that module's resources but code
> outside of the module cannot (with the usual exception for reflection,
> which would require a security check).

This brings us close to a previous point of contention regarding having 
the implementation require a class loader for each module (or not); the 
problem never came up for us (even in security manager environments) 
because acquiring a class loader already entails a security check.

>> • A consistent ruling on whether it makes sense generally to be able to
>> export resources to dependents.
>
> If we take the view that resources are module-internal then the answer is
> no, this does not make sense.  A module can export packages of classes
> and interfaces, and can publish service providers; it cannot also export
> files in a string-based hierarchical namespace.

I guess the salient underlying question is, "why not?".  I'm being (for 
the most part) a devil's advocate here, but I think this question really 
has to be answered succinctly, because we're really trying to change a 
basic (de facto) principle of Java library interoperation (that they 
often use resources as a sort of API, even if only primarily for service 
loader-style and configuration use cases).

I can easily imagine an "export resources" feature which specifies 
resource directories to make exportable (due in no small part to the 
fact that such a feature has been used extensively by us with good 
results).  Another use case beyond those mentioned above that is 
satisfied by such a capability could be bundling supplementary i18n/l10n 
or time zone data or similar as a module.

>> A minor, somewhat related point that this raises...
>>
>> The term "dependences" (rarely-used pl. of "dependence") is used quite a
>> lot in the document (whether intentional or otherwise), but I think that
>> the term "dependencies" (pl. of "dependency") is probably a better term,
>> and is definitely a more ubiquitous one.  The inverse of "dependency" is
>> "dependent", which forms a concise term pair that we use quite a lot
>> internally (i.e. if A is a dependency of B, B is a dependent of A).
>
> You're not the first to ask me about this ...
>
> "Dependence" and "dependency" name distinct concepts.  A dependence is a
> relationship, potentially unfulfilled; a dependency is a thing which can
> fulfill a relationship of dependence.  When thinking, speaking, or
> writing about modules I find it useful to keep this distinction clear.
>
> A module, standing alone, can have dependences upon some other,
> yet-to-be-identified modules.  All that is known is that they have
> specific names, or that they provide specific services.  The module's
> definition, in other words, merely describes relationships and
> constraints upon how those relationships can be fulfilled; it does not
> identify the specific modules that will fulfill those relationships.
>
> The processes of resolution and service binding identify the actual
> modules, i.e., the dependencies, that will satisfy a module's
> dependences.

OK that makes sense to me.

> I agree that "dependent" is also a useful term.  In ASCII-art form the
> three concepts are related thus:
>
>                   Dependence
>      Dependent  --------------> Dependency
>
> - Mark
>

-- 
- DML