Proposal: #ResourceEncapsulation and #ClassFilesAsResources

Paul Benedict pbenedict at apache.org
Wed Jun 29 14:54:38 UTC 2016


A reply to the "observers" list...

Would a decent compromise be that resources in META-INF should be
unconditionally public? I am just saying that perhaps the problem needs to
be framed in a way where library producers *know* where to put their public
resources. I am in no way able to say if dropping all encapsulation for
resources is the right thing -- but if you wanted to save encapsulation in
any form, I'd be interested in hearing opinions on what I just proposed.
Hopefully allowing META-INF to be publicly readable is in line with
existing SE and EE standards/specs anyway, and this would enforce
compliance.

Cheers,
Paul

On Wed, Jun 29, 2016 at 8:21 AM, David M. Lloyd <david.lloyd at redhat.com>
wrote:

> Responses inline & at bottom.
>
> On 06/28/2016 04:20 PM, Mark Reinhold wrote:
>
>> Issue summaries
>> ---------------
>>
>>    #ClassFilesAsResources --- If a type is visible and was loaded from a
>>    class file then it should be possible to read that file by invoking the
>>    `getResourceAsStream` method of the type's class loader, as it is in
>>    earlier releases. [1]
>>
>>    #ResourceEncapsulation --- The `Module::getResourceAsStream` method can
>>    be used to read the resources of any named module, without restriction,
>>    which violates the resource-encapsulation requirement [2].  This method
>>    should be restricted somehow so that only "suitably-privileged" code
>>    (for some definition of that term) can access resources in a named
>>    module other than its own.  An alternative is to drop this
>>    requirement. [3]
>>
>> Proposal
>> --------
>>
>> Drop the agreed resource-encapsulation requirement, which reads:
>>
>>    Resource encapsulation --- The run-time system must ensure that the
>>    static resource files within a module are directly accessible only by
>>    code within that module.  The existing resource-access APIs should
>>    continue to work as they do today when used to access a module's own
>>    resource files. [2]
>>
>
> OK so far.
>
> Make the following changes to the various `getResource*` methods:
>>
>>    - The `ClassLoader::getResource*` methods will first delegate to the
>>      loader's parent, if any, and will then search for resources in the
>>      named modules defined to the loader, and will then search in the
>>      loader's unnamed module, typically defined by a class path.  In the
>>      case of the `getResources` method the order of the resources in the
>>      returned enumeration will be unspecified, except that the values of
>>      `cl.getResources(name).nextElement()` and `cl.getResource(name)`
>>      will be equal.
>>
>
> I am really not a fan of unspecified resource order, especially when you
> can easily derive a stable order from the graph topology.
>
> Resources however should always load from the child-most loader first.
> This isn't a problem in Jigsaw for classes because packages cannot be
> duplicated between modules (in a visible manner), but it becomes a problem
> with resources, especially if/when run-time cycles are permitted: you can
> run into an awkward situation where two mutually dependent modules each may
> see the other's resource "first" before their own, which is almost always
> undesirable.
>
>    - The `Class::getResource*` methods will only search for resources in
>>      the module that defines the class.  In the case of a class in an
>>      unnamed module this will typically result in searching the class
>>      path of the corresponding class loader.
>>
>
> Does this also apply to automatic modules?
>
> The `java.lang.reflect.Module::getResourceAsStream` method will remain,
>> so that it's possible to look up a resource in a module without having a
>> reference to a class defined in that module.  It will always be the case
>> that, for a given a class `c`, `c.getResourceAsStream(name)` will be
>> equivalent to `c.getModule().getResourceAsStream(name)`.
>>
>
> I believe that in the former case, the "name" was historically relative to
> the location of the class.  Is that behavior changing?
>
> Rationale
>> ---------
>>
>> The encapsulation of resources was intended to be part of the overall
>> encapsulation story, on the view that a module's resources are just
>> internal implementation details that need not, and should not, be
>> available to code in other modules.  Allowing external code to depend
>> upon a module's internal resources is just as problematic as allowing
>> such code to depend upon internal APIs.
>>
>> The reality of existing code today, however, is rather different, with
>> two common use cases:
>>
>>    - Resources are often used intentionally to convey information to
>>      external code, e.g., as a way to publish configuration files such as
>>      `persistence.xml`.  This sort of thing should, ideally, be done via
>>      services [4][5], but that is not the current practice and it will
>>      take years to migrate to it.
>>
>
> In many cases services are too limiting.  It is useful for users to have a
> more powerful, general facility available that can use the same basic
> mechanism as services, but for arbitrary resources.  For example, one case
> that came up quite recently (on Jigsaw-dev I think) was that it is useful
> to acquire not the service instance, but the class name, so that it could
> be constructed in various ways.  Services as they are now require another
> front-end class to encapsulate this kind of variation. Also it is not
> uncommon for a situation to arise where many classes must be produced for
> variant services which all have the same basic behavior but differ only in
> details, which is awkward as well as it can result in large quantities of
> similar classes.
>
> It would be nice if service loading and resource loading continued to
> follow a similar set of rules.  When Jigsaw switched ServiceLoader from a
> convenience API over resources to a separate concept, it lost the concept
> of using resources *as* an API component or participant, which is a
> powerful feature.  This change will restore that to some degree, but it is
> still strange that loading a META-INF/services/com.foo.Service file will
> have different search semantics than using ServiceLoader. Basically we are
> taking this capability from users unless they do it the way that Jigsaw
> wants you to: via ServiceLoader, and no other way; it almost seems like
> Jigsaw is attempting to take on the characteristics of a container (not a
> very powerful one though because it only supports linking via services (a
> pretty thin concept when you think about it), not by (for example)
> annotations or anything else).
>
> Anyway I hope you can extract some kind of cogency from all that.
>
>    - Various popular tools and byte-code manipulation frameworks expect to
>>      be able to load class files as resources, i.e., to access the class
>>      file that defines a class `p.q.C` by invoking `getResource("p.q.C")`
>>      on that class's loader.
>>
>> With the API as it stands today these use cases can still be supported,
>> but the code that looks up the resource must be converted to use the
>> `Module::getResourceAsStream` method.  This is a barrier to migration,
>> since it means that fewer existing frameworks will work out-of-the-box
>> as automatic modules.
>>
>> We should not drop agreed requirements lightly, but now that we have
>> some practical experience with an implementation of this one I think
>> it's clear that the pain outweighs the gain.
>>
>
> Overall I think this is a step in the right direction, but I think the
> discussion needs to carry on a bit longer.
>
> --
> - DML
>


More information about the jpms-spec-observers mailing list