Proposal: #ResourceEncapsulation and #ClassFilesAsResources
Rafael Winterhalter
rafael.wth at gmail.com
Wed Jun 29 20:28:07 UTC 2016
Paul, this would still break millions of applications as viewing resources
is a crucial assumption of many libraries. For example, Apache Wicket
requires to read HTML pages from the folder (package) of the class that
defines the dispatcher of this web page. And shadoing the view would
neither allow to obtain a class's byte code what is common practice today.
I currently contract for an APM vendor and my job is basically to write
plugins for all sorts of Java frameworks and libraries which is why I think
I have a pretty good overview of how this would affect a lot of code. One
reason I was so explicit about this issue being problematic is that it
would really break a wide range of applications.
In this context, I think Mark's suggestion is the best possible idea; I do
neither see a problem with the randomness of the resources as even today,
it depends on the order that jar files are added to the class path which is
by no means stable. As for named modules, package names should be exclusive
anyways and for the unnamed module, the behavior does not change.
I would still like to see an introduction of some form of ResourceLocatable
interface that defines two methods:
interface ResourceLocatable {
URL getResource(String);
InputStream getResourceAsStream(String);
}
This would add some structure to the API. As for the different lookup
contexts I think the made suggestion is sound, too:
1. ClassLoader: Exposes any resource reachable by this class loader,
therefore an Enumeration<URL> getResources(String) method makes sense, too.
2. Module: Only exposes resources within this module from an absolute
location.
3. Class: Only exposes resources within the class's module from a relative
location.
Great work on this, I am really glad this proposal was made in this way.
Cheers, Rafael
2016-06-29 16:54 GMT+02:00 Paul Benedict <pbenedict at apache.org>:
> A reply to the "observers" list...
>
> Would a decent compromise be that resources in META-INF should be
> unconditionally public? I am just saying that perhaps the problem needs to
> be framed in a way where library producers *know* where to put their public
> resources. I am in no way able to say if dropping all encapsulation for
> resources is the right thing -- but if you wanted to save encapsulation in
> any form, I'd be interested in hearing opinions on what I just proposed.
> Hopefully allowing META-INF to be publicly readable is in line with
> existing SE and EE standards/specs anyway, and this would enforce
> compliance.
>
> Cheers,
> Paul
>
> On Wed, Jun 29, 2016 at 8:21 AM, David M. Lloyd <david.lloyd at redhat.com>
> wrote:
>
> > Responses inline & at bottom.
> >
> > On 06/28/2016 04:20 PM, Mark Reinhold wrote:
> >
> >> Issue summaries
> >> ---------------
> >>
> >> #ClassFilesAsResources --- If a type is visible and was loaded from a
> >> class file then it should be possible to read that file by invoking
> the
> >> `getResourceAsStream` method of the type's class loader, as it is in
> >> earlier releases. [1]
> >>
> >> #ResourceEncapsulation --- The `Module::getResourceAsStream` method
> can
> >> be used to read the resources of any named module, without
> restriction,
> >> which violates the resource-encapsulation requirement [2]. This
> method
> >> should be restricted somehow so that only "suitably-privileged" code
> >> (for some definition of that term) can access resources in a named
> >> module other than its own. An alternative is to drop this
> >> requirement. [3]
> >>
> >> Proposal
> >> --------
> >>
> >> Drop the agreed resource-encapsulation requirement, which reads:
> >>
> >> Resource encapsulation --- The run-time system must ensure that the
> >> static resource files within a module are directly accessible only by
> >> code within that module. The existing resource-access APIs should
> >> continue to work as they do today when used to access a module's own
> >> resource files. [2]
> >>
> >
> > OK so far.
> >
> > Make the following changes to the various `getResource*` methods:
> >>
> >> - The `ClassLoader::getResource*` methods will first delegate to the
> >> loader's parent, if any, and will then search for resources in the
> >> named modules defined to the loader, and will then search in the
> >> loader's unnamed module, typically defined by a class path. In the
> >> case of the `getResources` method the order of the resources in the
> >> returned enumeration will be unspecified, except that the values of
> >> `cl.getResources(name).nextElement()` and `cl.getResource(name)`
> >> will be equal.
> >>
> >
> > I am really not a fan of unspecified resource order, especially when you
> > can easily derive a stable order from the graph topology.
> >
> > Resources however should always load from the child-most loader first.
> > This isn't a problem in Jigsaw for classes because packages cannot be
> > duplicated between modules (in a visible manner), but it becomes a
> problem
> > with resources, especially if/when run-time cycles are permitted: you can
> > run into an awkward situation where two mutually dependent modules each
> may
> > see the other's resource "first" before their own, which is almost always
> > undesirable.
> >
> > - The `Class::getResource*` methods will only search for resources in
> >> the module that defines the class. In the case of a class in an
> >> unnamed module this will typically result in searching the class
> >> path of the corresponding class loader.
> >>
> >
> > Does this also apply to automatic modules?
> >
> > The `java.lang.reflect.Module::getResourceAsStream` method will remain,
> >> so that it's possible to look up a resource in a module without having a
> >> reference to a class defined in that module. It will always be the case
> >> that, for a given a class `c`, `c.getResourceAsStream(name)` will be
> >> equivalent to `c.getModule().getResourceAsStream(name)`.
> >>
> >
> > I believe that in the former case, the "name" was historically relative
> to
> > the location of the class. Is that behavior changing?
> >
> > Rationale
> >> ---------
> >>
> >> The encapsulation of resources was intended to be part of the overall
> >> encapsulation story, on the view that a module's resources are just
> >> internal implementation details that need not, and should not, be
> >> available to code in other modules. Allowing external code to depend
> >> upon a module's internal resources is just as problematic as allowing
> >> such code to depend upon internal APIs.
> >>
> >> The reality of existing code today, however, is rather different, with
> >> two common use cases:
> >>
> >> - Resources are often used intentionally to convey information to
> >> external code, e.g., as a way to publish configuration files such
> as
> >> `persistence.xml`. This sort of thing should, ideally, be done via
> >> services [4][5], but that is not the current practice and it will
> >> take years to migrate to it.
> >>
> >
> > In many cases services are too limiting. It is useful for users to have
> a
> > more powerful, general facility available that can use the same basic
> > mechanism as services, but for arbitrary resources. For example, one
> case
> > that came up quite recently (on Jigsaw-dev I think) was that it is useful
> > to acquire not the service instance, but the class name, so that it could
> > be constructed in various ways. Services as they are now require another
> > front-end class to encapsulate this kind of variation. Also it is not
> > uncommon for a situation to arise where many classes must be produced for
> > variant services which all have the same basic behavior but differ only
> in
> > details, which is awkward as well as it can result in large quantities of
> > similar classes.
> >
> > It would be nice if service loading and resource loading continued to
> > follow a similar set of rules. When Jigsaw switched ServiceLoader from a
> > convenience API over resources to a separate concept, it lost the concept
> > of using resources *as* an API component or participant, which is a
> > powerful feature. This change will restore that to some degree, but it
> is
> > still strange that loading a META-INF/services/com.foo.Service file will
> > have different search semantics than using ServiceLoader. Basically we
> are
> > taking this capability from users unless they do it the way that Jigsaw
> > wants you to: via ServiceLoader, and no other way; it almost seems like
> > Jigsaw is attempting to take on the characteristics of a container (not a
> > very powerful one though because it only supports linking via services (a
> > pretty thin concept when you think about it), not by (for example)
> > annotations or anything else).
> >
> > Anyway I hope you can extract some kind of cogency from all that.
> >
> > - Various popular tools and byte-code manipulation frameworks expect
> to
> >> be able to load class files as resources, i.e., to access the class
> >> file that defines a class `p.q.C` by invoking
> `getResource("p.q.C")`
> >> on that class's loader.
> >>
> >> With the API as it stands today these use cases can still be supported,
> >> but the code that looks up the resource must be converted to use the
> >> `Module::getResourceAsStream` method. This is a barrier to migration,
> >> since it means that fewer existing frameworks will work out-of-the-box
> >> as automatic modules.
> >>
> >> We should not drop agreed requirements lightly, but now that we have
> >> some practical experience with an implementation of this one I think
> >> it's clear that the pain outweighs the gain.
> >>
> >
> > Overall I think this is a step in the right direction, but I think the
> > discussion needs to carry on a bit longer.
> >
> > --
> > - DML
> >
>
More information about the jpms-spec-observers
mailing list