Proposal: #ResourceEncapsulation and #ClassFilesAsResources

Rafael Winterhalter rafael.wth at gmail.com
Wed Jun 29 21:06:42 UTC 2016


Paul, I agree with you as for that this is desirable and this is why the
access should be modularized when locating a resource via the Class or
Module instances what already provides a form of encapsulation.

Obtaining a ClassLoader is already guarded by a security manager check so
plugins cannot break through here once a library maintainer chooses to
encapsulate a resource.

Cheers, Rafael

2016-06-29 22:53 GMT+02:00 Paul Benedict <pbenedict at apache.org>:

> Rafael, you are right my suggestion would break such applications. You
> couldn't automatically use them as a Java 9 module. You would have to move
> those resources into META-INF.
>
> With that said, what about private resources? This is not a matter of
> reflection, which is where I advocate "jailbreaking" to see all types no
> matter what module. I consider getResource() to be a normal API that should
> respect module boundaries. If I want to provide a file with a secret key in
> my module, must it be exposed to everyone? I say no, it shouldn't. That's
> why I think encapsulation should not be dropped in this case.
>
> Cheers,
> Paul
>
> On Wed, Jun 29, 2016 at 3:28 PM, Rafael Winterhalter <rafael.wth at gmail.com
> > wrote:
>
>> Paul, this would still break millions of applications as viewing resources
>> is a crucial assumption of many libraries. For example, Apache Wicket
>> requires to read HTML pages from the folder (package) of the class that
>> defines the dispatcher of this web page. And shadoing the view would
>> neither allow to obtain a class's byte code what is common practice today.
>> I currently contract for an APM vendor and my job is basically to write
>> plugins for all sorts of Java frameworks and libraries which is why I
>> think
>> I have a pretty good overview of how this would affect a lot of code. One
>> reason I was so explicit about this issue being problematic is that it
>> would really break a wide range of applications.
>>
>> In this context, I think Mark's suggestion is the best possible idea; I do
>> neither see a problem with the randomness of the resources as even today,
>> it depends on the order that jar files are added to the class path which
>> is
>> by no means stable. As for named modules, package names should be
>> exclusive
>> anyways and for the unnamed module, the behavior does not change.
>>
>> I would still like to see an introduction of some form of
>> ResourceLocatable
>> interface that defines two methods:
>>
>> interface ResourceLocatable {
>>   URL getResource(String);
>>   InputStream getResourceAsStream(String);
>> }
>>
>> This would add some structure to the API. As for the different lookup
>> contexts I think the made suggestion is sound, too:
>> 1. ClassLoader: Exposes any resource reachable by this class loader,
>> therefore an Enumeration<URL> getResources(String) method makes sense,
>> too.
>> 2. Module: Only exposes resources within this module from an absolute
>> location.
>> 3. Class: Only exposes resources within the class's module from a relative
>> location.
>>
>> Great work on this, I am really glad this proposal was made in this way.
>> Cheers, Rafael
>>
>> 2016-06-29 16:54 GMT+02:00 Paul Benedict <pbenedict at apache.org>:
>>
>> > A reply to the "observers" list...
>> >
>> > Would a decent compromise be that resources in META-INF should be
>> > unconditionally public? I am just saying that perhaps the problem needs
>> to
>> > be framed in a way where library producers *know* where to put their
>> public
>> > resources. I am in no way able to say if dropping all encapsulation for
>> > resources is the right thing -- but if you wanted to save encapsulation
>> in
>> > any form, I'd be interested in hearing opinions on what I just proposed.
>> > Hopefully allowing META-INF to be publicly readable is in line with
>> > existing SE and EE standards/specs anyway, and this would enforce
>> > compliance.
>> >
>> > Cheers,
>> > Paul
>> >
>> > On Wed, Jun 29, 2016 at 8:21 AM, David M. Lloyd <david.lloyd at redhat.com
>> >
>> > wrote:
>> >
>> > > Responses inline & at bottom.
>> > >
>> > > On 06/28/2016 04:20 PM, Mark Reinhold wrote:
>> > >
>> > >> Issue summaries
>> > >> ---------------
>> > >>
>> > >>    #ClassFilesAsResources --- If a type is visible and was loaded
>> from a
>> > >>    class file then it should be possible to read that file by
>> invoking
>> > the
>> > >>    `getResourceAsStream` method of the type's class loader, as it is
>> in
>> > >>    earlier releases. [1]
>> > >>
>> > >>    #ResourceEncapsulation --- The `Module::getResourceAsStream`
>> method
>> > can
>> > >>    be used to read the resources of any named module, without
>> > restriction,
>> > >>    which violates the resource-encapsulation requirement [2].  This
>> > method
>> > >>    should be restricted somehow so that only "suitably-privileged"
>> code
>> > >>    (for some definition of that term) can access resources in a named
>> > >>    module other than its own.  An alternative is to drop this
>> > >>    requirement. [3]
>> > >>
>> > >> Proposal
>> > >> --------
>> > >>
>> > >> Drop the agreed resource-encapsulation requirement, which reads:
>> > >>
>> > >>    Resource encapsulation --- The run-time system must ensure that
>> the
>> > >>    static resource files within a module are directly accessible
>> only by
>> > >>    code within that module.  The existing resource-access APIs should
>> > >>    continue to work as they do today when used to access a module's
>> own
>> > >>    resource files. [2]
>> > >>
>> > >
>> > > OK so far.
>> > >
>> > > Make the following changes to the various `getResource*` methods:
>> > >>
>> > >>    - The `ClassLoader::getResource*` methods will first delegate to
>> the
>> > >>      loader's parent, if any, and will then search for resources in
>> the
>> > >>      named modules defined to the loader, and will then search in the
>> > >>      loader's unnamed module, typically defined by a class path.  In
>> the
>> > >>      case of the `getResources` method the order of the resources in
>> the
>> > >>      returned enumeration will be unspecified, except that the
>> values of
>> > >>      `cl.getResources(name).nextElement()` and `cl.getResource(name)`
>> > >>      will be equal.
>> > >>
>> > >
>> > > I am really not a fan of unspecified resource order, especially when
>> you
>> > > can easily derive a stable order from the graph topology.
>> > >
>> > > Resources however should always load from the child-most loader first.
>> > > This isn't a problem in Jigsaw for classes because packages cannot be
>> > > duplicated between modules (in a visible manner), but it becomes a
>> > problem
>> > > with resources, especially if/when run-time cycles are permitted: you
>> can
>> > > run into an awkward situation where two mutually dependent modules
>> each
>> > may
>> > > see the other's resource "first" before their own, which is almost
>> always
>> > > undesirable.
>> > >
>> > >    - The `Class::getResource*` methods will only search for resources
>> in
>> > >>      the module that defines the class.  In the case of a class in an
>> > >>      unnamed module this will typically result in searching the class
>> > >>      path of the corresponding class loader.
>> > >>
>> > >
>> > > Does this also apply to automatic modules?
>> > >
>> > > The `java.lang.reflect.Module::getResourceAsStream` method will
>> remain,
>> > >> so that it's possible to look up a resource in a module without
>> having a
>> > >> reference to a class defined in that module.  It will always be the
>> case
>> > >> that, for a given a class `c`, `c.getResourceAsStream(name)` will be
>> > >> equivalent to `c.getModule().getResourceAsStream(name)`.
>> > >>
>> > >
>> > > I believe that in the former case, the "name" was historically
>> relative
>> > to
>> > > the location of the class.  Is that behavior changing?
>> > >
>> > > Rationale
>> > >> ---------
>> > >>
>> > >> The encapsulation of resources was intended to be part of the overall
>> > >> encapsulation story, on the view that a module's resources are just
>> > >> internal implementation details that need not, and should not, be
>> > >> available to code in other modules.  Allowing external code to depend
>> > >> upon a module's internal resources is just as problematic as allowing
>> > >> such code to depend upon internal APIs.
>> > >>
>> > >> The reality of existing code today, however, is rather different,
>> with
>> > >> two common use cases:
>> > >>
>> > >>    - Resources are often used intentionally to convey information to
>> > >>      external code, e.g., as a way to publish configuration files
>> such
>> > as
>> > >>      `persistence.xml`.  This sort of thing should, ideally, be done
>> via
>> > >>      services [4][5], but that is not the current practice and it
>> will
>> > >>      take years to migrate to it.
>> > >>
>> > >
>> > > In many cases services are too limiting.  It is useful for users to
>> have
>> > a
>> > > more powerful, general facility available that can use the same basic
>> > > mechanism as services, but for arbitrary resources.  For example, one
>> > case
>> > > that came up quite recently (on Jigsaw-dev I think) was that it is
>> useful
>> > > to acquire not the service instance, but the class name, so that it
>> could
>> > > be constructed in various ways.  Services as they are now require
>> another
>> > > front-end class to encapsulate this kind of variation. Also it is not
>> > > uncommon for a situation to arise where many classes must be produced
>> for
>> > > variant services which all have the same basic behavior but differ
>> only
>> > in
>> > > details, which is awkward as well as it can result in large
>> quantities of
>> > > similar classes.
>> > >
>> > > It would be nice if service loading and resource loading continued to
>> > > follow a similar set of rules.  When Jigsaw switched ServiceLoader
>> from a
>> > > convenience API over resources to a separate concept, it lost the
>> concept
>> > > of using resources *as* an API component or participant, which is a
>> > > powerful feature.  This change will restore that to some degree, but
>> it
>> > is
>> > > still strange that loading a META-INF/services/com.foo.Service file
>> will
>> > > have different search semantics than using ServiceLoader. Basically we
>> > are
>> > > taking this capability from users unless they do it the way that
>> Jigsaw
>> > > wants you to: via ServiceLoader, and no other way; it almost seems
>> like
>> > > Jigsaw is attempting to take on the characteristics of a container
>> (not a
>> > > very powerful one though because it only supports linking via
>> services (a
>> > > pretty thin concept when you think about it), not by (for example)
>> > > annotations or anything else).
>> > >
>> > > Anyway I hope you can extract some kind of cogency from all that.
>> > >
>> > >    - Various popular tools and byte-code manipulation frameworks
>> expect
>> > to
>> > >>      be able to load class files as resources, i.e., to access the
>> class
>> > >>      file that defines a class `p.q.C` by invoking
>> > `getResource("p.q.C")`
>> > >>      on that class's loader.
>> > >>
>> > >> With the API as it stands today these use cases can still be
>> supported,
>> > >> but the code that looks up the resource must be converted to use the
>> > >> `Module::getResourceAsStream` method.  This is a barrier to
>> migration,
>> > >> since it means that fewer existing frameworks will work
>> out-of-the-box
>> > >> as automatic modules.
>> > >>
>> > >> We should not drop agreed requirements lightly, but now that we have
>> > >> some practical experience with an implementation of this one I think
>> > >> it's clear that the pain outweighs the gain.
>> > >>
>> > >
>> > > Overall I think this is a step in the right direction, but I think the
>> > > discussion needs to carry on a bit longer.
>> > >
>> > > --
>> > > - DML
>> > >
>> >
>>
>
>


More information about the jpms-spec-observers mailing list