Feedback on proposal for #ReflectiveAccessToNonExportedTypes

Thu Jul 14 03:27:45 UTC 2016

> On Jul 13, 2016, at 4:47 PM, mark.reinhold at oracle.com wrote:
> 
> Jason -- thanks for your feedback on this topic.

Hi Mark,

Thanks for you reply! My thoughts are inline. I apologize in advance for the length/verbosity. Also, as a general disclaimer, I realize that you are all experts; in many of my arguments, I occaisionally restate certain concepts that I know you are all intimately aware of to frame the argument. Corrections are, as always, welcome.

> 
> To put what Alex wrote in a somewhat different way, I'd say that the
> tension here is between explicit configuration (as one finds today in,
> e.g., the Maven world) and implicit configuration (IoC).

Just a small nit: IoC can also be explicit, its just that the explicitness is decoupled from the module, and controlled by another party, allowing for more flexibility in the assembled system.

> Both approaches
> are important.  The former is typical of standalone Java SE applications
> while the latter is typical of Java EE applications, though the two
> approaches are often intermixed.

I agree they are certainly intermixed elements of a system, but I’d also argue IoC is pervasive in SE applications as well (e.g. inclusion of 330 and 250 in SE are examples of a desire for SE usage). I can’t refute that it has greater usage in EE, since its part of the spec, and thus effectively every EE application. 

I think a better use case categorization of this problem is static linkage vs dynamic invocation. In static linkage an explicit symbol mapping resolved by the language itself is ideal as it avoids ambiguity, and by definition is static. On the other hand with dynamic invocation it’s common for the caller to utilize introspection and discovery as part of the natural flow of executing a dynamic call. Resolving ambiguity is not an issue in this case, since it is already handled by the caller as part of introspection.

> 
> What we have in the design today seems to support the explicit approach
> pretty well, but we're still trying to figure out how best to support the
> implicit approach.

(Thanks for trying to address this concern!)

> 
> If I understand correctly, your view of the present proposal is that:
> 
> (1) It induces too much boilerplate, requiring developers to write
>     `exports dynamic P` for every single package `P` that's subject
>     to reflection by a framework, and

That’s an accurate summary of this point. To the user they have already expressed their intention by including an annotation on their class, or expressing it in configuration. This requires that they restate that intention, in another area. It’s also a potential source of errors/confusion if a user unintentionally expresses an inconsistency. 
> 
> (2) It weakens encapsulation too much, by making the types in such a
>     package available for reflection at run time by any module in the
>     system.

Sorry for the confusion, what I was trying to say on this point was a bit different. What I was trying to say was:

(2') It weakens encapsulation by forcing the introduction of exports introducing potential conflicts that break applications. 

As an example, assume I have three modules with classloader-per-module isolation (A, B, and Victim)

- A exports foo, and has a non-exported package “bar"

- B exports bar

- Victim has a module-info with requires A; requires B

Now A decides to use IoC on some of its classes in bar, so it’s definition is changed to:
{ exports foo; exports dynamic bar; }

Since exports dynamic is internally a normal export at runtime, module resolution fails when loading Victim, because its now including a duplicate package, even though A had no intention of publishing its internal bar package for linkage.

Qualified exports could in theory address this problem, but they are problematic in a dynamic environment since the module is simply not in a position to know all of the various modules which would/could enhance it. Notably, a key aspect of IoC is that modules are decoupled. For example, a container might employ dynamic selection of multiple versions of the code performing the enhancement, and/or it might segment its implementation into multiple implementation modules. Additionally the code behind a module that uses qualified exports is less reusable, and has to be updated according to everything which may or may not reflectively access it. Ultimately you would end up needing to dynamically generate them as you mention later on, but there are still problems there as well (I’ll expend further below).

> 
> These observations lead to your suggestion to allow declarative module
> boundaries to be overridden by "trusted" framework code.  It's far from
> clear how to define such a facility in a way that would still allow us to
> achieve one of our primary goals, namely strong encapsulation, i.e., the
> ability of the author of a module to declare which types are accessible
> by other components, and which are not.

My understanding was the underlying driver was a security concern (that’s were I was going with that suggestion). Is that accurate? 

I think the goal you list above is laudable but I’m hoping there is room for nuance.

One of the other points I was making in my earlier writeup is that you have two different access roles at play here. There is interaction with a module through an API/contract, and you have runtimes which enhance/augment the implementation of a module. Encapsulation is very important for the former, but it’s counterproductive for the latter. For example, the JDK itself needs to break encapsulation for some of its own actions, in order to implement rich functionality (e.g. Java serialization on non exported packages), but it also needs to be able to do so without weakening the contract between end-user modules. Containers and frameworks need the same ability to be able to extend the platform with the same powerful capabilities that have already been delivered on Java through Java 8, and to further innovate in the future. 

> I'd therefore like to explain
> and explore how the above issues can be addressed with the present
> design.
> 
>                                 * * *
> 
> To point (1), we all know that the most common way for developers to
> write Java code today is with a rich and powerful IDE.  These tools
> already have plenty of built-in cleverness for generating POJO classes,
> deriving precise `import` directives, and ameliorating other kinds of
> boilerplate.  I don't think it would be at all a stretch for such tools
> to generate precise `exports dynamic` directives on demand, based upon
> the presence of IoC-style annotations, and maintain their consistency
> over time.  Just as precise `import` directives in class and interface
> declarations document dependences upon specific types, so precise
> `exports dynamic` directives in a module declaration would document
> the exposition of specific types for reflection at run time.
> 
> If we think it likely that some modules will need to export dozens or
> hundreds of packages, leading to extremely long module declarations, then
> one possible refinement would be to allow a wildcard: `exports dynamic *`
> would export all of a module's packages for reflection at run time.  This
> would likely be straightforward.

It’s certainly true that tooling can address, and even negate the issue. However it does require that the IDE understand the specific framework in use. For large standards like Java EE, that’s a reasonable expectation. Although, there are many different frameworks and runtimes that do this, so coverage will likely be incomplete and or/lag. I suppose an IDE could just default to exporting everything but then that makes conflicts more likely.

> 
>                                 * * *
> 
> To point (2), if some packages in a user module need to be exported for
> reflection at run time, and a container wishes to ensure that only select
> "trusted" framework modules can access the types in those packages, then
> that's already expressible today.  We can also ensure that the set of
> packages exported by a module is the same whether it's used standalone
> on Java SE versus inside a container, which as you observe elsewhere in
> this thread [1] could be problematic.
> 
> Suppose, e.g., we have an application module that's written against JPA,
> rather than any specific JPA implementation, and exports the package
> containing its entity classes for reflection at run time:
> 
>   module com.foo.data {
>       requires java.persistence;
>       exports dynamic com.foo.data.model;
>   }
> 
> When used standalone, outside of a container, this module will export the
> package containing its entity classes for reflection at run time.  The
> classes will be accessible to every other module, but from a security and
> integrity standpoint we assume that whoever invokes the run-time system,
> i.e., whoever provides the command-line arguments to the `java` launcher
> or its equivalent, is trusted to ensure that no adversarial modules are
> present.
> 
> When used inside a container, the container already has the power to
> prevent an adversarial module from accessing the module's entity classes.
> That's because we expect containers to load every application into a
> unique layer [2], and a container can rewrite module descriptors when
> configuring a layer.  

Right, for the reasons I listed above, this is really only workable if the container rewrites these values and/or adds them itself. However, there is still some challenges with this. Another capability could come online in a running system in a hot fashion, after the qualified list was computed during initialization. So the set would need to be expanded to not just what is required now, but all possible consumers that could be required, and these may not be known yet. So as an example, the container administrator hot deploys a new service which snapshots internal state. The module implementation is not known until that code is deployed, forcing the container to bounce all deployments just to recompute the export list.

Perhaps this problem could be addressed by addressed by an indirection that represents a role or an actor (e.g. something like "export dynamic * to role runtime”).  I haven’t thought that through though.

> This is nothing to be ashamed of -- we fully expect
> it to become a common practice.
> 
> If the container is set up to provide, e.g., Hibernate to this particular
> application, then it could narrow the accessibility of the entity classes
> by rewriting the above module declaration to refine the `exports dynamic`
> directive:
> 
>   module com.foo.data {
>       requires java.persistence;
>       exports dynamic com.foo.data.model
>            to hibernate.core, hibernate.entitymanager;
>   }
> 
> (This is one of the very few use cases for qualified dynamic exports.)
> 
> Whether standalone or in a container the same set of packages is exported
> by the module; the only difference is that, inside the container, the
> exports are qualified.
> 
>                                 * * *
> 
> To sum up, for (1) I agree that unnecessary boilerplate is a bad thing.
> Asking the author of a module to be explicit about which packages are
> exported for reflection at run time, however, is of high value when
> trying to understand how the module fits into a larger system.  The cost
> of such explicitness can, moreover, easily be reduced by the tools that
> almost all Java developers already use.
> 
> For (2), I share your concern and I think it can be addressed within the
> scope of the present design.  At this point I don't see a strong need to
> introduce a way to enable framework code to violate module boundaries
> arbitrarily at run time, and I don't know how to do that without,
> essentially, giving up on one of our primary goals.

Thanks for sharing your perspective. I can respect pushing hard on an ideal. IMHO tweaking, or perhaps slightly reinterpreting, this goal doesn’t mean it wasn’t achieved, it’s just adapting to accommodate a very valuable set of use cases. Another way to look at it, is that once you have containers modifying and generating descriptors you have already transferred authority from the module to the runtime, so why not formalize that in a mechanism that best enables the use case? I’m hopeful we can find a way to do so.

> 
> We could make it easier to rewrite module descriptors, by providing an
> API for that purpose rather than expecting container developers to use
> libraries such as ASM, and perhaps that's worth doing, but it's a
> different issue.

It would be nice if there was a way to provide and/or alter a module definition in something other than in bytecode, mainly for optimal generation reasons, but I certainly understand that some things will have to wait as future enhancements.

> 
> - Mark
> 
> 

Thanks again

--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat