Feedback on proposal for #ReflectiveAccessToNonExportedTypes

Thu Jul 21 17:01:11 UTC 2016

> On Jul 18, 2016, at 4:30 PM, mark.reinhold at oracle.com wrote:
> 

Hi Mark, Thanks for the reply. I have snipped out portions to make it easier to follow the thread.

>> 
>> I agree they are certainly intermixed elements of a system, but I’d also
>> argue IoC is pervasive in SE applications as well (e.g. inclusion of 330
>> and 250 in SE are examples of a desire for SE usage). I can’t refute that
>> it has greater usage in EE, since its part of the spec, and thus
>> effectively every EE application.
> 
> FYI, JSR 330 (DI annotations) is not in Java SE, though it's certainly
> used in Java SE applications in combination with various DI frameworks.
> 
> JSR 250 ("common" annotations) specifies 14 annotations, but just five
> of them are in Java SE.  They're really only there to support JAX-WS, a
> component shared with Java EE.  So far as I know they're not used much
> in SE applications except in conjunction with JAX-WS.

Ah yes, of course thanks for the correction. Not sure why I had it in my head that 330 was included.

-snip-

>> 
>> Sorry for the confusion, what I was trying to say on this point was a bit
>> different. What I was trying to say was:
>> 
>> (2') It weakens encapsulation by forcing the introduction of exports
>>    introducing potential conflicts that break applications.
>> 
>> As an example, assume I have three modules with classloader-per-module
>> isolation (A, B, and Victim)
>> 
>> - A exports foo, and has a non-exported package “bar"
>> 
>> - B exports bar
>> 
>> - Victim has a module-info with requires A; requires B
>> 
>> Now A decides to use IoC on some of its classes in bar, so it’s
>> definition is changed to:
>> 
>> { exports foo; exports dynamic bar; }
>> 
>> Since exports dynamic is internally a normal export at runtime, module
>> resolution fails when loading Victim, because its now including a
>> duplicate package, even though A had no intention of publishing its
>> internal bar package for linkage.
> 
> Got it.  Thanks for clarifying this -- I agree that it's a problem.
> 
> Fortunately I think we can address it simply by revising the semantics of
> `exports dynamic p` to omit the package-conflict constraint.  This would
> allow split packages to occur more readily at run time, though still
> really only in fairly obscure situations involving poorly-written class
> loaders.

That would help, but there is also class visibility issues that would need to be addressed as well. 

Example 1 (Ambiguous class names):

Both A and B export “bar”, and both define “bar.MyClass” which have differing definitions. Victim could load the supposed to be hidden A’s MyClass instead of the intended B’s MyClass.

There is also a variant of this where the conflict is between Victim and A if A also exports another hidden package that is present in Victim itself.

Example 2 (Unintentional discovery):

Victim uses ClassLoader.getResources (plural), looking for a standard configuration file or class name, and receives entries for both A and B. A’s was not intended to be discovered by victim, and leads to a failure state. As an example perhaps the configuration file in B specifies a class name in B’s dependency, which is not visible to Victim. Or, perhaps A’s config leads to duplicate runtime actions being configured (since the file was really only indented for A, which also processes it)

You can potentially address 1 with precedence, but not 2. 

I think you would need to say that export dynamic is only utilizable for reflection permissions and has no other similarity with “export” (although perhaps that’s what you meant?)

If you combine that approach with a wildcard capability like you mentioned earlier then I’ll admit its very hard for me to quibble over a one line additional requirement in module-info.java.

Although, for completeness, let me (re?)introduce one other consideration that was briefly mentioned (although with sparing details) earlier in the thread

If you have a custom serialization framework that is supposed to be identical to Java serialization in contract, then it becomes impossible to mirror using the only available standard means (core reflection), since that mechanism disallows non-exported packages. Currently a custom serialization framework only needs to handle one non-standard case (missing no-arg constructor). Going forward it would need to use Unsafe for everything. 
> 
>> Qualified exports could in theory address this problem, but they are
>> problematic in a dynamic environment since the module is simply not in a
>> position to know all of the various modules which would/could enhance
>> it. ...
> 
> I completely agree.  In general I think it's inappropriate for the author
> of a module to write qualified exports except in some very special cases,
> and this is most definitely not one of them.
> 
>>> These observations lead to your suggestion to allow declarative module
>>> boundaries to be overridden by "trusted" framework code.  It's far from
>>> clear how to define such a facility in a way that would still allow us to
>>> achieve one of our primary goals, namely strong encapsulation, i.e., the
>>> ability of the author of a module to declare which types are accessible
>>> by other components, and which are not.
>> 
>> My understanding was the underlying driver was a security concern (that’s
>> were I was going with that suggestion). Is that accurate?
> 
> It's a question of both security (i.e., preventing vulnerabilities) and
> integrity (i.e., respecting the intent of a module's author).
> 
> For security, the problem with introducing an explicit notion of trust
> into any system, or in this case an additional explicit notion of trust,
> is that you then have to figure out how to prevent it from being used as
> an attack vector.
> 
> Yes, we could define a notion of "trusted" modules whose code can reflect
> arbitrarily.  Even if we implement and use it completely correctly in the
> JDK, however, there would be an API for it, and that API would be used by
> external library and application code, and eventually somebody, somewhere,
> would make a mistake that leads to a CVSS 10 in some production code.

I think it could be done in a fairly clean manner. For example, you could utilize something like JCE providers, and code which needs to obtain this trust has to be signed by a framework provider with a certificate that’s either been signed by a central JDK CA, or has been explicitly deployed as a trust within the JVM. There would be a burden for framework developers, but I assume they would prefer this over a limited model.

You could also reuse the security manager infrastructure, and just add a new permission.  That would mean giving up on protection even without a security manager, but if the user isn’t using a security manager then that implies they are on a platform will all trusted code.

> 
> In the Jigsaw design so far, therefore, we've tried instead to leverage
> existing implicit notions of trust wherever possible.
> 
> We already trust whoever has control over the invocation of the Java
> run-time environment.  If you can edit the `java` command line, or its
> equivalent, then you can already do pretty much anything you want, so
> there's little additional risk in providing command line options (as
> we do [1]) to break module-encapsulation boundaries.
> 
> We also already trust whoever writes systems that explicitly load
> classes, i.e., container developers.  If you can modify bytecodes on
> their way into the JVM then you already have significant power, so
> there's little additional risk in giving you the ability to control how
> modules are defined and related in the class loaders that you create.
> (In fact you already have this ability, since you can rewrite module
> descriptors prior to the configuration of a layer, so there's no need
> to create a special API for it.)

So the main issue with these solutions is really the problems we list above. 

The command line approach has issues with compatibility with all of the various launch mechanism. For example you can’t just bundle a script for the user, because IDEs launch the VM too, and ensuring those are all in sync is brittle. The command line is also too early in the boot process (unless its just export everything in all modules). 

The class loading approach also has the problem of requiring a particular launch mechanism; you can’t, for example, support a standard SE launch unless you require a particular agent on the command line.

> A natural consequence of this approach is that we need not place total
> trust in a framework in order to use it.  A container can arrange for a
> framework to have reflective access to just the packages of just the
> modules that the framework is going to support, by adding qualified
> dynamic exports as needed, rather than grant it full reflective access
> to all code in the system.
> 
> This approach does mean that container developers have a bit more work
> to do, but that seems a reasonable tradeoff if in return we're able to
> keep the platform simpler, and thus both easier to understand and easier
> to secure.

I think its fair to require a few more steps of additional burden to an advanced capability used by a framework developer, my concern is really more in when it spills over into the end user. 

> 
> As to integrity, if the author of a module decides not to export a
> specific package then they should be able to expect that decision to be
> respected, and not overridden lightly by any random code in the system
> that uses reflection.  Whoever controls the `java` command line or sets
> up the container in which the module ultimately runs can override that
> encapsulation decision, and they may have very good and legitimate
> reasons to do so.  At that point, however, they take on the burden of
> bearing the consequences of any violation of the module author's intent.
> If a future version of the module, e.g., removes an encapsulated package
> in a way that causes reflective code to fail then that is not the fault
> of the module author but, rather, of whoever decided to break into the
> package.

Wouldn’t you agree though that we already have this balance today? You have to use setAccessible and a security permission to override the accessibility contract expressed by the developer. 

-snip-

>> 
>> Perhaps this problem could be addressed by addressed by an indirection
>> that represents a role or an actor (e.g. something like "export dynamic *
>> to role runtime”).  I haven’t thought that through though.
> 
> Happily, I don't think we need to go that far.

I agree that modifying exports dynamic to avoid side effects is a superior solution to the indirection idea.

> 
> A container is in complete control of the class loaders that it creates,
> so it already has the power to load arbitrary classes into arbitrary
> modules.  If a container needs to add a dynamic export at run time then
> it can synthesize a tiny class whose static initializer invokes the
> java.lang.Module::addExports method as needed, and then define that class
> in the target module.
> 
> We already use this technique in the JDK's Nashorn JavaScript engine,
> which loads script classes into modules that are defined completely at
> run time.  (As Alex mentioned, there will be a talk on this at the
> upcoming JVM Language Summit, and the video will be available shortly
> thereafter.  We'll post a link here when it's available.)

Thanks that would be useful to read. 

--
Jason T. Greene
WildFly Lead / JBoss EAP Platform Architect
JBoss, a division of Red Hat