Suggestion: allow accessible reflection on protected methods of exported types.

Tue Jan 3 18:30:26 UTC 2017

Hi,

> From: jigsaw-dev [mailto:jigsaw-dev-bounces at openjdk.java.net] On Behalf Of Alan Bateman
> On 29/12/2016 16:24, Rafael Winterhalter wrote:
> 
> > :
> >
> > I argue that this module system should not be responsible to assert the
> > usage of such protected methods on exported types. Such methods are not
> > really encapsulated as they are considered official API by subclasses.
> Right, they are intended to be used by sub-classes and this hasn't
> changed, meaning that you should have no issue accessing a protected
> member defined in the super class with bytecode or core reflection as
> before (no setAccessible in the picture).
> 
> With strong encapsulation then having setAccessible treat protected
> members as public would mean we would be back to the hole that is
> #AwkwardStrongEncapsulation. It would mean anyone could access
> protected
> members that are not intended to be accessed in this way. Also in the
> case of the core modules then it conflicts with the goal to mprove
> platform integrity.

I strongly agree with this! This is one reason why the Lucene/Solr/Elasticsearch team spends much work to have everything in shape for Java 9. Especially for Elasticsearch, security has a high priority. The worst for projects like Elasticsearch, but also Solr, is the fact that they depend on hundreds of external libraries that behave wrong - really wrong! And trust me, not any of those libraries uses doPrivileged or setAccessible in a correct way. They even try to do setAccessible in static initializers of classes!!!! This brings huge problems, as you don't really can predict, when the classes are initialized and their clinit is executed. So it is impossible to make them work with a SecurityManager at all. With Java 9 we are now in a good shape, because people are in fact now facing the issue what their "hacks" are doing. In Elasticsearch there is code that "touches" many classes from foreign libraries just to execute their clinit (example workaround: https://goo.gl/STNjv6). This is also done before the security manager is installed. After that everything is closed down. But it would be much simpler if those libraries would not do "security relevant" stuff in clinit and (on top) without doPrivileged and without any fallback, if the stuff they try to do is not allowed. We have seen Amazon AWS libs that just call setAccessible on some javax.security classes and fail completely (ExceptionInInitializer). Just to make some small performance improvement or hack around some "optional" security feature - just horrible. We know: All those libs would work with Java 9 (maybe some optional features are not available), but this makes them useless. With Java 9 we hope somebody fixes it - otherwise it wouldn't work at all. This also lead to the fact that we reduced all dependencies (removal of Guava and commons-lang/beans/...) and slimmed down our deps. Elasticsearch now runs with a security manager that forbids almost everything. Most funny is that any script code or plugin cannot call System.exit() or similar. Only exactly one place can shut down the server (we inspect stack trace in SecurityManager and allow exit of JVM only under one stack trace of official shutdown method...).

> As regards the change you are seeing now then it has been in the Jigsaw
> EA builds for some time but only merged into the JDK 9 builds at
> jdk-9+148. There is further spec work required to deal with
> protected-static and protected constructors but that is a topic for
> another thread. There is of course a compatibility impact with this
> change. It will expose many hacks, a number of them have come up on this
> mailing list already. It will also impact code that relies on being able
> to use setAccessible to get at protected-instance methods that it would
> not otherwise have access to in bytecode.
> 
> For mitigation then the `--add-opens` command line option can be used to
> open any package of any module and so can be used to keep existing code
> working. Ideally of course this would not be needed but it requires
> effort from the maintainers of many important libraries and tools.
> There has been an outreach effort going on for the last 2+ years to try
> to get an many projects and libraries working with us so that the eco
> system will be in reasonable shape by the time that JDK 9 ships. In
> general then JDK 9 is very different to past releases in that it comes
> with many disruptive changes and requires updates across many areas of
> the eco system before it is generally usable by the average developer.
> 
> As regards defineClass then I see you've found Unsafe::defineClass. Many
> important libraries in the eco system rely on Unsafe and it will
> continue to be accessible in JDK 9. All the details on that are in JEP
> 260. Long term then Unsafe needs to go away of course and that will
> require working on some challenging use-cases. The VarHandles work in
> JEP 193 is a huge step forward but there are several other areas that
> will need effort. Using Unsafe::defineClass might feel like a regression
> but I'm not aware of any current efforts that would result in a "safe"
> way to do this type of injection (say a restricted defineClass that
> would only allow injecting into open packages for example).

About the defineClass problem: In Elasticseach and Lucene we don't ever do this. It has shown that scripts (Lucene Expressions module or Elasticsearch Painless language) are loaded in a private class loader, execute exactly one query and then are free for garbage collection. We have seen no significant overhead at all, sorry! Most people complaining do wrong micro-benchmarks. You have to look at the whole system!

Furthermore, a single-class classloader has significant advantages:
- cheap to create (subclass plain stupid ClassLoader, so expensive stuff like URLClassLoader is avoided)
- The code is garbage collected ASAP (e.g., after the Lucene/Elasticsearch query is done).
- Needs no problematic access rights like setAccessible
- No need for unique class names (all Lucene Expressions or Elasticsearch Painless scripts have same class name and the script source code is part of the bytecode's "filename", so you have meaningful stack traces), e.g.: https://goo.gl/PUyY51 and https://goo.gl/gjSxoF

Thanks to Rafael for adding Unsafe to the game in Mocking libraries (CGLIB, Byte Buddy). As injecting classes into foreign classloaders is really security relevant and unsafe, the place in sun.misc.Unsafe is correct! But I'd suggest that all those mocking libraries please offer the way to do both ways (separate loader instead of injecting). I agree sometimes injecting is needed - but not by default (sorry! That's my personal opinion!). In Apache Solr we have to disable all mocking tests for the time being, if Java 9 is detected. There is an issue about this already: https://issues.apache.org/jira/browse/SOLR-9893 - Another option would be to add an "add-opens to java.lang" when test runner spawns randomized test framework sub process for executing Solr tests (with ANT this should be easy). I have not yet done this (Lucene testing is still on build 147 because we wait for Groovy 2.4.8 to be released).

But Lucene also uses "horrible" stuff, but correctly wrapped with security code...: In Lucene we also adopted the ByteBuffer unmapping code to use the new sun.misc.Unsafe unmapper (until an official way to do this is supported with Java 10, looking forward to Andrew Haley's ideas and next FOSDEM): https://goo.gl/EzXeUq; as this is very unsafe (can lead to SIGSEGV/SIGBUS or other bad things™, we are happy that it now lives in sun.misc.Unsafe, too. Our code uses MethodHandles to early bind everything early, so we are sure that it actually works. It wraps this early binding with doPrivilged, but keeps the actual method handle fully isolated (others may stiull get it with reflection, but that's not our problem...!). Nothing else in Lucene ever calls setAccessible in runtime classes. We use Forbidden-Apis to prevent method signatures in bytecode which we don't want to have anywhere in Lucene, Solr, or Elasticsearch (list of forbidden signatures is here: https://goo.gl/1mx8rm - and also the defaults of  https://github.com/policeman-tools/forbidden-apis, e.g. https://goo.gl/VEyXwq).

> One final point about ClassLoader::defineClass is that java agents and
> tools doing bytecode instrumentation have the power to redefine modules
> and classes and so can break into all modules. I guess you are very
> familiar with this already.

Uwe