Jigsaw EA feedback for elasticsearch

Alan Bateman Alan.Bateman at oracle.com
Fri Sep 11 10:09:59 UTC 2015


Thanks for the great write-up! A few comments below.

On 11/09/2015 06:07, Robert Muir wrote:
> 2. we have a "jar hell detector" that threw an
> UnsupportedOperationException, because classloader is no longer a
> URLClassLoader, so we can't get the list of urls. This caused all
> tests to fail. I changed the code to parse java.class.path.
Right, code should never assume that the application class loader is a 
implemented as a URLClassLoader (more on this in JEP 261).


> 3. we have a "jvm info" api that provides information about the jvm,
> e.g. to assist our engineers in debugging different nodes in the
> cluster. it was not prepared to handle UnsupportedOperationException
> from RuntimeMXBean.getBootClassPath: I fixed it to fall back to
> sun.boot.class.path, otherwise fall back to "unknown".
This is a another behavior change. RuntimeMXBean.getBootClassPath() has 
always specified that it can throw UOE but the JDK has not needed to do 
this until now. The alternative choice here is to return an empty string 
but that might cause issues too.

> 4. exception serialization tests failed, because we manually serialize
> exceptions. We previously used java serialization, but it causes
> serious trouble because of backwards compatibility breaks between even
> minor jdk versions: this would strike when users try to upgrade their
> jvms for nodes in their cluster with a rolling restart. The tests fail
> because the stacktrace "loses" stuff after deserialization (the module
> version). For now i just disabled the tests on java 9, because I don't
> know how we can support e.g. java 8 and java 9 and populate this stuff
> "optionally" yet without more digging.
Stack traces have been updated to optionally include the module and 
version but this should be a compatible change (except maybe for code 
that parses the String representation). As you mention, these tests 
would need to be updated any time that there are new fields added to the 
serial form (for standard/Java SE types then this should only be major 
releases).


> 5. we have monitoring apis that provide basic system information,
> similar to #3, for debugging purposes, and to feed monitoring tools so
> people can track the health of the cluster. previously, we used the
> sigar library (JNI) for this, but it has bugs that caused users
> crashes. So we were forced to limit ourselves to what is provided with
> java management apis: which is much less, but we figure it has the
> basics. For some very basic stats, this means we also look for
> com.sun.management apis
> (https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/package-summary.html)
> and if they are available, we provide the stuff available there too,
> like how much ram is on the machine, swap in use, number of open/max
> file descriptors, and so on. We test what is available and what is not
> based on platform so we can detect if something changes in the JDK,
> like what happens with jigsaw, where they all become unavailable.
I'm not sure that I understand the issue here but just to say that the 
com.sun.management API is a documented/supported API and it exported by 
module jdk.management:

$ java -listmods:jdk.management

jdk.management at 9.0
   requires public java.management
   requires mandated java.base
   exports com.sun.management
   conceals com.sun.management.internal
   provides sun.management.spi.PlatformMBeanProvider with 
com.sun.management.internal.PlatformMBeanProviderImpl


>
> 6. cluster snapshot/restore to amazon s3 does not work, because of
> their use of internal ssl libraries. I've tried to get them to fix it
> for a while now (https://github.com/aws/aws-sdk-java/pull/432). This
> is also a serious loss of functionality, if they wont fix it, I guess
> we have to fork the aws sdk.
You can workaround this with 
-XaddExport:java.base/sun.security.ssl=ALL-UNNAMED of course but much 
better if they could understand and remove the dependency on these 
internal classes.


>
> 8. during testing I hit some kind of bug, where the thai break
> iterator returned wrong information. This might be hotspot-related or
> something else, and it never reproduced again. We use this check
> (https://github.com/apache/lucene-solr/blob/trunk/lucene/analysis/common/src/java/org/apache/lucene/analysis/th/ThaiTokenizer.java#L37-L47)
> to see if we can "really" tokenize thai, otherwise we throw an
> exception. For some IBM JVM versions at least in the past, they did
> not have a breakiterator for thai. I guess it just goes to show the EA
> build is really a prototype, and not yet ready to be added to our CI
> servers and so on... which is the only way I can ensure this huge
> codebase stays working with jigsaw.
Just so I understand, the Thai break iterator issue was with the jigsaw 
EA builds and not the regular JDK 9 builds, right? And it only happened 
once, you can't reproduce. This is a bit worrisome. All I can say is 
that there are a lot of changes in this area, a lot of technical debt 
related to the split with the java.base and the jdk.localedata module 
had to be addressed. Off-hand then I can't think of anything that would 
lead to an intermittent issue. If you find out more on this then please 
send mail.

-Alan


More information about the jigsaw-dev mailing list