Jigsaw EA feedback for elasticsearch
Alan Bateman
Alan.Bateman at oracle.com
Fri Sep 11 10:09:59 UTC 2015
Thanks for the great write-up! A few comments below.
On 11/09/2015 06:07, Robert Muir wrote:
> 2. we have a "jar hell detector" that threw an
> UnsupportedOperationException, because classloader is no longer a
> URLClassLoader, so we can't get the list of urls. This caused all
> tests to fail. I changed the code to parse java.class.path.
Right, code should never assume that the application class loader is a
implemented as a URLClassLoader (more on this in JEP 261).
> 3. we have a "jvm info" api that provides information about the jvm,
> e.g. to assist our engineers in debugging different nodes in the
> cluster. it was not prepared to handle UnsupportedOperationException
> from RuntimeMXBean.getBootClassPath: I fixed it to fall back to
> sun.boot.class.path, otherwise fall back to "unknown".
This is a another behavior change. RuntimeMXBean.getBootClassPath() has
always specified that it can throw UOE but the JDK has not needed to do
this until now. The alternative choice here is to return an empty string
but that might cause issues too.
> 4. exception serialization tests failed, because we manually serialize
> exceptions. We previously used java serialization, but it causes
> serious trouble because of backwards compatibility breaks between even
> minor jdk versions: this would strike when users try to upgrade their
> jvms for nodes in their cluster with a rolling restart. The tests fail
> because the stacktrace "loses" stuff after deserialization (the module
> version). For now i just disabled the tests on java 9, because I don't
> know how we can support e.g. java 8 and java 9 and populate this stuff
> "optionally" yet without more digging.
Stack traces have been updated to optionally include the module and
version but this should be a compatible change (except maybe for code
that parses the String representation). As you mention, these tests
would need to be updated any time that there are new fields added to the
serial form (for standard/Java SE types then this should only be major
releases).
> 5. we have monitoring apis that provide basic system information,
> similar to #3, for debugging purposes, and to feed monitoring tools so
> people can track the health of the cluster. previously, we used the
> sigar library (JNI) for this, but it has bugs that caused users
> crashes. So we were forced to limit ourselves to what is provided with
> java management apis: which is much less, but we figure it has the
> basics. For some very basic stats, this means we also look for
> com.sun.management apis
> (https://docs.oracle.com/javase/7/docs/jre/api/management/extension/com/sun/management/package-summary.html)
> and if they are available, we provide the stuff available there too,
> like how much ram is on the machine, swap in use, number of open/max
> file descriptors, and so on. We test what is available and what is not
> based on platform so we can detect if something changes in the JDK,
> like what happens with jigsaw, where they all become unavailable.
I'm not sure that I understand the issue here but just to say that the
com.sun.management API is a documented/supported API and it exported by
module jdk.management:
$ java -listmods:jdk.management
jdk.management at 9.0
requires public java.management
requires mandated java.base
exports com.sun.management
conceals com.sun.management.internal
provides sun.management.spi.PlatformMBeanProvider with
com.sun.management.internal.PlatformMBeanProviderImpl
>
> 6. cluster snapshot/restore to amazon s3 does not work, because of
> their use of internal ssl libraries. I've tried to get them to fix it
> for a while now (https://github.com/aws/aws-sdk-java/pull/432). This
> is also a serious loss of functionality, if they wont fix it, I guess
> we have to fork the aws sdk.
You can workaround this with
-XaddExport:java.base/sun.security.ssl=ALL-UNNAMED of course but much
better if they could understand and remove the dependency on these
internal classes.
>
> 8. during testing I hit some kind of bug, where the thai break
> iterator returned wrong information. This might be hotspot-related or
> something else, and it never reproduced again. We use this check
> (https://github.com/apache/lucene-solr/blob/trunk/lucene/analysis/common/src/java/org/apache/lucene/analysis/th/ThaiTokenizer.java#L37-L47)
> to see if we can "really" tokenize thai, otherwise we throw an
> exception. For some IBM JVM versions at least in the past, they did
> not have a breakiterator for thai. I guess it just goes to show the EA
> build is really a prototype, and not yet ready to be added to our CI
> servers and so on... which is the only way I can ensure this huge
> codebase stays working with jigsaw.
Just so I understand, the Thai break iterator issue was with the jigsaw
EA builds and not the regular JDK 9 builds, right? And it only happened
once, you can't reproduce. This is a bit worrisome. All I can say is
that there are a lot of changes in this area, a lot of technical debt
related to the split with the java.base and the jdk.localedata module
had to be addressed. Off-hand then I can't think of anything that would
lead to an intermittent issue. If you find out more on this then please
send mail.
-Alan
More information about the jigsaw-dev
mailing list