JDK 9 Build 111 seems to miss some locale data, Lucene tests fail with Farsi and Thai language
Alan Bateman
Alan.Bateman at oracle.com
Sat Mar 26 14:10:18 UTC 2016
On 26/03/2016 11:56, Uwe Schindler wrote:
> Hi,
>
> after also testing the separate "Jigsaw" build on jdk9.java.net I see the same problems. So both builds 111 are wrong.
>
> To me it looks like the Unicode data files are missing some information - which could again be a packaging bug. As said before, build 110 does not have this problem, so it seems to be a side-effect of Jigsaw merging.
>
> The following stuff does not work:
>
> (1) Thai's locale does not have working dictionary-based BreakIterator available. The following "check" in Lucene for this fails, because it cannot detect a boundary correctly:
>
> /**
> * True if the JRE supports a working dictionary-based breakiterator for Thai.
> * If this is false, this tokenizer will not work at all!
> */
> public static final boolean DBBI_AVAILABLE;
> private static final BreakIterator proto = BreakIterator.getWordInstance(new Locale("th"));
> static {
> // check that we have a working dictionary-based break iterator for thai
> proto.setText("ภาษาไทย");
> DBBI_AVAILABLE = proto.isBoundary(4);
> }
>
> After this static initializer, DBBI_AVAILABLE is false. This makes some tests to be ignored, but 2 fail because of this (which might be an oversight on our side). But nevertheless, this is a bug in build 111.
I just tried to duplicate this on OSX and Linux without success. The log
you linked to suggests this is Linux, is that right? Is this the JDK
bundle, I haven't checked the JRE bundle but would be surprise anything
is missing. The JDK has several tests for Thai so if it was completely
broken then I would have expected it would have been seen. I've no doubt
that it is not working in your environment, we just need to figure out
what is different.
>
> (2) The collator for Arabic (Farsi) language fails to work correctly. This also looks like missing data.
>
> Collator collator = Collator.getInstance(new Locale("ar"));
>
Are there any exceptions or anything here? Or maybe it tests the
collector with compare?
-Alan
More information about the core-libs-dev
mailing list