<i18n dev> RFR JDK-7071819: To support Extended Grapheme Clusters in Regex

Peter Levart peter.levart at gmail.com
Wed Feb 3 11:07:20 UTC 2016

Hi Sherman,

I don't currently have any idea how to squeeze the hashtable any more 
further. It is already very compact in my opinion. But I noticed a data 
race that could result in navigating the half-initialized data 
structure. It is a classical unsafe publication bug. It has been present 
before in get(int cp) and it is present now in both getName(int cp) and 
getCodePoint(String name). For example:

  151     static int getCodePoint(String name) {
  152         byte[] strPool = null;
  153         if (refStrPool == null || (strPool = refStrPool.get()) == 
null) {
  154             strPool = initNamePool();
  155         }


111             refStrPool = new SoftReference<>(strPool);

...the static refStrPool field is not marked volatile.

One way to fix this is to mark field volatile and then rearrange code in 
getName/getCodePoint to only read from it once by introducing a local 
var. The other would be to change the line 111 into something like:

SoftReference<byte[]> rsp = new SoftReference<>(strPool);
refStrPool = rsp;

...*and* also rearrange code in getName/getCodePoint to only read from 
field once by introducing a local var.

Regards, Peter

On 02/02/2016 10:25 PM, Xueming Shen wrote:
> Hi,
> Have not heard any feedback on this one so far. I'm adding
> a little more to make it attractive for reviewers :-)
> On top of the \N now the webrev includes the proposal to add
> two more matchers, \X for unicode extended grapheme cluster
> and \b{g} for the corresponding boundary.
> Issue: https://bugs.openjdk.java.net/browse/JDK-7071819
> Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
> webrev: http://cr.openjdk.java.net/~sherman/8147531_7071819/webrev/
> Thanks!
> Sherman
> On 01/18/2016 11:52 PM, Xueming Shen wrote:
>> Hi,
>> Please help review the change to add \N support in regex.
>> Issue: https://bugs.openjdk.java.net/browse/JDK-8147531
>> webrev: http://cr.openjdk.java.net/~sherman/8147531/webrev
>> This is one of the items we were planning to address via JEP111
>> http://openjdk.java.net/jeps/111
>> https://bugs.openjdk.java.net/browse/JDK-8046101
>> Some of the constructs had been added already in early release. I'm
>> planning to address the rest as individual rfe separately.
>> Thanks,
>> Sherman

More information about the i18n-dev mailing list