Java regex vs. Unicode TR#18 vs. ICU

Steven R. Loomis steven.loomis at oracle.com
Wed Mar 6 22:01:19 UTC 2013


On 3/6/13 1:06 PM, Xueming Shen wrote:
> On 03/06/2013 12:44 PM, Steven R. Loomis wrote:
>> Hello,
>>  Someone on the ICU team recently compared the use of "\w" between 
>> ICU, Java, and Unicode TR#18
>> <http://www.unicode.org/reports/tr18/#Compatibility_Properties> .
>> The results are in the following ICU bug 
>> <http://bugs.icu-project.org/trac/ticket/10006>.
>>
>> A question for core-libs-dev is, does Java plan to change the 
>> semantics of \w to match TR#18's list?
>
> It appears the "standard" has just added one more entry \p{Join_Control}
> during their last update :-( I may consider to update the spec/impl to 
> match
> that, I would assume there is no any jdk7 application really has 
> dependency
> on the updated \w (in jdk7).
>
> -Sherman
>
>
Thanks, Sherman

Do you want to open a bug to track this? You can reference the above URLs

Steven




More information about the core-libs-dev mailing list