Unicode script support in Regex and Character class
Xueming Shen
xueming.shen at oracle.com
Thu Apr 22 22:38:49 UTC 2010
Ulf Zibis wrote:
>> (3) the syntax for script constructs. In addition to the "normal"
>> \p{InScriptName} and \P{InScriptName} for the script support
>> I'm also adding
>> \p{script=ScriptName} \P{script=ScriptName} for the new script
>> support
>> \p{block=BlockName} \P{block=BlockName} for the "existing" block
>> support
>> \p{general_category=CategoryName}
>> \P{general_category=CategoryName} for the "existing" gc
>> Perl recently also started to accept this \p{propName=propValue}
>> Unicode style.
>> It opens the door for future "expanding", for example \p{name=XYZ}
>> :-)
> (2) the piggyback method j.l.c.getName() :-)
>
> I'm missing \p{InScriptName} in Pattern javadoc.
>
I meant to say
\p{IsScriptName} and \P{IsScriptName}
So the "recommended" usage would be
Script:
\p{IsScriptName} and \P{IsScriptName} or \p{script=ScriptName}
\P{script=ScriptName}
Block
\p{InBlockName} \P{InBlockName} or \p{block=BlockName} \P{block=BlockName}
Category
\p{CategoryName} \P{CategoryName} or \p{general_category=CategoryName}
\P{general_category=CategoryName}
For compatibility reason, we also take \p{IsCategoryName} \P{IsCategoryName}
It appears there is no conflict between the category name and script
name, yet.
My apology for the inconvenience.
Sherman
More information about the core-libs-dev
mailing list