<i18n dev> RL1.1 Hex Notation
Xueming Shen
xueming.shen at oracle.com
Thu Jan 27 12:50:40 PST 2011
Mark,
The high/lowSurrogate(codepoint) pair has been added in jdk1.7 already.
http://download.java.net/jdk7/docs/api/java/lang/Character.html#highSurrogate(int)
<http://download.java.net/jdk7/docs/api/java/lang/Character.html#highSurrogate%28int%29>
http://download.java.net/jdk7/docs/api/java/lang/Character.html#lowSurrogate(int)
<http://download.java.net/jdk7/docs/api/java/lang/Character.html#lowSurrogate%28int%29>
I submitted CR#7015408 for the third (formatting) one.
-Sherman
On 01/26/2011 01:36 PM, Mark Davis ☕ wrote:
> Ok, now I understand. With that change, the situation is much better.
> It doesn't fully satisfy RL1.1, because you can't use hex codepoint
> numbers -- you have to use the fairly ugly workaround of
>
> String hexPattern = codePoint <= 0xFFFF
>
> ? String.format("\\u%04x", codePoint)
>
> : String.format("\\u%04x\\u%04x", (int)
> Character.toChars(codePoint)[0], (int) Character.toChars(codePoint)[1]);
>
>
>
> BTW, in plain Java I really miss a few of the ICU4J routines, like:
>
> * char c1 = UTF16.getLeadSurrogate(codePoint);
> * char c2 = UTF16.getLeadSurrogate(codePoint);
> * String s = UTF16.valueOf(codePoint);
>
> You can do them in plain Java, as in the above expression, but they're
> awkward and not as clear to read. And instead of the third one, the
> best I see in plain Java is the following, which is really pretty ugly
> (is there any better way?).
>
>
> String s = new StringBuilder().appendCodePoint(codePoint).toString();
>
>
> Mark
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: http://mail.openjdk.java.net/pipermail/i18n-dev/attachments/20110127/affdb4c7/attachment.html
More information about the i18n-dev
mailing list