review request for 6798511/6860431: Include functionality of Surrogate in Character
Ulf Zibis
Ulf.Zibis at gmx.de
Thu Aug 27 16:03:19 UTC 2009
Sherman,
thanks for your detailed review.
Am 26.08.2009 20:02, Xueming Shen schrieb:
> Ulf,
>
> "get rid of sun.nio.cs.Surrogate" itself is not a sufficient enough
> reason to add bunch of new supplementary
> support related APIs into Character and CharBuffer classes. You
> probably need to demonstrate more use
> scenarios to show/prove why these new APIs are needed, why they can
> not be easily achieved by using existing
> APIs and especially you should ask yourself first if these APIs
> will/should be used/needed by "general public"
> or they are just "specifically" needed by your current
> application/project.
Well yes, most of the use scenarios I see in charset processing. ;-) I
guess one can find more use scenarios in existing JDK e.g. visualization
and keyboard input of surrogate chars / supplementary code points in
Swing package. On the other hand I have total code base and footprint
size in mind.
Alternatively subclassing sun.nio.cs.Surrogate from Character would be
straight forward. (Unfortunately Character class is final. Is that
mandatory?)
... and this would not cover my enhancements in CharBuffer.
>
> For example, the isBMP(int), it might be convenient, but it can be
> easily archived by the one line code
>
> (int)(char)codePoint == codePoint;
We still have isSupplementaryCodePoint(int), so IMHO additionally having
isBMPCodePoint(int) would be a good pair.
On the other hand, there are still many public methods in the API witch
can be "easily archived by one line code", especially in class
Character, even plenty of duplicate (or XYZ+1 valued) public constants.
e.g:
public static UnicodeBlock of(char c) {
return of((int)c);
}
>
> or more readable form
>
> codePoint < Character.MIN_SUPPLEMENTARY_COE_POINT;
correction: ;-)
codePoint >= MIN_CODE_POINT) && codePoint <
MIN_SUPPLEMENTARY_CODE_POINT
>
> I'm not saying we can NOT add isBMP() (I know icu4j's UCharacter class
> does have one), just
> believe it's arguably not necessary.
>
> Same as the pair
>
> -- static char highSurrogate(int codePoint);
> -- static char lowSurrogate(int codePoint);
> -- CharBuffer.putCodePoint(int)
>
> In "normal use case", the Character.toChars(int codePoint) is good
> enough cover them all.
Hm, yes, but what's about performance ?
- instantiating a new char[]
- boxing the char(s) into
- invoking *bulk* method for just 1 (potentially, but rare 2) char(s)
(resolves to put(char[], int, int) including checkBounds(int, int, int) )
And yes, we can make CharBuffer#putCodePoint(int) using /Charset
helpers/ sun.nio.cs.Surrogate#high(int) + sun.nio.cs.Surrogate#low(int)
if there is no correspondent in Character class. But is that good design ?
-Ulf
More information about the core-libs-dev
mailing list