review request for 6798511/6860431: Include functionality of Surrogate in Character

Ulf Zibis Ulf.Zibis at gmx.de
Thu Aug 27 16:03:19 UTC 2009


Sherman,

thanks for your detailed review.

Am 26.08.2009 20:02, Xueming Shen schrieb:
> Ulf,
>
> "get rid of sun.nio.cs.Surrogate" itself is not a sufficient enough 
> reason to add bunch of new supplementary
> support related APIs into Character and CharBuffer classes.  You 
> probably need to demonstrate more use
> scenarios to show/prove why these new APIs are needed, why they can 
> not be easily achieved by using existing
> APIs and especially you should ask yourself first if these APIs 
> will/should be used/needed by "general public"
> or they are just "specifically" needed by your current 
> application/project.

Well yes, most of the use scenarios I see in charset processing.  ;-) I 
guess one can find more use scenarios in existing JDK e.g. visualization 
and keyboard input of  surrogate chars / supplementary code points in 
Swing package. On the other hand I have total code base and footprint 
size in mind.

Alternatively subclassing sun.nio.cs.Surrogate from Character would be 
straight forward. (Unfortunately Character class is final. Is that 
mandatory?)
... and this would not cover my enhancements in CharBuffer.

>
> For example, the isBMP(int), it might be convenient, but it can be 
> easily archived by the one line code
>
> (int)(char)codePoint == codePoint;

We still have isSupplementaryCodePoint(int), so IMHO additionally having 
isBMPCodePoint(int) would be a good pair.

On the other hand, there are still many public methods in the API witch 
can be "easily archived by one line code", especially in class 
Character, even plenty of duplicate (or XYZ+1 valued) public constants.

e.g:
        public static UnicodeBlock of(char c) {
            return of((int)c);
        }


>
> or more readable form
>
>    codePoint < Character.MIN_SUPPLEMENTARY_COE_POINT;

correction: ;-)
     codePoint >= MIN_CODE_POINT) && codePoint < 
MIN_SUPPLEMENTARY_CODE_POINT

>
> I'm not saying we can NOT add isBMP() (I know icu4j's UCharacter class 
> does have one), just
> believe it's arguably not necessary.
>
> Same as the pair
>
> -- static char highSurrogate(int codePoint);
> -- static char lowSurrogate(int codePoint);
> -- CharBuffer.putCodePoint(int)
>
> In "normal use case", the Character.toChars(int codePoint) is good 
> enough cover them all.

Hm, yes, but what's about performance ?
- instantiating a new char[]
- boxing the char(s) into
- invoking *bulk* method for just 1 (potentially, but rare 2) char(s)
   (resolves to put(char[], int, int) including checkBounds(int, int, int) )

And yes, we can make CharBuffer#putCodePoint(int) using /Charset 
helpers/ sun.nio.cs.Surrogate#high(int) + sun.nio.cs.Surrogate#low(int) 
if there is no correspondent in Character class. But is that good design ?


-Ulf





More information about the core-libs-dev mailing list