review request for 6798511/6860431: Include functionality of Surrogate in Character
Ulf Zibis
Ulf.Zibis at gmx.de
Mon Mar 22 13:57:46 UTC 2010
Am 21.03.2010 17:23, schrieb Martin Buchholz:
> On Sun, Mar 21, 2010 at 04:28, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>
>> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis<Ulf.Zibis at gmx.de> wrote:
>>
>
>> I don't think it's a performance problem in the real world.
>>
>>
>> Hm, if someone uses:
>> if (Character.isBMPCodePoint(codePoint))
>> ...;
>> else if (Character.isSupplementaryCodePoint(codePoint)) // instead
>> isValidCodepoint()
>> ...;
>> else
>> ...;
>> he will loose up to 50 % performance as you can see on my benchmark on
>> isSuppCPAlaMartin().
>>
> Only if their data is full of supplementary characters.
>
Yes, but we dont't know anything about the purpose of code written there
in the world, so why not provide best performance or at least give a
hint in the docs, if it doesn't cost anything.
>
>> We don't usually put such performance information in the javadoc.
>>
>>
>> In class StringBuilder:
>> "Where possible, it is recommended that this class be used in preference to
>> StringBuffer as it will be faster under most implementations."
>>
>> java.util.List:
>> Note that these operations may execute in time proportional to the index
>> value for some implementations (the LinkedList class, for example).
>>
>> ByteBuffer#get(byte[],int,int):
>> In other words, an invocation of this method of the form
>> src.get(dst, off, len) has exactly the same effect as the loop
>>
>> for (int i = off; i< off + len; i++)
>> dst[i] = src.get();
>>
>> except that it first checks that there are sufficient bytes in this buffer
>> and it is potentially much more efficient.
>>
> In the above, the performance is a Raison d'être of the API,
> that real users should consider when choosing API.
>
Oh, on parle français. Je l'aime beaucoup.
>
>> Anyway, even if isSupplementaryCodePoint() is used isolated, my code will
>> help JIT to use 2-byte shifted adressing and shorter 2-byte immediate value
>> for the compare, but yes, JIT should be able to catch that without this
>> help. But for that case, we could stay on the old implementations too for
>> isBMPCodePoint and is ValidCodePoint.
>>
> Again, performance with BMP characters is infinitely more important
> than performance with supplementary characters.
>
You are right. But I can't see any reason, why the fast supplementary
version would harm the BMP performance.
-Ulf
More information about the core-libs-dev
mailing list