review request for 6798511/6860431: Include functionality of Surrogate in Character

Ulf Zibis Ulf.Zibis at gmx.de
Sun Mar 21 11:28:57 UTC 2010


>
> On Sat, Mar 20, 2010 at 17:13, Ulf Zibis<Ulf.Zibis at gmx.de>  wrote:
>    
>> Am 20.03.2010 01:13, schrieb Martin Buchholz:
>> Don't you think we should add a hint to javadoc to inform the user about the
>> implementation difference between isSupplementaryCodePoint and
>> isValidCodePoint?
>>      
> No.
>
>    
>> It's likely, the user would use isBMPCodePoint and isSupplementaryCodePoint
>> as pair, not knowing about the performance problem.
>>      
> I don't think it's a performance problem in the real world.
>    

Hm, if someone uses:
      if (Character.isBMPCodePoint(codePoint))
          ...;
      else if (Character.isSupplementaryCodePoint(codePoint)) // instead 
isValidCodepoint()
          ...;
      else
          ...;
he will loose up to 50 % performance as you can see on my benchmark on 
isSuppCPAlaMartin().


> We don't usually put such performance information in the javadoc.
>    

In class StringBuilder:
"Where possible, it is recommended that this class be used in preference 
to |StringBuffer| as it will be faster under most implementations."

java.util.List:
Note that these operations may execute in time proportional to the index 
value for some implementations (the LinkedList class, for example).

ByteBuffer#get(byte[],int,int):
In other words, an invocation of this method of the form 
src.get(dst, off, len) has exactly the same effect as the loop

      for (int i = off; i<  off + len; i++)
          dst[i] = src.get();

except that it first checks that there are sufficient bytes in this 
buffer *and it is potentially much more efficient*.

**

> Can you demonstrate a performance advantage
> of your implementation of isSupplementaryCodePoint
> for BMP characters, when there is no call to
> isBMPCodePoint?  (Such a demonstration typically
> requires testing on a large variety of systems and JITs)
>    

I'm not sure if I understand right.
I think, my benchmark on isSuppCPAlaMartin() would demonstrate that.

Anyway, even if isSupplementaryCodePoint() is used isolated, my code 
will help JIT to use 2-byte shifted adressing and shorter 2-byte 
immediate value for the compare, but yes, JIT should be able to catch 
that without this help. But for that case, we could stay on the old 
implementations too for isBMPCodePoint and is ValidCodePoint.


-Ulf
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/core-libs-dev/attachments/20100321/6874ddbc/attachment.html>


More information about the core-libs-dev mailing list