Sponsor for 6666666: A better implementation of Character.isSupplementaryCodePoint

Tue Mar 16 19:48:07 UTC 2010

Here my additions:

Am 13.03.2010 00:04, schrieb Martin Buchholz:
>> - Why don't you like using the new isBMPCodePoint() for
>> isSupplementaryCodePoint() and toUpperCaseCharArray() ?
>>      
> I now use it for the assert in toUpperCaseCharArray()
>    

     return !isBMPCodePoint() && isValidCodePoint();
resolves in same than current code.

>    
>> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
>> isLowSurrogate(), in particular if latter occur consecutive.
>>      
> isISOControl - yes, others - I am not convinced.
>    

If virtually shifted by 8, HotSpot could use cheaper 1-byte compare on 
the high byte.
Additionally, those methods are often used consecutively, so all 4 
compares would benefit from.

>>   8-bit shift + compare would allow HotSpot to compile to smart 1-byte
>> immediate op-codes.
>> In encodeBufferLoop() you could use putChar(), putInt() instead put().
>> Should perform better.
>>      
> I'm not convinced.  You would need to assemble bytes into an
> int, and then break them apart into bytes on the other side?
>    

Some time ago, I disassembled such code. I could see, that the int was 
copied directly to memory by one 32-bit move instruction.
In case of using put(byte), I saw 4 8-bit move instructions.

I not have dissassembled if a 3-byte value first would be collected in a 
3-byte byte[] and then copied by put(byte[]). Maybe HotSpot could 
optimize here too.

Try it out. 2 will see more than 1. Maybe I was in error.

BTW: for the same optimization, I would like to have putInt() and 
putLong() in Charbuffer, ShortBuffer and for the latter in IntBuffer.

-Ulf