Sponsor for 6666666: A better implementation of Character.isSupplementaryCodePoint

Fri Mar 12 23:04:06 UTC 2010

On Thu, Mar 11, 2010 at 13:14, Ulf Zibis <Ulf.Zibis at gmx.de> wrote:
> Am 11.03.2010 20:38, schrieb Martin Buchholz:
> - Maybe better:  "... using a single {@code char}".

Done.

> - Why don't you like using the new isBMPCodePoint() for
> isSupplementaryCodePoint() and toUpperCaseCharArray() ?

I now use it for the assert in toUpperCaseCharArray()

> - Same shift magic would enhance isISOControl(), isHighSurrogate(),
> isLowSurrogate(), in particular if latter occur consecutive.

isISOControl - yes, others - I am not convinced.

>  8-bit shift + compare would allow HotSpot to compile to smart 1-byte
> immediate op-codes.
> - Don't you think my notes on validity are worth to add. (or separate bug ?)

I agree something could be done here - separate bug.

> - Changing ch <= MAX_SURROGATE to ch < MAX_SURROGATE + 1 would allow HotSpot
> compiler to optimize 1 branch if those methods are used consecutive.

Done.

> - And at last, I would like to make the constants complete (= adding
> MAX_SUPPLEMENTARY_CODE_POINT).

I have no objection to adding those, but I am not in favor either.
You'll need to convince someone else.

>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings/
>>
>
> Remembers me that some months ago I prepared a beautified version of
> Character's source (things like above, replacing <code> against {@code},
> indentation inconsistencies etc.) Would there be interest to provide such a
> patch ?

Please provide URL of patch.

>> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/malformed-utf8/

> In encodeBufferLoop() you could use putChar(), putInt() instead put().
> Should perform better.

I'm not convinced.  You would need to assemble bytes into an
int, and then break them apart into bytes on the other side?

Martin