review request for 6798511/6860431: Include functionality of Surrogate in Character

Martin Buchholz martinrb at google.com
Tue Mar 16 22:00:34 UTC 2010


I am recanting my previous support for any change to
isSupplementaryCodePoint.

I think my brain (or maybe Ulf's brain)
tricked me into thinking that
the considerations for isValidCodePoint and
isBMPCodePoint also apply to
isSupplementaryCodePoint.

Sorry.

I renamed my patch file from isSupplementaryCodePoint to isValidCodePoint.

6934268: Better implementation of Character.isValidCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
6934265: Add public method Character.isBMPCodePoint
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
6934270: Remove javac warnings from Character.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
6934271: Better handling of longer utf-8 sequences
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
6935172: Optimize bit-twiddling in Bits.java
http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java

Martin

On Tue, Mar 16, 2010 at 15:35, Xueming Shen <Xueming.Shen at sun.com> wrote:
> Martin Buchholz wrote:
>>
>> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
>>
>>>
>>> Martin Buchholz wrote:
>>>
>>>>
>>>> Therefore the existing implementation
>>>>
>>>>>>
>>>>>>  return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>           &&  codePoint<= MAX_CODE_POINT;
>>>>>>
>>>>>> will almost always perform just one comparison against a constant,
>>>>>> which is hard to beat.
>>>>>>
>>>>>>
>>>>>>
>>>>>
>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>> against
>>>>> only 8 bit values in my case.
>>>>>
>>>>>
>>>>
>>>> It's a good point.  In the machine code, shifts are likely to use
>>>> immediate values, and so will be a small win.
>>>>
>>>> int x = codePoint >>> 16;
>>>> return x != 0 && x < 0x11;
>>>>
>>>> (On modern hardware, these optimizations
>>>> are less valuable than they used to be;
>>>> ordinary integer arithmetic is almost free)
>>>>
>>>>
>>>>
>>>
>>> I'm not convinced if the proposed code is really better...a "small win".
>>>
>>
>> The primary theory here is that branches are expensive,
>> and we are reducing them by one.
>>
>>
>
> There are still two branches in new impl, if you count the "ifeq" and
> "if_icmpge"(?)
>
> We are trying to "optimize" this piece of code with the assumption that the
> new impl MIGHT help certain vm (hotspot?)
> to optimize certain use scenario (some consecutive usages), if the compiler
> and/or the vm are both smart enough at certain
> point, with no supporting benchmark data?
>
> My concern is that the reality might be that this optimization might even
> hurt the BMP use
> case (the majority of the possible real world use scenarios) with a 10%
> bigger bytecode size.
>
> -Sherman
>
>
>
> public class Character extends java.lang.Object {
>  public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>
>  public static final int MAX_CODE_POINT = 1114111;
>
>  public Character();
>   Code:
>      0: aload_0             1: invokespecial #1                  // Method
> java/lang/Object."<init>":()V
>      4: return
>  public static boolean isSupplementaryCodePoint(int);
>   Code:
>      0: iload_0             1: ldc           #2                  // int
> 65536
>      3: if_icmplt     16
>      6: iload_0             7: ldc           #3                  // int
> 1114111
>      9: if_icmpgt     16
>     12: iconst_1           13: goto          17
>     16: iconst_0           17: ireturn
>  public static boolean isSupplementaryCodePoint_new(int);
>   Code:
>      0: iload_0             1: bipush        16
>      3: iushr               4: istore_1
>      5: iload_1             6: ifeq          19
>      9: iload_1            10: bipush        17
>     12: if_icmpge     19
>     15: iconst_1           16: goto          20
>     19: iconst_0           20: ireturn       }
>
>
>
>
>
>



More information about the core-libs-dev mailing list