review request for 6798511/6860431: Include functionality of Surrogate in Character
Xueming Shen
Xueming.Shen at Sun.COM
Fri Mar 19 19:56:46 UTC 2010
Martin Buchholz wrote:
> I renamed my patch file from isSupplementaryCodePoint to isValidCodePoint.
>
> 6934268: Better implementation of Character.isValidCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/isValidCodePoint
>
It's fine. But if I was you I would not "optimize" it.
> 6934265: Add public method Character.isBMPCodePoint
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/public-isBMPCodePoint
>
Looks fine. I will let you know when the CCC is approved.
> 6934270: Remove javac warnings from Character.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Character-warnings
>
Looks fine.
> 6934271: Better handling of longer utf-8 sequences
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/utf8-twiddling
>
Looks good, though the code style looks really really...strange:-)
> 6935172: Optimize bit-twiddling in Bits.java
> http://cr.openjdk.java.net/~martin/webrevs/openjdk7/Bits.java
>
Looks fine. I was surprised the javac compiler really generates the code
for expr + 0 and expr << 0.
I kinda remember the gcc compiler cat optimize this kind situation to
just expr (If my memory is correct,
or maybe that was kinda of optimization I was planning to do in one of
my projects :-) ).
-Sherman
> Martin
>
> On Tue, Mar 16, 2010 at 15:35, Xueming Shen <Xueming.Shen at sun.com> wrote:
>
>> Martin Buchholz wrote:
>>
>>> On Tue, Mar 16, 2010 at 13:06, Xueming Shen <Xueming.Shen at sun.com> wrote:
>>>
>>>
>>>> Martin Buchholz wrote:
>>>>
>>>>
>>>>> Therefore the existing implementation
>>>>>
>>>>>
>>>>>>> return codePoint>= MIN_SUPPLEMENTARY_CODE_POINT
>>>>>>> && codePoint<= MAX_CODE_POINT;
>>>>>>>
>>>>>>> will almost always perform just one comparison against a constant,
>>>>>>> which is hard to beat.
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>> 1. Wondering: I think there are TWO comparisons.
>>>>>> 2. Those comparisons need to load 32 bit values from machine code,
>>>>>> against
>>>>>> only 8 bit values in my case.
>>>>>>
>>>>>>
>>>>>>
>>>>> It's a good point. In the machine code, shifts are likely to use
>>>>> immediate values, and so will be a small win.
>>>>>
>>>>> int x = codePoint >>> 16;
>>>>> return x != 0 && x < 0x11;
>>>>>
>>>>> (On modern hardware, these optimizations
>>>>> are less valuable than they used to be;
>>>>> ordinary integer arithmetic is almost free)
>>>>>
>>>>>
>>>>>
>>>>>
>>>> I'm not convinced if the proposed code is really better...a "small win".
>>>>
>>>>
>>> The primary theory here is that branches are expensive,
>>> and we are reducing them by one.
>>>
>>>
>>>
>> There are still two branches in new impl, if you count the "ifeq" and
>> "if_icmpge"(?)
>>
>> We are trying to "optimize" this piece of code with the assumption that the
>> new impl MIGHT help certain vm (hotspot?)
>> to optimize certain use scenario (some consecutive usages), if the compiler
>> and/or the vm are both smart enough at certain
>> point, with no supporting benchmark data?
>>
>> My concern is that the reality might be that this optimization might even
>> hurt the BMP use
>> case (the majority of the possible real world use scenarios) with a 10%
>> bigger bytecode size.
>>
>> -Sherman
>>
>>
>>
>> public class Character extends java.lang.Object {
>> public static final int MIN_SUPPLEMENTARY_CODE_POINT = 65536;
>>
>> public static final int MAX_CODE_POINT = 1114111;
>>
>> public Character();
>> Code:
>> 0: aload_0 1: invokespecial #1 // Method
>> java/lang/Object."<init>":()V
>> 4: return
>> public static boolean isSupplementaryCodePoint(int);
>> Code:
>> 0: iload_0 1: ldc #2 // int
>> 65536
>> 3: if_icmplt 16
>> 6: iload_0 7: ldc #3 // int
>> 1114111
>> 9: if_icmpgt 16
>> 12: iconst_1 13: goto 17
>> 16: iconst_0 17: ireturn
>> public static boolean isSupplementaryCodePoint_new(int);
>> Code:
>> 0: iload_0 1: bipush 16
>> 3: iushr 4: istore_1
>> 5: iload_1 6: ifeq 19
>> 9: iload_1 10: bipush 17
>> 12: if_icmpge 19
>> 15: iconst_1 16: goto 20
>> 19: iconst_0 20: ireturn }
>>
>>
>>
>>
>>
>>
>>
More information about the core-libs-dev
mailing list