Rewrite of IBM doublebyte charsets

Thu May 14 23:48:49 UTC 2009

Ulf Zibis wrote:
> Am 15.05.2009 00:03, Xueming Shen schrieb:
>>
>> What is the "XOR approach"? I might have miss it. I'm happy to try it 
>> out. OK, the "3 times computing" and "compare <0" are the good hint 
>> to improve, the
>> latest one looks like
>
> I'm afraid, you really missed it. See my post from 12.05.2009 20:25 
> CEST ;-)
> I mean:
>
>           int cnsPlane = sa[sp +1] ^ 0xa0;
>           if (cnsPlane > 0xf || (cnsPlane = cnspToIndex[cnsPlane]) < 0)
>               return CoderResult.malformedForLength(2);
>
> or maybe (to force LoadUB (Bug ID 6797305), which may be faster than 
> sign extension to int):
>           int cnsPlane = (sa[sp +1] ^ 0xa0) && 0xff;
>
> or maybe use byte[] (to force LoadUB, which may be faster than sign 
> extension to int):
>           byte cnsPlane = (byte)(sa[sp +1] ^ 0xa0);
> or
>           byte cnsPlane = (byte)((sa[sp +1] && 0xff) ^ (0xa0 && 
> 0xff)); // don't know if this make a difference
>
> But anyway, I think,
>       static final byte[] cnspToIndex = new byte[0x100];
>       ...
>
> would be the fastest.
With the motivation of pushing you move on to the ibm charsets as quick 
as possible:-) I've tried both xor and the
byte[256]. since we have to do "int cnsPlane = (sa[sp +1] ^ 0xa0) && 
0xff;", (without & 0xff, you got a negative
sign extension), it is not faster than the existing one, actually my 
"not that accurate" benchmark shows it is slower...

The "static final byte[] cnspToIndex = new byte[0x100];" is slightly 
faster, the measurement shows about 3-5%
improvement of the decoding. So the webrev has been updated to go with 
this approach. Thanks!

Sherman