Rewrite of IBM doublebyte charsets

Thu May 14 22:03:18 UTC 2009

Ulf Zibis wrote:
> Am 12.05.2009 20:31, Xueming Shen schrieb:
>> Ulf Zibis wrote:
>>> Sherman,
>>> thanks for verifying my suggestions.
>>>
>>> > (1) simplify the "plane number" byte check by adding a new static 
>>> array of cnspToIndex[16] for decoder
>>>
>>> or simply:
>>>
>>>       static final byte[] cnspToIndex = new byte[0x100];
>>>       static {
>>>           Arrays.fill(cnspToIndex, -1);
>>>           cnspToIndex[0xa2] = 1; cnspToIndex[0xa3] = 2; 
>>> cnspToIndex[0xa4] = 3;
>>>           cnspToIndex[0xa5] = 4; cnspToIndex[0xa6] = 5; 
>>> cnspToIndex[0xa7] = 6; cnspToIndex[0xaf] = 7;
>>>       }
>>>
>>>            if ((cnsPlane = cnspToIndex[sa[sp + 1] && 0xff]) < 0)
>>>                return CoderResult.malformedForLength(2);
>>>
>> considered that, but ended up thinking it might not really worth the 
>> 0x100 bytes:-) at least based on my measurement.
>
> But I wonder, that you don't consider my XOR approach. The XOR needs 
> only to be computed once in contrast to 3 times computing AND + 
> comparing against -1 (needs load operation in contrast to compare '< 0')
>
>
What is the "XOR approach"? I might have miss it. I'm happy to try it 
out. OK, the "3 times computing" and "compare <0" are the good hint to 
improve, the
latest one looks like

 180                         int cnsPlane = sa[sp +1];
 181                         if ((cnsPlane & 0xf0) != 0xa0 ||
 182                             (cnsPlane = cnspToIndex[cnsPlane&0x0f]) < 0)
 183                             return CoderResult.malformedForLength(2);

And it definitely is better than the previous one, thanks! :-) Now maybe you might want to eye the
IBM db webrev, any change would benefit several charsets:-)

Sherman