Rewrite of IBM doublebyte charsets

Tue May 12 19:00:24 UTC 2009

Am 12.05.2009 20:31, Xueming Shen schrieb:
> Ulf Zibis wrote:
>> Sherman,
>> thanks for verifying my suggestions.
>>
>> > (1) simplify the "plane number" byte check by adding a new static 
>> array of cnspToIndex[16] for decoder
>>
>> or simply:
>>
>>       static final byte[] cnspToIndex = new byte[0x100];
>>       static {
>>           Arrays.fill(cnspToIndex, -1);
>>           cnspToIndex[0xa2] = 1; cnspToIndex[0xa3] = 2; 
>> cnspToIndex[0xa4] = 3;
>>           cnspToIndex[0xa5] = 4; cnspToIndex[0xa6] = 5; 
>> cnspToIndex[0xa7] = 6; cnspToIndex[0xaf] = 7;
>>       }
>>
>>            if ((cnsPlane = cnspToIndex[sa[sp + 1] && 0xff]) < 0)
>>                return CoderResult.malformedForLength(2);
>>
> considered that, but ended up thinking it might not really worth the 
> 0x100 bytes:-) at least based on my measurement.

Well, int[16] is even 64 bytes, and loop performance should be more 
important here than wasting 196 bytes considering ~227,000 Bytes for the 
mapping tables?

>
>>
>> *** Question: Why you code:
>>
>>                   } else if ((byte1 & MSB) == 0) {  // ASCII  G0
>>
>> instead of:
>>
>>                   } else if (byte1 >= 0) {  // ASCII  G0
>>
> I believe this line was written 10 years ago, so I have no idea (or 
> forgot) why we picked this one, my guess is the code might
> be a little easier to read with "MSB"...you think the >=0 is better or 
> faster/

Yes, I think it's also faster, as loading of "MSB" + AND would be saved.
Not sure if HotSpot will detect the shortcut!