Rewrite of IBM doublebyte charsets
Ulf Zibis
Ulf.Zibis at gmx.de
Mon May 18 13:54:46 UTC 2009
Am 17.05.2009 23:00, Xueming Shen schrieb:
> Ulf Zibis wrote:
>> *** Encoder-Suggestions:
>>
>> (26) Why copying String to char[] in initC2B(), String access should
>> be same fast?:
>> - char[] sb = b2cSB.toCharArray();
>> - char[] db = b2c[i].toCharArray();
>>
>> -Ulf
>>
>>
>
> because the b2c tables need to be updated before used to generate the
> c2b tables, if there is
> a b2cNR table (means there are multiple "bytes" mapped to a single
> same "char", when do
> c->b, we need to know which "bytes" to map to, this is done by
> specified that in .nr map). In
> theory we need only do that if b2cNR presents, but I don't want to
> keep two paths. A possible
> optimization is to pass in char[] instead of String, then only make a
> copy when necessary.
Oops, yes, it was late after hours of thinking digital.
While thinking, why I didn't have this problem in my code....
I didn't have to manipulate the b2c map, as I transformed all the
NR's to the *.irregularities map file, which you called *.c2b, which is
in fact an overwriting of the from b2c generated c2b map. (BTW, in *.nr
the 2nd value is redundant and could be saved)
So if we have
15 --> 000A
25 --> 000A
in *.map, instead of
25 (--> 000A)
in *.nr, we could have
15 <-- 000A
in *.c2b
So avoiding the copying of the whole b2c map should be an additional
sincere argument for my suggestion (21), which I must correct:
(21) join *.nr to *.c2b files (25->000a becomes 000a->15):
Benefit[21]: reduce no. of files
Benefit[22]: simplifies initC2B() (avoids 2 loops + saves copying the
whole b2c map)
-Ulf
More information about the core-libs-dev
mailing list