Rewrite of IBM doublebyte charsets

Thu May 21 19:30:34 UTC 2009

Am 21.05.2009 01:48, Xueming Shen schrieb:
> Thanks for the 5 minutes:-)
>
> Your FindXYZcoderBugs tests are indeed very helpful to catch most of 
> the "inconsistent" behaviors
> between different paths by feeding the "random" inputs.
>
> The TestIBMDB.java is diffing the behaviors of old implementation and 
> new implementation
> with all "decode-able" bytes and "encode-able" chars...so it gives us 
> some of the guarantee.

Why do we *try* to stick on old behaviour in case of malformed and/or 
unmappable input, if we don't diff new against old ?
Then we also could *try*, to treat malformed and/or unmappable input 
most accurate.
As you mentioned, most users don't distinguish between those, so they 
won't be affected. On the other hand, user's, who did this distinction, 
would probably happy to return more accurate results, even if not 
identical to recent results.

-Ulf