fastpath for StringCoding.java
Ulf Zibis
Ulf.Zibis at gmx.de
Sat Mar 21 14:57:56 UTC 2009
Sherman,
sorry, I haven't recognized, that you filtered the mappables for this
benchmark.
-Ulf
Am 21.03.2009 08:03, Xueming Shen schrieb:
>
> Ulf,
>
> The charsets being benchmarked here are all singlebyte charsets built
> on top of SingleByte.De/Encoder
> (excluding ASCII and 8859-1), which have exactly the same lookup
> mechanism of a b2c[] for decoding
> and a c2bIndex+c2b table for encoding. For these charsets what/which
> codepoint(s) are used to benchmark
> the performance actually is irrelevant as long as they are "mappable",
> the benchmark only collect those
> "mappable" codepoints (canEncode() returns true) as the testing data.
> The rational behind this is that the "most
> likely" real life use scenario of decoding/encoding is the
> "characters" in and out are all "mappable" characters,
> which is the performance really matters.
>
>
> Ulf Zibis wrote:
>> Am 20.03.2009 00:37, Xueming Shen schrieb:
>>> Webrev has been updated to address your coments. I'm building the
>>> binaries and will run the benchmark again
>>> (up to 2048 this time) tonight.
>>
>> As I see, you take a seed from Random to create the test patterns.
>> This approach would be good for testing stability of the coders, but
>> for benchmark it wouldn't take to account the differing probability
>> distribution of the code points in the particular charsets.
>> For example see TIS-620. The code points from U+0e01 to U+0e5b would
>> be probably more used than the latin chars and the unmappables.
>> I accommodate this by different coders. See:
>> https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/src/sun/nio/cs/
>>
>>
>> -Ulf
>>
>>
>
>
More information about the core-libs-dev
mailing list