fastpath for StringCoding.java

Ulf Zibis Ulf.Zibis at gmx.de
Sat Mar 21 14:57:56 UTC 2009


Sherman,

sorry, I haven't recognized, that you filtered the mappables for this 
benchmark.

-Ulf


Am 21.03.2009 08:03, Xueming Shen schrieb:
>
> Ulf,
>
> The charsets being benchmarked here are all singlebyte charsets built 
> on top of SingleByte.De/Encoder
> (excluding ASCII and 8859-1), which have exactly the same lookup 
> mechanism of a b2c[] for decoding
> and a c2bIndex+c2b table for encoding. For these charsets what/which 
> codepoint(s) are used to benchmark
> the performance actually is irrelevant as long as they are "mappable", 
> the benchmark only collect those
> "mappable" codepoints (canEncode() returns true) as the testing data. 
> The rational behind this is that the "most
> likely" real life  use scenario of decoding/encoding is the 
> "characters" in and out are all "mappable" characters,
> which is the performance really matters.
>
>
> Ulf Zibis wrote:
>> Am 20.03.2009 00:37, Xueming Shen schrieb:
>>> Webrev has been updated to address your coments. I'm building the 
>>> binaries and will run the benchmark again
>>> (up to 2048 this time) tonight.
>>
>> As I see, you take a seed from Random to create the test patterns. 
>> This approach would be good for testing stability of the coders, but 
>> for benchmark it wouldn't take to account the differing probability 
>> distribution of the code points in the particular charsets.
>> For example see TIS-620. The code points from U+0e01 to U+0e5b would 
>> be probably more used than the latin chars and the unmappables.
>> I accommodate this by different coders. See:
>> https://java-nio-charset-enhanced.dev.java.net/source/browse/java-nio-charset-enhanced/trunk/src/sun/nio/cs/ 
>>
>>
>> -Ulf
>>
>>
>
>




More information about the core-libs-dev mailing list