JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

Thu Apr 10 18:38:53 UTC 2014

On Apr 10 2014, at 11:08 , Chris Hegarty <chris.hegarty at oracle.com> wrote:

> 
>> On 10 Apr 2014, at 18:40, Mike Duigou <mike.duigou at oracle.com> wrote:
>> 
>> 
>>> On Apr 10 2014, at 03:21 , Chris Hegarty <chris.hegarty at oracle.com> wrote:
>>> 
>>>> On 10 Apr 2014, at 11:03, Ulf Zibis <Ulf.Zibis at CoSoCo.de> wrote:
>>>> 
>>>> Hi Chris,
>>>> 
>>>> Am 10.04.2014 11:04, schrieb Chris Hegarty:
>>>>> Trivially, you could ( but of not have to ) use java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to CharSet lookup.
>>>> 
>>>> In earlier tests Sherman and I have found out, that the cost of initialization of a new charsets object is higher than the lookup of an existing object in the cache.
>>>> And it's even better to use the same String instance for the lookup which was used to cache the charset.
>>> 
>>> Interesting… thanks for let me know.  Presumably, there is an assumption is StandardCharsets is not initialized elsewhere, by another dependency.
>> 
>> Generally it's safe to assume that StandardCharsets will already be initialized. If it isn't initialized we should consider it an amortized cost.
> 
> I'm which case why would the string version be more performant than the version that already takes the Charset? Doesn't the string version need to do a lookup?

There is a cache in StringCoder that is only used in the byte[] getBytes(String charsetName) but not in the byte[] getBytes(Charset charset) case. The rationale in StringCodding::decode(Charset cs, byte[] ba, int off, int len) may need to be revisited as it is certainly surprising that the string constant charset name usage is faster than the CharSet constant.

Mike