JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)

Xueming Shen xueming.shen at oracle.com
Thu Apr 10 18:50:19 UTC 2014


On 04/10/2014 11:38 AM, Mike Duigou wrote:
> On Apr 10 2014, at 11:08 , Chris Hegarty<chris.hegarty at oracle.com>  wrote:
>
>>> On 10 Apr 2014, at 18:40, Mike Duigou<mike.duigou at oracle.com>  wrote:
>>>
>>>
>>>> On Apr 10 2014, at 03:21 , Chris Hegarty<chris.hegarty at oracle.com>  wrote:
>>>>
>>>>> On 10 Apr 2014, at 11:03, Ulf Zibis<Ulf.Zibis at CoSoCo.de>  wrote:
>>>>>
>>>>> Hi Chris,
>>>>>
>>>>> Am 10.04.2014 11:04, schrieb Chris Hegarty:
>>>>>> Trivially, you could ( but of not have to ) use java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to CharSet lookup.
>>>>> In earlier tests Sherman and I have found out, that the cost of initialization of a new charsets object is higher than the lookup of an existing object in the cache.
>>>>> And it's even better to use the same String instance for the lookup which was used to cache the charset.
>>>> Interesting… thanks for let me know.  Presumably, there is an assumption is StandardCharsets is not initialized elsewhere, by another dependency.
>>> Generally it's safe to assume that StandardCharsets will already be initialized. If it isn't initialized we should consider it an amortized cost.
>> I'm which case why would the string version be more performant than the version that already takes the Charset? Doesn't the string version need to do a lookup?
> There is a cache in StringCoder that is only used in the byte[] getBytes(String charsetName) but not in the byte[] getBytes(Charset charset) case. The rationale in StringCodding::decode(Charset cs, byte[] ba, int off, int len) may need to be revisited as it is certainly surprising that the string constant charset name usage is faster than the CharSet constant.

It's a surprising :-) In theory you can't cache the de/encoder of a charset from
external world, as the same charset might return a different de/encoder next
time. So it is decided to not cache the de/encoder for a coming charset back
then. It might be reasonable to cache those from the StandardCharsets though.

-Sherman




> Mike




More information about the core-libs-dev mailing list