JDK 9 RFR of 8039474: sun.misc.CharacterDecoder.decodeBuffer should use getBytes(iso8859-1)
Mike Duigou
mike.duigou at oracle.com
Thu Apr 10 18:38:53 UTC 2014
On Apr 10 2014, at 11:08 , Chris Hegarty <chris.hegarty at oracle.com> wrote:
>
>> On 10 Apr 2014, at 18:40, Mike Duigou <mike.duigou at oracle.com> wrote:
>>
>>
>>> On Apr 10 2014, at 03:21 , Chris Hegarty <chris.hegarty at oracle.com> wrote:
>>>
>>>> On 10 Apr 2014, at 11:03, Ulf Zibis <Ulf.Zibis at CoSoCo.de> wrote:
>>>>
>>>> Hi Chris,
>>>>
>>>> Am 10.04.2014 11:04, schrieb Chris Hegarty:
>>>>> Trivially, you could ( but of not have to ) use java.nio.charset.StandardCharsets.ISO_8859_1 to avoid the cost of String to CharSet lookup.
>>>>
>>>> In earlier tests Sherman and I have found out, that the cost of initialization of a new charsets object is higher than the lookup of an existing object in the cache.
>>>> And it's even better to use the same String instance for the lookup which was used to cache the charset.
>>>
>>> Interesting… thanks for let me know. Presumably, there is an assumption is StandardCharsets is not initialized elsewhere, by another dependency.
>>
>> Generally it's safe to assume that StandardCharsets will already be initialized. If it isn't initialized we should consider it an amortized cost.
>
> I'm which case why would the string version be more performant than the version that already takes the Charset? Doesn't the string version need to do a lookup?
There is a cache in StringCoder that is only used in the byte[] getBytes(String charsetName) but not in the byte[] getBytes(Charset charset) case. The rationale in StringCodding::decode(Charset cs, byte[] ba, int off, int len) may need to be revisited as it is certainly surprising that the string constant charset name usage is faster than the CharSet constant.
Mike
More information about the core-libs-dev
mailing list