Codereview request: CR 7040220 java/char_encodin Optimize UTF-8 charset for String.getBytes()/toCharArray()

Fri Apr 29 00:11:41 UTC 2011

On 04-28-2011 3:46 PM, Ulf Zibis wrote:
>
>> It's safe to say that java.nio.cs.StandardCharset is not for 
>> String.getBytes()/toCharArray()
>> only, so the fact that "cs" variant of 
>> String.getBytes()/toCharArray() is "slower" than its "csn"
>> variant arguably might not be a very strong/supportive material for 
>> that discussion:-)
> So what prevents us from the same caching optimization in ZipCoder 
> etc. class ?
>

What do you want to cache in ZipCoder? Each ZipFile object holds one 
ZipCoder object and
uses it for its coding need through its lifetime. And ZipCoder does 
"remember" its de/encoder.

>
> - ZipCoder.isutf8 is unreadeable. Better: isUTF8
>

Updated to isUTF8 as suggested.

> - ArrayDecoder.decode(ba, 0, length, ca) could throw 
> MalformedInput/UnmappableCharacterException instead returning -1. 
> Benefits:
> -- prevent from translating -1 to 
> IllegalArgumentException("MALFORMED") in ZipCoder etc.
> -- more precise exception
>

Something we might consider to do in jdk8 or jdk7 updates. But for now I 
don't want to
change ArrayDecoder/Encoder interface at this stage, otherwise I will 
have to touch those
SingleByte charsets and the StringCoding class as well, those SingleByte 
charsets now
only handle/assume "replace" action and the StringCoding does not expect 
MalformedInput
or UnmappableCharacterException.

-Sherman

>
> -Ulf
>