Codereview request: CR 7040220 java/char_encodin Optimize UTF-8 charset for String.getBytes()/toCharArray()
Alan Bateman
Alan.Bateman at oracle.com
Thu Apr 28 11:01:33 UTC 2011
Xueming Shen wrote:
> Hi
>
> This is motivated by Neil's request to optimize common-case UTF8 path
> for native ZipFile.getEntry calls [1].
> As I said in my replying email [2] I believe a better approach might
> be to "patch" UTF8 charset directly to
> implement sun.nio.cs.ArrayDecoder/Encoder interface to speed up the
> coding operation for array based
> encoding/decoding under certain circumstance, as we did for all single
> byte charsets in #6636323 [3]. I
> have a old blog [4] that has some data for this optimization.
>
> The original plan was to do the same thing for our new UTF8 [5] as
> well in JDK7, but then (excuse, excuse)
> I was just too busy to come back to this topic till 2 days ago. After
> two days of small tweaking here and there
> and testing those possible corner cases I can think of, I'm happy with
> the result and think it might be
> worth sending it out for a codereview for JDK7, knowing we only have
> couple days left.
I skimmed through the webrev and I agree this is a better approach. I
will try to do a detailed review before Monday. It would be great if
others on the list could jump in and help too as we are running out of time.
Neil - I don't know if you've had a chance to look at Sherman's changes
but I think it's better than checking if mUTF-8 can be used. If you
agree then would you have time to run your tests that altered you to
this performance regression? There's a patch file in the webrev.
-Alan.
More information about the core-libs-dev
mailing list