RFR [10]: 8186517: sun.nio.cs.StandardCharsets$Aliases and ClassMap can be lazily loaded

Xueming Shen xueming.shen at oracle.com
Mon Aug 21 19:45:44 UTC 2017


On 8/21/17, 12:04 PM, Martin Buchholz wrote:
> OK, but ...
>
> I'd like to see further improvements here later, like switching to 
> upper case.

what's the benefit of switching to upper case? i would assume the original
assumption is that people tends to use lower case charset name in their
code, in that case (if the assumption is correct) the "toLower()" then 
needs to
do nothing.

the aliases and classes mapping are generated during the build time, so it
does not matter it's lowercase or uppercase

>
> I just realized we have
> java/nio/charset/StandardCharsets.java
> sun/nio/cs/StandardCharsets.java
>
> and they both have a UTF_8 field !
>
>
>
> On Mon, Aug 21, 2017 at 11:53 AM, Claes Redestad 
> <claes.redestad at oracle.com <mailto:claes.redestad at oracle.com>> wrote:
>
>
>     On 2017-08-21 20:05, Martin Buchholz wrote:
>
>         I agree we should optimize for common charset names, in part
>         to help the world move to UTF-8.
>
>
>     Agreed.
>
>
>         It's *weird* to canonicalize to lower case, when the canonical
>         charset names are all uppercase ("UTF-8" instead of "utf-8").
>
>
>     A pre-existing weirdness, and it goes deep enough that I haven't
>     dared changing it.
>
>
>         ---
>            62     public static final String UTF_8 = "UTF-8";
>         Is this still used?
>
>         Maybe the very first thing lookup() should do is check
>         charsetName == UTF_8
>
>
>     Subsequent lookups are very likely to hit the two-element cache in
>     Charset, so I've not seen this add up.
>
>
>         ---
>
>         Is switching from char[] to StringBuilder really an
>         improvement?  Charset names are all short, so the cost of
>         copying the char[] to a byte[] is negligible.
>
>
>     This allows us to not load and touch the code to deflate a char[]
>     to a byte[] (StringUTF16), so a tiny, tiny startup win.
>     Throughput-wise it's likely no different.
>
>     /Claes
>
>
>
>         On Mon, Aug 21, 2017 at 6:46 AM, Claes Redestad
>         <claes.redestad at oracle.com <mailto:claes.redestad at oracle.com>
>         <mailto:claes.redestad at oracle.com
>         <mailto:claes.redestad at oracle.com>>> wrote:
>
>             Hi,
>
>             the Aliases and Classes inner classes in StandardCharsets
>         can be
>             lazily-loaded by restructuring how we check for the three
>             default-loaded charsets. This removes some classloading and
>             work from happening during critical phases of the VM startup,
>             as well as a net gain on any systems that default to any
>         of the
>             three standard charsets (UTF-8, Latin-1, ASCII).
>
>             Webrev:
>         http://cr.openjdk.java.net/~redestad/8186517/jdk.00/
>         <http://cr.openjdk.java.net/%7Eredestad/8186517/jdk.00/>
>         <http://cr.openjdk.java.net/%7Eredestad/8186517/jdk.00/
>         <http://cr.openjdk.java.net/%7Eredestad/8186517/jdk.00/>>
>             Bug: https://bugs.openjdk.java.net/browse/JDK-8186517
>         <https://bugs.openjdk.java.net/browse/JDK-8186517>
>         <https://bugs.openjdk.java.net/browse/JDK-8186517
>         <https://bugs.openjdk.java.net/browse/JDK-8186517>>
>
>             I'm not sure if the pre-existing optimization to allow
>             StandardCharsets.charsets() unsynchronized access to internals
>             is really necessary (or even 100% correct), but by ensuring we
>             retrieve the Aliases and Classes instances in a
>         synchronized block
>             we should be no worse off semantically here.
>
>             /Claes
>
>
>
>

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/nio-dev/attachments/20170821/1329fee4/attachment-0001.html>


More information about the nio-dev mailing list