RFR: JDK-8021560,(str) String constructors that take ByteBuffer

Xueming Shen xueming.shen at oracle.com
Fri Feb 16 23:19:33 UTC 2018


On 02/16/2018 03:04 PM, Richard Warburton wrote:
> Hi gents,
>
>         public String(ByteBuffer bytes, Charset cs);
>         public String(ByteBuffer bytes, String csname);
>
>
>     I think these constructors make good sense. They avoid an extra copy to an intermediate byte[].
>
>     One issue (also mentioned by Stephen Colebourne) is whether we need the csname overload. Arguably it's not needed if we have the Charset overload. And the csname overload throws UnsupportedEncodingException, which is checked. But the csname overload is apparently faster, since the decoder can be cached, and it's unclear when this can be remedied for the Charset case....
>
>     I could go either way on this one.
>
>
> Yes - the original motivation for the csname overload was the ability to cache the decoder and also for consistency with other String encoding methods that normally have both the String name and also the Charset. If memory serves me correctly it was concluded that it was possible to cache the decoder iff the decoder could be trusted to not hold a reference to the formerly char[], now byte[], that sits within the String. Otherwise a rogue decoder could create a mutable string by holding a reference to its internal state for some malicious purpose. JDK charsets aren't rogue so it would be possible to check whether a charset was a JDK one and perform the caching, should that be a desirable + worthwhile optimization.
>
>
Yes, we don't cache external cs now (and yes, we still have to make some defensive
copy her and there :-))

The StringCoder.java included in the v2 webrev is trying to cache the de/encoder
from jdk's own charset. So if thing goes well as expected (rounds of jmh measurement)
there should be no performance difference between csn and cs. In that case, should
we drop the public String(ByteBuffer, String) constructor, as Stephen suggested?
Though the caller then needs to get the charset somewhere, Charset.forName()
for example, themselves.

http://cr.openjdk.java.net/~sherman/8021560/StringCoder.java <http://cr.openjdk.java.net/%7Esherman/8021560/StringCoder.java>

-sherman


More information about the core-libs-dev mailing list