Lower overhead String encoding/decoding

Richard Warburton richard.warburton at gmail.com
Mon Nov 24 16:45:21 UTC 2014


Hi all,

I'm sure everyone is very busy, but is there any chance that someone take a
look at the latest iteration of this patch?

Thanks for taking the time to look at this - most appreciated. I've pushed
>>>> the latest iteration to http://cr.openjdk.java.net/~
>>>> rwarburton/string-patch-webrev-8/ <http://cr.openjdk.java.net/%
>>>> 7Erwarburton/string-patch-webrev-8/>.
>>>>
>>>>  I think this is looking good.
>>>
>>
> Thanks - I've pushed a new iteration to http://cr.openjdk.java.net/~
> rwarburton/string-patch-webrev-9.
>
> For the constructor then the words "decoding the specified byte buffer",
>>> it might be a bit clearer as "decoding the remaining bytes in the ...".
>>>
>>> For getBytes(ByteBuffer, Charset) then the position is advanced by the
>>> bytes written, no need to mention the number of chars read here.
>>>
>>> In the constructor then you make it clear that malformed/unmappable
>>> sequences use the default replacement. This is important to state in the
>>> getBytes methods too because the encoding can fail.
>>
>>
> I've made all these changes.
>
> Hi Richard, hi all,
>> Two comments,
>> You replace the nullcheck in getBytes() by a requireNonNull and at the
>> same time, you don"t use requireNonNull in String(ByteBuffer,Charset),
>> no very logical isn't it ?
>>
>
> Thanks for spotting this Remi - I've fixed this issue in my latest
> iteration.
>
> I think you need a supplementary constructor that takes a ByteBuffer and a
>> charset name as a String,
>> i may be wrong, but it seems that every constructor of String that takes
>> a Charset has a dual constructor that takes a Charset as a String.
>> As far as I remember, a constructor that takes a Charset as a String may
>> be faster because you can share the character decoder
>> instead of creating a new one.
>
>
> A good observation. I've added variants which take a String for the
> charset name as well the charset variants.
>
> Having said that - wouldn't it also be a good idea to replicate the
> caching on the charset versions as well as the charset name? I don't see
> any obvious reason why this isn't possible but perhaps there's something
> I'm missing here. Probably cleaner as a separate patch either way.
>

regards,

  Richard Warburton

  http://insightfullogic.com
  @RichardWarburto <http://twitter.com/richardwarburto>



More information about the core-libs-dev mailing list