WebSocket client API

Joakim Erdfelt joakim.erdfelt at gmail.com
Tue Oct 20 12:50:03 UTC 2015


You know that CharBuffer doesn't actually do UTF8, right?
It's just a ByteBuffer split into equal 2 byte segments.
CharBuffer is a way to obtain char (the 2 byte number, not the character)
from the ByteBuffer or String you created it from.
CharBuffer is functionally no different than ShortBuffer.


On Tue, Oct 20, 2015 at 1:39 AM, Peter Levart <peter.levart at gmail.com>
wrote:

>
>
> On 10/18/2015 12:08 AM, Pavel Rappo wrote:
>
>> Hi Joakim,
>>
>> On 17 Oct 2015, at 22:42, Joakim Erdfelt <joakim.erdfelt at gmail.com>
>>> wrote:
>>>
>>> You are required, per the RFC6455 spec, to validate incoming and
>>> outgoing TEXT messages are valid UTF8.
>>> (also Handshake and Close Reason Messages)
>>>
>>> http://tools.ietf.org/html/rfc6455#section-8.1
>>>
>>> Relying on the JVM built-in replacement character behavior for invalid
>>> UTF8 sequences will cause many bugs.
>>> If you rely on the CharsetEncoder and CharBuffer you'll wind up with
>>> situations where you are changing the data.
>>>
>>> You need to rely on an implementation that does not use replacement
>>> characters and throws exceptions on bad Write,
>>> and on bad received TEXT messages you MUST close the connection with a
>>> 1007 error code.
>>>
>> The only thing I was trying to say is that in my opinion there's no extra
>> confidence in UTF-8 representability that CharSequence or even String
>> gives us
>> compared to what CharBuffer does. On the other hand, compared to any other
>> implementation of CharSequence or String, CharBuffer is the most
>> charset-friendly thing we have: CharsetEncoder/CharsetDecoder speaks in
>> CharBuffers.
>>
>> Sorry, but I believe I haven't proposed to rely on JDK built-in
>> replacement
>> characters. Moreover, being able to tell the decoder/encoder to throw
>> exceptions
>> (e.g. UnmappableCharacterException) on incorrect input was one of the main
>> reasons to use CharsetEncoder/Decoder. And not, say,
>> String.getBytes(StandardCharsets.UTF_8).
>>
>> Thanks.
>>
>>
> Hi,
>
> Just to clear things... The onText(..., CharBuffer cb, ...) call-back
> method receives a CharBuffer with content that is already UTF-8 decoded
> from wire message bytes, right? If it was different, it would not be right!
> So decoding is performed by WebSocket implementation, not by user and
> therefore can be performed per RFC6455 spec. CharBuffer, CharSequence,
> String - those object all represent characters and their API has nothing to
> do with UTF-8 or any other encoding.
>
> Regards, Peter
>
>
-------------- next part --------------
An HTML attachment was scrubbed...
URL: <http://mail.openjdk.java.net/pipermail/net-dev/attachments/20151020/b13ddde4/attachment.html>


More information about the net-dev mailing list