RFR: 8351339: WebSocket::sendBinary assume that user supplied buffers are BIG_ENDIAN [v7]
Volkan Yazici
vyazici at openjdk.org
Wed Mar 26 09:40:14 UTC 2025
On Wed, 26 Mar 2025 08:09:54 GMT, Michael McMahon <michaelm at openjdk.org> wrote:
>> should the Endian check be in the genesis of the processing i.e. the applyMask method ?
>>
>> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>>
>> // endian check here then other checks are superfluous ?
>>
>> if (src.remaining() > dst.remaining()) {
>> throw new IllegalArgumentException(dump(src, dst));
>> }
>> new Masker().setMask(mask).applyMask(src, dst);
>> }
>>
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent (received) in network byte order (Big Endian), then both src and dst should be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
>
>> should the Endian check be in the genesis of the processing i.e. the applyMask method ?
>>
>> ```
>> static void applyMask(ByteBuffer src, ByteBuffer dst, int mask) {
>> ```
>>
>> // endian check here then other checks are superfluous ?
>>
>> ```
>> if (src.remaining() > dst.remaining()) {
>> throw new IllegalArgumentException(dump(src, dst));
>> }
>> new Masker().setMask(mask).applyMask(src, dst);
>> }
>> ```
>>
>> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent (received) in network byte order (Big Endian), then both src and dst should be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
>
> A stream of data bytes is just a stream of bytes. The big/little endian question only comes into it when you try to interpret bytes as 32 bit (or 64 bit) integers, and that's only an implementation artifact, supposedly to optimize performance. All this code relating to byte order would disappear if we removed this "optimized" code path.
>
> Afaict, the RFC refers to byte order itself only for framing data that is itself 16 bit or 32 bit. The payload length is encoded in network byte order but application data is just a stream of bytes.
I agree with @Michael-Mc-Mahon. I will try to address remarks by @liach and @msheppar:
> Because `maskBytes` is big-endian, if `dst` is little-endian (which is not the case at all right now because trusted callers are all using BE dst ByteBuffer), we should use `maskBytes[4 - offset]`, right?
> ...
> the endianness of maskBytes (currently BE) must match that of dst (all internal usages of dst is currently BE)
@liach, AFAICT, no. Mask is received as a `byte[4]`, we pass it to `Masker` as an `int` (read as BE, which is the correct order here to preserve the initial order in `byte[4]`), and its octets XOR'ed to the input octets independent of the endianness. See [the relevant section in the RFC](https://datatracker.ietf.org/doc/html/rfc6455#section-5.3):
Octet i of the transformed data ("transformed-octet-i") is the XOR of
octet i of the original data ("original-octet-i") with octet at index
i modulo 4 of the masking key ("masking-key-octet-j"):
j = i MOD 4
transformed-octet-i = original-octet-i XOR masking-key-octet-j
Consider that the mask is `0x0A0B0C0D` and the input buffer is `{0x1, 0x2, 0x3, 0x4}`. Independent of the endianness of the input and/or output buffer, the output buffer should be `{0x1 ^ 0xA, 0x2 ^ 0xB, 0x3 ^ 0xC, 0x4 ^ 0xD}`. I've added further tests for this in 41b40436adc5cd9abb46b0f2f9536734470d9507.
> endian check here then other checks are superfluous ?
> ...
> Additionally, if the WebSocket spec (RFC 6455) expects that data is sent (received) in network byte order (Big Endian), then both src and dst should be ByteOrder.BIG_ENDIAN rather than just the same byte order ?
@msheppar, endianness only matter in the context of vectorized passes. There we read `long`, XOR it with the mask matching the `src` order, and write `long`. The only constraint is input and output endiannesses must match. Consider the following example:
var maskLongBe = 0x0A0B0C0D0A0B0C0DL;
var maskLongLe = 0x0D0C0B0A0D0C0B0AL;
var srcBe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8});
var srcLe = ByteBuffer.wrap(new byte[]{1, 2, 3, 4, 5, 6, 7, 8}).order(ByteOrder.LITTLE_ENDIAN);
var dstBe = ByteBuffer.allocate(8);
var dstLe = ByteBuffer.allocate(8).order(ByteOrder.LITTLE_ENDIAN);
dstBe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstBe.array();
// srcBe->dstBe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }
dstLe.putLong(0, srcBe.getLong(0) ^ maskLongBe); dstLe.array();
// srcBe->dstLe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }
dstBe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstBe.array();
// srcLe->dstBe: WRONG: byte[8] { 5, 11, 13, 15, 9, 15, 9, 11 }
dstLe.putLong(0, srcLe.getLong(0) ^ maskLongLe); dstLe.array();
// srcLe->dstLe: RIGHT: byte[8] { 11, 9, 15, 9, 15, 13, 11, 5 }
-------------
PR Review Comment: https://git.openjdk.org/jdk/pull/24033#discussion_r2013728091
More information about the net-dev
mailing list