RFR 8251989: Hex encoder and decoder utility

Fri Aug 28 14:22:01 UTC 2020

Hi Raffaello,

Missed sending this yesterday, the changes are in the updated webrev.

On 8/23/20 10:42 AM, Raffaello Giulietti wrote:

...
> Hi Roger,
>
> here are my notes about the Decoder.
>
> (1) The only serious issue I could find is on L.548. It should read
> L.548   CharBuffer cb = CharBuffer.wrap(chars, index, index + length);
> as method wrap() takes a start and an end.
> (It's quite confusing and error prone to have both (start, length) and
> (start, end) conventions on ranges all over the OpenJDK :-( Nobody's
> fault, just annoying.)
Well spotted, fixed.
>
>
> The rest are minor notes.
>
> (2) While I'm pretty sure that a JIT compiler can remove the index range
> check on L.465 when len is replaced by bytes.length on L.463, I'm less
> sure that it can do it as currently coded, despite len and bytes.length
> having the same value. Readability wouldn't suffer and len could be 
> removed.
The JIT doesn't have trouble following those values.
>
>
> (3) The expressions preLen > origLen - sufLen on L.492 and sufLen >
> origLen - preLen on L.496 are equivalent. The latter is thus useless on
> L.496 because it is always false when execution reaches here.
true; updated to optimize the case of empty prefix and/or suffix.
>
>
> (4) The code starting at L.588 assumes that delimiter.length() > 0,
> which is ensured by the optimization in L.585-586.
> Without this assumption, however, neither L.509 nor L.512 are correct
> when delimiter.length() == 0.
> To make the code to work even when, for some reason, delimiter.length()
> == 0) the lines can be replaced by the slightly more involved yet more
> robust
> L.509   if ((subcs.length() - 2) % stride != 0)
> and
> L.512   final int len = (subcs.length() - 2) / stride + 1;
fixed
>
>
> (5) I guess that the rationale for not failing fast on illegal
> characters in the loop on L.516-521, and accumulate and postpone the
> check only after the whole cs has been processed, is that the code
> optimistically assumes that cs is legal, so it would be wasteful to
> check at each iteration.
> But since each iteration already needs a quite more expensive check for
> the delimiter, I wonder if the postponing makes much economical sense.
> The check could be directly on v before assignment to bytes and the var
> illegal could be removed.
The code is easier to read with the inline checs and throws.
>
>
> (6) What about moving L.517 at the end of the loop body so that memory
> accesses are issued with strictly increasing and hardware predictable
> addresses? This move would also better reveal the "left-to-right"
> processing intent.
ok, though with a lot of reordering and caching in hardware it won't 
make a difference.

>
>
> (7) On L.561, delimiter.isEmpty() could be removed as checkDelimiter()
> is not invoked when the delimiter is empty. But it's perhaps safer to
> have it.
It could be an assert, but asserts aren't enabled always
>
>
> (8) You may want to wrap escapeNL() around the message on L.567-569
ok
>
>
> (9) fromHexPair() and fromHex() can be declared static.
>
>
> (10) equals() and hashCode() are missing despite the class being
> value-based. This holds for Encoder as well.
Added
>
>
> I look forward for the next iteration of the code.

Thanks  for the comments,

Roger

>
>
> Greetings
> Raffaello
>