RFR: 8316426: Optimization for HexFormat.formatHex [v4]

Tue Sep 19 09:28:41 UTC 2023

On Tue, 19 Sep 2023 09:03:35 GMT, 温绍锦 <duke at openjdk.org> wrote:

>> In the improvement of @cl4es PR #15591, the advantages of non-lookup-table were discussed.
>> 
>> But if the input is byte[], using lookup table can improve performance.
>> 
>> For HexFormat#formatHex(Appendable, byte[]) and HexFormat#formatHex(byte[]), If the length of byte[] is larger, the performance of table lookup will be improved more obviously.
>
> 温绍锦 has updated the pull request incrementally with one additional commit since the last revision:
> 
>   "-" -> "& ~"

src/java.base/share/classes/jdk/internal/util/HexDigits.java line 103:

> 101:         short v = DIGITS[i & 0xff];
> 102:         return ucase
> 103:                 ? (short) (v - ((v & 0b0100_0000_0100_0000) >> 1)) // really: v - ((v >= 'a' && v <= 'f') ? 32 : 0)

I think this logic is somewhat complicated that explaining directly is better:
`0b0100_0000_0100_0000` is a selector that selects letters (1 << 6), uppercase or not, and shifting it right by 1 bit incidentally becomes a bit offset between cases (1 << 5).

I think you can keep the original `& ~` which is clearer (as `-` will not work had the differences not be aligned bit-wise). But you should explain what `>> 1` does which is a major hurdle to understanding this function.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15768#discussion_r1329835145