RFR: 8316426: Optimization for HexFormat.formatHex [v4]

Tue Sep 19 10:29:43 UTC 2023

On Tue, 19 Sep 2023 09:34:56 GMT, Claes Redestad <redestad at openjdk.org> wrote:

>> src/java.base/share/classes/jdk/internal/util/HexDigits.java line 103:
>> 
>>> 101:         short v = DIGITS[i & 0xff];
>>> 102:         return ucase
>>> 103:                 ? (short) (v - ((v & 0b0100_0000_0100_0000) >> 1)) // really: v - ((v >= 'a' && v <= 'f') ? 32 : 0)
>> 
>> I think this logic is somewhat complicated that explaining directly is better:
>> `0b0100_0000_0100_0000` is a selector that selects letters (1 << 6), uppercase or not, and shifting it right by 1 bit incidentally becomes a bit offset between cases (1 << 5).
>> 
>> I think you can keep the original `& ~` which is clearer (as `-` will not work had the differences not be aligned bit-wise). But you should explain what `>> 1` does which is a major hurdle to understanding this function.
>
> If we changed DIGITS to be encoded with the uppercase digits then the expression could be simplified to
> ```return ucase ? v : (short) (v | 0b0010_0000_0010_0000); // or 0x2020```

Some performance-focused scenarios, such as UUID.toString, use lowercase, so I think DIGITS should be lowercase.

-------------

PR Review Comment: https://git.openjdk.org/jdk/pull/15768#discussion_r1329908384