RFR: 8251989: Hex formatting and parsing utility [v10]
Roger Riggs
rriggs at openjdk.java.net
Fri Nov 27 17:29:09 UTC 2020
On Fri, 27 Nov 2020 10:46:09 GMT, Chris Hegarty <chegar at openjdk.org> wrote:
>> Roger Riggs has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision:
>>
>> - Clarified that suffix() and prefix() methods do not return null, instead the empty string is returned.
>> - Merge branch 'master' into 8251989-hex-formatter
>> - Merge branch 'master' into 8251989-hex-formatter
>> - Merge branch 'master' into 8251989-hex-formatter
>> - The HexFormat API indexing model for array and string ranges is changed
>> to describe the range using 'fromIndex (inclusive)' and 'toIndex (exclusive)'.
>>
>> Initially, it was specified as 'index' and 'length'. However, both byte arrays
>> and strings used in the HexFormat API typically use fromIndex and toIndex
>> to describe ranges. Using the same indexing model can prevent mistakes.
>>
>> The change affects the methods and corresponding tests:
>>
>> formatHex(byte[] bytes, int fromIndex, int toIndex)
>> formatHex(A out, byte[] bytes, int fromIndex, int toIndex)
>> parseHex(char[] chars, int fromIndex, int toIndex)
>> parseHex(CharSequence string, int fromIndex, int toIndex)
>> fromHexDigits(CharSequence string, int fromIndex, int toIndex)
>> fromHexDigitsToLong(CharSequence string, int fromIndex, int toIndex)
>> - - Added @see and @link references to Integer.toHexString and Long.toHexString
>> - Clarified parsing is case insensistive in various parse and fromXXX methods
>> - Source level cleanup based on review comments
>> - Expanded some javadoc tag text to make it more descriptive
>> - Consistent use of 'hexadecimal' vs 'hex'
>> - Review comment updates to class javadoc
>> - Review comment updates, in the example code, and to describe the characters used to convert to hexadecimal
>> - Correct length of StringBuilder in formatHex;
>> Correct bug in formatHex(char[], 2, 3) and add test for subranges of char[]
>> - Merge branch 'master' into 8251989-hex-formatter
>> - ... and 9 more: https://git.openjdk.java.net/jdk/compare/dcf4db97...b19d2827
>
> src/java.base/share/classes/java/util/HexFormat.java line 47:
>
>> 45: * or choice of {@link #withUpperCase()} or {@link #withLowerCase()} parameters.
>> 46: * <p>
>> 47: * For primitive to hexadecimal string conversions the {@code toHexDigits}
>
> It took me a little while to grok the naming convention being used here. It might be helpful to ensure that all the terms are clearly defined and used consistently throughout.
>
> "hexadecimal string" - consists of just 0-9, a-f, A-F (i.e. no delimiter, prefix, suffix)
> "formatted hexadecimal string" - consists of hexadecimal string, and prefix, delimiter, suffix.
>
> The `parseXXX` and `formatXXX` are logically related - and operate on "formatted hexadecimal strings"
>
> The `toXXX` and `fromXXX` are logically related - one is kind of the inverse of the other. And these operate (even when accepting/producing chars) on "hexadecimal strings" - ignoring almost all properties of the HexFormat, except the toXXX cares about upper/lower case. Does `fromXXX` care about upper/lower ?
Yes, naming is important and there are two dimensions in the api that are significant.
Primitives vs byte arrays, and decoding and encoding.
Primitives ByteArrays
encode: toXXX format
decode: fromXXX parse
Both fromXXX or parseXXX accept either upper or lower case.
The class javadoc and each method specifies whether it uses the prefix, suffix, delimiter, and upper/lower case choice.
The choice of 'formatted byte array' was used to make a strong distinction between formatting of byte arrays vs converting primitives with limited/no formatting.
-------------
PR: https://git.openjdk.java.net/jdk/pull/482
More information about the core-libs-dev
mailing list