RFR: 8251989: Hex formatting and parsing utility [v10]

Thu Nov 26 00:14:20 UTC 2020

On Wed, 25 Nov 2020 22:51:44 GMT, Roger Riggs <rriggs at openjdk.org> wrote:

>> java.util.HexFormat utility:
>> 
>>  - Format and parse hexadecimal strings, with parameters for delimiter, prefix, suffix and upper/lowercase
>>  - Static factories and builder methods to create HexFormat copies with modified parameters.
>>  - Consistent naming of methods for conversion of byte arrays to formatted strings and back: formatHex and parseHex
>>  - Consistent naming of methods for conversion of primitive types: toHexDigits... and fromHexDigits...
>>  - Prefix and suffixes now apply to each formatted value, not the string as a whole
>>  - Using java.util.Appendable as a target for buffered conversions so output to Writers and PrintStreams
>>    like System.out are supported in addition to StringBuilder. (IOExceptions are converted to unchecked exceptions)
>>  - Immutable and thread safe, a "value-based" class
>> 
>> See the [HexFormat javadoc](http://cr.openjdk.java.net/~rriggs/8251989-hex-formatter/java.base/java/util/HexFormat.html) for details.
>> 
>> Review comments and suggestions welcome.
>
> Roger Riggs has updated the pull request with a new target base due to a merge or a rebase. The incremental webrev excludes the unrelated changes brought in by the merge/rebase. The pull request contains 19 additional commits since the last revision:
> 
>  - Clarified that suffix() and prefix() methods do not return null, instead the empty string is returned.
>  - Merge branch 'master' into 8251989-hex-formatter
>  - Merge branch 'master' into 8251989-hex-formatter
>  - Merge branch 'master' into 8251989-hex-formatter
>  - The HexFormat API indexing model for array and string ranges is changed
>    to describe the range using 'fromIndex (inclusive)' and 'toIndex (exclusive)'.
>    
>    Initially, it was specified as 'index' and 'length'. However, both byte arrays
>    and strings used in the HexFormat API typically use fromIndex and toIndex
>    to describe ranges.  Using the same indexing model can prevent mistakes.
>    
>    The change affects the methods and corresponding tests:
>    
>        formatHex(byte[] bytes, int fromIndex, int toIndex)
>        formatHex(A out, byte[] bytes, int fromIndex, int toIndex)
>        parseHex(char[] chars, int fromIndex, int toIndex)
>        parseHex(CharSequence string, int fromIndex, int toIndex)
>        fromHexDigits(CharSequence string, int fromIndex, int toIndex)
>        fromHexDigitsToLong(CharSequence string, int fromIndex, int toIndex)
>  - - Added @see and @link references to Integer.toHexString and Long.toHexString
>    - Clarified parsing is case insensistive in various parse and fromXXX methods
>    - Source level cleanup based on review comments
>    - Expanded some javadoc tag text to make it more descriptive
>    - Consistent use of 'hexadecimal' vs 'hex'
>  - Review comment updates to class javadoc
>  - Review comment updates, in the example code, and to describe the characters used to convert to hexadecimal
>  - Correct length of StringBuilder in formatHex;
>    Correct bug in formatHex(char[], 2, 3) and add test for subranges of char[]
>  - Merge branch 'master' into 8251989-hex-formatter
>  - ... and 9 more: https://git.openjdk.java.net/jdk/compare/d1f1e8b7...b19d2827

Hi Roger,
Other than these few comments, there are some files that need copyright year updates.

src/java.base/share/classes/java/util/HexFormat.java line 42:

> 40:  * <p>
> 41:  * There are two factories of {@code HexFormat} with preset parameters {@link #of()} and
> 42:  * {@link #ofDelimiter(String) of(delimiter)}. For other parameter combinations

Is that `ofDelimiter(delimiter)` ?

src/java.base/share/classes/java/util/HexFormat.java line 408:

> 406:      * @param fromIndex the initial index of the range, inclusive
> 407:      * @param toIndex the final index of the range, exclusive.
> 408:      * @return a String formatting or null for non-single byte formatting

`non-single byte delimiter`?

src/java.base/share/classes/java/util/HexFormat.java line 853:

> 851:      */
> 852:     public int fromHexDigit(int ch) {
> 853:         int value = Character.digit(ch, 16);

Do we need to limit parsing the hex digit for only [0-9a-fA-F]? This would return `0` for other digits, say `fullwidth digit zero` (U+FF10)

-------------

PR: https://git.openjdk.java.net/jdk/pull/482