RFR: 8170769 Provide a simple hexdump facility for binary data

Roger Riggs Roger.Riggs at oracle.com
Mon Nov 26 19:49:46 UTC 2018


Hi,

Thanks for updating this proposal.  Some comments on the javadoc/spec.

The class name Hex isn't very evocative of its function.
HexFormat would convey a more complete idea as to its function and be 
similar to
to the existing DecimalFormat and NumberFormat classes though they do 
not share
any of the APIs or framework.

- The 2nd example using Files.newInputStream is not recommended
since the stream to the file may not be closed.
The recommended form would use try-with-resources.

|try (InputStream is = |||Files.newInputStream(Paths.get("mydata"))) {| Hex.dumpAsStream(is, 64, 
(offset, chunk, fromIndex, toIndex) -> String.format("%d %s", offset / 
64 + 1, Hex.toPrintableString(chunk, fromIndex, toIndex))) 
.forEachOrdered(System.out::println); } catch (IOException ioe) { ... } |

The 'forEachOrdered' should not be necessary and may raise questions 
about why.
if there's no good reason, use 'forEach'.


- The HEXDUMP_FORMATTER should be explicit about the format it generates.
   Though it may declare itself compatible with hexdump(1) it should specify
   the format it produces to avoid an external reference that might 
change unexpectedly.

- toFormattedHexString:
References to the line-separator should be javadoc linked to 
System#lineSeparator.

Alternatively, should this class follow the lead of String.align/indent 
and use
the normalized line terminator "\n" consistently? In which case, it 
should be explicit.

Clarify exactly the behavior:

"If the final chunk is less than 16 +bytes+ then +the result+ is padded 
with spaces to match the length of the preceding chunks. "

Only the built-in formatter asserts that every result is the same 
length.  A provided formatter
may not conform.  soften "will be" -> "may be" in "the final chunk ..."

- everywhere: "binary buffer" -> "byte buffer"  or byte array

- toPrintableString:  The ASCII limitation seems quite dated, 8-bit 
clean is the norm and
except for control characters  and 0x7f, the rest of the range is 
printable in most Locales.
Strict conformance to hexdump is not a requirement.

- dumpAsStream(InputStream)
   It is inconsistent (and possibly wrong) that the final chunk may be 
shorter since the HEXDUMP_FORMATTER
   uses the formatter that always pads.  (Btw, the implementation 
appends \n and should be using lineSeparator.)
    "If the input is not a multiple of 16 bytes then the final chunk 
will be shorter than the preceding chunks."

- dumpAsStream(bytes, from, to, chunk, formatter)

  Can it more explicit about the unchecked exception?
  "If an error occurs in the |formatter| then an unchecked exception 
will be thrown from the underlying |Stream| method."

- Another thought about the methods returning Stream<String>.
   They require a concrete string to be created for each chunk.
   There may be some efficiency and flexibility to be gained if 
formatters can return a CharSequence.
   In many cases it would be a string, but it would allow the formatter 
to return a StringBuilder
   that could be used directly without needing to construct a short 
lived string.

Thanks, Roger


p.s.
I wrote code for the use case of formatting bytes in the form that can 
be compiled with javac.
it took rather more code than I expected but I don't have any specific 
suggestions.


         byte[] b = ...;
         final String indent = " ".repeat(8);
         final StringBuilder sb = new StringBuilder(16 * 5);
         sb.append(indent);

         System.out.print("    byte[] bytes = {\n");
         try (ByteArrayInputStream is = new ByteArrayInputStream(b)) {
             dumpAsStream(is, 16, (o, c, s, e) -> {
                 sb.setLength(indent.length());
                 for (int i = s; i < e; i++) {
                     sb.append(c[i]);
                     sb.append(", ");
                 }
                 sb.append(System.lineSeparator());
                 return sb.toString();
             }).forEach(s -> System.out.print(s));
         }
         System.out.print("    }\n");


On 11/23/2018 09:51 AM, Vincent Ryan wrote:
> Hello,
>
> Please review this proposal for a new API to conveniently generate and 
> display binary data using hexadecimal string representation.
> It supports both bulk and stream operations and it can also generate 
> the well-known hexdump format [1].
>
> This latest revision addresses review comments provided earlier.
> These include adding methods to support for data supplied in a 
> ByteBuffer, exposing the default formatter implementation as a public
> static and dropping the superfluous 'Hex' term from several method names.
>
> Thanks
>
>
> Bug: https://bugs.openjdk.java.net/browse/JDK-8170769
> API: 
> http://cr.openjdk.java.net/~vinnie/8170769/javadoc.06/api/java.base/java/util/Hex.html
> Webrev: http://cr.openjdk.java.net/~vinnie/8170769/webrev.06/
>
>
> ____
> [1] https://docs.oracle.com/cd/E88353_01/html/E37839/hexdump-1.html



More information about the core-libs-dev mailing list