[foreign-memaccess+abi] RFR: 8291639: Improve the ability to visualize a MemorySegment in human readable forms

Maurizio Cimadamore mcimadamore at openjdk.org
Wed Aug 3 15:22:58 UTC 2022


On Tue, 2 Aug 2022 15:22:13 GMT, Per-Åke Minborg <duke at openjdk.org> wrote:

> This PR imrproves the ability to visualize a `MemorySegment` in human-readable forms.
> 
> It exposes two new methods by adding them directly to the MemorySegment interface. There are other ways to add the methods (e.g. by exposing them via a utility class).
> 
> The methods
> 
> 
>     /**
>      * Returns a Stream of human-readable, lines with hexadecimal values for this memory segment.
>      * <p>
>      * The exact format of the stream elements is unspecified and should not
>      * be acted upon programmatically. Loosely speaking, this method renders
>      * a format similar to the *nix command "hexdump -C".
>      * <p>
>      * As an example, a memory segment created, initialized and used as follows
>      * {@snippet lang = java:
>      * MemorySegment memorySegment = memorySession.allocate(64 + 4);
>      * memorySegment.setUtf8String(0, "The quick brown fox jumped over the lazy dog\nSecond line\t:here");
>      * memorySegment.hexStream()
>      *     .forEach(System.out::println);
>      *}
>      * might print to something like this:
>      * {@snippet lang = text:
>      * 0000000000000000  54 68 65 20 71 75 69 63  6B 20 62 72 6F 77 6E 20  |The quick brown |
>      * 0000000000000010  66 6F 78 20 6A 75 6D 70  65 64 20 6F 76 65 72 20  |fox jumped over |
>      * 0000000000000020  74 68 65 20 6C 61 7A 79  20 64 6F 67 0A 53 65 63  |the lazy dog.Sec|
>      * 0000000000000030  6F 6E 64 20 6C 69 6E 65  09 3A 68 65 72 65 00 00  |ond line.:here..|
>      * 0000000000000040  00 00 00 00                                       |....|
>      *}
>      * <p>
>      * Use a {@linkplain MemorySegment#asSlice(long, long) slice} to inspect a specific region
>      * of a memory segment.
>      *
>      * @return a Stream of human-readable, lines with hexadecimal values
>      */
>      default Stream<String> hexStream() {
>         return MemorySegmentRenderUtil.hexStream(this);
>     }
> ``` 
> 
> and
> 
> 
>     /**
>      * Returns a human-readable view of this memory segment viewed through
>      * the provided memory layout {@code lens}.
>      * <p>
>      * Lines are separated with the system-dependent line separator {@link System#lineSeparator() }.
>      * Otherwise, the exact format of the returned view is unspecified and should not
>      * be acted upon programmatically.
>      * <p>
>      * As an example, a memory segment viewed though the following memory layout lens
>      * {@snippet lang = java:
>      * var lens = MemoryLayout.structLayout(
>      *         ValueLayout.JAVA_INT.withName("x"),
>      *         ValueLayout.JAVA_INT.withName("y")
>      * ).withName("Point");
>      *}
>      * might be rendered to something like this:
>      * {@snippet lang = text:
>      * Point {
>      *   x=1,
>      *   y=2
>      * }
>      *}
>      * <p>
>      * This method is intended to view memory segments through small and medium-sized memory layout
>      * lenses and is, in all cases, restricted by the inherent String capacity limit.
>      *
>      * @param lens  to use as a lens when viewing the memory segment
>      * @return a view of the memory segment viewed through a memory layout lens
>      * @throws OutOfMemoryError if the view exceeds the array size VM limit
>      */
>     default String viewThrough(MemoryLayout lens) {
>         Objects.requireNonNull(lens);
>         return MemorySegmentRenderUtil.viewThrough(this, lens);
>     }

The new functionalities seem helpful. From an API perspective I'm a bit concerned in that we are probably not able to say a lot in terms of specification. For instance, how many String are returned in the hex dump stream? Or what is the format of the `viewThrough` string? While you call some of these out, part of me is a bit wary of adding methods whose semantics cannot be fully described in the javadoc, as that creates an issue when it comes to testing and conformance testing of the API. Are there similar analogue functionalities in public JDK APIs (e.g. ByteBuffer) ?

Separately, I think viewThrough is more segment/layout specific, whereas hexDump really is a general functionality that just so happens to be defined by MemorySegment. I'm worried about this because it leaves users of ByteBuffer, String, arrays and what not in the cold, while we provide a specific functionality for memory segments. And, if we had a more general functionality available somewhere else (e.g. that worked on a byte[]) I'm not sure we would be adding this method?

For these reason I'd veer towards keeping the layout method in the API, but make the hex dump method private for now.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1903:

> 1901:      * @return a Stream of human-readable, lines with hexadecimal values
> 1902:      */
> 1903:     default Stream<String> hexStream() {

I think `hexDump` or `hexDumpStream` would be a better name

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1908:

> 1906: 
> 1907:     /**
> 1908:      * Returns a human-readable view of this memory segment viewed through

I strongly suggest dropping all reference to "lenses". I see what you mean, but I think the term seems to exotic. E.g. "viewing a memory segment through a memory layout" is fine.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1915:

> 1913:      * be acted upon programmatically.
> 1914:      * <p>
> 1915:      * As an example, a memory segment viewed though the following memory layout lens

As an example, consider the following segment (show a segment initialized to (1, 2)).
We can view the above segment through the layout defined as follows (show layout you have now).
This will render the segment as follows (the exact contents are not specified):...

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1930:

> 1928:      *}
> 1929:      * <p>
> 1930:      * This method is intended to view memory segments through small and medium-sized memory layout

I would leave this out, since it's covered in `@throws`

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1935:

> 1933:      * @param lens  to use as a lens when viewing the memory segment
> 1934:      * @return a view of the memory segment viewed through a memory layout lens
> 1935:      * @throws OutOfMemoryError if the view exceeds the array size VM limit

we have some verbiage for this in `MemorySegment::getUtf8String`:


     * @throws IllegalArgumentException if the size of the UTF-8 string is greater than the largest string supported by the platform.
     ```

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1937:

> 1935:      * @throws OutOfMemoryError if the view exceeds the array size VM limit
> 1936:      */
> 1937:     default String viewThrough(MemoryLayout lens) {

I think using `toString(MemoryLayout)` is fine here

src/java.base/share/classes/java/lang/foreign/PaddingLayout.java line 38:

> 36:  * This class is immutable, thread-safe and <a href="{@docRoot}/java.base/java/lang/doc-files/ValueBased.html">value-based</a>.
> 37:  */
> 38: public final class PaddingLayout extends AbstractLayout implements MemoryLayout {

I don't think this change should be part of this patch. Whether that's something we want o pursue, it's an orthogonal question (e.g. to make use of pattern matching more nice).

src/java.base/share/classes/jdk/internal/foreign/MemorySegmentRenderUtil.java line 156:

> 154:     }
> 155: 
> 156:     public static void renderView(MemorySegment memorySegment,

Suggestion, instead of keeping track of an explicit index, it would be perhaps better to construct a layout path using `MemoryLayout.PathElement`. That way, access to the correct element can be made w/o index. It will be slower, as it will compute a new VH each time, but at least it's guaranteed to be correct. Another approach could be to construct the layout path, and then use it to compute the offset, and then use the offset as you are doing now. I'm worried about duplicating offsetting logic.

src/java.base/share/classes/jdk/internal/foreign/MemorySegmentRenderUtil.java line 207:

> 205:         if (memoryLayout instanceof GroupLayout groupLayout) {
> 206: 
> 207:             // Strictly, we should provide all permutations of nested unions.

Not sure I get this comment

src/java.base/share/classes/jdk/internal/foreign/MemorySegmentRenderUtil.java line 262:

> 260:     }
> 261: 
> 262:     static final class HexStreamState {

While this class is simple - I note that there are many others hex utils in the JDK like src/java.base/share/classes/sun/security/util/HexDumpEncoder.java

src/java.base/share/classes/jdk/internal/foreign/MemorySegmentRenderUtil.java line 367:

> 365:     }
> 366: 
> 367: }

Note missing newline here

-------------

PR: https://git.openjdk.org/panama-foreign/pull/695


More information about the panama-dev mailing list