[foreign-memaccess+abi] RFR: 8292034: Improve javadoc after memory segment/memory address unification [v2]

Jorn Vernee jvernee at openjdk.org
Fri Sep 23 14:15:44 UTC 2022


On Wed, 21 Sep 2022 11:52:14 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:

>> This is a rather big patch which overhauls the javadoc of the `MemorySegment` class.
>> A big thanks to Alex Buckley, who helped review the API javadoc and contributed many of the changes you see here.
>> 
>> The main things touched in this PR are:
>> 
>> * the section on unsafe segments is gone from the package-level javadoc. In its place, there's a new section on zero-length memory segments in the `MemorySegment` javadoc
>> * the class javadoc for `MemorySegment` has been overhauled greatly. It now focusses on two kinds of segments (native segments, and heap segments) and define how address, size, alignment of both is handled by the API.
>> * we introduce a distinction between memory segment and "the region of memory" which backs the segment, which helps spelling out a lot of these properties in a cleaner way
>> * perhaps the biggest change in the javadoc is the section on memory segment alignment, which now is much more clearly defined, and provides lots of examples (as that's a complex topic).
>> * some of the javadoc, in both this class and other classes have been tweaked to reflect the new terminology (for instance, references to the term `base address` are gone)
>> 
>> Other javadoc changes might follow, at some point later. The `Linker` class might need some attention, as we don't really spell out how zero-length segments interacts with downcalls and upcalls.
>> But, given the size of the changes, I'd rather deal with that separately.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
> 
>   Replace "alignment constraints" with "alignment constraint", in accordance to what done in `MemorySegment` javadoc

Mostly minor nits. I think my main gripe is the "physical address" terminology. I think depending on the situation, the "physical" could maybe be left out. Or maybe replaced with another term, like off-heap, native, or maybe real/underlying. (if you do decide to change it, you might want to search, as I'm not sure I tagged all the occurrences with a comment).

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 71:

> 69:  * Native segments can be obtained by calling one of the {@link MemorySegment#allocateNative(long, long)}
> 70:  * factory methods, which return a memory segment backed by a newly allocated off-heap region with the given size
> 71:  * and alignment constraint. Alternatively, native segments can be obtained by

I think it would be more correct here to say: "with the given size, and aligned to the given alignment constraint". i.e. it is not the memory region that _has_ an alignment constraint.
Suggestion:

 * factory methods, which return a memory segment backed by a newly allocated off-heap region with the given size
 * and aligned to the given alignment constraint. Alternatively, native segments can be obtained by

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 91:

> 89:  * A heap segment obtained from one of the {@link #ofArray(int[])} factory methods has an address of zero.</li>
> 90:  * <li>The address of a native segment (including mapped segments) denotes the physical address of the region of
> 91:  * memory which backs the segment.</li>

I wonder if "physical address" could be confusing here. Since, a process in the operating system is typically assigned a virtual address space that maps to physical memory, and it's the virtual addresses we expose.

Maybe:
Suggestion:

 * <li>The address of a native segment (including mapped segments) denotes the address in native memory of the region of
 * memory which backs the segment.</li>

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 201:

> 199:  * Access operations on a memory segment are constrained not only by the spatial and temporal bounds of the segment,
> 200:  * but also by the <em>alignment constraint</em> of the value layout specified to the operation. An access operation can
> 201:  * access only those offsets in the segment that denote addresses in physical memory which are <em>aligned</em> according

Again, "physical" might be confusing here. I suggest maybe just dropping it.

Suggestion:

 * access only those offsets in the segment that denote addresses in memory which are <em>aligned</em> according

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 202:

> 200:  * but also by the <em>alignment constraint</em> of the value layout specified to the operation. An access operation can
> 201:  * access only those offsets in the segment that denote addresses in physical memory which are <em>aligned</em> according
> 202:  * to the layout. An address in physical memory is <em>aligned</em> according to a layout if the address is an integer

Suggestion:

 * to the layout. An address in memory is <em>aligned</em> according to a layout if the address is an integer

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 214:

> 212:  * <p>
> 213:  * If the segment being accessed is a native segment, then its {@linkplain #address() address} in physical memory can be
> 214:  * combined with the offset to obtain the <em>target address</em> in physical memory. The pseudo-function below demonstrates this:

Suggestion:

 * If the segment being accessed is a native segment, then its {@linkplain #address() address} can be
 * combined with the offset to obtain the <em>target address</em> in native memory. The pseudo-function below demonstrates this:

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 224:

> 222:  * For example:
> 223:  * <ul>
> 224:  * <li>A native segment at address 1000 can be accessed at offsets 0, 8, 16, 24, etc under an 8-byte alignment constraint,

I suggest maybe changing 'at address' -> 'with address' here, and below. Since 'native segment' seems to refer to the Java object, and it is the backing memory region that is _at_ a certain address.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 245:

> 243:  * </ul>
> 244:  * <!-- Under a 1-byte alignment constraint, any offset address in a native segment can be accessed. -->
> 245:  * <!-- In effect, every target address is at least 1-byte aligned. -->

This seems to be an html comment, and doesn't show up in the rendered javadoc. Not sure if that's intentional?

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 257:

> 255:  * The address of the segment in physical memory is not known, and is not even fixed (it may change when the segment
> 256:  * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257:  * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of

Suggestion:

 * determine a target address. Since the alignment constraint <em>always</em> refer to alignment of

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 257:

> 255:  * The address of the segment in physical memory is not known, and is not even fixed (it may change when the segment
> 256:  * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257:  * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of

Suggestion:

 * determine a target address in physical memory. Since the alignment constraint <em>always</em> refers to alignment of

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 258:

> 256:  * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257:  * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of
> 258:  * addresses in physical memory, it is not possible in principle to determine if any offset in a heap segment is aligned.

This also uses "physical memory", but I'm not sure what to change it to here.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 364:

> 362:  * Zero-length memory segments obtained when interacting with foreign functions are associated with the
> 363:  * {@link MemorySession#global() global} memory session. This is because the Java runtime, in addition to having no insight
> 364:  * into the size of the region of memory associated with a pointer returned from a foreign function, it also has no insight

I think the 'it' is redundant here.
Suggestion:

 * into the size of the region of memory associated with a pointer returned from a foreign function, also has no insight

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 379:

> 377:  *}
> 378:  *
> 379:  * Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbound}

Suggestion:

 * Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbounded}

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 381:

> 379:  * Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbound}
> 380:  * address value layout. When an access operation, or a function descriptor that is passed to a downcall method handle,
> 381:  * uses one or more unbound address value layouts, the API will wrap any corresponding raw addresses with native segments

Suggestion:

 * uses one or more unbounded address value layouts, the API will wrap any corresponding raw addresses with native segments

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 928:

> 926:      * If the buffer is {@linkplain Buffer#isReadOnly() read-only}, the resulting segment will also be
> 927:      * {@linkplain ByteBuffer#isReadOnly() read-only}. Moreover, if the buffer is a {@linkplain Buffer#isDirect() direct buffer},
> 928:      * the return segment is a native segment; otherwise the returned memory segment is a heap segment.

Suggestion:

     * the returned segment is a native segment; otherwise the returned memory segment is a heap segment.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 935:

> 933:      *     <li>if the buffer has been obtained by calling {@link #asByteBuffer()} on a memory segment whose session
> 934:      *     is {@code S'}, then {@code S = S'}; or</li>
> 935:      *     <li>if the buffer is an heap buffer, then {@code S} is the {@linkplain MemorySession#global() global session}; or

Suggestion:

     *     <li>if the buffer is a heap buffer, then {@code S} is the {@linkplain MemorySession#global() global session}; or

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 943:

> 941:      *
> 942:      * @param buffer the buffer instance to be wrapped by a new memory segment.
> 943:      * @return a memory segment, wrapping the given buffer instance.

Since there's not always a wrapping happening, maybe:
Suggestion:

     * @param buffer the buffer instance to be turned into a new memory segment.
     * @return a memory segment, derived from the given buffer instance.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1138:

> 1136:      * }
> 1137:      * <p>
> 1138:      * The region of off-heap region of memory associated with the returned native segment is initialized to zero.

Suggestion:

     * The off-heap region of memory associated with the returned native segment is initialized to zero.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1161:

> 1159:      * <p>
> 1160:      * Native segments featuring deterministic deallocation can be obtained using the
> 1161:      * {@link MemorySession#allocate(MemoryLayout)} method.

Is this the right method to reference? Since this method takes a `long`, I'd expect the referenced method to be `allocate(long)` as well.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1195:

> 1193:      * <p>
> 1194:      * Native segments featuring deterministic deallocation can be obtained using the
> 1195:      * {@link MemorySession#allocate(MemoryLayout)} method.

Same here.

src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2063:

> 2061:      *     the same on-heap Java array;
> 2062:      *     <li>{@code s1.address() == s2.address()}, that is, the address of the two segments should be the same.
> 2063:      *     This means that the two segments either refer at the same location in some off-heap region, or they refer

Suggestion:

     *     This means that the two segments either refer to the same location in some off-heap region, or they refer

src/java.base/share/classes/java/lang/foreign/MemorySession.java line 228:

> 226:      *
> 227:      * @param byteSize the size (in bytes) of the off-heap memory block backing the native memory segment.
> 228:      * @param byteAlignment the alignment constraint (in bytes) of the off-heap region of memory backing the native memory segment.

I notice that in this section there are 2 ways of referring to a memory segment's underlying memory region:
- "The off-heap region of memory associated with the"
- "the off-heap region of memory backing the"

I wonder if it's worth it to align these to one or the other. i.e. either "associated with", or "backing"

src/java.base/share/classes/java/lang/foreign/VaList.java line 238:

> 236:      * @param address the address of the variable argument list.
> 237:      * @param session the memory session to be associated with the returned variable argument list.
> 238:      * @return a new variable argument list backed by an off-heap region of memory starting at the given address value.

Here, the terminology "backed" is also used. That seems like the better of the 2 terms to me as well.

-------------

PR: https://git.openjdk.org/panama-foreign/pull/730


More information about the panama-dev mailing list