[foreign-memaccess+abi] RFR: 8292034: Improve javadoc after memory segment/memory address unification [v2]
Jorn Vernee
jvernee at openjdk.org
Fri Sep 23 14:15:44 UTC 2022
On Wed, 21 Sep 2022 11:52:14 GMT, Maurizio Cimadamore <mcimadamore at openjdk.org> wrote:
>> This is a rather big patch which overhauls the javadoc of the `MemorySegment` class.
>> A big thanks to Alex Buckley, who helped review the API javadoc and contributed many of the changes you see here.
>>
>> The main things touched in this PR are:
>>
>> * the section on unsafe segments is gone from the package-level javadoc. In its place, there's a new section on zero-length memory segments in the `MemorySegment` javadoc
>> * the class javadoc for `MemorySegment` has been overhauled greatly. It now focusses on two kinds of segments (native segments, and heap segments) and define how address, size, alignment of both is handled by the API.
>> * we introduce a distinction between memory segment and "the region of memory" which backs the segment, which helps spelling out a lot of these properties in a cleaner way
>> * perhaps the biggest change in the javadoc is the section on memory segment alignment, which now is much more clearly defined, and provides lots of examples (as that's a complex topic).
>> * some of the javadoc, in both this class and other classes have been tweaked to reflect the new terminology (for instance, references to the term `base address` are gone)
>>
>> Other javadoc changes might follow, at some point later. The `Linker` class might need some attention, as we don't really spell out how zero-length segments interacts with downcalls and upcalls.
>> But, given the size of the changes, I'd rather deal with that separately.
>
> Maurizio Cimadamore has updated the pull request incrementally with one additional commit since the last revision:
>
> Replace "alignment constraints" with "alignment constraint", in accordance to what done in `MemorySegment` javadoc
Mostly minor nits. I think my main gripe is the "physical address" terminology. I think depending on the situation, the "physical" could maybe be left out. Or maybe replaced with another term, like off-heap, native, or maybe real/underlying. (if you do decide to change it, you might want to search, as I'm not sure I tagged all the occurrences with a comment).
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 71:
> 69: * Native segments can be obtained by calling one of the {@link MemorySegment#allocateNative(long, long)}
> 70: * factory methods, which return a memory segment backed by a newly allocated off-heap region with the given size
> 71: * and alignment constraint. Alternatively, native segments can be obtained by
I think it would be more correct here to say: "with the given size, and aligned to the given alignment constraint". i.e. it is not the memory region that _has_ an alignment constraint.
Suggestion:
* factory methods, which return a memory segment backed by a newly allocated off-heap region with the given size
* and aligned to the given alignment constraint. Alternatively, native segments can be obtained by
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 91:
> 89: * A heap segment obtained from one of the {@link #ofArray(int[])} factory methods has an address of zero.</li>
> 90: * <li>The address of a native segment (including mapped segments) denotes the physical address of the region of
> 91: * memory which backs the segment.</li>
I wonder if "physical address" could be confusing here. Since, a process in the operating system is typically assigned a virtual address space that maps to physical memory, and it's the virtual addresses we expose.
Maybe:
Suggestion:
* <li>The address of a native segment (including mapped segments) denotes the address in native memory of the region of
* memory which backs the segment.</li>
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 201:
> 199: * Access operations on a memory segment are constrained not only by the spatial and temporal bounds of the segment,
> 200: * but also by the <em>alignment constraint</em> of the value layout specified to the operation. An access operation can
> 201: * access only those offsets in the segment that denote addresses in physical memory which are <em>aligned</em> according
Again, "physical" might be confusing here. I suggest maybe just dropping it.
Suggestion:
* access only those offsets in the segment that denote addresses in memory which are <em>aligned</em> according
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 202:
> 200: * but also by the <em>alignment constraint</em> of the value layout specified to the operation. An access operation can
> 201: * access only those offsets in the segment that denote addresses in physical memory which are <em>aligned</em> according
> 202: * to the layout. An address in physical memory is <em>aligned</em> according to a layout if the address is an integer
Suggestion:
* to the layout. An address in memory is <em>aligned</em> according to a layout if the address is an integer
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 214:
> 212: * <p>
> 213: * If the segment being accessed is a native segment, then its {@linkplain #address() address} in physical memory can be
> 214: * combined with the offset to obtain the <em>target address</em> in physical memory. The pseudo-function below demonstrates this:
Suggestion:
* If the segment being accessed is a native segment, then its {@linkplain #address() address} can be
* combined with the offset to obtain the <em>target address</em> in native memory. The pseudo-function below demonstrates this:
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 224:
> 222: * For example:
> 223: * <ul>
> 224: * <li>A native segment at address 1000 can be accessed at offsets 0, 8, 16, 24, etc under an 8-byte alignment constraint,
I suggest maybe changing 'at address' -> 'with address' here, and below. Since 'native segment' seems to refer to the Java object, and it is the backing memory region that is _at_ a certain address.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 245:
> 243: * </ul>
> 244: * <!-- Under a 1-byte alignment constraint, any offset address in a native segment can be accessed. -->
> 245: * <!-- In effect, every target address is at least 1-byte aligned. -->
This seems to be an html comment, and doesn't show up in the rendered javadoc. Not sure if that's intentional?
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 257:
> 255: * The address of the segment in physical memory is not known, and is not even fixed (it may change when the segment
> 256: * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257: * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of
Suggestion:
* determine a target address. Since the alignment constraint <em>always</em> refer to alignment of
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 257:
> 255: * The address of the segment in physical memory is not known, and is not even fixed (it may change when the segment
> 256: * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257: * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of
Suggestion:
* determine a target address in physical memory. Since the alignment constraint <em>always</em> refers to alignment of
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 258:
> 256: * is relocated during garbage collection). This means that the address cannot be combined with the specified offset to
> 257: * determine a target address in physical memory. Since the alignment constraint <em>always</em> refer to alignment of
> 258: * addresses in physical memory, it is not possible in principle to determine if any offset in a heap segment is aligned.
This also uses "physical memory", but I'm not sure what to change it to here.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 364:
> 362: * Zero-length memory segments obtained when interacting with foreign functions are associated with the
> 363: * {@link MemorySession#global() global} memory session. This is because the Java runtime, in addition to having no insight
> 364: * into the size of the region of memory associated with a pointer returned from a foreign function, it also has no insight
I think the 'it' is redundant here.
Suggestion:
* into the size of the region of memory associated with a pointer returned from a foreign function, also has no insight
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 379:
> 377: *}
> 378: *
> 379: * Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbound}
Suggestion:
* Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbounded}
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 381:
> 379: * Alternatively, clients can obtain, <em>unsafely</em>, an {@linkplain java.lang.foreign.ValueLayout.OfAddress#asUnbounded() unbound}
> 380: * address value layout. When an access operation, or a function descriptor that is passed to a downcall method handle,
> 381: * uses one or more unbound address value layouts, the API will wrap any corresponding raw addresses with native segments
Suggestion:
* uses one or more unbounded address value layouts, the API will wrap any corresponding raw addresses with native segments
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 928:
> 926: * If the buffer is {@linkplain Buffer#isReadOnly() read-only}, the resulting segment will also be
> 927: * {@linkplain ByteBuffer#isReadOnly() read-only}. Moreover, if the buffer is a {@linkplain Buffer#isDirect() direct buffer},
> 928: * the return segment is a native segment; otherwise the returned memory segment is a heap segment.
Suggestion:
* the returned segment is a native segment; otherwise the returned memory segment is a heap segment.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 935:
> 933: * <li>if the buffer has been obtained by calling {@link #asByteBuffer()} on a memory segment whose session
> 934: * is {@code S'}, then {@code S = S'}; or</li>
> 935: * <li>if the buffer is an heap buffer, then {@code S} is the {@linkplain MemorySession#global() global session}; or
Suggestion:
* <li>if the buffer is a heap buffer, then {@code S} is the {@linkplain MemorySession#global() global session}; or
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 943:
> 941: *
> 942: * @param buffer the buffer instance to be wrapped by a new memory segment.
> 943: * @return a memory segment, wrapping the given buffer instance.
Since there's not always a wrapping happening, maybe:
Suggestion:
* @param buffer the buffer instance to be turned into a new memory segment.
* @return a memory segment, derived from the given buffer instance.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1138:
> 1136: * }
> 1137: * <p>
> 1138: * The region of off-heap region of memory associated with the returned native segment is initialized to zero.
Suggestion:
* The off-heap region of memory associated with the returned native segment is initialized to zero.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1161:
> 1159: * <p>
> 1160: * Native segments featuring deterministic deallocation can be obtained using the
> 1161: * {@link MemorySession#allocate(MemoryLayout)} method.
Is this the right method to reference? Since this method takes a `long`, I'd expect the referenced method to be `allocate(long)` as well.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 1195:
> 1193: * <p>
> 1194: * Native segments featuring deterministic deallocation can be obtained using the
> 1195: * {@link MemorySession#allocate(MemoryLayout)} method.
Same here.
src/java.base/share/classes/java/lang/foreign/MemorySegment.java line 2063:
> 2061: * the same on-heap Java array;
> 2062: * <li>{@code s1.address() == s2.address()}, that is, the address of the two segments should be the same.
> 2063: * This means that the two segments either refer at the same location in some off-heap region, or they refer
Suggestion:
* This means that the two segments either refer to the same location in some off-heap region, or they refer
src/java.base/share/classes/java/lang/foreign/MemorySession.java line 228:
> 226: *
> 227: * @param byteSize the size (in bytes) of the off-heap memory block backing the native memory segment.
> 228: * @param byteAlignment the alignment constraint (in bytes) of the off-heap region of memory backing the native memory segment.
I notice that in this section there are 2 ways of referring to a memory segment's underlying memory region:
- "The off-heap region of memory associated with the"
- "the off-heap region of memory backing the"
I wonder if it's worth it to align these to one or the other. i.e. either "associated with", or "backing"
src/java.base/share/classes/java/lang/foreign/VaList.java line 238:
> 236: * @param address the address of the variable argument list.
> 237: * @param session the memory session to be associated with the returned variable argument list.
> 238: * @return a new variable argument list backed by an off-heap region of memory starting at the given address value.
Here, the terminology "backed" is also used. That seems like the better of the 2 terms to me as well.
-------------
PR: https://git.openjdk.org/panama-foreign/pull/730
More information about the panama-dev
mailing list