Propose 2 new methods for MemorySegment
leerho
leerho at gmail.com
Wed Jun 16 01:37:17 UTC 2021
In working on https://github.com/openjdk/panama-foreign/pull/555, which is
the PR for Memory Segment Efficient Array Handling, I discovered that there
are two methods that would be very useful beyond copying arrays, but useful
in other types of data movement operations between MemorySegments.
I'd like to draw attention to the opening Javadoc of the
*MemorySegment::copyFrom(MemorySegment)* method:
1. Performs a bulk copy from given source segment to this segment. More
> specifically, the bytes at offset 0 through src.byteSize() - 1 in the
> source segment are copied into this segment at offset 0 through src.byteSize()
> - 1. If the source segment overlaps with this segment, then the copying
> is performed as if the bytes at offset 0 through src.byteSize() - 1 in
> the source segment were first copied into a temporary segment with size
> bytes, and then the contents of the temporary segment were copied into
> this segment at offset 0 through src.byteSize() - 1.
>
> 2. The result of a bulk copy is unspecified if, in the uncommon case, the
> source segment and this segment do not overlap, but refer to overlapping
> regions of the same backing storage using different addresses. For example,
> this may occur if the same file is mapped
> <#mapFile(java.nio.file.Path,long,long,java.nio.channels.FileChannel.MapMode,jdk.incubator.foreign.ResourceScope)> to
> two segments.
>
The first paragraph is a guarantee that even if two descendant segments
have an overlapping region with a parent segment that the copy operation
will work properly. This is similar to the guarantee of System.arrayCopy()
The second paragraph refers to memory-mapped files. However, let's examine
the following scenario:
- A hierarchy of Memory Segments where two descendant segments may
overlap a region of the parent segment.
- The elements of the segments are more complex than Java primitives
(thus, PR 555 doesn't apply).
- The user wishes to copy a region of elements from one of the
descendant segments to the other descendant segment.
- The user only has the two descendant segments in hand and does not
have access to the parent segment.
With the current MemorySegment API, the descendant segments are blind to
the overlap, to wit:
- The user cannot determine if an overlap exists.
- Or, if an overlap exists where the overlap is with respect to the two
segments in hand.
In order to ensure that corruption doesn't occur during the copy, the user
must create a temporary duplicate of the destination segment, copy the data
into the duplicate, then copy the duplicate into the original destination
segment. This can be expensive in time and space.
If, however, the user can determine that an overlap exists, and where the
overlap occurs, the copy operation can be done safely, with no additional
storage, by properly choosing the direction of the iterative copy.
To solve this, the user doesn't need access to the parent segment (this
could be for security reasons), but could use these two methods:
- MemorySegment::boolean isSameBaseResource(MemorySegment other);
The intent is to reveal if *this* segment and the *other* segment share
a common ancestor segment. It could also be extended to determine if the
two segments share the same memory-mapped file (a true resource), thus
possibly removing the caveat in paragraph 2 above.
- MemorySegment::long baseResourceOffsetBytes();
This would return the offset in bytes of the start of this segment from
the start of the highest common segment (or resource).
With this information, the user can easily design a safe, efficient, and
fast data copy method for moving arbitrary elements from one segment to
another with the same guarantee as System.arrayCopy().
*Evidence*
See (copySwap(...)
<https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java#L667-L703>).
Before I had access to the new MemorySegment::void copyFrom(MemorySegment,
MemoryLayout, MemoryLayout), I had to design a proxy routine that would do
the copy (with swap) correctly, especially in the case where the two
segments overlapped. Note lines 682, 683 where I create a temporary
segment. If I had the above two methods, this extra copy operation would
not be needed.
For exactly the above reasons, some years ago we implemented similar
methods in our DataSketches Memory Component
<https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html>.
Specifically, in the class *WritableMemory*, the methods *getRegionOffset()*
and *isSameResource(that).*
Lee.
More information about the panama-dev
mailing list