Propose 2 new methods for MemorySegment
leerho
leerho at gmail.com
Wed Jun 16 19:06:10 UTC 2021
Maurizio,
Well, I learned yet another corner of the API I hadn't found:
*MemoryAddress::segmentOffset()* :)
However, having the boolean i*sSameBaseResource(MemorySegment other)* would
still be very useful!
Having to catch exceptions in order to understand some basic properties of
a segment (or a pair of them) is a real nuisance. As the API stands
currently, given two segments:
1. If they are both independently allocated on-heap the segmentOffset()
throws an exception.
2. If one is on-heap and the other off-heap segmentOffset() throws an
exception.
3. If they are both independently allocated *off-heap* segmentOffset *does
not* throw an exception!
If you had the method i*sSameBaseResource(MemorySegment other)*:
1. would return false
2. would return false
3. would return true (since the segmentOffset works in this case).
4. Also, if both segments are descendants of a common ancestor segment,
it would return true
This would make handling of moving data between segments so much more
straightforward.
I revise my request to just add the first method:
- MemorySegment::boolean isSameBaseResource(MemorySegment other);
The intent is to reveal if *this* segment and the *other* segment share
a common ancestor segment.
["It could also be extended to determine if the two segments share the
same memory-mapped file (a true resource), thus possibly removing the
caveat in paragraph 2 above". -- this may not be possible ]
Now whether this removes your paragraph 2 caveat (at the top), I'm not
sure. Perhaps the caveat is because memory regions of a memory-mapped file
can be swapped out at any time, making any assumptions about sub-regions
and offsets rather meaningless? Are there other reasons?
Lee.
On Wed, Jun 16, 2021 at 1:43 AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:
> I see what you mean.
>
> I wonder if this use case isn't already partially covered by
> MemoryAddress::segmentOffset.
>
> E.g. can you do:
>
> long otherOffset = segment.address().segmentOffset(otherSegment);
>
> Then it should be easy to check if the offset is within the bounds of
> "otherSegment" ?
>
> (Note that the method already throws if you try to compare addresses and
> segments that are mismatched - e.g. on-heap vs. off-heap).
>
> Not saying a more direct API is ruled out, just pointing out what we
> have to see if it can be used.
>
> Maurizio
>
>
> On 16/06/2021 02:37, leerho wrote:
> > In working on https://github.com/openjdk/panama-foreign/pull/555, which
> is
> > the PR for Memory Segment Efficient Array Handling, I discovered that
> there
> > are two methods that would be very useful beyond copying arrays, but
> useful
> > in other types of data movement operations between MemorySegments.
> >
> > I'd like to draw attention to the opening Javadoc of the
> > *MemorySegment::copyFrom(MemorySegment)* method:
> >
> > 1. Performs a bulk copy from given source segment to this segment. More
> >> specifically, the bytes at offset 0 through src.byteSize() - 1 in the
> >> source segment are copied into this segment at offset 0 through
> src.byteSize()
> >> - 1. If the source segment overlaps with this segment, then the copying
> >> is performed as if the bytes at offset 0 through src.byteSize() - 1 in
> >> the source segment were first copied into a temporary segment with size
> >> bytes, and then the contents of the temporary segment were copied into
> >> this segment at offset 0 through src.byteSize() - 1.
> >>
> >> 2. The result of a bulk copy is unspecified if, in the uncommon case,
> the
> >> source segment and this segment do not overlap, but refer to overlapping
> >> regions of the same backing storage using different addresses. For
> example,
> >> this may occur if the same file is mapped
> >>
> <#mapFile(java.nio.file.Path,long,long,java.nio.channels.FileChannel.MapMode,jdk.incubator.foreign.ResourceScope)>
> to
> >> two segments.
> >>
> > The first paragraph is a guarantee that even if two descendant segments
> > have an overlapping region with a parent segment that the copy operation
> > will work properly. This is similar to the guarantee of
> System.arrayCopy()
> >
> > The second paragraph refers to memory-mapped files. However, let's
> examine
> > the following scenario:
> >
> > - A hierarchy of Memory Segments where two descendant segments may
> > overlap a region of the parent segment.
> > - The elements of the segments are more complex than Java primitives
> > (thus, PR 555 doesn't apply).
> > - The user wishes to copy a region of elements from one of the
> > descendant segments to the other descendant segment.
> > - The user only has the two descendant segments in hand and does not
> > have access to the parent segment.
> >
> > With the current MemorySegment API, the descendant segments are blind to
> > the overlap, to wit:
> >
> > - The user cannot determine if an overlap exists.
> > - Or, if an overlap exists where the overlap is with respect to the
> two
> > segments in hand.
> >
> > In order to ensure that corruption doesn't occur during the copy, the
> user
> > must create a temporary duplicate of the destination segment, copy the
> data
> > into the duplicate, then copy the duplicate into the original destination
> > segment. This can be expensive in time and space.
> >
> > If, however, the user can determine that an overlap exists, and where the
> > overlap occurs, the copy operation can be done safely, with no additional
> > storage, by properly choosing the direction of the iterative copy.
> >
> > To solve this, the user doesn't need access to the parent segment (this
> > could be for security reasons), but could use these two methods:
> >
> > - MemorySegment::boolean isSameBaseResource(MemorySegment other);
> > The intent is to reveal if *this* segment and the *other* segment
> share
> > a common ancestor segment. It could also be extended to determine
> if the
> > two segments share the same memory-mapped file (a true resource),
> thus
> > possibly removing the caveat in paragraph 2 above.
> >
> >
> > - MemorySegment::long baseResourceOffsetBytes();
> > This would return the offset in bytes of the start of this segment
> from
> > the start of the highest common segment (or resource).
> >
> > With this information, the user can easily design a safe, efficient, and
> > fast data copy method for moving arbitrary elements from one segment to
> > another with the same guarantee as System.arrayCopy().
> >
> > *Evidence*
> > See (copySwap(...)
> > <
> https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java#L667-L703
> >).
> > Before I had access to the new MemorySegment::void
> copyFrom(MemorySegment,
> > MemoryLayout, MemoryLayout), I had to design a proxy routine that would
> do
> > the copy (with swap) correctly, especially in the case where the two
> > segments overlapped. Note lines 682, 683 where I create a temporary
> > segment. If I had the above two methods, this extra copy operation would
> > not be needed.
> >
> > For exactly the above reasons, some years ago we implemented similar
> > methods in our DataSketches Memory Component
> > <https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html
> >.
> > Specifically, in the class *WritableMemory*, the methods
> *getRegionOffset()*
> > and *isSameResource(that).*
> >
> > Lee.
>
More information about the panama-dev
mailing list