Propose 2 new methods for MemorySegment

leerho leerho at gmail.com
Wed Jun 16 19:06:10 UTC 2021


Maurizio,
  Well, I learned yet another corner of the API I hadn't found:
*MemoryAddress::segmentOffset()* :)

However, having the boolean i*sSameBaseResource(MemorySegment other)* would
still be very useful!

Having to catch exceptions in order to understand some basic properties of
a segment (or a pair of them) is a real nuisance.  As the API stands
currently, given two segments:

   1. If they are both independently allocated on-heap the segmentOffset()
   throws an exception.
   2. If one is on-heap and the other off-heap segmentOffset() throws an
   exception.
   3. If they are both independently allocated *off-heap* segmentOffset *does
   not* throw an exception!

If you had the method i*sSameBaseResource(MemorySegment other)*:

   1. would return false
   2. would return false
   3. would return true (since the segmentOffset works in this case).
   4. Also, if both segments are descendants of a common ancestor segment,
   it would return true

This would make handling of moving data between segments so much more
straightforward.

I revise my request to just add the first method:

   - MemorySegment::boolean isSameBaseResource(MemorySegment other);
   The intent is to reveal if *this* segment and the *other* segment share
   a common ancestor segment.
   ["It could also be extended to determine if the two segments share the
   same memory-mapped file (a true resource), thus possibly removing the
   caveat in paragraph 2 above". -- this may not be possible ]

Now whether this removes your paragraph 2 caveat (at the top), I'm not
sure.  Perhaps the caveat is because memory regions of a memory-mapped file
can be swapped out at any time, making any assumptions about sub-regions
and offsets rather meaningless?  Are there other reasons?

Lee.

On Wed, Jun 16, 2021 at 1:43 AM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:

> I see what you mean.
>
> I wonder if this use case isn't already partially covered by
> MemoryAddress::segmentOffset.
>
> E.g. can you do:
>
> long otherOffset = segment.address().segmentOffset(otherSegment);
>
> Then it should be easy to check if the offset is within the bounds of
> "otherSegment" ?
>
> (Note that the method already throws if you try to compare addresses and
> segments that are mismatched - e.g. on-heap vs. off-heap).
>
> Not saying a more direct API is ruled out, just pointing out what we
> have to see if it can be used.
>
> Maurizio
>
>
> On 16/06/2021 02:37, leerho wrote:
> > In working on https://github.com/openjdk/panama-foreign/pull/555, which
> is
> > the PR for Memory Segment Efficient Array Handling, I discovered that
> there
> > are two methods that would be very useful beyond copying arrays, but
> useful
> > in other types of data movement operations between MemorySegments.
> >
> > I'd like to draw attention to the opening Javadoc of the
> > *MemorySegment::copyFrom(MemorySegment)* method:
> >
> > 1. Performs a bulk copy from given source segment to this segment. More
> >> specifically, the bytes at offset 0 through src.byteSize() - 1 in the
> >> source segment are copied into this segment at offset 0 through
> src.byteSize()
> >> - 1. If the source segment overlaps with this segment, then the copying
> >> is performed as if the bytes at offset 0 through src.byteSize() - 1 in
> >> the source segment were first copied into a temporary segment with size
> >> bytes, and then the contents of the temporary segment were copied into
> >> this segment at offset 0 through src.byteSize() - 1.
> >>
> >> 2. The result of a bulk copy is unspecified if, in the uncommon case,
> the
> >> source segment and this segment do not overlap, but refer to overlapping
> >> regions of the same backing storage using different addresses. For
> example,
> >> this may occur if the same file is mapped
> >>
> <#mapFile(java.nio.file.Path,long,long,java.nio.channels.FileChannel.MapMode,jdk.incubator.foreign.ResourceScope)>
> to
> >> two segments.
> >>
> > The first paragraph is a guarantee that even if two descendant segments
> > have an overlapping region with a parent segment that the copy operation
> > will work properly.  This is similar to the guarantee of
> System.arrayCopy()
> >
> > The second paragraph refers to memory-mapped files.  However, let's
> examine
> > the following scenario:
> >
> >     - A hierarchy of Memory Segments where two descendant segments may
> >     overlap a region of the parent segment.
> >     - The elements of the segments are more complex than Java primitives
> >     (thus, PR 555 doesn't apply).
> >     - The user wishes to copy a region of elements from one of the
> >     descendant segments to the other descendant segment.
> >     - The user only has the two descendant segments in hand and does not
> >     have access to the parent segment.
> >
> > With the current MemorySegment API, the descendant segments are blind to
> > the overlap, to wit:
> >
> >     - The user cannot determine if an overlap exists.
> >     - Or, if an overlap exists where the overlap is with respect to the
> two
> >     segments in hand.
> >
> > In order to ensure that corruption doesn't occur during the copy, the
> user
> > must create a temporary duplicate of the destination segment, copy the
> data
> > into the duplicate, then copy the duplicate into the original destination
> > segment.  This can be expensive in time and space.
> >
> > If, however, the user can determine that an overlap exists, and where the
> > overlap occurs, the copy operation can be done safely, with no additional
> > storage, by properly choosing the direction of the iterative copy.
> >
> > To solve this, the user doesn't need access to the parent segment (this
> > could be for security reasons), but could use these two methods:
> >
> >     - MemorySegment::boolean isSameBaseResource(MemorySegment other);
> >     The intent is to reveal if *this* segment and the *other* segment
> share
> >     a common ancestor segment.  It could also be extended to determine
> if the
> >     two segments share the same memory-mapped file (a true resource),
> thus
> >     possibly removing the caveat in paragraph 2 above.
> >
> >
> >     - MemorySegment::long baseResourceOffsetBytes();
> >     This would return the offset in bytes of the start of this segment
> from
> >     the start of the highest common segment (or resource).
> >
> > With this information, the user can easily design a safe, efficient, and
> > fast data copy method for moving arbitrary elements from one segment to
> > another with the same guarantee as System.arrayCopy().
> >
> > *Evidence*
> > See (copySwap(...)
> > <
> https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java#L667-L703
> >).
> > Before I had access to the new MemorySegment::void
> copyFrom(MemorySegment,
> > MemoryLayout, MemoryLayout), I had to design a proxy routine that would
> do
> > the copy (with swap) correctly, especially in the case where the two
> > segments overlapped. Note lines 682, 683 where I create a temporary
> > segment. If I had the above two methods, this extra copy operation would
> > not be needed.
> >
> > For exactly the above reasons, some years ago we implemented similar
> > methods in our DataSketches Memory Component
> > <https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html
> >.
> > Specifically, in the class *WritableMemory*, the methods
> *getRegionOffset()*
> > and *isSameResource(that).*
> >
> > Lee.
>


More information about the panama-dev mailing list