Propose 2 new methods for MemorySegment
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Jun 16 21:06:55 UTC 2021
Few observation:
* I think there is space to add a method which checks if two segments
_overlap_
* This doesn't mean reasoning in terms of structure, like you are
suggesting (e.g. two slices of the same parent), but merely checking for
address overlap
* I don't think that, in the general case we carry around the mapped
file on which a segment/buffer is based from. And even if we did, with
symbolic links etc. it would be pretty hard to uniformly detect these issues
Given the above, the complexity vs. benefit of the proposed API seems
rather slim.
If the general feeling is that a _simple_ address overlap test would be
useful, we can add that - but compared with other things we're
discussing seems like low priority.
Cheers
Maurizio
On 16/06/2021 20:06, leerho wrote:
> Maurizio,
> Well, I learned yet another corner of the API I hadn't found:
> /MemoryAddress::segmentOffset()/ :)
>
> However, having the boolean i/sSameBaseResource(MemorySegment
> other)/ would still be very useful!
>
> Having to catch exceptions in order to understand some basic
> properties of a segment (or a pair of them) is a real nuisance. As
> the API stands currently, given two segments:
>
> 1. If they are both independently allocated on-heap the
> segmentOffset() throws an exception.
> 2. If one is on-heap and the other off-heap segmentOffset() throws an
> exception.
> 3. If they are both independently allocated */off-heap/*
> segmentOffset */does not/* throw an exception!
>
> If you had the method i/sSameBaseResource(MemorySegment other)/:
>
> 1. would return false
> 2. would return false
> 3. would return true (since the segmentOffset works in this case).
> 4. Also, if both segments are descendants of a common ancestor
> segment, it would return true
>
> This would make handling of moving data between segments so much more
> straightforward.
>
> I revise my request to just add the first method:
>
> * MemorySegment::boolean isSameBaseResource(MemorySegment other);
> The intent is to reveal if /*this*/ segment and the
> */other/* segment share a common ancestor segment.
> ["It could also be extended to determine if the two segments share
> the same memory-mapped file (a true resource), thus possibly
> removing the caveat in paragraph 2 above". -- this may not be
> possible ]
>
> Now whether this removes your paragraph 2 caveat (at the top), I'm not
> sure. Perhaps the caveat is because memory regions of a memory-mapped
> file can be swapped out at any time, making any assumptions about
> sub-regions and offsets rather meaningless? Are there other reasons?
>
> Lee.
>
> On Wed, Jun 16, 2021 at 1:43 AM Maurizio Cimadamore
> <maurizio.cimadamore at oracle.com
> <mailto:maurizio.cimadamore at oracle.com>> wrote:
>
> I see what you mean.
>
> I wonder if this use case isn't already partially covered by
> MemoryAddress::segmentOffset.
>
> E.g. can you do:
>
> long otherOffset = segment.address().segmentOffset(otherSegment);
>
> Then it should be easy to check if the offset is within the bounds of
> "otherSegment" ?
>
> (Note that the method already throws if you try to compare
> addresses and
> segments that are mismatched - e.g. on-heap vs. off-heap).
>
> Not saying a more direct API is ruled out, just pointing out what we
> have to see if it can be used.
>
> Maurizio
>
>
> On 16/06/2021 02:37, leerho wrote:
> > In working on https://github.com/openjdk/panama-foreign/pull/555
> <https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/pull/555__;!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQyi2Lh030$>,
> which is
> > the PR for Memory Segment Efficient Array Handling, I discovered
> that there
> > are two methods that would be very useful beyond copying arrays,
> but useful
> > in other types of data movement operations between MemorySegments.
> >
> > I'd like to draw attention to the opening Javadoc of the
> > *MemorySegment::copyFrom(MemorySegment)* method:
> >
> > 1. Performs a bulk copy from given source segment to this
> segment. More
> >> specifically, the bytes at offset 0 through src.byteSize() - 1
> in the
> >> source segment are copied into this segment at offset 0 through
> src.byteSize()
> >> - 1. If the source segment overlaps with this segment, then the
> copying
> >> is performed as if the bytes at offset 0 through src.byteSize()
> - 1 in
> >> the source segment were first copied into a temporary segment
> with size
> >> bytes, and then the contents of the temporary segment were
> copied into
> >> this segment at offset 0 through src.byteSize() - 1.
> >>
> >> 2. The result of a bulk copy is unspecified if, in the uncommon
> case, the
> >> source segment and this segment do not overlap, but refer to
> overlapping
> >> regions of the same backing storage using different addresses.
> For example,
> >> this may occur if the same file is mapped
> >>
> <#mapFile(java.nio.file.Path,long,long,java.nio.channels.FileChannel.MapMode,jdk.incubator.foreign.ResourceScope)>
> to
> >> two segments.
> >>
> > The first paragraph is a guarantee that even if two descendant
> segments
> > have an overlapping region with a parent segment that the copy
> operation
> > will work properly. This is similar to the guarantee of
> System.arrayCopy()
> >
> > The second paragraph refers to memory-mapped files. However,
> let's examine
> > the following scenario:
> >
> > - A hierarchy of Memory Segments where two descendant
> segments may
> > overlap a region of the parent segment.
> > - The elements of the segments are more complex than Java
> primitives
> > (thus, PR 555 doesn't apply).
> > - The user wishes to copy a region of elements from one of the
> > descendant segments to the other descendant segment.
> > - The user only has the two descendant segments in hand and
> does not
> > have access to the parent segment.
> >
> > With the current MemorySegment API, the descendant segments are
> blind to
> > the overlap, to wit:
> >
> > - The user cannot determine if an overlap exists.
> > - Or, if an overlap exists where the overlap is with respect
> to the two
> > segments in hand.
> >
> > In order to ensure that corruption doesn't occur during the
> copy, the user
> > must create a temporary duplicate of the destination segment,
> copy the data
> > into the duplicate, then copy the duplicate into the original
> destination
> > segment. This can be expensive in time and space.
> >
> > If, however, the user can determine that an overlap exists, and
> where the
> > overlap occurs, the copy operation can be done safely, with no
> additional
> > storage, by properly choosing the direction of the iterative copy.
> >
> > To solve this, the user doesn't need access to the parent
> segment (this
> > could be for security reasons), but could use these two methods:
> >
> > - MemorySegment::boolean isSameBaseResource(MemorySegment
> other);
> > The intent is to reveal if *this* segment and the *other*
> segment share
> > a common ancestor segment. It could also be extended to
> determine if the
> > two segments share the same memory-mapped file (a true
> resource), thus
> > possibly removing the caveat in paragraph 2 above.
> >
> >
> > - MemorySegment::long baseResourceOffsetBytes();
> > This would return the offset in bytes of the start of this
> segment from
> > the start of the highest common segment (or resource).
> >
> > With this information, the user can easily design a safe,
> efficient, and
> > fast data copy method for moving arbitrary elements from one
> segment to
> > another with the same guarantee as System.arrayCopy().
> >
> > *Evidence*
> > See (copySwap(...)
> >
> <https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java#L667-L703
> <https://urldefense.com/v3/__https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java*L667-L703__;Iw!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQySw8wUVE$>>).
> > Before I had access to the new MemorySegment::void
> copyFrom(MemorySegment,
> > MemoryLayout, MemoryLayout), I had to design a proxy routine
> that would do
> > the copy (with swap) correctly, especially in the case where the two
> > segments overlapped. Note lines 682, 683 where I create a temporary
> > segment. If I had the above two methods, this extra copy
> operation would
> > not be needed.
> >
> > For exactly the above reasons, some years ago we implemented similar
> > methods in our DataSketches Memory Component
> >
> <https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html
> <https://urldefense.com/v3/__https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html__;!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQyNuBaO9I$>>.
> > Specifically, in the class *WritableMemory*, the methods
> *getRegionOffset()*
> > and *isSameResource(that).*
> >
> > Lee.
>
More information about the panama-dev
mailing list