Propose 2 new methods for MemorySegment
leerho
leerho at gmail.com
Wed Jun 16 21:36:37 UTC 2021
I think that will work and it does more than i wanted. If two segments
overlap (either 2 segments on-heap or 2 segments off-heap) and then using
the segmentOffset() I can figure out how to do a direct copy without having
to create an extra buffer --- without having to catch exceptions.
I assume this would work even if the two segments are buried in a deep
hierarchy, i.e., two great-grandchildren of a parent segment.
Lee.
On Wed, Jun 16, 2021 at 2:07 PM Maurizio Cimadamore <
maurizio.cimadamore at oracle.com> wrote:
> Few observation:
>
> * I think there is space to add a method which checks if two segments
> _overlap_
>
> * This doesn't mean reasoning in terms of structure, like you are
> suggesting (e.g. two slices of the same parent), but merely checking for
> address overlap
>
> * I don't think that, in the general case we carry around the mapped file
> on which a segment/buffer is based from. And even if we did, with symbolic
> links etc. it would be pretty hard to uniformly detect these issues
>
> Given the above, the complexity vs. benefit of the proposed API seems
> rather slim.
>
> If the general feeling is that a _simple_ address overlap test would be
> useful, we can add that - but compared with other things we're discussing
> seems like low priority.
>
> Cheers
> Maurizio
>
>
> On 16/06/2021 20:06, leerho wrote:
>
> Maurizio,
> Well, I learned yet another corner of the API I hadn't found:
> *MemoryAddress::segmentOffset()* :)
>
> However, having the boolean i*sSameBaseResource(MemorySegment other)* would
> still be very useful!
>
> Having to catch exceptions in order to understand some basic properties of
> a segment (or a pair of them) is a real nuisance. As the API stands
> currently, given two segments:
>
> 1. If they are both independently allocated on-heap the
> segmentOffset() throws an exception.
> 2. If one is on-heap and the other off-heap segmentOffset() throws an
> exception.
> 3. If they are both independently allocated *off-heap* segmentOffset *does
> not* throw an exception!
>
> If you had the method i*sSameBaseResource(MemorySegment other)*:
>
> 1. would return false
> 2. would return false
> 3. would return true (since the segmentOffset works in this case).
> 4. Also, if both segments are descendants of a common ancestor
> segment, it would return true
>
> This would make handling of moving data between segments so much more
> straightforward.
>
> I revise my request to just add the first method:
>
> - MemorySegment::boolean isSameBaseResource(MemorySegment other);
> The intent is to reveal if *this* segment and the *other* segment
> share a common ancestor segment.
> ["It could also be extended to determine if the two segments share the
> same memory-mapped file (a true resource), thus possibly removing the
> caveat in paragraph 2 above". -- this may not be possible ]
>
> Now whether this removes your paragraph 2 caveat (at the top), I'm not
> sure. Perhaps the caveat is because memory regions of a memory-mapped file
> can be swapped out at any time, making any assumptions about sub-regions
> and offsets rather meaningless? Are there other reasons?
>
> Lee.
>
> On Wed, Jun 16, 2021 at 1:43 AM Maurizio Cimadamore <
> maurizio.cimadamore at oracle.com> wrote:
>
>> I see what you mean.
>>
>> I wonder if this use case isn't already partially covered by
>> MemoryAddress::segmentOffset.
>>
>> E.g. can you do:
>>
>> long otherOffset = segment.address().segmentOffset(otherSegment);
>>
>> Then it should be easy to check if the offset is within the bounds of
>> "otherSegment" ?
>>
>> (Note that the method already throws if you try to compare addresses and
>> segments that are mismatched - e.g. on-heap vs. off-heap).
>>
>> Not saying a more direct API is ruled out, just pointing out what we
>> have to see if it can be used.
>>
>> Maurizio
>>
>>
>> On 16/06/2021 02:37, leerho wrote:
>> > In working on https://github.com/openjdk/panama-foreign/pull/555
>> <https://urldefense.com/v3/__https://github.com/openjdk/panama-foreign/pull/555__;!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQyi2Lh030$>,
>> which is
>> > the PR for Memory Segment Efficient Array Handling, I discovered that
>> there
>> > are two methods that would be very useful beyond copying arrays, but
>> useful
>> > in other types of data movement operations between MemorySegments.
>> >
>> > I'd like to draw attention to the opening Javadoc of the
>> > *MemorySegment::copyFrom(MemorySegment)* method:
>> >
>> > 1. Performs a bulk copy from given source segment to this segment. More
>> >> specifically, the bytes at offset 0 through src.byteSize() - 1 in the
>> >> source segment are copied into this segment at offset 0 through
>> src.byteSize()
>> >> - 1. If the source segment overlaps with this segment, then the copying
>> >> is performed as if the bytes at offset 0 through src.byteSize() - 1 in
>> >> the source segment were first copied into a temporary segment with size
>> >> bytes, and then the contents of the temporary segment were copied into
>> >> this segment at offset 0 through src.byteSize() - 1.
>> >>
>> >> 2. The result of a bulk copy is unspecified if, in the uncommon case,
>> the
>> >> source segment and this segment do not overlap, but refer to
>> overlapping
>> >> regions of the same backing storage using different addresses. For
>> example,
>> >> this may occur if the same file is mapped
>> >>
>> <#mapFile(java.nio.file.Path,long,long,java.nio.channels.FileChannel.MapMode,jdk.incubator.foreign.ResourceScope)>
>> to
>> >> two segments.
>> >>
>> > The first paragraph is a guarantee that even if two descendant segments
>> > have an overlapping region with a parent segment that the copy operation
>> > will work properly. This is similar to the guarantee of
>> System.arrayCopy()
>> >
>> > The second paragraph refers to memory-mapped files. However, let's
>> examine
>> > the following scenario:
>> >
>> > - A hierarchy of Memory Segments where two descendant segments may
>> > overlap a region of the parent segment.
>> > - The elements of the segments are more complex than Java primitives
>> > (thus, PR 555 doesn't apply).
>> > - The user wishes to copy a region of elements from one of the
>> > descendant segments to the other descendant segment.
>> > - The user only has the two descendant segments in hand and does not
>> > have access to the parent segment.
>> >
>> > With the current MemorySegment API, the descendant segments are blind to
>> > the overlap, to wit:
>> >
>> > - The user cannot determine if an overlap exists.
>> > - Or, if an overlap exists where the overlap is with respect to the
>> two
>> > segments in hand.
>> >
>> > In order to ensure that corruption doesn't occur during the copy, the
>> user
>> > must create a temporary duplicate of the destination segment, copy the
>> data
>> > into the duplicate, then copy the duplicate into the original
>> destination
>> > segment. This can be expensive in time and space.
>> >
>> > If, however, the user can determine that an overlap exists, and where
>> the
>> > overlap occurs, the copy operation can be done safely, with no
>> additional
>> > storage, by properly choosing the direction of the iterative copy.
>> >
>> > To solve this, the user doesn't need access to the parent segment (this
>> > could be for security reasons), but could use these two methods:
>> >
>> > - MemorySegment::boolean isSameBaseResource(MemorySegment other);
>> > The intent is to reveal if *this* segment and the *other* segment
>> share
>> > a common ancestor segment. It could also be extended to determine
>> if the
>> > two segments share the same memory-mapped file (a true resource),
>> thus
>> > possibly removing the caveat in paragraph 2 above.
>> >
>> >
>> > - MemorySegment::long baseResourceOffsetBytes();
>> > This would return the offset in bytes of the start of this segment
>> from
>> > the start of the highest common segment (or resource).
>> >
>> > With this information, the user can easily design a safe, efficient, and
>> > fast data copy method for moving arbitrary elements from one segment to
>> > another with the same guarantee as System.arrayCopy().
>> >
>> > *Evidence*
>> > See (copySwap(...)
>> > <
>> https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java#L667-L703
>> <https://urldefense.com/v3/__https://github.com/leerho/PanamaLocal/blob/main/src/main/java/org/apache/datasketches/panama/MemoryCopy.java*L667-L703__;Iw!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQySw8wUVE$>
>> >).
>> > Before I had access to the new MemorySegment::void
>> copyFrom(MemorySegment,
>> > MemoryLayout, MemoryLayout), I had to design a proxy routine that would
>> do
>> > the copy (with swap) correctly, especially in the case where the two
>> > segments overlapped. Note lines 682, 683 where I create a temporary
>> > segment. If I had the above two methods, this extra copy operation would
>> > not be needed.
>> >
>> > For exactly the above reasons, some years ago we implemented similar
>> > methods in our DataSketches Memory Component
>> > <https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html
>> <https://urldefense.com/v3/__https://datasketches.apache.org/api/memory/snapshot/apidocs/index.html__;!!GqivPVa7Brio!NIq-EJ-oDIw_GAcsVYAeixn4aRoWv0Ka_lgwzAjIaMC6ieshNbNmRDI0DoTelLQyNuBaO9I$>
>> >.
>> > Specifically, in the class *WritableMemory*, the methods
>> *getRegionOffset()*
>> > and *isSameResource(that).*
>> >
>> > Lee.
>>
>
More information about the panama-dev
mailing list