Arena/Segment allocator and zero initialized MemorySegment

Wed Jun 5 17:51:44 UTC 2024

On 05/06/2024 18:25, Pedro Lamarão wrote:

> Also, as this discussion suggests, what appears to be a universal 
> interface may actually not be universal at all, and perhaps it is best 
> to have libraries define their own allocator interfaces and see what 
> emerges from that.

This is the thing we keep (re)learning when thinking about allocators in 
FFM. It is tempting to hardwire something to cater specific needs, but 
in reality specific libraries will work with very specific set of 
constraints, so our best bet is to somehow support such things to be 
built on top.

Now, this is orthogonal to the original question which was about 
advertizing zeroing properties of an allocator. What we’re seeing there 
is the (unavoidable) consequence of the (unavoidable) decision of not 
exposing (by default, at least) non-zeroing memory allocation 
primitives. E.g. it is easy to write a zeroing allocator if you have a 
non-zeroing one (as the latter is clearly more primitive than the 
former). But it wouldn’t have been safe to expose non-zeroing allocators 
in the front door allocation API. This set of constraints has led us to 
SegmentAllocator being the way it is today, where we make no assumption 
on what the “correct” behavior for SegmentAllocator::allocate should be. 
For Arenas that are provided by the JDK, the answer has to be “zeroing”. 
But users are (and should!) be able to define allocators/arenas that do 
no zero, if they have compelling use cases where zeroing would add too 
much overhead.

Should we have an extra method in SegmentAllocator which states what’s 
the zeroing policy for the allocator?

It’s certainly cheap to add, but:

  * users will forget to override that (esp. given SegmentAllocator is a
    functional interface)
  * it only addresses /one/ dimension in which a given allocator is
    special - are there others? (e.g. disjointness of returned segments,
    etc)
  * another approach could be for clients that do want guaranteed
    zeroing to always call MS::fill, and rely on C2 to eliminate the
    redundant memset (after all Unsafe::setMemory is an intrinsics).
    This is hard to achieve, but would probably lead to a better
    long-term path, where we can stop worrying about duplicate zeroing
    (as happens for on-heap arrays). At this point having the predicate
    on |SegmentAllocator| woudn’t add too much.

For all these reasons, I’m slightly against adding new methods to 
SegmentAllocator. Of course if this requests turned out to be a very 
common one I could reconsider of course.

Last, note that there’s a workaround, even though it might not look too 
pretty. Let’s say that you know that you only allocate 1024 bytes tops. 
Then you can do like so:

|MemorySegment allocateAndZero(SegmentAllocator allocator, long size) { 
class Holder { static MemorySegment ZERO = MemorySegment.ofArray(new 
byte[1024]); } // todo: check that `size < 1024` return 
allocator.allocateFrom(JAVA_BYTE, Holder.ZERO, JAVA_BYTE, 0, size); } |

Note that this uses |allocateFrom| to allocate a memory segment of given 
size, and then bulk copy (a part of) the “zero segment” on top of it. 
This won’t perform double zeroing, because 
|SegmentAllocator::allocateFrom| tries to avoid that also. The only 
difference is that we’re using “Unsafe::copyMemory” instead of 
“Unsafe::setMemory” to zero the memory (but both should perform roughly 
the same, as they are both intrinsics that support vectorized 
instructions). It’s not pretty, but maybe could be ok while we wait for 
deeper “duplicate zero memset” avoidance?

Maurizio

-------------- next part --------------
An HTML attachment was scrubbed...
URL: <https://mail.openjdk.org/pipermail/panama-dev/attachments/20240605/5c81e961/attachment.htm>