[foreign-memaccess] musing on the memory access API

Uwe Schindler uschindler at apache.org
Mon Jan 4 23:53:29 UTC 2021


Hi Maurizio,

> Thanks for the feedback Uwe, and for the bug reports. We'll do our best
> to address some of them quickly (the NPE and the error in
> Unmapper::address). As for adding an overload for mapping a segment from
> a FileChannel I'm totally open to it, but I think it's late-ish now to
> add API changes, since we are in stabilization.

Hi, this was only a suggestion to improve the whole thing. My idea is more to wait for this until a more close integration into the FileSystem API is done. The main issue we had was that we can only pass a path from the default file system provider (I have a workaround for that, so during our testsuite we "unwrap" all the layers on top). But correctly, the FileSystem implementation should provide the way how to get a MemorySegment from the FileChannel, the current cast to the internal class is ... hacky! I know why it is like that (preview and it's not part of java base, so the FileSystem interface in java.base can't return a MemorySegment). But when Panama graduates, the filesystem integration is a must!: FileChannel should be extended by one "default" method throwing UOE, only implemented by default provider: "MemorySegment FileChannel.mapSegment(long offset, long size, MapMode mode)"

> Also, thanks for the thoughts on the API in general - I kind of expected
> (given our discussions) that shared segments were 90% of what you needed
> - and that you are not much interested in using confinement. I agree
> that, when working from that angle, the API looks mostly ok. But not all
> clients have same requirements and some would like to take advantage of
> confinement more - also, note that if we just drop support for confined
> segments (which is something we also thought about) and just offered
> shared access, _all_ clients will be stuck with a very slow close()
> operation.

Hi, yes, I agree. I just said: Switching between those modes is unlikely, but yet a confined default for long living segments is correct, shared for long living ones (this is also the usage pattern: something that ölives very long is very likely often also used by many threads, like a database file or some database off-heap cache). Allocated memory used in netty is of course often short-lived, but it is in most cases not really concurrently used (or you can avoid it).

I'd give the user the option on constructing, but don't allow to change it later.

> There are very different ways to use a memory segment; sometimes (as in
> your case) a memory segment is long-lived, and you don't care if closing
> it takes 1 us. But there are other cases where segments are created (and
> disposed) more frequently. To me, the interesting fact that emerged from
> the Netty experiment (thanks guys!) was that using handoff AND shared
> segment, while nice on paper it's not going to work performance-wise,
> because you need to do an expensive close at each hand-off. This might
> be rectified, for instance by making the API more complex, and have a
> state where a segment has no owner (e.g. so that instead of confined(A)
> -> shared -> confined(B) you do confined(A) -> detached -> confined(B)
> ), but the risk is that to add a lot of API complexity ("detached" is a
> brand new segment state in which the segment is not accessible, but
> where memory is not yet deallocated) for what might be perceived as a
> corner case.
> So, the big question here is - given that there are defo different modes
> to interact with this API (short lived vs. long lived segment), what API
> allows us to capture the use cases we want in the simplest way possible?
> While dynamic ownership changes look like a cool idea on paper, it also
> add complexity - so I think now it's the right time to ask ourself if we
> should scale back on that a bit and have a more "static" set of flavors
> to pick from (e.g. { confined, shared } x { explicit, cleaner }

I think, when "allocating" a segment (by reserving memory, mapping a file, supplying some external MemoryAddress and length), you should set confined or shared from the beginning, without a possibility to change it. This would indeed simplify many things. I got new benchmarks a minute ago from my Lucene colleagues: the current MemorySegmentAPI seems 40% slower than ByteBuffer  for some use cases, but equal of speed/faster for other use cases (I assume it is still long vs. int index/looping problems; a for loop using LONG is not as good optimized as a for loop with INT -- correct?). But without diving too deep, it might also come from the fact that the memory segments *may* change their state, so hotspot is not able to do all optimizations. 

> Cheers
> Maurizio

P.S.: From my first analysis: access using the long index seems slower than MappedByteBuffer with the int position/index. Memory copy of byte blocks between memory segment and heap array seem fast, although I would have expected an overhead by constructing the views/slices on the byte[] and long[] on heap just to call copyFrom(): <https://github.com/apache/lucene-solr/blob/50d93004ae895b00149985ab32c644633c983ec6/lucene/core/src/java/org/apache/lucene/store/MemorySegmentIndexInput.java#L171-L175>; I will keep you informed what I find out and why (and when) it is slower than MappedByteBuffer.



More information about the panama-dev mailing list