[foreign-memaccess] API refinements

Mon Jan 6 15:25:37 UTC 2020

On 18/12/2019 19:41, Maurizio Cimadamore wrote:
> Hi,
> during the code review for the foreign memory access API (see [1]), 
> several issues were raised; these are now tracked in [2].
>
> Before doing any code change, I'd like to get some shared consensus on 
> the direction we should take. Here is a list of the issues:
>
> * Should MemoryAddress implement Comparable?
>
> I believe the answer here should be NO. In general comparing addresses 
> is tricky - and it seems to me that, given that MemoryAddress has a 
> well-defined equals() behavior, you can still implement e.g. loops 
> where you want to keep offsetting an address until you reach a certain 
> limit.
>
> As to whether MemorySegment should implement Comparable, I believe 
> answer should also be NO. I know that ByteBuffer implements 
> Comparable, and it has method to compare the contents, and to find the 
> first index at which two byte buffers mismatch. But I think this is a 
> very marginal use case for the memory access API, and there's always 
> an escape hatch: project segments down to a byte buffer and take it 
> from there.
MemoryAddresses that point into the same segment could be ordered, but 
addresses pointing to different segments can not if they are heap 
segments. Making addresses Comparable seems to open up a corner case 
where the comparison would throw an exception, while the benefits don't 
seem all that clear to me.
> * should we merge MemoryLayouts and MemoryLayout?
>
> I'm on the fence on this one. The fact that we have a MemoryLayouts is 
> very much driven by the fact that the API lives in an incubating 
> module, and that some layouts cannot live where they should (e.g. 
> JAVA_INT should really be Integer::LAYOUT). Since we also have a 
> MemoryHandles (which will be replaced by some methods inside 
> MethodHandles), I really don't see a big reason as to why we should 
> refactor the API at this point. E.g. this seems to be more a question 
> that we should ask again when the API is finalized, and exits incubation.
Agree.
> * MemorySegment::acquire() -- replace with MemorySegment::fork?
>
> I came up with that - and I think I still like it. Acquire seems to 
> suggest that it has something to do with threads and ownership - which 
> it does, but in reality you can acquire a segment even from same 
> owner. Fork seems a more neutral term, and, to me helps better 
> visualizing what's really going on (e.g. a tree of segments is being 
> created).
I don't really see that much difference between these two. FWIW, 
acquire() sounds like you're acquiring access to the memory resource, 
which intuitively has to be released again before the resource can be 
cleaned up (kinda like when acquiring a lock, you're locking out 
somebody else, in this case from cleaning up the resource).
> * should we have a way to ask a Layout if its size is specified ?
>
> This is a tricky one. On the one hand, if not all layouts have a size, 
> one could argue that Layout::bitSize should return an optional. But 
> that seems excessively punitive for the vast majority of layouts that 
> are never going to use unbound sequences. This might leads down to two 
> conclusions:
>
> - maybe unbound sequences do not carry enough weight - perhaps replace 
> them with a zero-length sequence, and be done - or even get rid of 
> zero-length sequences and let layout only be for things whose size is 
> defined. (and maybe come back with a better mechanism to speak about 
> "size holes" later, and variable-length layouts in general)
>
> - if we keep unbound sequences, let's just add an hasSize() predicate
>
> Of the two I'm warming up with just ditch unbound sequences period 
> (e.g. all layouts must have a well-defined size).

If we have a general map function for layouts where we can replace part 
of the layout, maybe a more general mechanism for layout holes could be 
something reminiscent of the Unresolved layout type on the foreign 
branch; some placeholder layout that can be referenced by name (and thus 
easily replaced using map). Then unbounded sequences could be expressed 
using some wrapper object, which wraps a layout with a layout hole for 
the sequence, and has some withSize(long) method to plug in the layout 
hole with a bounded sequence.

Though, I don't think that actually solves the pain points with 
unbounded sequences, since such a layout hole doesn't have a size 
either, although it could maybe be a more general solution.

> * There doesn't seem to be a predicate for "can I call close without 
> an exception?" (isAlive && isAccessible) isn't sufficient.
>
> I think the answer here is - and that's FINE. I'm of the opinion that 
> if an exception occurs when calling close(), 90% it signals a bug in 
> the code, not just something that you want to proactively check 
> against. E.g. somebody else forked your segment w/o you knowing. On 
> top of that, I don't think that, even if we wanted to, there would be 
> a way to do a 'canClose()' check which would mean anything - that is, 
> the predicate could return true, but then, what if, just before you 
> call close() another thread re-acquires the segment? I think this is a 
> lost cause.
Agreed. Calling close() redundantly seems like a bug. For instance, 
there is no 'canClose()'  on java.util.stream.Stream either. Worst case 
the user can catch the resulting exception.
> * MemoryAddress::offset() vs. MemoryAddress::offset(long) -- not much 
> distance between these two semantically different operations
>
> I think the solution that I like here is to drop 
> MemoryAddress::offset() completely, and to provide a new functionality 
> to create a slice based on a given address (which is the primary 
> motivation for having the offset() method). Here we have two options:
>
> - MemoryAddress::asSlice(long bytes) -> MemorySegment
> - MemorySegment::asSlice(MemoryAddress base, long bytes)
>
> The former leads to more compact code, but it feels a bit dirtier - 
> anyway, I don't feel too strongly here.
I vote for the latter, since it matches the API for creating unchecked 
segments more.
> * MemorySegment::isAccessible() -- as the A* word is overloaded, some 
> other name should be found?
>
> Yes, the A-word is a mess - but some of the proposals I've see (e.g. 
> "isOwnedByCurrentThread") look equally terrible. If somebody has a 
> better alternative I'd be happy to hear it - if not, I think keeping 
> the A-word is tolerable.

Maybe this can be replaced with a `Thread owner()` accessor, which would 
serve a more general purpose?

Jorn

>
> Thoughts?
>
> Maurizio
>
> [1] - 
> https://mail.openjdk.java.net/pipermail/core-libs-dev/2019-December/063836.html
> [2] - https://bugs.openjdk.java.net/browse/JDK-8235837
>