[foreign] "implicit" conversions in the memory access API

Maurizio Cimadamore maurizio.cimadamore at oracle.com
Thu Mar 25 17:54:40 UTC 2021


Hi,
I'd like to call the attention on an aspect of the memory access API 
which, I believe, requires a bit more thinking.

We have two common conversions that code often wants to apply, one old, 
one brand new:

1. from segment to address (e.g. segment.address())
2. from resource scope to allocator (e.g. SegmentAllocator.scoped(scope))

Last year, we decided that (1) was so common that it "deserved" an 
interface, namely Addressable, which is implemented both by 
MemorySegment and MemoryAddress, and can be used to pass one or the 
other where an address is expected.

Similarly, few weeks ago, we decided to add overloads to address (2) 
after seeing how common it is in native interop code (e.g. allocate a 
scoped C string and pass to a native function).

There is a discrepancy here, in that we decided to solve the 
"conversion" problems in two different ways. Now, I think the overload 
strategy we picked for (2) just doesn't scale to (1). Imagine a native 
function; such a function can take 1, 2, 5, 10 ... different structs 
(hence memory segments). Should we add 2 overloads for each segment 
argument ? That quickly turns into a combinatorial explosion of 
overloads. Which is why the Addressable interface approach was considered.

One drawback of Addressable is that, now that we have a more powerful 
API, where implicit deallocation with Cleaner is more front and center, 
the implicit conversion (via the interface) also raises some questions: 
sometimes the user could pass a segment where an Addressable is 
expected, w/o thinking that the segment will be thrown away by the 
linker (which will just project the segment into a long value - its 
address). In other words, there can be cases where  passing a segment to 
a native function expecting a MemoryAddress can result in premature 
deallocation - and the Addressable interface makes  this conversion 
"implicit", hiding it from the user code, which might be an issue. A 
similar issue is present for VaList, which also implements Addressable.

Stepping back from the specifics, to accommodate the need for these 
conversions from S to T, I think there is only a limited number of 
options we could go (using the language we have):

A) do nothing: that is, user has to manually convert S into T by calling 
a well-known method
B) like (A) but provide usability overloads
C) make S <: T. This means I can use an S where a T is expected.
D) make S and T implements some common interface U, and then use U where 
both S and T could be passed

So, after having discussed these options abstractly, let's go back at 
how we move forward; I think these are some possible options:

* we do nothing and leave the API as is (1=D, 2=B)
* we go back, remove the Addressable interface (1=A, 2=B)
* as before, but we also remove the ResourceScope overloads - no magic 
conversions, everything is always explcit (1=A, 2=A)
* we see if there's an interface name we like for ResourceScope -> 
SegmentAllocator, and remove the overloads, which would then be no 
longer necessary (1=D, 2=D)
* make ResourceScope <: SegmentAllocator, but leave Addresable (1=D, 2=C)

Of course we could do (A) for both 1/2, by removing extra overloads and 
also removing the Addressable interface; while the result would be 
consistent, I'm not sure we would have improved the API. After all the 
conversions were precisely added to address usability issues.

More broadly, I think consistency should never become a goal onto its 
own. As such "less is better" options doesn't seem to be particularly 
helpful here. What I'm more worried about is that, in the current state, 
users will have to write 2 overloads of any segment-producing function, 
one accepting a scope and another accepting an allocator; this is more 
than just jextract or the CLinker API (although initially the cost is 
perceived there) - I believe this split  will quickly spread to user 
code too.

This leaves us, I think (and assuming we want to do anything at all), 
with the last two bullets; if we go for a common super-interface, the 
main issue is what name do we give it. More specifically, the same 
XYZ-able trick which works with Addressable, doesn't work with 
SegmentAllocator - "Allocatable", really? At this point I think 
Supplier<SegmentAllocator> would probably be a better choice and/or 
convey more meaning - although it might be hard, just by looking at the 
javadoc, how the user is supposed to obtain such a supplier from the API 
(a new named interface has an advantage, in that the user can click on 
its javadoc and look for all the implementations there).

If we go for a direct subtyping relationship (e.g. ResourceScope <: 
SegmentAllocator), the issue is what to do with the allocation routines 
in MemorySegment (e.g. MemorySegment.allocateNative(... , scope) - which 
seems less useful)

No matter whether we use common interface or direct subtyping, in both 
cases ResourceAllocator seems to become less of an independent 
abstraction, and more geared towards supporting the Foreign API (which 
isn't necessarily a bad thing, just noting).

Thoughts?

Maurizio




More information about the panama-dev mailing list