[foreign] "implicit" conversions in the memory access API
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Thu Mar 25 17:54:40 UTC 2021
Hi,
I'd like to call the attention on an aspect of the memory access API
which, I believe, requires a bit more thinking.
We have two common conversions that code often wants to apply, one old,
one brand new:
1. from segment to address (e.g. segment.address())
2. from resource scope to allocator (e.g. SegmentAllocator.scoped(scope))
Last year, we decided that (1) was so common that it "deserved" an
interface, namely Addressable, which is implemented both by
MemorySegment and MemoryAddress, and can be used to pass one or the
other where an address is expected.
Similarly, few weeks ago, we decided to add overloads to address (2)
after seeing how common it is in native interop code (e.g. allocate a
scoped C string and pass to a native function).
There is a discrepancy here, in that we decided to solve the
"conversion" problems in two different ways. Now, I think the overload
strategy we picked for (2) just doesn't scale to (1). Imagine a native
function; such a function can take 1, 2, 5, 10 ... different structs
(hence memory segments). Should we add 2 overloads for each segment
argument ? That quickly turns into a combinatorial explosion of
overloads. Which is why the Addressable interface approach was considered.
One drawback of Addressable is that, now that we have a more powerful
API, where implicit deallocation with Cleaner is more front and center,
the implicit conversion (via the interface) also raises some questions:
sometimes the user could pass a segment where an Addressable is
expected, w/o thinking that the segment will be thrown away by the
linker (which will just project the segment into a long value - its
address). In other words, there can be cases where passing a segment to
a native function expecting a MemoryAddress can result in premature
deallocation - and the Addressable interface makes this conversion
"implicit", hiding it from the user code, which might be an issue. A
similar issue is present for VaList, which also implements Addressable.
Stepping back from the specifics, to accommodate the need for these
conversions from S to T, I think there is only a limited number of
options we could go (using the language we have):
A) do nothing: that is, user has to manually convert S into T by calling
a well-known method
B) like (A) but provide usability overloads
C) make S <: T. This means I can use an S where a T is expected.
D) make S and T implements some common interface U, and then use U where
both S and T could be passed
So, after having discussed these options abstractly, let's go back at
how we move forward; I think these are some possible options:
* we do nothing and leave the API as is (1=D, 2=B)
* we go back, remove the Addressable interface (1=A, 2=B)
* as before, but we also remove the ResourceScope overloads - no magic
conversions, everything is always explcit (1=A, 2=A)
* we see if there's an interface name we like for ResourceScope ->
SegmentAllocator, and remove the overloads, which would then be no
longer necessary (1=D, 2=D)
* make ResourceScope <: SegmentAllocator, but leave Addresable (1=D, 2=C)
Of course we could do (A) for both 1/2, by removing extra overloads and
also removing the Addressable interface; while the result would be
consistent, I'm not sure we would have improved the API. After all the
conversions were precisely added to address usability issues.
More broadly, I think consistency should never become a goal onto its
own. As such "less is better" options doesn't seem to be particularly
helpful here. What I'm more worried about is that, in the current state,
users will have to write 2 overloads of any segment-producing function,
one accepting a scope and another accepting an allocator; this is more
than just jextract or the CLinker API (although initially the cost is
perceived there) - I believe this split will quickly spread to user
code too.
This leaves us, I think (and assuming we want to do anything at all),
with the last two bullets; if we go for a common super-interface, the
main issue is what name do we give it. More specifically, the same
XYZ-able trick which works with Addressable, doesn't work with
SegmentAllocator - "Allocatable", really? At this point I think
Supplier<SegmentAllocator> would probably be a better choice and/or
convey more meaning - although it might be hard, just by looking at the
javadoc, how the user is supposed to obtain such a supplier from the API
(a new named interface has an advantage, in that the user can click on
its javadoc and look for all the implementations there).
If we go for a direct subtyping relationship (e.g. ResourceScope <:
SegmentAllocator), the issue is what to do with the allocation routines
in MemorySegment (e.g. MemorySegment.allocateNative(... , scope) - which
seems less useful)
No matter whether we use common interface or direct subtyping, in both
cases ResourceAllocator seems to become less of an independent
abstraction, and more geared towards supporting the Foreign API (which
isn't necessarily a bad thing, just noting).
Thoughts?
Maurizio
More information about the panama-dev
mailing list