[nicl] RFR: Undefined struct and void*

Tue May 29 04:55:58 UTC 2018

On May 24, 2018, at 5:12 AM, Maurizio Cimadamore <maurizio.cimadamore at oracle.com> wrote:
> 
> In terms of the new implementation in the foreign branch - these pointers do not have any LayoutType associated with them - so they're much more similar to void* than the C syntax would suggest. I agree that many C API do use this style in what Sundar calls poor-man OO style (:-)), and that having a 'real' interfaace name improves readability somewhat - but I also think that _not_ using real interface names seems a more honest approach which more accurately reflects what's under the hood.

We have to reflect the semantics of the source language (C in this case)
as well as "what's under the hood".  The "poor man OO" style amounts to
using forward-declared struct types as named abstract types.  In the C
programming experience, "struct S1*" and "struct S2*" are distinct types
which cannot be converted to either other silently, and they are used
as place-holders for struct types which may be hidden inside the library,
or even types which are never defined.  In the latter case, the library
casts the machine word of a "struct S1" to some other type.  In either
case, the forward-declared type functions as an existential type, whose
contract is "keep this type distinct from other types, even though I won't
tell you what it contains".

Throwing away type information by flattening to void* is not in the
Panama philosophy, which seeks to retain as much as possible of
the experience of the foreign type system when translating to the
Java carrier types.  We can choose to do this as a conscious trade-off,
as in the case where we were weighing a parameter-free "Pointer"
type against a richer "Pointer<T>" type, but in general we have to
try hard to find a way to represent distinctions from the source language
in the extracted Java APIs.  If we can't represent them in the Java
static type system, we must at least record them in the runtime
metadata, so we can perform runtime checks.

An empty Java interface (as Henry proposes) is the probably best way
to choose a carrier type to represent such an existential type in C.
It supports both static and dynamic checks in the Java APIs.
Erasing the static type of "struct S1*" to "Pointer<Void>" probably
also entails dropping the runtime distinction between "void" and
"S1", which removes from Java API an important aspect of type
safety that was designed into the C API.

Since C allows the type to be declared multiple times in separate
header files, it is necessary for jextract to issue multiple empty
interfaces, one per jextract task (which may collect the contents
of several headers).  The runtime has to record the fact that all
of those empty interfaces carry the same foreign type.

If in fact a jextract run encounters a struct body for such a type,
it can generate a non-empty Java interface to carry the struct.
In the worst case, there may be several identical bodies for
the same struct, imported several times.  All of the Java interfaces
extracted from the same struct type, whether forward-declared
or actually defined, must be recorded by the binder runtime
as the same foreign type.

This implies, of course, that various interfaces (empty or not)
must be able to convert between each other, using a runtime
check to determine that the various instances of the struct
type are, in fact, the same struct type.  A simple string check
is reasonable:  You can safely convert Pointer<foo_h.S1>
to Pointer<bar_h.S1> if the simple-name of the interface
(empty or not) is the same, and if both interfaces are for
structs, and both interfaces were extracted for the same
library (a loose notion, at present).  If the binder was able
to see all of the interfaces which translate S1, then perhaps
there is an object which implements both foo_h.S1 and
bar_h.S1.  In that happy case, no new instance is needed,
just a checkcast.  In some cases, if the binder can't see all the
types when it starts building pointers, then it may have to
re-wrap the same machine address under different metadata.

You can also safely convert from Pointer<foo_h.S1> to
Pointer<Void>, but not vice versa.

— John