[foreign-jextract] RFR 8237577 minimal jextract tool on jextract API
Maurizio Cimadamore
maurizio.cimadamore at oracle.com
Wed Jan 22 12:05:41 UTC 2020
On 22/01/2020 08:00, Ty Young wrote:
>
> On 1/22/20 12:46 AM, John Rose wrote:
>> On Jan 21, 2020, at 6:31 PM, Ty Young <youngty1997 at gmail.com
>> <mailto:youngty1997 at gmail.com>> wrote:
>>>
>>> Without a higher level abstraction I don't think I'm personally
>>> going to be using this as-is. Hopefully the jextract API really lets
>>> me improve things otherwise I think I'm going to do everything
>>> myself so that the bindings more closely resemble the old API.
>>
>> This is a fair observation. It reminds me of the distinction in the
>> git tools between porcelain and plumbing[1]. Maybe we’re still working
>> on the plumbing? But even if that’s the case it’s not too early to
>> collect observations and requests for “porcelain”.
>
>
> "plumbing" is a very apt way to describe Memory Access.
>
>
> To be clear, I've been advocating for "porcelain" for
> awhile(relatively speaking). Like I said, I kinda knew it would turn
> out bad and had(if it wasn't evident before) the intention to wrap it
> in a higher level API of my own but uh... what jextract generates
> right now is a bit of an octopus.
>
>
> It has:
>
>
> - constants(good)
>
>
> - var handles - even for struct fields(bad)
Var handles allow you to access field with whatever atomic access you
want - in case the base getter/setter is not good.
>
>
> - memory layouts - not what I want/need(bad)
Memory layouts are necessary to instantiate structs - or also to ask
questions like 'what is the layout of field XYZ'
>
>
> - method handles - functions themselves already exist(bad)
This probably can be hidden
>
>
> - functions - generates(good) but also generates struct field
> setters/getters?(bad)
So... if VarHandle for struct fields are bad, but static struct field
getter/setter are also bad... what the heck should the tool generate? (I
obviously know what your answer is)
>
>
> So it seems kinda broken. The structs especially should probably be
> put into a class. That alone would clean things up a bit, I think. The
> exposure of memory method/var handles should be made an optional, non
> default jextract switch I think.
Yes - I was waiting you to say that :-). "We need structs!"
You have went through the list of generated stuff with a very specific
question in mind - which is: is this artifact something I'd like to use
in my code? And I understand if the answer to many of those is "no, sorry".
But there's another question to be asked - which is missing from all the
analysis here - is the set of generated plumbing _complete_ ? E.g. can
you take what is being generated and use it to interact with the native
library? And the answer there is "yes". Is it low level? Yes, of course
- but at least is still Java. Being able to develop against a library
using your IDE and not having to jump from javac to gcc and back to
fixup things IMHO is a big step forward.
All the high-level moves that have been listed over the last few weeks
are relatively obvious - but they all have a cost which you are
unwilling to see. Let's say that we create a new class for each new
struct - and that all static function wrappers do
marshalling/unmarshalling of the incoming structs (to convert them to
memory segments and back). Is it more usable? Yes, of course. Is it
free? No. And same thing for Pointer.
To be 100% clear - here we are arguing about the difference between:
MemorySegment point = MemorySegment.allocateNative(POINT$LAYOUT);
point$x$set(point, 10);
point$y$set(point, 11);
and:
Point p = new Point();
p.x$set(10);
p.y$set(11);
Modulo cosmetic differences (e.g.in one case setters are static methods,
in the other are instance methods), the main difference here is the fact
that the latter appeals to nominal types, while the former does not.
E.g. in the second example you know the thing you are working on it's a
point, which accessor it has, so, from a Java perspective, it is harder
to make silly mistakes (like instantiating a Point and then using the
Circle accessors to get its fields).
<sidebar>
the first version has also a more subtle advantage: it makes it dead
simple to see that you are effectively dealing with native code and
off-heap allocation, which the second version completely hides. Given
that we are essentially adding native programming abilities to the Java
platform, and that we will see Java programs starting to adopt them,
perhaps having the places where native interconnect happens look
"different" might not be such a bad idea
</sidebar>
That said, what if somebody wants structs to be modeled in a slightly
different way than what jextract does (I have, for example, zero
confidence that we can find a getter/setter naming convention that will
convince more than 50% of the users :-) )? Again, define an even
higher-level abstraction, and marshal/unmarshal it into a jextract
struct. Do you see the problem here?
With this, I'm not necessarily closing the door on "porcelaine" - but
I'd like to stress that following porcelaine leads you to a very
slippery slope. Say you have structs like Point above - then it becomes
sad that you can have Point, but as soon as C code does Point* you are
back to a MemoryAddress... so... let's add pointers!
Adding pointers is way harder than adding structs as it comes with the
usual caveats and limitations of Java generics - you'd like to have
something which captures the spirit of the C type (e.g.
Pointer<Pointer<Foo>>) - but there's no good way to do that in a fully
type-safe way. Either you go full blown Panama/foreign style and bake
your own type tokens (LayoutType objects), or you are left with two choices:
* have raw pointers (which JNR and Graal do, for instance) - not much
better than MemoryAddress IMHO - the C signature is still not reflected
in the Java binding
* have "unsafe" generic pointers (I believe JavaCPP goes down this path
- with its PointerPointer class) - where you can fool static type system
into thinking that a pointer<A> is a pointer<B> (so that a get()
operation can misbehave, or not do what you expect)
(and then there's the whole discussion about native arrays - should we
have Array<Foo> or just use pointers...). Stuff like this seems a lot
less of a slamdunk than structs might seem at first. And, if a developer
doesn't care about all these seemingly high-level abstractions (because
they are wrapping into an higher level API anyway), isn't all this stuff
just... in the way?
Maybe when value types will make object creation cheap enough, adding
things like structs and pointers will be a no brainer - right now it's
not, and we should at least be honest about the cost of the things that
are being proposed as possible "extensions". And, again, as I said
several other times, I'd like to evaluate the need for such "extensions"
after we take a good look at a sizeable corpus of extracted libraries.
>
>
> also, NVML uses a Pointer to an empty struct to represent a GPU:
>
>
> typedef struct nvmlDevice_st* nvmlDevice_t;
>
>
> which seems to be omitted from the generated class. I'm not entirely
> sure how to handle this manually either.
AFAIK jextract never modeled these things directly (in fact with old
jextract you got Pointer<Pointer<nvmlDevice_st>>, not
Pointer<nvmlDevice_t>).
> I'm guessing I need to use ForeignUnsafe to get the Pointer from
> C(hence why, in the old API it's Pointer<Pointer<nvmlDevice_st>>). I
> guess the fact that a struct is being used is irrelevant from the eyes
> of Memory Access, since it doesn't have field and is therefore in an
> undefined state?
If this is, as I think, an opaque pointer, then this is just a
MemoryAddress. You get back nvmlDevice_t as MemoryAddress from functions
and you can pass them on to other functions.
>
>
>>
>> — John
>>
>> [1]: https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain
More information about the panama-dev
mailing list