[foreign-jextract] RFR 8237577 minimal jextract tool on jextract API
Ty Young
youngty1997 at gmail.com
Thu Jan 23 01:19:57 UTC 2020
On 1/22/20 6:05 AM, Maurizio Cimadamore wrote:
>
> On 22/01/2020 08:00, Ty Young wrote:
>>
>> On 1/22/20 12:46 AM, John Rose wrote:
>>> On Jan 21, 2020, at 6:31 PM, Ty Young <youngty1997 at gmail.com
>>> <mailto:youngty1997 at gmail.com>> wrote:
>>>>
>>>> Without a higher level abstraction I don't think I'm personally
>>>> going to be using this as-is. Hopefully the jextract API really
>>>> lets me improve things otherwise I think I'm going to do everything
>>>> myself so that the bindings more closely resemble the old API.
>>>
>>> This is a fair observation. It reminds me of the distinction in the
>>> git tools between porcelain and plumbing[1]. Maybe we’re still working
>>> on the plumbing? But even if that’s the case it’s not too early to
>>> collect observations and requests for “porcelain”.
>>
>>
>> "plumbing" is a very apt way to describe Memory Access.
>>
>>
>> To be clear, I've been advocating for "porcelain" for
>> awhile(relatively speaking). Like I said, I kinda knew it would turn
>> out bad and had(if it wasn't evident before) the intention to wrap it
>> in a higher level API of my own but uh... what jextract generates
>> right now is a bit of an octopus.
>>
>>
>> It has:
>
>>
>>
>> - constants(good)
>>
>>
>> - var handles - even for struct fields(bad)
>
> Var handles allow you to access field with whatever atomic access you
> want - in case the base getter/setter is not good.
That's fine *as an option* but *by default* I don't think it should
expose them. Or maybe put into their own class file?
>
>
>>
>>
>> - memory layouts - not what I want/need(bad)
> Memory layouts are necessary to instantiate structs - or also to ask
> questions like 'what is the layout of field XYZ'
>>
>>
>> - method handles - functions themselves already exist(bad)
> This probably can be hidden
>>
>>
>> - functions - generates(good) but also generates struct field
>> setters/getters?(bad)
> So... if VarHandle for struct fields are bad, but static struct field
> getter/setter are also bad... what the heck should the tool generate?
> (I obviously know what your answer is)
>>
>>
>> So it seems kinda broken. The structs especially should probably be
>> put into a class. That alone would clean things up a bit, I think.
>> The exposure of memory method/var handles should be made an optional,
>> non default jextract switch I think.
>
> Yes - I was waiting you to say that :-). "We need structs!"
>
> You have went through the list of generated stuff with a very specific
> question in mind - which is: is this artifact something I'd like to
> use in my code? And I understand if the answer to many of those is
> "no, sorry".
I have over 75 class files that would need to use this, so yes I very
much am asking this question. Right now, with the old Pointer API, it
only takes about 4 lines per class to read a value from the native
function and return it. I'd very much like to keep it that way.
Also, there are like 200 menu options in Netbeans whenever I do
"nvml_h." so... yeah.
>
> But there's another question to be asked - which is missing from all
> the analysis here - is the set of generated plumbing _complete_ ? E.g.
> can you take what is being generated and use it to interact with the
> native library? And the answer there is "yes". Is it low level? Yes,
> of course - but at least is still Java. Being able to develop against
> a library using your IDE and not having to jump from javac to gcc and
> back to fixup things IMHO is a big step forward.
If I was interested in dealing with low level boilerplate I'd just make
the bindings myself, TBH.
>
> All the high-level moves that have been listed over the last few weeks
> are relatively obvious - but they all have a cost which you are
> unwilling to see. Let's say that we create a new class for each new
> struct - and that all static function wrappers do
> marshalling/unmarshalling of the incoming structs (to convert them to
> memory segments and back). Is it more usable? Yes, of course. Is it
> free? No. And same thing for Pointer.
In the vast majority of cases all these higher level abstractions just
wrap repetetive lower level API code for convenience. The vast majority
of the time the variables are even final. If the JVM can't optimize
something like that then it's a failure on the JVM(which, IIRC, is in
part being fixed with Valhalla).
>
> To be 100% clear - here we are arguing about the difference between:
>
> MemorySegment point = MemorySegment.allocateNative(POINT$LAYOUT);
> point$x$set(point, 10);
> point$y$set(point, 11);
>
> and:
>
> Point p = new Point();
> p.x$set(10);
> p.y$set(11);
>
> Modulo cosmetic differences (e.g.in one case setters are static
> methods, in the other are instance methods), the main difference here
> is the fact that the latter appeals to nominal types, while the former
> does not. E.g. in the second example you know the thing you are
> working on it's a point, which accessor it has, so, from a Java
> perspective, it is harder to make silly mistakes (like instantiating a
> Point and then using the Circle accessors to get its fields).
To call something that is so against the OOP spirit of Java and its
decades worth of OOP API(s) "cosmetic" is absurd. Of course there is
going to be compromises, C isn't an OOP language, but you can do things
to improve that.
>
> <sidebar>
> the first version has also a more subtle advantage: it makes it dead
> simple to see that you are effectively dealing with native code and
> off-heap allocation, which the second version completely hides. Given
> that we are essentially adding native programming abilities to the
> Java platform, and that we will see Java programs starting to adopt
> them, perhaps having the places where native interconnect happens look
> "different" might not be such a bad idea
> </sidebar>
Java doesn't have Unions or structs, so I don't see the issue. You also
can tell that native memory access is being used from the class imports.
>
> That said, what if somebody wants structs to be modeled in a slightly
> different way than what jextract does (I have, for example, zero
> confidence that we can find a getter/setter naming convention that
> will convince more than 50% of the users :-) )? Again, define an even
> higher-level abstraction, and marshal/unmarshal it into a jextract
> struct. Do you see the problem here?
That hasn't nor will it ever prevent horrible new Java API(s) with
methods that aren't prefixed with "get" in front of them, now has it?
Memory Access itself has methods named like "segment()" instead of
"getSegment()". Am I getting something or am I performing an action? I
can't really tell.
And this has been a long standing trend that flies in the face of legacy
Java APIs. There is literally a method in ModuleReader called "list()"
that returns a Stream. Who thought that was a good method name!?!?
>
> With this, I'm not necessarily closing the door on "porcelaine" - but
> I'd like to stress that following porcelaine leads you to a very
> slippery slope. Say you have structs like Point above - then it
> becomes sad that you can have Point, but as soon as C code does Point*
> you are back to a MemoryAddress... so... let's add pointers!
>
> Adding pointers is way harder than adding structs as it comes with the
> usual caveats and limitations of Java generics - you'd like to have
> something which captures the spirit of the C type (e.g.
> Pointer<Pointer<Foo>>) - but there's no good way to do that in a fully
> type-safe way. Either you go full blown Panama/foreign style and bake
> your own type tokens (LayoutType objects)
I don't see the issue. Yes, it's a good deal of work but once everything
is defined you don't have to touch anything else.
> , or you are left with two choices:
>
> * have raw pointers (which JNR and Graal do, for instance) - not much
> better than MemoryAddress IMHO - the C signature is still not
> reflected in the Java binding
> * have "unsafe" generic pointers (I believe JavaCPP goes down this
> path - with its PointerPointer class) - where you can fool static type
> system into thinking that a pointer<A> is a pointer<B> (so that a
> get() operation can misbehave, or not do what you expect)
>
> (and then there's the whole discussion about native arrays - should we
> have Array<Foo> or just use pointers...). Stuff like this seems a lot
> less of a slamdunk than structs might seem at first. And, if a
> developer doesn't care about all these seemingly high-level
> abstractions (because they are wrapping into an higher level API
> anyway), isn't all this stuff just... in the way?
I think, in hindsight, the way it was before in the old Pointer API was
the right way. That is, a Pointer by default with an instance method to
get an Array.
Of course, for plain off heap use, maybe static methods to create an
array of bytes would be good too?
And you don't need to force anyone to use the high level abstraction. I
don't understand where this black and white mentality is coming from,
but it doesn't have to be the case. Jextract could provide both via a
switch.
>
> Maybe when value types will make object creation cheap enough, adding
> things like structs and pointers will be a no brainer - right now it's
> not, and we should at least be honest about the cost of the things
> that are being proposed as possible "extensions". And, again, as I
> said several other times, I'd like to evaluate the need for such
> "extensions" after we take a good look at a sizeable corpus of
> extracted libraries.
>
>
>>
>>
>> also, NVML uses a Pointer to an empty struct to represent a GPU:
>>
>>
>> typedef struct nvmlDevice_st* nvmlDevice_t;
>>
>>
>> which seems to be omitted from the generated class. I'm not entirely
>> sure how to handle this manually either.
> AFAIK jextract never modeled these things directly (in fact with old
> jextract you got Pointer<Pointer<nvmlDevice_st>>, not
> Pointer<nvmlDevice_t>).
Yes. nvmDevice_st was basically just an Object.
>> I'm guessing I need to use ForeignUnsafe to get the Pointer from
>> C(hence why, in the old API it's Pointer<Pointer<nvmlDevice_st>>). I
>> guess the fact that a struct is being used is irrelevant from the
>> eyes of Memory Access, since it doesn't have field and is therefore
>> in an undefined state?
> If this is, as I think, an opaque pointer, then this is just a
> MemoryAddress. You get back nvmlDevice_t as MemoryAddress from
> functions and you can pass them on to other functions.
That's what I thought. How does the new jextract handle it though?
>>
>>
>>>
>>> — John
>>>
>>> [1]:
>>> https://git-scm.com/book/en/v2/Git-Internals-Plumbing-and-Porcelain
More information about the panama-dev
mailing list