some thoughts on panama/jextract
Michael Zucchi
notzed at gmail.com
Fri Jan 10 02:11:08 UTC 2020
On 9/1/20 9:37 pm, Maurizio Cimadamore wrote:
>
>
> To dereference you need to construct a VarHandle (of the right type)
> and use it against a MemoryAddress. See an example here of how you
> would implement struct accessors in this way:
>
> http://hg.openjdk.java.net/panama/dev/file/foreign-abi/test/jdk/java/foreign/StdLibTest.java#l261
>
>
Yes of course, I had all that working, but sorry my fault: I missed the
"ForiegnUnsafe" bit in the constructor and only saw cases where you
allocated the memory segment from java.
Thanks to the info from Jorn I solved that problem.
>>
>> So take these comments wrt not knowing how it works now.
>>
>> I know it's an attempt to provide some managed safety but as you're
>> already calling C that horse has already largely bolted. For example
>> you're trusting that the Java definition of a structure size and
>> layout matches the C compiler when you call C, but aren't trusting
>> the same information when it comes back (at least not from java). C
>> also supports pointers you cannot dereference so both cases are
>> necessary.
>>
>> So what you call "not so rosy" to me is "fundamental and basic c".
>> That really should have first-class support and not be hidden by any
>> complexity that will make it harder to use and thus more prone to
>> mistakes.
>
> As I said, we have at least two experiments (the only we've done so
> far with the minimal extract): OpenGL and LibClang - in both cases,
> the number of times in which was actually required to break the escape
> hatch was small - none in OpenGL, and a 2 in libclang.
>
> I think this is a topic where it's easy to get biased by the specific
> library one is looking at.
>
Well I mean, it is a C library, and this project purports to support
"calling C libraries without JNI", so it's not a matter of 'bias' as
such. This is also a public project and requests 'community' feedback
so here we are.
And to follow your argument, indeed OpenGL is a very specific case that
uses integer handles for everything. This isn't particularly common.
Vulkan mostly uses opaque pointers and user-supplied output buffers but
vkMapMemory is the only way to move application data to/from the
device. And that returns a pointer. Simlarly for OpenCL and it's
memory mapping function, although it also has memory copy functions too.
And then there's the case of string pointers which i've already
mentioned. Some apis will copy them but many wont because it's clumsy
to write, clumsy to use, and needlessly inefficient. They're probably
even worse than structures because as of now you have to: create an
unsafe big segment that can hold the potential size, walk the bytes to
find it's length, then create another unsafe segment for the actual
length. Then copy it to a byte array.
And although it isn't super common some libraries have their own
allocation functions that you must or should use instead of malloc and
friends.
> I'm not saying "you should always use the safe idiom, if you don't
> want" - I'm saying there should be a choice, so that well-behaved
> libraries can provide safe-by-default bindings.
>
I don't really understand this argument. They will just automatically
have this "safety" if that's the way they're written. But if they're not
then you simply don't have any choice in the matter anyway. I mean what
are you going to do, patch the upstream library for a java binding?
Given such a restriction either you can bind the library or you can't
and throwing an error at the developer and forcing them to add another
'now obviously know what you're doing!' argument isn't going to win any
friends. JNI has none of these restrictions remember.
I think your definition of "well behaved" is too narrow and is basically
"friendly to panama as it is now/or as i hope". I don't get the
impression you've actually written much C, particularly from the bizarre
Scoped Pointer stuff.
>>
>> I don't really see the justification of having
>> MemoryAddress::ofLong(p, size) not being available, or the
>> MemoryAddress varhandle .get method not taking a length parameter (i
>> think better than being able to change the MemoryAddress size as this
>> keeps the size immutable). Particularly if there's some
>> less-convenient work-around to get the same functionality anyway.
>
> Eventually, it is possible we could add a ofLong(p, size) - right now
> we're focusing on the set of primitive moves. But we're trying very
> hard to distinguish between operations that are safe and operations in
> which the user is essentially saying "you have to trust me". Maybe you
> always want tun run in the "I know what I'm doing"-mode, but that
> assumption might not be valid for all libraries/users.
>
> In other words, there is a distinction - and I'd like that distinction
> not to be lost under the "bbbut C is a mess" assertion.
>
Hah, I think C is great, so don't quote the opinion of other list users
here! C++ on the other hand can go die in a fire, we've had plenty of
those around here lately but alas it remains unburnt.
I understand you want to make the distinction I just don't think it's
worth making. This is because you're both: not getting much (if any)
safety but only having native pointers unchecked because they are never
checked before invoking C anyway (they can't even be closed), and not
losing much (if any) safety by assigning a pointer a valid access range
and letting the caller manage it's lifetime (and at least they can be
closed). The first just makes them unusable from java.
An example. Without using any classes or methods marked 'unsafe' I
wrote a varhandle-like class, it just uses MemoryAddress,
MemorySegment.allocNative() and one MethodHandle to memcpy(3). It lets
me read or write any value at any arbitrary offset from any arbitrary
MemoryAddress with absolutely no checking. And I didn't have to use any
'scary' compiler or jvm flags.
Another less obtuse one. You have to just 'trust the user' that they
allocated a memory segment with valid alignment. And of the right size.
Anyway there's a workable solution (because there has to be) and I think
i've pushed about as far as I should on why I think it should just be in
the base api. The real key is foreign providing Var and Mem handles,
the rest is sort of sugar.
I know it's still work in progress.
I'll continue evaluating the platform, mostly around
developer-effort-to-use but also performance and overheads.
Cheers,
Michael
More information about the panama-dev
mailing list