some thoughts on panama/jextract

Michael Zucchi notzed at gmail.com
Fri Jan 10 02:11:08 UTC 2020


On 9/1/20 9:37 pm, Maurizio Cimadamore wrote:
>
>
> To dereference you need to construct a VarHandle (of the right type) 
> and use it against a MemoryAddress. See an example here of how you 
> would implement struct accessors in this way:
>
> http://hg.openjdk.java.net/panama/dev/file/foreign-abi/test/jdk/java/foreign/StdLibTest.java#l261 
>
>


Yes of course, I had all that working, but sorry my fault: I missed the 
"ForiegnUnsafe" bit in the constructor and only saw cases where you 
allocated the memory segment from java.

Thanks to the info from Jorn I solved that problem.

>>
>> So take these comments wrt not knowing how it works now.
>>
>> I know it's an attempt to provide some managed safety but as you're 
>> already calling C that horse has already largely bolted. For example 
>> you're trusting that the Java definition of a structure size and 
>> layout matches the C compiler when you call C, but aren't trusting 
>> the same information when it comes back (at least not from java).  C 
>> also supports pointers you cannot dereference so both cases are 
>> necessary.
>>
>> So what you call "not so rosy" to me is "fundamental and basic c". 
>> That really should have first-class support and not be hidden by any 
>> complexity that will make it harder to use and thus more prone to 
>> mistakes.
>
> As I said, we have at least two experiments (the only we've done so 
> far with the minimal extract): OpenGL and LibClang - in both cases, 
> the number of times in which was actually required to break the escape 
> hatch was small - none in OpenGL, and a 2 in libclang.
>
> I think this is a topic where it's easy to get biased by the specific 
> library one is looking at.
>
Well I mean, it is a C library, and this project purports to support 
"calling C libraries without JNI", so it's not a matter of 'bias' as 
such.  This is also a public project and requests 'community' feedback 
so here we are.

And to follow your argument, indeed OpenGL is a very specific case that 
uses integer handles for everything.  This isn't particularly common.

Vulkan mostly uses opaque pointers and user-supplied output buffers but 
vkMapMemory is the only way to move application data to/from the 
device.  And that returns a pointer.  Simlarly for OpenCL and it's 
memory mapping function, although it also has memory copy functions too.

And then there's the case of string pointers which i've already 
mentioned.  Some apis will copy them but many wont because it's clumsy 
to write, clumsy to use, and needlessly inefficient.  They're probably 
even worse than structures because as of now you have to: create an 
unsafe big segment that can hold the potential size, walk the bytes to 
find it's length, then create another unsafe segment for the actual 
length.  Then copy it to a byte array.

And although it isn't super common some libraries have their own 
allocation functions that you must or should use instead of malloc and 
friends.

> I'm not saying "you should always use the safe idiom, if you don't 
> want" - I'm saying there should be a choice, so that well-behaved 
> libraries can provide safe-by-default bindings.
>
I don't really understand this argument.  They will just automatically 
have this "safety" if that's the way they're written. But if they're not 
then you simply don't have any choice in the matter anyway.  I mean what 
are you going to do, patch the upstream library for a java binding?  
Given such a restriction either you can bind the library or you can't 
and throwing an error at the developer and forcing them to add another 
'now obviously know what you're doing!' argument isn't going to win any 
friends.  JNI has none of these restrictions remember.

I think your definition of "well behaved" is too narrow and is basically 
"friendly to panama as it is now/or as i hope".  I don't get the 
impression you've actually written much C, particularly from the bizarre 
Scoped Pointer stuff.

>>
>> I don't really see the justification of having 
>> MemoryAddress::ofLong(p, size) not being available, or the 
>> MemoryAddress varhandle .get method not taking a length parameter (i 
>> think better than being able to change the MemoryAddress size as this 
>> keeps the size immutable). Particularly if there's some 
>> less-convenient work-around to get the same functionality anyway.
>
> Eventually, it is possible we could add a ofLong(p, size) - right now 
> we're focusing on the set of primitive moves. But we're trying very 
> hard to distinguish between operations that are safe and operations in 
> which the user is essentially saying "you have to trust me". Maybe you 
> always want tun run in the "I know what I'm doing"-mode, but that 
> assumption might not be valid for all libraries/users.
>
> In other words, there is a distinction - and I'd like that distinction 
> not to be lost under the "bbbut C is a mess" assertion.
>

Hah, I think C is great, so don't quote the opinion of other list users 
here!  C++ on the other hand can go die in a fire, we've had plenty of 
those around here lately but alas it remains unburnt.

I understand you want to make the distinction I just don't think it's 
worth making.  This is because you're both: not getting much (if any) 
safety but only having native pointers unchecked because they are never 
checked before invoking C anyway (they can't even be closed), and not 
losing much (if any) safety by assigning a pointer a valid access range 
and letting the caller manage it's lifetime (and at least they can be 
closed).  The first just makes them unusable from java.

An example.  Without using any classes or methods marked 'unsafe' I 
wrote a varhandle-like class, it just uses MemoryAddress, 
MemorySegment.allocNative() and one MethodHandle to memcpy(3).  It lets 
me read or write any value at any arbitrary offset from any arbitrary 
MemoryAddress with absolutely no checking.  And I didn't have to use any 
'scary' compiler or jvm flags.

Another less obtuse one.  You have to just 'trust the user' that they 
allocated a memory segment with valid alignment.  And of the right size.

Anyway there's a workable solution (because there has to be) and I think 
i've pushed about as far as I should on why I think it should just be in 
the base api.  The real key is foreign providing Var and Mem handles, 
the rest is sort of sugar.

I know it's still work in progress.

I'll continue evaluating the platform, mostly around 
developer-effort-to-use but also performance and overheads.

Cheers,
  Michael






More information about the panama-dev mailing list