some thoughts on panama/jextract

Michael Zucchi notzed at gmail.com
Thu Jan 2 06:23:11 UTC 2020


On 2/1/20 2:16 pm, Ty Young wrote:
>
> Some thoughts as someone who has wrapped a few native libraries using 
> jextract's API...
>
>
> On 1/1/20 5:06 PM, Michael Zucchi wrote:
>>
>> Morning,
>>
>> First the not-very nice.  The naming conventions are just ugly. 
>> Amongst other things, getters/setters unlike any other in the java 
>> world, no doubt for a reason but it still sux. Pointer.ofNull(), 
>> sigh.  And why the ugly 'x86_64' when 'amd64' is used everywhere else 
>> in java-land.In general I can't see one would want to export any of 
>> these interfaces "as is", either for simplicity or to make an oo api 
>> - so you will almost always need to write substantial boilerplate 
>> anyway.
>
>
> Agreed... but AFAIK using the jextract bindings "as is" isn't the 
> intended use. What you get is just the glue that connects the C code 
> to the Java world. I don't see how this could be any other way.
>
There's always more ways to skin a cat!  The syntax doesn't need to be 
so ugly.  The extraction doesn't need to be so automatic, or at least it 
could be more tunable.  Most other automated code generators will at 
least dump functions in one class and allow grouping by arbitrary measures.

Since someone still needs to write the glue, making it nicer to use is 
not an unimportant consideration.
>
>>
>> Now the real problems.  Having the package/class names based on the 
>> filename?  How is that going to work?
>
>
> IMO, even though it isn't correct in the C world, I think using domain 
> names for the package layout of bindings is the correct way. For 
> Nvidia's Management Library I've just decided to do "org.nvidia.nvml" 
> wherein nvml_h resides. It isn't perfect but it's a lot better than 
> the alternative.
>

It's not the package names that's the problem, it's the files.

Ok here's an example:
$ cat bob.h

#include "bob-data.h"
#include "bob-func.h"
$ cat bob-data.h

#include <stdint.h>

struct bob {
         struct bob *next;

         uint64_t bob;
};

$ cat bob-func.h

void bob_init(void);
struct bob *bob_new(uint64_t bob);
$ cat bob-lib.c

#include "bob.h"

void bob_init(void) {
}

struct bob *bob_new(uint64_t bob) {
         return 0;
}

$ gcc -c bob-lib.c -fPIC
$ gcc -shared -o libbob.so bob-lib.o
$ ~/src/openjdk-panama-14/bin/jextract -t au.bob -L. -lbob 
--include-symbols "bob" bob.h
notzed at shitzone:~/src/panama$ unzip -l bob.h.jar
Archive:  bob.h.jar
   Length      Date    Time    Name
---------  ---------- -----   ----
       347  01-02-2020 15:06   au/bob/bob_data_h.class
       365  01-02-2020 15:06   usr/include/bits/types_h.class
       816  01-02-2020 15:06   usr/include/bits/types_lib.class
      1180  01-02-2020 15:06 usr/include/bits/types_h$__fsid_t.class
      1363  01-02-2020 15:06   au/bob/bob_data_h$bob.class
       798  01-02-2020 15:06   au/bob/bob_data_lib.class
       170  01-02-2020 15:06   META-INF/jextract.properties
---------                     -------
      5039                     7 files

This is the minimum size i could make it, by default it would include 
dozens of things from /usr/include/.  I have no idea why __fsid_t gets 
generated so maybe just put that down to a bug.

You've got the functions in au.bob.bob_func_h and the struct in 
au.bob.bob_data_h.bob and any re-arrangement of those will completely 
break your 'glue' java.  This is a very common design and libraries move 
stuff around in headers all the time so this isn't just some 
hypothetical.  It's a maintenance burden a jni library doesn't have to 
deal with as the compiler hides it.

At a minimum it needs options to 'dump these functions here, those 
functions there, these data structures in that namespace or embedded 
class', and so on.

>
>>   Even assuming that wasn't a problem, now you've got a 
>> usr.include.bits package in your module so you have to rename it or 
>> the module wont play well with others (just leading to redundancy and 
>> difficult code reuse across projects).   I tried various jextract 
>> args to whittle down the generated classes but it still wants to grab 
>> a few things from /usr/include/bits and that's just from including 
>> stdint.h, by default jextract on libavformat.h generates a 500K jar.  
>> FFmpeg also has the problem that many of the structure fields are 
>> read only or 'not public', so wrapping everything creates 
>> unnecessarily large classes.
>
> The only solutions that I can think of is to:
>
>
> A. Fix jextract so that you can specify already generated bindings as 
> dependencies. For example in my particular case, you would first 
> generate the X server bindings and then point to that binding when 
> attempting to generate Nvidia X ctrl(nvxctrl) API(which needs X server 
> APIs).
>
>
> B. Create somekind of standard repo which contains an agreed upon way 
> to generate every binding.
>
I mean B could just be module based, and needn't be globally managed.

You could manually play with the jar file to get around A, an you're 
probably going to already if your output is a module.
>
>>
>> My first thought would be to wrap these "ugly" api's in 
>> self-contained ones but that seems to defeat the purpose.  I suppose 
>> it depends on whether panama is designed to completely replace jni or 
>> just some of the common "easy" cases.
>
>
> If you need to do anything more than reading a value then you are 
> basically going to want to wrap it in some form or another. In the 
> context of Nvidia's API(s), there are GPU specific quirks that need to 
> be hammered out that Nvidia themselves don't handle or resolve.
>
I meant wrapping the c api in a cleaner c api so then the generated 
stuff isn't so messy to work with.  Or just a simpler header file so it 
goes in one place and doesn't drag in half the operating system by 
accident.  Even if just to provide some stability to reduce maintenance 
costs.

>
>>
>> Even if you ignore jextract and roll your own via the annotations you 
>> run into some of the same problems: e.g. structures can change 
>> between platforms, so now you need to include platform specific stuff 
>> in your java, yet it provides no simple mechanism to deal with it.  
>> This is pretty much a show-stopper on it's own.
>
>
> Java 9 modules can be used to fix this via the "requires *static*" 
> module declaration. You would then have for modules:
>
>
> foo.base
>
> foo.windows
>
> foo.mac
>
> foo.linux
>
> etc. 

And 32-bit if anyone still bothers with that (i don't).  But yeah sure i 
didn't say it wasn't solvable.  One just hopes that in the case of using 
jextract the generated class names are the same so you don't also need 
platform/bitsize specifc glue code just for that purpose as well.


> Also keep in mind that you don't need to be on a valid platform to 
> *build*, only *run*.
>
But header files are quite likely to include platform specific parts so 
they have to come from somewhere at build time otherwise your structure 
and function signatures might not match those available on the runtime 
platform.  For my jni code I use cross compile environments and that 
stuff is just included.

I suppose my main beef is really with jextract: it works great for some 
apis (obviously the ones in the examples) and looks like a mess for 
others.  Sure it's still in it's infancy so these things will be 
addressed at some point, but it's the big selling point when you see 
panama talked about and the canned examples make it look so effortless.  
Don't misunderstand, I want it to be good and usable and preferable to jni!

For jjmpeg I don't think jextract will be practical yet because no 
matter what options I use i get tons of /usr/include junk added and the 
result is going to be about 10x the size of my current implementation 
before i even get started on the glue.

Actually for zcl (OpenCL) I don't think project panama itself can even 
be made to work in it's current state.  When you use OpenCL you only 
link (-lOpenCL) with a few functions and the rest of the entry points 
must be resolved using one of those (possibly object-specific too!).  
jextract doesn't support this and unless there's some way to provide a 
custom resolution function for symbol lookup this seems impossible using 
the panama runtime too.

  Z



More information about the panama-dev mailing list