Building CUDA bindings for Windows with jextract

Mon Feb 18 19:50:02 UTC 2019

Am 18.02.2019 um 19:35 schrieb Maurizio Cimadamore:
> Is this an issue (as in 'bug') or is this just a note on how to read 
> the jextract command line above? Is there an alternate command line 
> that you would have liked better but that doesn't work?

Not a "bug", for course, just for people who might wonder or want to try 
it out on a different OS.
E.g. on Linux, the default path for CUDA 10 is "/usr/local/cuda-10.0"

> Yep - System.loadLibrary has two modes:
>
> 1) library name -> as in "cublas" this would be expanded into a shared 
> library name - e.g. on linux libcublas.so
> 2) full absolute path to lib

To my understanding, System#loadLibrary uses the library name and does 
the prefix/suffix magic and scans the java.library.path for that lib. In 
contrast to that, System#load (without 'Library') takes the full path.
But sure, in both cases, the names have to match, so this is also not an 
issue with Panama. Just something to keep in mind for people who want to 
use it. (Maybe I'll set up a README.md somewhere, summarizing some of 
these points).

> The main issue here is what we have been calling 'library-centric' 
> approach - that is, instead of generating many separate classes - 
> generate a single root class which has all the required member 
> functions (the ones that appear in the shared library), and have all 
> required dependencies added in as inner classes. That would indicate 
> more clearly where you have to  look for.
>
> Note also that, for the purpose of looking inside classes (w/o using a 
> jar) you can also use jextract with "-d <dirname>" and output classes 
> into a folder, uncompressed.

Even if there are class files: Some of the CUDA-based libraries contain 
 >4000 functions, some of them taking 30 parameters. Generating even 
"empty" JavaDocs just to be able to easily browse through the API could 
be helpful. (Not an issue of jextract either, though. Using a decompiler 
was fine for me.)

> The long term solution would be the ability to reuse jextract runs by 
> pointing jextract at a previous extracted library.
>
> That said, I believe you should be able to extract all three libraries 
> in a single shot, by giving the three headers as input to jextract 
> (and the three libraries...); that will generate only one version of 
> everything. I used this approach for OpenGL which also relies on a 
> number of dependent headers - and generated a single jar.

For the first, basic tests, one could pack all libraries into one. But 
in terms of modularity (and proper package names), having the option to 
declare already generated dependencies would be favorable. I also think 
about the case where new libraries are published later. (This was the 
case for CUDA, but certainly for other libraries as well).
I don't have the slightest idea of how jextract works internally 
(although I went through some related issues, probably: Some of the 
JCuda code is auto-generated, and I'm using the Eclipse CDT to 
internally parse the header files into an AST, from which some of the 
JNI-bindings are generated). But I could imagine that is is tricky to 
figure out the required "mapping" between existing headers/JARs, and the 
headers that are included by others. In fact, this can become 
arbitrarily complicated (or even impossible) when the preprocessor comes 
into play...

> Again, I believe this situation will be much improved when we'll move 
> from an header-centric view (which of course expose all levels of 
> brittleness) towards a more library-centric view of the extraction 
> process.

I'm not entirely sure whether I understood this correctly. It sounds 
like the move to the "library-centric" approach was on the agenda...?
And wouldn't that mean that for headers like
     exampleDataType.h
     libA.h (including exampleDataType.h)
     libB.h (also including exampleDataType.h)
the class structure would be roughly like this:
     class LibA {
         private static class ExampleDataType {}
     }
     class LibB {
         private static class ExampleDataType {}
     }
making both "ExampleDataType" bindings be different types, and thus 
incompatible?

(BTW: If any of my dumb questions has already been discussed, you can 
ignore them (or maybe point me to the respective thread))

> enums are currently translated away as annotations - Java enums are 
> not an option because in C enums are much closer to ints than to Java 
> enums.
>
> That said, the current translation scheme could be improved at least 
> by grouping the methods of the enum constants under a common interface 
> (which can also define the annotation). That would make it easier to 
> see which constants are defined by an enum.
>
> This looks like a bug or something that can be improved.

Yes, sorry, I overlooked this: There is one header ("driver_types.h") 
that declares several enums, and it is translated into a "driver_types" 
interface. The enums are translated into annotations, as inner types. 
But the interface also offers methods for obtaining the constant values 
for ALL enums from this file. So the main point, namely having a way to 
access the constants, is solved. Having them grouped as in 
"enumName.methodForConstantX()" would be nice, though.

> Well, yes - accessing non DRAM memory via Unsafe is bound to fail in 
> mysterious ways. I'm less positive than you are re. gracefulness, in 
> the sense that I don't know under which condition the VM is able to 
> recover after a bad Unsafe::put/get. I would expect the mileage might 
> vary here - but I'm no VM engineer and I'll leave this specific point 
> to others.

Indeed, the details for different (mis-) usage scenarios will depend on 
OS, libraries, drivers etc. I just wanted to point out that this is 
something that can easily happen accidentally with CUDA, and the VM 
managed to recover at least in my first test.

bye
Marco