RFR: JDK-8319516 - Native library suffix impact on the library loading in AIX- Java Class Loader [v4]
Maurizio Cimadamore
mcimadamore at openjdk.org
Sat Mar 23 00:40:33 UTC 2024
On Fri, 22 Mar 2024 19:56:30 GMT, Martin Doerr <mdoerr at openjdk.org> wrote:
> The symbols get found and the JVM can really call into the `libclang.a(libclang.so.16)`. Impressive! (That doesn't mean that jextract is working. clang crashes the way we are calling it. Maybe because of a thread stack size or other memory management problem.) So, there is basically a valid solution for loading such libraries. Only not very smooth for Java programmers.
Note that the jextract code itself depends on bindings generated via jextract (!!). So, I wonder if there might be some incompatibility in the generated layouts/descriptors, which is then causing the crash.The generated bindings are here:
https://github.com/openjdk/jextract/tree/master/src/main/java/org/openjdk/jextract/clang/libclang
These bindings are effectively shared across Linux/x64, Macos/x64, Macos/arm64 and Win/x64 - that is, all the layouts in there are valid and portable on these platforms (except for `C_LONG` on Windows, but libclang is quite disciplined and only uses `long long`).
I'd try few steps:
1. check that your libclang.so is not broken, by calling `clang_version` in a C program and making sure that works w/o crashing
2. then try to do the same (w/o jextract) using FFM, from Java, and see if that still works
If even (1) fails, you might just have a bad libclang, or one that is not for your system (not all the binary downloads in the LLVM website worked on my machine, even if they were supposedly compatible).
If (1) succeds, but (2) fails, that would indicate some general issue with libclang and the JVM. There is an issue that we currently have to workaround, where libclang tries to install its own signal handlers, which mess up the JVM's signal handler to deal with NPEs (and that causes a random JVM crash). This is documented here:
https://reviews.llvm.org/D23662
We try to call "setenv" to disable that logic, but that might fail or not be supported on your platform, so worth checking that.
Finally, if (1) and (2) both succeed, but you get spurious JVM crashes with jextract, then I'd start looking at jextract's bindings (the ones in the folder above), pick a struct (e.g. CXCursor, or CXString) and then inspect the layout and make sure that corresponds to what the layout should be in AIX - it is possible that the AIX compiler inserts some extra padding, and then passing structs with the wrong size in and out of libclang would explain the issues.
Hope this helps!
-------------
PR Comment: https://git.openjdk.org/jdk/pull/17945#issuecomment-2016231211
More information about the core-libs-dev
mailing list