JMOD, native libraries and the packaging of JavaFX

Thu May 10 15:05:32 UTC 2018

I couldn't find any support in FreeBSD, although there is "fdlopen", which
opens a shared library direct from a file descriptor. I haven't tried it.

Loading a library from a file or memory region is an obvious use case
that'd be helpful for anyone who wants to distribute programs in the form
of single files (whether jars or exes or elf binaries), but it's not well
supported. Here's a bit of background on why not.

The blame can mostly be laid at the feet of the ubiquitous performance
optimisation mmap/MapViewOfFile. The idea is, map your shared library into
memory and let the kernel lazily load only what's needed instead of the
whole thing. In the days when memory was very scarce and disks were very
slow this could be a big help. Unfortunately it imposes some strict limits
on what you can do. Kernels really want mmaps to be page-aligned at every
level, so, you can't tell the kernel to map a shared library starting from
some arbitrary offset in the file. This *could* be supported, but isn't,
presumably to simplify kernel code.

I was curious if it's still the case that mmap is so important. Putting
aside the question of OS support, would you lose a lot of performance by
just loading the file into memory all at once with regular file IO and then
adjusting the page permissions using mmap afterwards?

Shared libraries are not very large by modern standards. HotSpot libjvm is
13mb on macOS, and the largest DSO I found in my Linux
/usr/lib/x86_64-linux-gnu directory was libicudata.so.57.1 which weighs in
at a generous 25 megabytes. ICU is rare (it's mostly Unicode data tables
which are enormous). The next largest is libgs (ghostscript) which is 16mb.
So it seems plausible that 15-20mb is about the largest shared library Java
users are likely to want to load (that's a LOT of C++!).

Running a simple benchmark on a cheap Linode VM:

root at plan99:/usr/lib/x86_64-linux-gnu# echo 3 > /proc/sys/vm/drop_caches
root at plan99:/usr/lib/x86_64-linux-gnu# time cat libicudata.so.57.1
>/dev/null
real 0m0.046s
user 0m0.000s
sys 0m0.013s

46 msec to load a 25 megabyte DSO into memory from disk? 8msec to do it
again when hot in the cache? mmap is surely instant, but it's not clear to
me that mmap matters much anymore if you're already paying the cost of
interpreting/jit compiling. In an age where people routinely ship apps
as *entire
operating systems* (Docker images), it feels like we're being held back
here by obsolete optimisations.

Unfortunately on most platforms the system dynamic linker has special
privileges. Debuggers handshake with it, and on Windows only the OS linker
can produce an HMODULE even though HMODULE is just a pointer to the base
address of the mapped image. HMODULEs are in turn required by a few old
Windows APIs. So, using a custom linker imposes some small sacrifices.

I'll leave the topic here.