PING: RFR: 8241439: jdk.NativeLibraryEvent hooks all opened regular files

Yasumasa Suenaga suenaga at oss.nttdata.com
Tue May 26 07:04:35 UTC 2020


Hi Thomas,

Thanks for your comment!

On 2020/05/26 14:48, Thomas Stüfe wrote:
> .. another point occurring to me. I wonder how robust this API is in comparison with reading the proc fs. Since we rely on it in crash situations when writing the hs-error file. I'd be afraid that we run into situations where e.g. ELF information is corrupted  in some library and we'd get secondary crashes during error reporting. Reading /proc/pid/maps is pretty robust, it does not rely on data structures of the process itself. So, if we change this - and I'm not sure we should - I'd really like to see a new error reporting step explicitly printing out proc/pid/maps to the error file too, as a fallback.

This change does not affect to hs_err.
Memory mappings for error reporting would be generated at os::print_dll_info() in os_linux.cpp, and it still uses /proc/<PID>/maps.

> Cheers, Thomas
> 
> On Tue, May 26, 2020 at 7:37 AM Thomas Stüfe <thomas.stuefe at gmail.com <mailto:thomas.stuefe at gmail.com>> wrote:
> 
>     Hi Yasumasa,
> 
>     Sorry, I share Davids reservations on this one.
> 
>     For one, the bug description is misleading, maybe that is why you don't get feedback: you are proposing to switch the underlying implementation of os::get_loaded_module_info() from /proc/pid/maps - which it has been for ages - to dl_iterate_phdr(3). The bug description should clearly say so, that would lead the discussion into the right channels.

I've written why it is a problem - it does not match the spec of os::get_loaded_modules_info().
I added a comment that we can resolve this issue to use dl_iterate_phdr(3), and it affects NativeLibrary event only.

>     Switching this implementation affects more callers than just jdk.NativeLibraryEvent. For instance, it is used behind os::address_is_in_vm(), and behind a number of monitoring facilities (jfr, error reporting etc).

I grep'ed HotSpot source, but I could not find other place to use it excepting NativeLibrary event in JFR.
os::address_is_in_vm() for Linux uses dladdr(3).

>     So, how exactly would be new list differ? Do you have example printouts? And, do we even want this to change?
> 
>     While it is true that os::get_loaded_module_info() on Linux always returned the whole process mapping instead of just the list of shared libraries, this was a "bug" I always secretly liked. E.g. in hs-err files, full mapping is valuable information. I would hate to see it go. So, if you propose to change this, we should discuss how to replace that information.

As I said in above, it is not affect hs_err, so we can still get memory mappings which come from /proc/<PID>/maps.

>     Other questions which pop in mind are:
> 
>     - would using dl_iterate_phdr(3) make us dependent on ELF versioning of the loaded libraries?

In case of glibc, dl_iterate_phdr(3) seems to use the space for dynamic linker (see comments in link.h in glibc source).
So I think the result of dl_iterate_phdr(3) is valid for library path and addresses even if the shared library has symbol versioning.

>     - does it rely on glibc or will it work with muslc too (Alpine Linux, see JEP 386)

musl libc has already added dl_iterate_phdr(3):
   http://git.musl-libc.org/cgit/musl/commit/?id=18c0e02e2bd53ceedbb843b06ff90890f1c734b0


Cheers,

Yasumasa


>     The beauty of using the proc file system is that it is dead simple, always works, and can be simply verified manually.
> 
>     I think if you your intent is to just fix jdk.NativeLibraryEvent, I would do the filtering up there.
> 
>     Cheers, Thomas
> 
> 
> 
> 
> 
> 
>     On Tue, May 26, 2020 at 2:26 AM Yasumasa Suenaga <suenaga at oss.nttdata.com <mailto:suenaga at oss.nttdata.com>> wrote:
> 
>         Hi all,
> 
>         I've sent review request last month, but nobody has reviewed. Could you review it?
> 
>          >    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>          >    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.01/
> 
>         os::get_loaded_modules_info() has introduced in JDK-8056242, and it says this function is for dll/so/dylibs.
>         In fact, os::get_loaded_modules_info() uses EnumProcessModules() in Windows.
> 
>         In Linux, get_loaded_modules_info() depends on /proc/<PID>/maps, so it hooks regular files (e.g. JAR, CDS archive), and hooks multiple memory regions by one shared library.
>         We should merge same library to one event, and should not hook regular files (not shared library).
> 
>         All comments are welcome.
> 
> 
>         Thanks,
> 
>         Yasumasa
> 
> 
>         On 2020/04/16 9:30, Yasumasa Suenaga wrote:
>          > PING: Could you review it?
>          >
>          >    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>          >    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.01/
>          >
>          >
>          > Yasumasa
>          >
>          >
>          > On 2020/04/09 16:00, Yasumasa Suenaga wrote:
>          >> Hi all,
>          >>
>          >> Please review this change:
>          >>
>          >>    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>          >>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.00/
>          >>
>          >> HotSpot has os::get_loaded_modules_info() to get all loaded native libraries, however it would call callback in twice on same library on Linux. Also it would call callback even if the file is not a library.
>          >>
>          >> We can see this problem in jdk.NativeLibrary JFR event as below:
>          >>
>          >>
>          >> ```
>          >> jdk.NativeLibrary {
>          >>     startTime = 16:59:06.621
>          >>     name = "/usr/lib64/libc-2.30.so <http://libc-2.30.so>"
>          >>     baseAddress = 0x7FDB06ACC000
>          >>     topAddress = 0x7FDB06AF1000
>          >> }
>          >>
>          >> jdk.NativeLibrary {
>          >>     startTime = 16:59:06.621
>          >>     name = "/usr/lib64/libc-2.30.so <http://libc-2.30.so>"
>          >>     baseAddress = 0x7FDB06AF1000
>          >>     topAddress = 0x7FDB06C40000
>          >> }
>          >>
>          >>      :
>          >>
>          >> ```
>          >>
>          >> ```
>          >> jdk.NativeLibrary {
>          >>     startTime = 16:59:06.621
>          >>     name = "/home/ysuenaga/OpenJDK/jdk/build/linux-x86_64-server-fastdebug/images/jdk/lib/server/classes.jsa"
>          >>     baseAddress = 0x800000000
>          >>     topAddress = 0x800007000
>          >> }
>          >> ```
>          >>
>          >> This change uses dl_iterate_phdr() to get them, so these problem has gone.
>          >>
>          >>
>          >> Thanks,
>          >>
>          >> Yasumasa
> 


More information about the hotspot-jfr-dev mailing list