PING: RFR: 8241439: jdk.NativeLibraryEvent hooks all opened regular files

Thomas Stüfe thomas.stuefe at gmail.com
Tue May 26 05:48:10 UTC 2020


.. another point occurring to me. I wonder how robust this API is in
comparison with reading the proc fs. Since we rely on it in crash
situations when writing the hs-error file. I'd be afraid that we run into
situations where e.g. ELF information is corrupted  in some library and
we'd get secondary crashes during error reporting. Reading /proc/pid/maps
is pretty robust, it does not rely on data structures of the process
itself. So, if we change this - and I'm not sure we should - I'd really
like to see a new error reporting step explicitly printing out
proc/pid/maps to the error file too, as a fallback.

Cheers, Thomas

On Tue, May 26, 2020 at 7:37 AM Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi Yasumasa,
>
> Sorry, I share Davids reservations on this one.
>
> For one, the bug description is misleading, maybe that is why you
> don't get feedback: you are proposing to switch the underlying
> implementation of os::get_loaded_module_info() from /proc/pid/maps - which
> it has been for ages - to dl_iterate_phdr(3). The bug description should
> clearly say so, that would lead the discussion into the right channels.
>
> Switching this implementation affects more callers than
> just jdk.NativeLibraryEvent. For instance, it is used
> behind os::address_is_in_vm(), and behind a number of monitoring facilities
> (jfr, error reporting etc).
>
> So, how exactly would be new list differ? Do you have example printouts?
> And, do we even want this to change?
>
> While it is true that os::get_loaded_module_info() on Linux always
> returned the whole process mapping instead of just the list of shared
> libraries, this was a "bug" I always secretly liked. E.g. in hs-err files,
> full mapping is valuable information. I would hate to see it go. So, if you
> propose to change this, we should discuss how to replace that information.
>
> Other questions which pop in mind are:
>
> - would using dl_iterate_phdr(3) make us dependent on ELF versioning of
> the loaded libraries?
> - does it rely on glibc or will it work with muslc too (Alpine Linux, see
> JEP 386)
>
> The beauty of using the proc file system is that it is dead simple, always
> works, and can be simply verified manually.
>
> I think if you your intent is to just fix jdk.NativeLibraryEvent, I would
> do the filtering up there.
>
> Cheers, Thomas
>
>
>
>
>
>
> On Tue, May 26, 2020 at 2:26 AM Yasumasa Suenaga <suenaga at oss.nttdata.com>
> wrote:
>
>> Hi all,
>>
>> I've sent review request last month, but nobody has reviewed. Could you
>> review it?
>>
>> >    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>> >    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.01/
>>
>> os::get_loaded_modules_info() has introduced in JDK-8056242, and it says
>> this function is for dll/so/dylibs.
>> In fact, os::get_loaded_modules_info() uses EnumProcessModules() in
>> Windows.
>>
>> In Linux, get_loaded_modules_info() depends on /proc/<PID>/maps, so it
>> hooks regular files (e.g. JAR, CDS archive), and hooks multiple memory
>> regions by one shared library.
>> We should merge same library to one event, and should not hook regular
>> files (not shared library).
>>
>> All comments are welcome.
>>
>>
>> Thanks,
>>
>> Yasumasa
>>
>>
>> On 2020/04/16 9:30, Yasumasa Suenaga wrote:
>> > PING: Could you review it?
>> >
>> >    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>> >    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.01/
>> >
>> >
>> > Yasumasa
>> >
>> >
>> > On 2020/04/09 16:00, Yasumasa Suenaga wrote:
>> >> Hi all,
>> >>
>> >> Please review this change:
>> >>
>> >>    JBS: https://bugs.openjdk.java.net/browse/JDK-8241439
>> >>    webrev: http://cr.openjdk.java.net/~ysuenaga/JDK-8241439/webrev.00/
>> >>
>> >> HotSpot has os::get_loaded_modules_info() to get all loaded native
>> libraries, however it would call callback in twice on same library on
>> Linux. Also it would call callback even if the file is not a library.
>> >>
>> >> We can see this problem in jdk.NativeLibrary JFR event as below:
>> >>
>> >>
>> >> ```
>> >> jdk.NativeLibrary {
>> >>     startTime = 16:59:06.621
>> >>     name = "/usr/lib64/libc-2.30.so"
>> >>     baseAddress = 0x7FDB06ACC000
>> >>     topAddress = 0x7FDB06AF1000
>> >> }
>> >>
>> >> jdk.NativeLibrary {
>> >>     startTime = 16:59:06.621
>> >>     name = "/usr/lib64/libc-2.30.so"
>> >>     baseAddress = 0x7FDB06AF1000
>> >>     topAddress = 0x7FDB06C40000
>> >> }
>> >>
>> >>      :
>> >>
>> >> ```
>> >>
>> >> ```
>> >> jdk.NativeLibrary {
>> >>     startTime = 16:59:06.621
>> >>     name =
>> "/home/ysuenaga/OpenJDK/jdk/build/linux-x86_64-server-fastdebug/images/jdk/lib/server/classes.jsa"
>> >>     baseAddress = 0x800000000
>> >>     topAddress = 0x800007000
>> >> }
>> >> ```
>> >>
>> >> This change uses dl_iterate_phdr() to get them, so these problem has
>> gone.
>> >>
>> >>
>> >> Thanks,
>> >>
>> >> Yasumasa
>>
>


More information about the hotspot-jfr-dev mailing list