RFC: JEP 386 patch - minor cleanup
Magnus Ihse Bursie
magnus.ihse.bursie at oracle.com
Thu Aug 13 10:05:55 UTC 2020
On 2020-08-12 18:58, Aleksei Voitylov wrote:
> Hi Magnus,
>
> thank you for looking into it. Here is a webrev that addresses your
> comments:
>
> http://cr.openjdk.java.net/~avoitylov/webrev.8247589.02/
Thank you, looks much better! (If I may ask of anything more, it is to
keep a consistent ordering OS -> CPU -> LIBC where they all occur
together. It helps keep your sanity up when reading the code)
/Magnus
>
> As far as code changes are concerned, I'd still like to seek consensus
> within this project if the patch is overall good and has reached a point
> to submit for pre-review upstream. Meanwhile, it was concluded that
> JDK-8250630 is a generic issue that will be submitted upstream
> separately - it has to do with linux networking stack configuration.
>
> -Aleksei
>
> On 10/08/2020 15:52, Magnus Ihse Bursie wrote:
>> On 2020-07-28 19:20, Aleksei Voitylov wrote:
>>> Hi Mikael,
>>>
>>> thank you for looking into it. Here is an updated version of the webrev
>>> with your comments addressed:
>>>
>>> http://cr.openjdk.java.net/~avoitylov/webrev.8247589.01/
>> Some build comments:
>>
>> make/autoconf/build-aux/config.sub:
>> I think you could match on just "linux-musl", to facilitate future
>> ports to other architectures.
>>
>> make/autoconf/platform.m4:
>> The new clib matching should move to a separate
>> PLATFORM_EXTRACT_VARS_FROM_CLIB function. (Maybe these should have
>> been named "FOR" instead of "FROM"...)
>>
>> Apart from this, it looks good from a build perspective.
>>
>> /Magnus
>>> Please see inline for details.
>>>
>>> On 08/07/2020 21:52, Mikael Vidstedt wrote:
>>>>> On Jun 26, 2020, at 5:00 AM, Aleksei Voitylov
>>>>> <aleksei.voitylov at bell-sw.com> wrote:
>>>>>
>>>>> Hi,
>>>>>
>>>>> As JEP 386 is now Candidate [1] I'd like to request comments on the
>>>>> patch [2] before moving the JEP to PTT and submitting for review
>>>>> upstream. The patch is rebased to 59791:a22af2c3d969 and minor cleanup
>>>>> of the changes from the original portola branch [3] was made.
>>>>>
>>>>> Webrev [2]: http://cr.openjdk.java.net/~avoitylov/webrev.8247589
>>>>> Original [3]: http://cr.openjdk.java.net/~avoitylov/webrev.portola
>>>> So glad to see that this is making progress, thanks a lot for
>>>> picking it up! I’ve had a quick look at the changes and have a few
>>>> initial comments, inlined below. I’ll also mention that I recently
>>>> started the work of bringing the portola repo up-to-date again after
>>>> the move to GitHub earlier this year. You’ll see that some of the
>>>> changes I’m making there are not the same as the ones you have in
>>>> this patch. Don’t read anything into that please - I started making
>>>> those changes before I realized that you had this patch out for
>>>> review and it wouldn’t surprise me if your version is preferable in
>>>> many/most cases. Let’s figure out how to resolve that in the portola
>>>> repo a bit later.
>>>>
>>>> Disclaimer: this is *not* a complete review, and should not be
>>>> considered as blessing the changes for upstream/mainline. For that
>>>> you’ll need to get the changes sent out to the respective lists
>>>> (build-dev, hotspot-dev, …).
>>>>
>>>> High level comment: There are a handful of JBS issues with Fix
>>>> Version=repo-portola which would probably be good to have a look at
>>>> and possibly close if they’re no longer relevant.
>>>>
>>> I did some bug scrubbing, here is the list of open issues:
>>>
>>> JDK-8241092 make test-bundles fails on Alpine Linux because of
>>> TestTLS.java test
>>> is fixed as part of the proposed patch.
>>> JDK-8240187 Test
>>> hotspot/jtreg/runtime/jni/terminatedThread/TestTerminatedThread.java
>>> fails on Alpine Linux
>>> the test is excluded as part of the proposed patch.
>>> JDK-8241053 Hotspot
>>> runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java test
>>> fails on Alpine Linux with debug build
>>> - runtime/CommandLine/OptionsValidation/TestOptionsWithRanges.java
>>> - runtime/Thread/TestThreadStackSizes.java
>>> tests are fixed as part of the proposed patch, David suggests to fix
>>> it in mainline as it does not seem to be specific to the port. I tend to
>>> agree with the evaluation.
>>> JDK-8178692 Clean up RequiresSetenv / LD_LIBRARY_PATH
>>> please see below
>>> JDK-8250630 test/jdk/com/sun/jdi/JdwpListenTest.java fails on Alpine
>>> Linux
>>> - test/jdk/com/sun/jdi/JdwpAllowTest.java
>>> - test/jdk/com/sun/jdi/JdwpListenTest.java
>>> Alex is looking into the reason of the issue. It could be because of
>>> the network settings on the host or the way the getaddrinfo(3) returns
>>> addresses in musl. Currently it looks to be not very specific to the
>>> port, though.
>>>
>>> These issues reflect the current testing status of the port.
>>>>> Build System, Toolchains:
>>>>>
>>>>> - make/autoconf/build-aux/config.guess - replaced the "head -1 | cut
>>>>> -f1" construct with sed to be slightly more robust
>>>>> - make/autoconf/build-aux/config.sub - style brought in line with the
>>>>> current file style
>>>>> - make/autoconf/hotspot.m4 - HOTSPOT_SETUP_SA is no longer required
>>>>> after JDK-8239450 [4]
>>>>> - make/modules/jdk.jdwp.agent/Lib.gmk, make/test/JtregNativeJdk.gmk,
>>>>> hotspot/share/gc/shared/genCollectedHeap.cpp - disabling GCC warnings
>>>>> and modifying initialization is no longer required, Alpine Linux
>>>>> toolchain is quite recent
>>>> Great cleanups! Happy to see that the SA special casing is no longer
>>>> needed - https://bugs.openjdk.java.net/browse/JDK-8181313 is also to
>>>> thank for that.
>>>>
>>>> * ReleaseFile.gmk
>>>>
>>>> The choice to use “LIBC” as the key name in the releases file was
>>>> not something I/we spent a lot of time on, so we’ll want to revisit
>>>> that and make sure it’s the right name or come up with a better one.
>>> For ReleaseFile.gmk I think LIBC is good, as it is used as a handle for
>>> the actual OPENJDK_TARGET_LIBC. For source code, MUSL_LIBC is what came
>>> up as the least intrusive indication that musl is a kind of libc.
>>> Leaving it just as MUSL could be confusing.
>>>>> Desktop:
>>>>>
>>>>> - src/java.desktop/unix/native/libawt_xawt/xawt/XToolkit.c -
>>>>> changes are
>>>>> present upstream and no longer required
>>>> Yay!
>>>>
>>>>> HotSpot:
>>>>>
>>>>> - src/hotspot/os/linux/os_linux.cpp - checks for MUSL can be made
>>>>> static. Defined stub dlvsym for MUSL instead of dlvsym_if_available
>>>>> logic
>>>>> - src/jdk.hotspot.agent/linux/native/libsaproc/ps_proc.c - using
>>>>> strerror is cleaner and more portable, the code originally present
>>>>> there
>>>>> is not thread-safe so it's ok to use it
>>>>> - src/jdk.jdwp.agent/share/native/libjdwp/util.h - including a header
>>>>> instead
>>>> * os_linux.cpp
>>>>
>>>> Out of curiosity, have you tested with/without the various libnuma
>>>> library versions?
>>> Please see the updated version of the patch which is tested to work with
>>> (and without) libnuma v1.2 on Alpine Linux. Note that musl can't
>>> recognize library versions, and it only allows to install v1.2 in the
>>> distribution. It's good there is a single place where a change was
>>> required.
>>>> set_glibc_version -> set_libc_version?
>>> Fixed, thanks.
>>>> * globalDefinitions_gcc.hpp
>>>>
>>>> Did you verify that the change to use isnan instead of isnanf is
>>>> correct for all versions of linux/bsd and the respective (supported)
>>>> toolchain versions? See:
>>>> https://bugs.openjdk.java.net/browse/JDK-8178689. I linked it to the
>>>> JEP in JBS.
>>> I updated JDK-8178689 with proof that such a replacement is safe for all
>>> practical combinations of gcc/glibc and llvm versions.
>>>> * ps_proc.c
>>>>
>>>> Hm.. What makes you say that the original code is not thread-safe?
>>> I checked LinuxDebuggerLocal.java and all attach0 calls are guarded by
>>> corresponding syncronized constructs.
>>>
>>> That's because nobody claims that ptrace_attach is thread-safe or
>>> re-enterant. It depends to kernel implementation and attempt to use it
>>> from multiple threads could lead to completely unpredictable results. If
>>> we run ptrace_attach from different threads but with the same PID, it
>>> either crash the target process or fail. And it doesn't have any sense
>>> from serviceability agent point of view.
>>>
>>> Running ptrace_attach from multiple threads with different PIDs should
>>> be more or less OK - I trust modern kernel, but I don't see any usecase
>>> or command line support for using hotspot agent with multiple pids.
>>>
>>> So using strerror instead of strerror_r to mitigate compatibility issues
>>> should not be a problem.
>>>>> java.base:
>>>>>
>>>>> - src/java.base/linux/native/libnet/linux_close.c - made (SIGRTMAX
>>>>> - 2)
>>>>> a define
>>>>> - src/java.base/unix/native/libjava/childproc.c - suggest to skip this
>>>>> for now. Also, if the OS does not handle scripts without shebang it
>>>>> should not be emulated.
>>>> * jdk_util_md.h
>>>>
>>>> The same isnan comment as above.
>>> This is no longer relevant, fixed upstream.
>>>> * INTERRUPT_SIGNAL
>>>>
>>>> Can the definition of INTERRUPT_SIGNAL (perhaps then with a better
>>>> name) be moved to something like net_util_md.h so that it is shared
>>>> for both files using it? I’d also remove the spaces around the
>>>> arguments to pthread_kill in linux_close.c (pre-existing).
>>> Done, renamed INTERRUPT_SIGNAL to WAKEUP_SIGNAL.
>>>> * java_md.c
>>>>
>>>> Did you look into alternatives ways to resolve the issue? As covered
>>>> by https://bugs.openjdk.java.net/browse/JDK-8178692 I was hoping we
>>>> could find some other way of dealing with that to avoid the the
>>>> LD_LIBRARY_PATH dance and the ugly/fragile fallout from it.
>>> Yes, but neither of them looked better [1, 2].
>>>> * childproc.c
>>>>
>>>> Remind me: what happens without the fix? I’m thinking there was some
>>>> reason I made it to start with..?
>>> Without the fix we rely on the OS execve mechanism. On Alpine it does
>>> not currently handle files without shebang [3].
>>>>> Tests:
>>>>>
>>>>> - added method isBusyBox() to test/lib/jdk/test/lib/Platform.java
>>>>> - added WB_InternalVMInfo method to
>>>>> hotspot/share/prims/whitebox.cpp for
>>>>> better test handling
>>>>> - refactored affected tests to cater for the above
>>>> * VMProps.java
>>>>
>>>> If I read it correctly I think WB.internalVMInfo().contains("musl”)
>>>> may incorrectly return true if the VM info strings happens to
>>>> contain “musl” in, say, HOTSPOT_BUILD_USER or whatever, so you’ll
>>>> probably want something more targeted for the check. Also, the
>>>> pattern used in other places in the file to convert the boolean to a
>>>> String seems to be “" + p.
>>> Yes, converted the check into
>>> WhiteBox.getWhiteBox().getLibcName().contains("musl") - this should be
>>> more robust.
>>>> * exeinvoke.c
>>>>
>>>> I’m don’t know the "complicated interaction of signal handlers” well
>>>> enough to understand if the changes are correct, but assuming they are:
>>>>
>>>> I’d feel better having the whole JDK1_1InitArgs jdk_args struct
>>>> initialized (memset).
>>>>
>>>> size_t stacksize seems to be unused. Perhaps get_java_stacksize
>>>> should return a size_t?
>>>>
>>>> I’d also remove the spaces between pthread_ and the argument list
>>>> while your at it (also pre-existing), not exactly obvious why only
>>>> the pthread_ calls have them..
>>> This is addressed in the updated webrev.
>>>> * exestack-tls.c
>>>>
>>>> As a potential alternative, base the decision on a define set by the
>>>> build system (MUSL_LIBC). Just something to consider. What you have
>>>> should work.
>>> I'd suggest to keep it as is - there is no other precedent of this in
>>> tests to support altering the build system.
>>>> * TestTerminatedThread.java
>>>>
>>>> I don’t know where we stand on adding comments, but personally I’d
>>>> prefer a comment explaining why the test is excluded when running
>>>> with musl (not unlike the one you have in NativeLibraryTest.java).
>>> Fixed in the updated webrev.
>>>> * Test7029048.java
>>>>
>>>> Can VMProps::isMusl be reused? I’m assuming that moving the
>>>> predicate definition to TestHelper would require a slew of tests to
>>>> use -XX:+WhiteBox?
>>> Yes, this is exactly the problem we try to avoid here. Another option is
>>> to introduce another helper class just for these couple of tests, but
>>> there would be only a couple of places where it would be used.
>>>> Also, I know I looked at this before but don’t remember the
>>>> conclusion: why is the check at the end “passed < expected” rather
>>>> than exact match - passed == expected? (pre-existing)
>>> The test was refactored upstream and does not have any magic numbers any
>>> more [4]. The webrev is updated.
>>>> Finally a question: What level of testing have you done of this and
>>>> on which versions of Alpine/…?
>>> JTReg, VM TestBase on all platforms (Linux x86_64, Linux x86, Linux Arm,
>>> Linux AArch64, Linux PPC, Windows x86_64, Windows x86, Mac) with no
>>> regressions. A downport of these changes to 14 passes JCK 14 on these
>>> platforms.
>>>
>>> The port was tested on Alpine 3.8 and 3.11 with JTReg, VM TestBase. With
>>> the exception of the issues listed above and SA which is not supported
>>> as per the JEP, there are no differences with Linux x86_64. A downport
>>> of these changes to 14 passes JCK 14 on Alpine Linux.
>>>
>>> -Aleksei
>>>
>>> [1] https://www.openwall.com/lists/musl/2020/05/28/5
>>> [2] https://www.openwall.com/lists/musl/2020/05/25/3
>>> [3] https://www.openwall.com/lists/musl/2020/02/12/9
>>> [4] http://hg.openjdk.java.net/jdk/jdk/rev/57e7f837e6d9
>>>> Cheers,
>>>> Mikael
>>>>
>>>>> Thanks,
>>>>> -Aleksei
>>>>>
>>>>> [1] https://openjdk.java.net/jeps/386
>>>>> [2] http://cr.openjdk.java.net/~avoitylov/webrev.8247589
>>>>> (https://bugs.openjdk.java.net/browse/JDK-8247589)
>>>>> [3] http://cr.openjdk.java.net/~avoitylov/webrev.portola
>>>>> [4] https://bugs.openjdk.java.net/browse/JDK-8239450
>>>>>
>>>>>
More information about the portola-dev
mailing list