RFR(s): 8078513: [linux] Clean up code relevant to LinuxThreads implementation

Thomas Stüfe thomas.stuefe at gmail.com
Mon Jun 15 12:57:04 UTC 2015


Hi Volker, David,

Volker, thanks for the review!

new version:

http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.04/webrev/

I added all changes Volker wanted.

Oh, and I forgot to add Volker as a Reviewer, please add him when pushing.

Kind Regards, Thomas







On Mon, Jun 15, 2015 at 11:23 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:

> Hi,
>
> sorry, dropped the ball on this one. Will post an updated version soon.
>
> Thomas
>
> On Fri, Jun 12, 2015 at 4:02 AM, David Holmes <david.holmes at oracle.com>
> wrote:
>
>> Hi Thomas,
>>
>> If you can address Volker's comments we can move forward with this. I'll
>> be primary Reviewer and Volker a second reviewer.
>>
>> Thanks,
>> David
>>
>> On 3/06/2015 3:52 PM, Thomas Stüfe wrote:
>>
>>> Ping...
>>>
>>> Still need a second reviewer.
>>>
>>> Thanks!
>>>
>>> On Wednesday, May 20, 2015, Thomas Stüfe <thomas.stuefe at gmail.com
>>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>>
>>>     David, thanks!
>>>
>>>     to all: could I have a second reviewer, please?
>>>
>>>     Thanks & Kind Regards, Thomas
>>>
>>>     On Wed, May 20, 2015 at 10:42 AM, David Holmes
>>>     <david.holmes at oracle.com
>>>     <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>> wrote:
>>>
>>>         Hi Thomas,
>>>
>>>         Summary: ok - need second reviewer ( I think Coleen is away)
>>>
>>>         More inline ...
>>>
>>>         On 19/05/2015 10:26 PM, Thomas Stüfe wrote:
>>>
>>>             Hi David,
>>>
>>>             new webrev:
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.02/webrev/
>>>
>>>             Only two things changed. I reverted my comment change in the
>>>             SA and in
>>>             os_linux.cpp I massaged the "expand-thread-stack" comment
>>>             text a bit.
>>>
>>>             More details inline.
>>>
>>>             On Thu, May 14, 2015 at 4:40 AM, David Holmes
>>>             <david.holmes at oracle.com
>>>             <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>
>>>             <mailto:david.holmes at oracle.com
>>>             <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>>>
>>>             wrote:
>>>
>>>                  Hi Thomas,
>>>
>>>                  On 14/05/2015 2:32 AM, Thomas Stüfe wrote:
>>>
>>>                      Hi David,
>>>
>>>                      here the new version:
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.01/webrev/
>>>
>>>                      Differences to the old version are just comment
>>>             changes, and I also
>>>                      modified some comments in the SA as Staffan
>>> requested.
>>>
>>>
>>>                  agent/src/os/linux/libproc.h
>>>
>>>                  Modified comments still refer to "both thread
>>>             libraries" but now the
>>>                  two libraries have not been mentioned, so there is no
>>>             context to
>>>                  know what the two libraries are.
>>>
>>>
>>>             I reverted my comment change to the original version. I
>>>             think I do not
>>>             know enough about the SA internals to make the comment clear
>>> and
>>>             concise. Someone with more SA knowledge may be better suited
>>>             to change
>>>             that comment.
>>>
>>>
>>>         Ok. I think you were pretty close anyway, just need to get rid
>>>         of the "both thread libraries" references.
>>>
>>>
>>>
>>>                  More commentary inline below (with trimming)...
>>>
>>>
>>>                      See remarks inline:
>>>
>>>                           os_linux.cpp:
>>>
>>>                           The whole MAP_GROWSDOWN discussion and code
>>>             also seems to be no
>>>                           longer relevant.
>>>
>>>
>>>                      I spent a lot of time digging around in the history
>>>             of this
>>>                      thing. I am
>>>                      unsure whether that stack expansion coding can be
>>>             removed. There are
>>>                      some rare case where they may still be needed:
>>>
>>>                      That whole coding revolves around the question
>>>             whether thread stacks
>>>                      need to be expanded manually.
>>>
>>>                      Nowadays, normal threads allocated with a modern
>>>             pthread library
>>>                      (NPTL)
>>>                         just allocate the whole stack with mmap(without
>>>                      MAP_NORESERVE), so the
>>>                      memory is committed right away. See glibc sources,
>>>                      nptl/allocate_stack.c.
>>>
>>>                      In former times (LinuxThreads) thread stacks were
>>>             allocated using
>>>                      mmap(MAP_GROWSDOWN), which needed cooperation from
>>>             the linux kernel:
>>>                      kernel caught faults caused by expanding the stack
>>>             and transparently
>>>                      mapped the next page. It needed to distinguish real
>>>             page faults from
>>>                      stack expansion page faults, and seems the logic
>>>             did not always
>>>                      work, so
>>>                      manual expansion was needed - see os_linux.cpp,
>>>                      "os::Linux::manually_expand_stack".
>>>
>>>                      I think there may still be cases where we run on
>>>             threads whose
>>>                      stacks
>>>                      are allocated with mmap(MAP_GROWSDOWN), even though
>>>             LinuxThreads
>>>                      is gone:
>>>
>>>                      - with primordial threads (not sure, did not test).
>>>             We do not run on
>>>                      primordial thread but someone may write a launcher
>>>             and start a
>>>                      VM from
>>>                      the primordial thread or attach the primordial
>>>             thread to the VM.
>>>
>>>
>>>                  This is already a very grey area in the VM. See:
>>>
>>>             https://bugs.openjdk.java.net/browse/JDK-6516512
>>>
>>>                  "HotSpot:thread terminology should be clearer or
>>>             cleaned up."
>>>
>>>                  So until that is investigated and cleaned up this code
>>>             can't really
>>>                  be touched.
>>>
>>>
>>>             Interesting and well written bug report. Unfortunately, the
>>>             issue seems
>>>             to be in deep sleep.
>>>
>>>
>>>         Sadly often true :(
>>>
>>>             At SAP, we explicitly forbid all our internal users to
>>>             invoke the VM on
>>>             the primordial thread, and have done so since a long time.
>>>             We tell all
>>>             our users to spawn off an own pthread for VM creation. I
>>>             think the AIX
>>>             port will not even come up, we just assert.
>>>
>>>             But this is more of a political question. If we pull support
>>>             for running
>>>             on the primordial thread in the OpenJDK, there may be
>>>             applications which
>>>             break. On the other hand, does this scenario - running on
>>>             the primordial
>>>             thread - get tested? Would we even notice if it were not to
>>>             work anymore?
>>>
>>>
>>>         There may be some JNI tests that use the primordial thread but I
>>>         can't say for sure.
>>>
>>>                      - someone may write its own thread implementation
>>>             using clone and
>>>                      mmap(MAP_GROWSDOWN) allocated stacks.
>>>
>>>                      These cases are probably extremely rare, and I
>>>             would have no qualms
>>>                      removing support for them, but I do not want to
>>>             decide that.
>>>
>>>
>>>                  Fair enough. As I said I think this expands into a
>>>             cleanup up of the
>>>                  whole "initial thread" business anyway.
>>>
>>>                      Sidenote: These cases are inherently unsafe because
>>>                      mmap(MAP_GROWSDOWN)
>>>                      is too. Ulrich Drepper tried without success to
>>>             remove support
>>>                      for these
>>>                      flags from the linux kernel:
>>>             https://lwn.net/Articles/294001/
>>>
>>>                      I added some comment to os_linux.cpp to explain
>>>             this a bit better.
>>>
>>>
>>>                  The two comments don't rely flow together due to the
>>>             "Force Linux
>>>                  kernel to expand current thread stack." lead-in of the
>>>             second
>>>                  comment (which isn't actually a grammatically correct
>>>             sentence). But
>>>                  I can't really think of anything to do that doesn't
>>>             start making
>>>                  numerous changes to what is already a confusing and
>>>             disjointed
>>>                  comment block.
>>>
>>>
>>>             I tried to improve the comment a bit, to clearly distinguish
>>>             it from the
>>>             older comment and to provide enough information for whenever
>>>             we decide
>>>             to clean up this expand-thread-stack coding.
>>>
>>>
>>>         Revised comment is much clearer - thanks!
>>>
>>>         David
>>>         -----
>>>
>>>
>>>             Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>                           The gcc 2.x workarounds can also be removed I
>>>             think.
>>>
>>>                           I wonder if we can also eradicate
>>>                      WorkAroundNPTLTimedWaitHang :)
>>>
>>>
>>>                      I do not dare to do that - I do not know enough
>>>             about this issue.
>>>
>>>
>>>                  Separate RFE. I doubt we care about the old systems
>>>             that had this
>>>                  problem.
>>>
>>>
>>>                           Aside: The createThread_lock (now unused)
>>>             seems to have
>>>                      been copied
>>>                           into src/os/aix/vm/os_aix.hpp as part of the
>>>             AIX port and
>>>                      should be
>>>                           removed at some point.
>>>
>>>
>>>                      Yes I know :-) It is not the only Linux-specific
>>>             code which
>>>                      crept into
>>>                      our AIX port. We actually have a bug open to deal
>>>             with that:
>>>             https://bugs.openjdk.java.net/browse/JDK-8079125
>>>
>>>
>>>                  Good to know :)
>>>
>>>                  Thanks,
>>>                  David
>>>                  -----
>>>
>>>
>>>                           ---
>>>
>>>                           src/os/linux/vm/os_linux.hpp
>>>
>>>                           Copyright needs updating.
>>>
>>>
>>>                      done
>>>
>>>                           ---
>>>
>>>                           src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>
>>>                           Copyright needs updating.
>>>
>>>                      done
>>>
>>>                             os::Linux::supports_variable_stack_size() is
>>>             now dead
>>>                      code (value
>>>                           is true for all arch's AFAICS) so can be
>>>             removed from all
>>>                           src/os_cpu/linux_XXX/os_linux_XXX.cpp.
>>>
>>>                           I think  supports_variable_stack_size() is
>>>             also always true
>>>                      on other
>>>                           platforms (AIX, BSD) and so could be cleaned
>>>             up in that
>>>                      code too
>>>                           (separate clean-up CR okay for that).
>>>
>>>
>>>                      I agree, lets do this in a different RFR because
>>>             this would touch
>>>                      non-linux sources and I want to keep this linux
>>>             specific. I opened
>>>             https://bugs.openjdk.java.net/browse/JDK-8080298 for this.
>>>
>>>
>>>                  Thanks for filing.
>>>
>>>
>>>
>>>                           ---
>>>
>>>
>>> src/os_cpu/linux_x86/vm/threadLS_linux_x86.cpp
>>>
>>>                           Copyright needs updating.
>>>
>>>                      done
>>>
>>>                           ---
>>>
>>>                           I'm happy to sponsor this for you.
>>>
>>>                           Thanks,
>>>                           David
>>>                           -----
>>>
>>>
>>>
>>>                               This change removes (I hope all) remnants
>>>             of code
>>>                      dealing with
>>>                               LinuxThreads
>>>                               from the hotspot.
>>>
>>>                               Discussion of the changes:
>>>
>>>                               1) on LinuxThreads, each thread had an own
>>>             pid, so getpid()
>>>                               would return a
>>>                               unique pid for every thread it was invoked
>>>             in. On NPTL,
>>>                      this is
>>>                               not an
>>>                               issue anymore; getpid() works as expected
>>>             returning a
>>>                      global pid.
>>>                               For LinuxThreads, there was a workaround
>>>             where at VM
>>>                      startup the
>>>                               pid was
>>>                               captured (on the thread invoking
>>>             CreateJavaVM) and this
>>>                      pid was
>>>                               stored in
>>>                               static variable _initial_pid and returned
>>> in
>>>                               os::current_process_id().
>>>                               Later another workaround was added because
>>>             CreateJavaVM
>>>                      was not
>>>                               invoked on
>>>                               the primordial thread anymore, so the pid
>>>             was handed in
>>>                      via a
>>>                               define (see
>>>                               Arguments::sun_java_launcher_pid()).
>>>
>>>                               Both workaround were removed and
>>>                      os::current_process_id() was
>>>                               changed to
>>>                               just return getpid().
>>>
>>>
>>>                           We should remove the code on the JDK side as
>>>             well -
>>>                      possibly as a
>>>                           follow up clean up (but via the same forest as
>>>             this will be
>>>                      pushed):
>>>
>>>
>>> ./java.base/unix/native/libjli/java_md_solinux.c
>>>
>>>                           void SetJavaLauncherPlatformProps() {
>>>                               /* Linux only */
>>>                           #ifdef __linux__
>>>                                const char *substr =
>>>             "-Dsun.java.launcher.pid=";
>>>                                char *pid_prop_str = (char
>>>                      *)JLI_MemAlloc(JLI_StrLen(substr) +
>>>                           MAX_PID_STR_SZ + 1);
>>>                                sprintf(pid_prop_str, "%s%d", substr,
>>>             getpid());
>>>                                AddOption(pid_prop_str, NULL);
>>>                           #endif /* __linux__ */
>>>
>>>                           }
>>>
>>>
>>>                      Unfortunately, sun.launcher.pid gets used in the
>>>             wild, e.g.:
>>>
>>>
>>> http://stackoverflow.com/questions/35842/how-can-a-java-program-get-its-own-process-id
>>>
>>>                      Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>>                               2) os::Linux::gettid(): Linux uses the
>>>             kernel thread id for
>>>                               OSThread thread
>>>                               id - I did not change that. But by now the
>>>             system call
>>>                      gettid
>>>                               should be
>>>                               available in every Linux kernel, so I
>>>             removed the
>>>                      fallback handling.
>>>
>>>                               3) A number of changes dealt with the way
>>>             LinuxThreads
>>>                      allocated
>>>                               thread
>>>                               stacks (with mmap and the MAP_GROWSDOWN
>>>             flag). On
>>>                      LinuxThreads,
>>>                               it was not
>>>                               possible to change the size of thread
>>>             stacks (to start
>>>                      a new
>>>                               thread with a
>>>                               different stack size). NPTL can do this
>>>             and all the
>>>                      places where
>>>                               we dealt
>>>                               with fixed stacks I changed.
>>>
>>>                               4) On LinuxThreads, there is the problem
>>>             that the
>>>                      thread stacks
>>>                               - which
>>>                               grow down and are allocated with MAP_FIXED
>>>             - may at
>>>                      some point
>>>                               trash the
>>>                               java heap. To prevent this, every
>>>             allocation done with
>>>                               os::reserve_memory()
>>>                               and friends recorded a
>>>             "_highest_vm_reserved_address".
>>>                      There was
>>>                               a function
>>>                               called "_thread_safety_check" which
>>>             prevented start of new
>>>                               threads if
>>>                               thread stacks came dangerously close to
>>>             the highest
>>>                      reserved
>>>                               address + a
>>>                               gap; latter gap could be set via parameter
>>>                      ThreadSafetyMargin.
>>>
>>>                               All this coding was removed. Note that
>>>             this coding
>>>                      probably was
>>>                               already
>>>                               broken since a long time, because there
>>>             are many
>>>                      allocations
>>>                               which were not
>>>                               tracked.
>>>
>>>                               5) Recognition of glibc version and
>>>             pthread library
>>>                      version were
>>>                               very
>>>                               elaborate to deal with recognition of
>>>             LinuxThreads
>>>                               implementation. Those
>>>                               were dumbed down and now just assert in
>>>             the highly
>>>                      unlikely case
>>>                               that we
>>>                               encounter a LinuxThreads implementation.
>>>
>>>                               The rest of the stuff is more
>>>             straight-forward.
>>>
>>>                               ---
>>>
>>>                               I built the changes on Linux x64, x86 and
>>>             ppc64 and ran
>>>                      jtreg tests
>>>                               (hotspot/test) on these platforms too. I
>>>             did not see
>>>                      any issues,
>>>                               but I'd
>>>                               like to get a couple of reviews for this.
>>>
>>>                               Kind Regards,
>>>                               Thomas
>>>
>>>
>>>
>>>
>>>
>


More information about the hotspot-dev mailing list