RFR(s): 8078513: [linux] Clean up code relevant to LinuxThreads implementation
Thomas Stüfe
thomas.stuefe at gmail.com
Mon Jun 15 12:57:04 UTC 2015
Hi Volker, David,
Volker, thanks for the review!
new version:
http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.04/webrev/
I added all changes Volker wanted.
Oh, and I forgot to add Volker as a Reviewer, please add him when pushing.
Kind Regards, Thomas
On Mon, Jun 15, 2015 at 11:23 AM, Thomas Stüfe <thomas.stuefe at gmail.com>
wrote:
> Hi,
>
> sorry, dropped the ball on this one. Will post an updated version soon.
>
> Thomas
>
> On Fri, Jun 12, 2015 at 4:02 AM, David Holmes <david.holmes at oracle.com>
> wrote:
>
>> Hi Thomas,
>>
>> If you can address Volker's comments we can move forward with this. I'll
>> be primary Reviewer and Volker a second reviewer.
>>
>> Thanks,
>> David
>>
>> On 3/06/2015 3:52 PM, Thomas Stüfe wrote:
>>
>>> Ping...
>>>
>>> Still need a second reviewer.
>>>
>>> Thanks!
>>>
>>> On Wednesday, May 20, 2015, Thomas Stüfe <thomas.stuefe at gmail.com
>>> <mailto:thomas.stuefe at gmail.com>> wrote:
>>>
>>> David, thanks!
>>>
>>> to all: could I have a second reviewer, please?
>>>
>>> Thanks & Kind Regards, Thomas
>>>
>>> On Wed, May 20, 2015 at 10:42 AM, David Holmes
>>> <david.holmes at oracle.com
>>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>> wrote:
>>>
>>> Hi Thomas,
>>>
>>> Summary: ok - need second reviewer ( I think Coleen is away)
>>>
>>> More inline ...
>>>
>>> On 19/05/2015 10:26 PM, Thomas Stüfe wrote:
>>>
>>> Hi David,
>>>
>>> new webrev:
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.02/webrev/
>>>
>>> Only two things changed. I reverted my comment change in the
>>> SA and in
>>> os_linux.cpp I massaged the "expand-thread-stack" comment
>>> text a bit.
>>>
>>> More details inline.
>>>
>>> On Thu, May 14, 2015 at 4:40 AM, David Holmes
>>> <david.holmes at oracle.com
>>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>
>>> <mailto:david.holmes at oracle.com
>>> <javascript:_e(%7B%7D,'cvml','david.holmes at oracle.com');>>>
>>> wrote:
>>>
>>> Hi Thomas,
>>>
>>> On 14/05/2015 2:32 AM, Thomas Stüfe wrote:
>>>
>>> Hi David,
>>>
>>> here the new version:
>>>
>>> http://cr.openjdk.java.net/~stuefe/webrevs/8078513/webrev.01/webrev/
>>>
>>> Differences to the old version are just comment
>>> changes, and I also
>>> modified some comments in the SA as Staffan
>>> requested.
>>>
>>>
>>> agent/src/os/linux/libproc.h
>>>
>>> Modified comments still refer to "both thread
>>> libraries" but now the
>>> two libraries have not been mentioned, so there is no
>>> context to
>>> know what the two libraries are.
>>>
>>>
>>> I reverted my comment change to the original version. I
>>> think I do not
>>> know enough about the SA internals to make the comment clear
>>> and
>>> concise. Someone with more SA knowledge may be better suited
>>> to change
>>> that comment.
>>>
>>>
>>> Ok. I think you were pretty close anyway, just need to get rid
>>> of the "both thread libraries" references.
>>>
>>>
>>>
>>> More commentary inline below (with trimming)...
>>>
>>>
>>> See remarks inline:
>>>
>>> os_linux.cpp:
>>>
>>> The whole MAP_GROWSDOWN discussion and code
>>> also seems to be no
>>> longer relevant.
>>>
>>>
>>> I spent a lot of time digging around in the history
>>> of this
>>> thing. I am
>>> unsure whether that stack expansion coding can be
>>> removed. There are
>>> some rare case where they may still be needed:
>>>
>>> That whole coding revolves around the question
>>> whether thread stacks
>>> need to be expanded manually.
>>>
>>> Nowadays, normal threads allocated with a modern
>>> pthread library
>>> (NPTL)
>>> just allocate the whole stack with mmap(without
>>> MAP_NORESERVE), so the
>>> memory is committed right away. See glibc sources,
>>> nptl/allocate_stack.c.
>>>
>>> In former times (LinuxThreads) thread stacks were
>>> allocated using
>>> mmap(MAP_GROWSDOWN), which needed cooperation from
>>> the linux kernel:
>>> kernel caught faults caused by expanding the stack
>>> and transparently
>>> mapped the next page. It needed to distinguish real
>>> page faults from
>>> stack expansion page faults, and seems the logic
>>> did not always
>>> work, so
>>> manual expansion was needed - see os_linux.cpp,
>>> "os::Linux::manually_expand_stack".
>>>
>>> I think there may still be cases where we run on
>>> threads whose
>>> stacks
>>> are allocated with mmap(MAP_GROWSDOWN), even though
>>> LinuxThreads
>>> is gone:
>>>
>>> - with primordial threads (not sure, did not test).
>>> We do not run on
>>> primordial thread but someone may write a launcher
>>> and start a
>>> VM from
>>> the primordial thread or attach the primordial
>>> thread to the VM.
>>>
>>>
>>> This is already a very grey area in the VM. See:
>>>
>>> https://bugs.openjdk.java.net/browse/JDK-6516512
>>>
>>> "HotSpot:thread terminology should be clearer or
>>> cleaned up."
>>>
>>> So until that is investigated and cleaned up this code
>>> can't really
>>> be touched.
>>>
>>>
>>> Interesting and well written bug report. Unfortunately, the
>>> issue seems
>>> to be in deep sleep.
>>>
>>>
>>> Sadly often true :(
>>>
>>> At SAP, we explicitly forbid all our internal users to
>>> invoke the VM on
>>> the primordial thread, and have done so since a long time.
>>> We tell all
>>> our users to spawn off an own pthread for VM creation. I
>>> think the AIX
>>> port will not even come up, we just assert.
>>>
>>> But this is more of a political question. If we pull support
>>> for running
>>> on the primordial thread in the OpenJDK, there may be
>>> applications which
>>> break. On the other hand, does this scenario - running on
>>> the primordial
>>> thread - get tested? Would we even notice if it were not to
>>> work anymore?
>>>
>>>
>>> There may be some JNI tests that use the primordial thread but I
>>> can't say for sure.
>>>
>>> - someone may write its own thread implementation
>>> using clone and
>>> mmap(MAP_GROWSDOWN) allocated stacks.
>>>
>>> These cases are probably extremely rare, and I
>>> would have no qualms
>>> removing support for them, but I do not want to
>>> decide that.
>>>
>>>
>>> Fair enough. As I said I think this expands into a
>>> cleanup up of the
>>> whole "initial thread" business anyway.
>>>
>>> Sidenote: These cases are inherently unsafe because
>>> mmap(MAP_GROWSDOWN)
>>> is too. Ulrich Drepper tried without success to
>>> remove support
>>> for these
>>> flags from the linux kernel:
>>> https://lwn.net/Articles/294001/
>>>
>>> I added some comment to os_linux.cpp to explain
>>> this a bit better.
>>>
>>>
>>> The two comments don't rely flow together due to the
>>> "Force Linux
>>> kernel to expand current thread stack." lead-in of the
>>> second
>>> comment (which isn't actually a grammatically correct
>>> sentence). But
>>> I can't really think of anything to do that doesn't
>>> start making
>>> numerous changes to what is already a confusing and
>>> disjointed
>>> comment block.
>>>
>>>
>>> I tried to improve the comment a bit, to clearly distinguish
>>> it from the
>>> older comment and to provide enough information for whenever
>>> we decide
>>> to clean up this expand-thread-stack coding.
>>>
>>>
>>> Revised comment is much clearer - thanks!
>>>
>>> David
>>> -----
>>>
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>> The gcc 2.x workarounds can also be removed I
>>> think.
>>>
>>> I wonder if we can also eradicate
>>> WorkAroundNPTLTimedWaitHang :)
>>>
>>>
>>> I do not dare to do that - I do not know enough
>>> about this issue.
>>>
>>>
>>> Separate RFE. I doubt we care about the old systems
>>> that had this
>>> problem.
>>>
>>>
>>> Aside: The createThread_lock (now unused)
>>> seems to have
>>> been copied
>>> into src/os/aix/vm/os_aix.hpp as part of the
>>> AIX port and
>>> should be
>>> removed at some point.
>>>
>>>
>>> Yes I know :-) It is not the only Linux-specific
>>> code which
>>> crept into
>>> our AIX port. We actually have a bug open to deal
>>> with that:
>>> https://bugs.openjdk.java.net/browse/JDK-8079125
>>>
>>>
>>> Good to know :)
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>> ---
>>>
>>> src/os/linux/vm/os_linux.hpp
>>>
>>> Copyright needs updating.
>>>
>>>
>>> done
>>>
>>> ---
>>>
>>> src/os_cpu/linux_x86/vm/os_linux_x86.cpp
>>>
>>> Copyright needs updating.
>>>
>>> done
>>>
>>> os::Linux::supports_variable_stack_size() is
>>> now dead
>>> code (value
>>> is true for all arch's AFAICS) so can be
>>> removed from all
>>> src/os_cpu/linux_XXX/os_linux_XXX.cpp.
>>>
>>> I think supports_variable_stack_size() is
>>> also always true
>>> on other
>>> platforms (AIX, BSD) and so could be cleaned
>>> up in that
>>> code too
>>> (separate clean-up CR okay for that).
>>>
>>>
>>> I agree, lets do this in a different RFR because
>>> this would touch
>>> non-linux sources and I want to keep this linux
>>> specific. I opened
>>> https://bugs.openjdk.java.net/browse/JDK-8080298 for this.
>>>
>>>
>>> Thanks for filing.
>>>
>>>
>>>
>>> ---
>>>
>>>
>>> src/os_cpu/linux_x86/vm/threadLS_linux_x86.cpp
>>>
>>> Copyright needs updating.
>>>
>>> done
>>>
>>> ---
>>>
>>> I'm happy to sponsor this for you.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>
>>> This change removes (I hope all) remnants
>>> of code
>>> dealing with
>>> LinuxThreads
>>> from the hotspot.
>>>
>>> Discussion of the changes:
>>>
>>> 1) on LinuxThreads, each thread had an own
>>> pid, so getpid()
>>> would return a
>>> unique pid for every thread it was invoked
>>> in. On NPTL,
>>> this is
>>> not an
>>> issue anymore; getpid() works as expected
>>> returning a
>>> global pid.
>>> For LinuxThreads, there was a workaround
>>> where at VM
>>> startup the
>>> pid was
>>> captured (on the thread invoking
>>> CreateJavaVM) and this
>>> pid was
>>> stored in
>>> static variable _initial_pid and returned
>>> in
>>> os::current_process_id().
>>> Later another workaround was added because
>>> CreateJavaVM
>>> was not
>>> invoked on
>>> the primordial thread anymore, so the pid
>>> was handed in
>>> via a
>>> define (see
>>> Arguments::sun_java_launcher_pid()).
>>>
>>> Both workaround were removed and
>>> os::current_process_id() was
>>> changed to
>>> just return getpid().
>>>
>>>
>>> We should remove the code on the JDK side as
>>> well -
>>> possibly as a
>>> follow up clean up (but via the same forest as
>>> this will be
>>> pushed):
>>>
>>>
>>> ./java.base/unix/native/libjli/java_md_solinux.c
>>>
>>> void SetJavaLauncherPlatformProps() {
>>> /* Linux only */
>>> #ifdef __linux__
>>> const char *substr =
>>> "-Dsun.java.launcher.pid=";
>>> char *pid_prop_str = (char
>>> *)JLI_MemAlloc(JLI_StrLen(substr) +
>>> MAX_PID_STR_SZ + 1);
>>> sprintf(pid_prop_str, "%s%d", substr,
>>> getpid());
>>> AddOption(pid_prop_str, NULL);
>>> #endif /* __linux__ */
>>>
>>> }
>>>
>>>
>>> Unfortunately, sun.launcher.pid gets used in the
>>> wild, e.g.:
>>>
>>>
>>> http://stackoverflow.com/questions/35842/how-can-a-java-program-get-its-own-process-id
>>>
>>> Kind Regards, Thomas
>>>
>>>
>>>
>>>
>>> 2) os::Linux::gettid(): Linux uses the
>>> kernel thread id for
>>> OSThread thread
>>> id - I did not change that. But by now the
>>> system call
>>> gettid
>>> should be
>>> available in every Linux kernel, so I
>>> removed the
>>> fallback handling.
>>>
>>> 3) A number of changes dealt with the way
>>> LinuxThreads
>>> allocated
>>> thread
>>> stacks (with mmap and the MAP_GROWSDOWN
>>> flag). On
>>> LinuxThreads,
>>> it was not
>>> possible to change the size of thread
>>> stacks (to start
>>> a new
>>> thread with a
>>> different stack size). NPTL can do this
>>> and all the
>>> places where
>>> we dealt
>>> with fixed stacks I changed.
>>>
>>> 4) On LinuxThreads, there is the problem
>>> that the
>>> thread stacks
>>> - which
>>> grow down and are allocated with MAP_FIXED
>>> - may at
>>> some point
>>> trash the
>>> java heap. To prevent this, every
>>> allocation done with
>>> os::reserve_memory()
>>> and friends recorded a
>>> "_highest_vm_reserved_address".
>>> There was
>>> a function
>>> called "_thread_safety_check" which
>>> prevented start of new
>>> threads if
>>> thread stacks came dangerously close to
>>> the highest
>>> reserved
>>> address + a
>>> gap; latter gap could be set via parameter
>>> ThreadSafetyMargin.
>>>
>>> All this coding was removed. Note that
>>> this coding
>>> probably was
>>> already
>>> broken since a long time, because there
>>> are many
>>> allocations
>>> which were not
>>> tracked.
>>>
>>> 5) Recognition of glibc version and
>>> pthread library
>>> version were
>>> very
>>> elaborate to deal with recognition of
>>> LinuxThreads
>>> implementation. Those
>>> were dumbed down and now just assert in
>>> the highly
>>> unlikely case
>>> that we
>>> encounter a LinuxThreads implementation.
>>>
>>> The rest of the stuff is more
>>> straight-forward.
>>>
>>> ---
>>>
>>> I built the changes on Linux x64, x86 and
>>> ppc64 and ran
>>> jtreg tests
>>> (hotspot/test) on these platforms too. I
>>> did not see
>>> any issues,
>>> but I'd
>>> like to get a couple of reviews for this.
>>>
>>> Kind Regards,
>>> Thomas
>>>
>>>
>>>
>>>
>>>
>
More information about the hotspot-dev
mailing list