RFR: 8189170: Add option to disable stack overflow checking in primordial thread for use with JNI_CreateJavaJVM
Tomas Kalibera
tomas.kalibera at gmail.com
Mon Nov 20 14:37:21 UTC 2017
I have tested with R and it seems to be working fine.
Specifically, I've tested with jdk/hs 47868 as baseline, with R-devel
73756, rJava 0.9-9 (modified to print VM args). I've run R recursive
calls after initializing the jvm via .jinit(), using
RJAVA_JVM_STACK_WORKAROUND=1 (detect inserted guard pages, but only
print a message when they appear).
The baseline inserted guard pages as expected following -Xss (default
and when specified 1m..9m) and following the implied cap when rlimit
stack is unlimited, and R consequenlty crashed in the baseline. The
modified version with the fix activated did not insert guard pages.
Tested with rlimit stack of 8m (Ubuntu default) and unlimited.
From R, I activated the fix via
options(java.parameters=c("-XX:+UnlockExperimentalVMOptions",
"-XX:+DisablePrimordialThreadGuardPages"))
Tested on Ubuntu 17.04, x86_64. jdk/hs built with gcc 4.9.4 (but still
needed --disable-warnings-as-errors)
Thanks a lot for the new option!
Tomas
On 11/20/2017 02:06 PM, David Holmes wrote:
> Thank you both.
>
> David
>
> On 20/11/2017 11:05 PM, Daniel D. Daugherty wrote:
>> I'm also good with this change.
>>
>> Dan
>>
>>
>> On 11/20/17 4:04 AM, Thomas Stüfe wrote:
>>> Hi David,
>>>
>>> looks fine.
>>>
>>> Thanks, Thomas
>>>
>>> On Mon, Nov 20, 2017 at 4:44 AM, David Holmes
>>> <david.holmes at oracle.com <mailto:david.holmes at oracle.com>> wrote:
>>>
>>> Hi Dan,
>>>
>>> Thanks for the review. I've addressed all of yours and Thomas's
>>> comments.
>>>
>>> http://cr.openjdk.java.net/~dholmes/8189170/webrev.v2/
>>> <http://cr.openjdk.java.net/%7Edholmes/8189170/webrev.v2/>
>>>
>>> The main change is:
>>>
>>> + static bool is_primordial_thread(void)
>>> + #if defined(_WINDOWS) || defined(BSD)
>>> + // No way to identify the primordial thread.
>>> + { return false; }
>>> + #else
>>> + ;
>>> + #endif
>>>
>>>
>>> Also to clarify this is logically a "product" flag for the systems
>>> that need it, not a diagnostic one. But as we don't want this flag
>>> to be in common use and it is of such narrow applicability, we
>>> have made it experimental. This was discussed previously in the
>>> CSR review thread for this:
>>>
>>> http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024859.html
>>> <http://mail.openjdk.java.net/pipermail/hotspot-runtime-dev/2017-October/024859.html>
>>>
>>> Once Tomas K. has been able to test this, and you and Thomas
>>> sign-off on the updates, I'll get this pushed. But as I'm away
>>> from Thursday it may not be until next week.
>>>
>>> Thanks,
>>> David
>>> -----
>>>
>>>
>>>
>>> On 18/11/2017 1:40 AM, Daniel D. Daugherty wrote:
>>>
>>> On 11/14/17 5:39 AM, David Holmes wrote:
>>>
>>> Bug: https://bugs.openjdk.java.net/browse/JDK-8189170
>>> <https://bugs.openjdk.java.net/browse/JDK-8189170>
>>>
>>> webrev:
>>> http://cr.openjdk.java.net/~dholmes/8189170/webrev/
>>> <http://cr.openjdk.java.net/%7Edholmes/8189170/webrev/>
>>>
>>>
>>> src/hotspot/os/aix/os_aix.cpp
>>> No comments.
>>>
>>> src/hotspot/os/aix/os_aix.hpp
>>> No comments.
>>>
>>> src/hotspot/os/bsd/os_bsd.cpp
>>> L880: // thread in any special way - so just report
>>> 'false'
>>> Nit - needs a period after 'false' to make it a
>>> sentence.
>>>
>>> L882: return false;
>>> Nit - indent should be 2 spaces.
>>>
>>> src/hotspot/os/bsd/os_bsd.hpp
>>> No comments.
>>>
>>> src/hotspot/os/linux/os_linux.cpp
>>> L947: if (stack_size >= (size_t)(3 * page_size()))
>>> Nit - needs '{' and '}' on the if-statement body.
>>>
>>> L3070: // always place it right after end of the mapped
>>> region
>>> Nit - needs a period after 'region' to make it a
>>> sentence.
>>> (Not your problem, but you did update the sentence...)
>>>
>>> src/hotspot/os/linux/os_linux.hpp
>>> No comments.
>>>
>>> src/hotspot/os/solaris/os_solaris.cpp
>>> No comments other than I learned something new about
>>> thr_main() :-)
>>> Thanks for cleaning up that mess.
>>>
>>> src/hotspot/os/windows/os_windows.cpp
>>> L452: // thread in any special way - so just report
>>> 'false'
>>> Nit - needs a period after 'false' to make it a
>>> sentence.
>>>
>>> L454: return false;
>>> Nit - indent should be 2 spaces.
>>>
>>> src/hotspot/share/gc/shared/threadLocalAllocBuffer.cpp
>>> No comments.
>>>
>>> src/hotspot/share/runtime/globals.hpp
>>> L1181: experimental(bool,
>>> DisablePrimordialThreadGuardPages, false,
>>> Experimental or diagnostic?
>>>
>>> src/hotspot/share/runtime/os.hpp
>>> L450: // that loads/creates the JVM via JNI_CreateJavaVM
>>> Nit - needs a period after 'JNI_CreateJavaVM'.
>>>
>>> L453: // launcher nevers uses the primordial thread as
>>> the main thread, but
>>> Typo - 'nevers' -> 'never'
>>>
>>> L455: // have to special-case handling of the
>>> primordial thread if it attaches
>>> Typo - 'have to' -> 'have'
>>>
>>> src/hotspot/share/runtime/thread.cpp
>>> I like the new log message.
>>>
>>> src/hotspot/share/runtime/thread.inline.hpp
>>> L159: if (os::uses_stack_guard_pages() &&
>>> L160: !(DisablePrimordialThreadGuardPages &&
>>> os::is_primordial_thread())) {
>>> I'm not sure why, but I had to read these two lines
>>> several
>>> times before I convinced myself the logic is right.
>>>
>>> I'm good with the code changes. Your call on whether to fix the
>>> bits or not.
>>>
>>> Regarding this part of Thomas' review:
>>>
>>> Or, something like this:
>>> static bool is_primordial_thread(void)
>>> WINDOWS_ONLY({return false;}) BSD_ONLY({return false;})
>>>
>>>
>>> I can do that. I'll see if anyone else has any comments on
>>> this.
>>>
>>>
>>> I'm assuming that Thomas means for the above to be at the
>>> platform independent layer (runtime/os.hpp?) so it is more
>>> obvious to the reader that the function doesn't do anything
>>> on those two platforms... I know we dislike such things at
>>> the platform independent layer, but it does feel appropriate
>>> to make this limitation more obvious.
>>>
>>> I would be okay if you made the change that Thomas is
>>> suggesting.
>>>
>>> Thumbs up!
>>>
>>> Dan
>>>
>>>
>>>
>>>
>>> There are three parts to this which made it somewhat more
>>> complicated than I had envisaged:
>>>
>>> 1. Add the flag and disable the guards.
>>>
>>> This is relatively straight-forward:
>>>
>>> void JavaThread::create_stack_guard_pages() {
>>> if (!os::uses_stack_guard_pages() ||
>>> _stack_guard_state != stack_guard_unused ||
>>> (DisablePrimordialThreadGuardPages &&
>>> os::is_primordial_thread())) {
>>> log_info(os, thread)("Stack guard page creation for
>>> thread "
>>> UINTX_FORMAT " disabled",
>>> os::current_thread_id());
>>> return;
>>>
>>> with a tweak in JavaThread::stack_guards_enabled as well.
>>>
>>> 2. Promote os::XXX::is_initial_thread to
>>> os::is_primordial_thread()
>>>
>>> We have three platforms that already implement this
>>> functionality: Linux, Solaris and AIX. For BSD/OSX and
>>> Windows we don't attempt to do anything different for the
>>> primordial thread, and we don't have a way to detect the
>>> primordial thread - so is_primordial_thread() just returns
>>> false.
>>>
>>> I also tidied up some comments regarding terminology.
>>>
>>> 3. Thomas noticed that we unconditionally subtract
>>> 2*page_size() from the rlimit stack size, without checking
>>> it was bigger than 2*page_size() to start with. I was
>>> unsure how to tackle this. It's no good leaving stack_size
>>> at zero so I opted to ensure we would have at least one
>>> page left. Of course in such cases we would hit the bug in
>>> libc (if it still exists, which seems unlikely but time
>>> consuming to establish).
>>>
>>> Testing:
>>> - Tier 1 (JPRT) testing with a modified launcher that
>>> uses the primordial thread
>>> - with guard pages enabled
>>> - with guard pages disabled
>>> - Tier 1 (JPRT) normal JDK build (which should be
>>> unaffected by this change)
>>>
>>> The testing was primarily to ensure that disabling the
>>> stack guards on the primordial thread wasn't totally
>>> broken. A number of tests fail:
>>> - setting -Xss32m fails for the primordial thread when
>>> guards are enabled and the rlimit is 8m
>>> - tests that would hit StackOverflowError crash the VM
>>> when guards are disabled (as you would expect).
>>> - runtime/execstack/Testexecstack.java crashes when the
>>> guards are disabled
>>>
>>> Thanks,
>>> David
>>>
>>>
>>>
>>
More information about the hotspot-runtime-dev
mailing list