RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up

Wed May 21 12:30:42 UTC 2014

Question, memory layout:

On Linux, the process main stack is usually mapped last, and address 
space is potentially randomized.

How is libjvm guaranteed to be mapped after stack ?

On 05/21/2014 02:09 PM, Dmitry Samersoff wrote:
> Coleen,
>
> Please, see updated webrev:
>
> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.03/
>
> only invoke.c is changed (added -Xint and limited recursion depth for
> second run).
>
>>> It's guaranteed. We always have libjvm mapped after stack of initial
>>> thread.
>> We do?   Really?
> Sure! Memory layout *of this test* is always the same. Unless we change
> JNI_CreateJavaVM() implementation drastically.
>
> -Dmitry
>
>
>
> On 2014-05-20 18:59, Coleen Phillimore wrote:
>> On 5/20/14, 10:16 AM, Dmitry Samersoff wrote:
>>> Coleen,
>>>
>>>> Is this guaranteed?  How many recursions of do_overflow will this
>>>> take?   I still prefer my suggestion of limiting the recursions in the
>>>> second case of do_overfflow so the test doesn't rely on some
>>>> environmental setting that may not be true.
>>> It's guaranteed. We always have libjvm mapped after stack of initial
>>> thread.
>> We do?   Really?
>>> Typically we crashes after 136000 recursive calls and it takes 1.071 sec.
>>>
>>> But it's worth to limit recursion in second case. I'll do it.
>> Yes, please do this!
>>
>> Coleen
>>> -Dmitry
>>>
>>>
>>> On 2014-05-20 17:54, Coleen Phillimore wrote:
>>>> Dmitry, some comments below.
>>>>
>>>> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>>>>> Coleen,
>>>>>
>>>>> Thank you for the review!
>>>>>
>>>>> Please see below.
>>>>>
>>>>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>>>>> I have reviewed the test.  I'm happy that you are fixing this test
>>>>>> because it's been really problematic, but I have a couple of concerns.
>>>>>> First the java program overflow doesn't allocate anything on the stack
>>>>>> so when it gets compiled it might take a really long time to run.  Can
>>>>>> you pass -Xint in invoke.c in the option string?
>>>>> Sure. Will do.
>>>>>
>>>>>> Second, the second do_overflow() in run_native_overflow must hit some
>>>>>> sort of protected page causing it to terminate, why isn't it
>>>>>> SIG_ACCERR?   Maybe the second do_overflow should only recurse the
>>>>>> same
>>>>>> number of times as the first + some small number so it doesn't keep
>>>>>> running up the stack (until ???)
>>>>> Both first and second call to do_overflow can:
>>>>>
>>>>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>>>>> case, ERR in second)
>>>>>
>>>>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>>>>> SIG_ACCERR. (ERR in first case, OK in second)
>>>> Is this guaranteed?  How many recursions of do_overflow will this
>>>> take?   I still prefer my suggestion of limiting the recursions in the
>>>> second case of do_overfflow so the test doesn't rely on some
>>>> environmental setting that may not be true.
>>>>> There is a case that is not covered by this test explicitly - if test
>>>>> guard page was never been installed and stack starts immediately after
>>>>> non-stack page it's possible that native do_verflow overwrite whole
>>>>> non-stack page and hit another protected page and we get SIGSEGV with
>>>>> SIG_ACCERR.
>>>>>
>>>>> Most probably, hotspot would crash in this case.
>>>>>
>>>>> Let me know if you like me to check it explicitly. We can read
>>>>> /proc/self/map and check what page cause SIGSEGV.
>>>> This seems more complicated and more likely to be affected by the
>>>> environment, so I think you should not do this for now.
>>>>
>>>> thanks,
>>>> Coleen
>>>>
>>>>> -Dmitry
>>>>>
>>>>>
>>>>>> Thanks,
>>>>>> Coleen
>>>>>>
>>>>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>>>>> Dan,
>>>>>>>
>>>>>>> Fixed (in-place, press shift-reload)
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>>
>>>>>>>>         One thing I did notice is that some of the error messages are
>>>>>>>>         identical which would make it more difficult for someone
>>>>>>>>         investigating a future failure to know which failure
>>>>>>>> happened:
>>>>>>> It shouldn't be an issue because a) test prints banner to stdout with
>>>>>>> the name of the function it's running b) test terminates after first
>>>>>>> error
>>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>>
>>>>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>>>>> Dan,
>>>>>>>>>
>>>>>>>>> Thank you for the review!
>>>>>>>>>
>>>>>>>>> 1. I use exit(7) to make the situation when the test fails
>>>>>>>>> because of
>>>>>>>>> some error condition clear visible.
>>>>>>>>>
>>>>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>>>>> fails, but
>>>>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>>>>
>>>>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>>>        test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>         No comments.
>>>>>>>>
>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>         I'm good with most of the changes.
>>>>>>>>
>>>>>>>>         One thing I did notice is that some of the error messages are
>>>>>>>>         identical which would make it more difficult for someone
>>>>>>>>         investigating a future failure to know which failure
>>>>>>>> happened:
>>>>>>>>
>>>>>>>>         102     fprintf(stderr, "Test ERROR. Can't load class
>>>>>>>> DoOverflow\n");
>>>>>>>>        144     printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>>             You can make them unique by including the function
>>>>>>>>             name that generated them
>>>>>>>>
>>>>>>>>         I just noticed that some error messages still go to stdout,
>>>>>>>>         e.g., line 144 above.
>>>>>>>>           Looks like this comment from the first round was also
>>>>>>>> missed
>>>>>>>>         (the line number is now 243):
>>>>>>>>
>>>>>>>>>         line 225: You'll end up here if no arguments are passed or
>>>>>>>>> if the
>>>>>>>>>             wrong arguments are passed. You should have a better
>>>>>>>>> usage
>>>>>>>>>             message, e.g.:
>>>>>>>>>
>>>>>>>>>         void usage() {
>>>>>>>>>             fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>>>             fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>>>>             exit(1);
>>>>>>>>>         }
>>>>>>>>       test/runtime/StackGuardPages/testme.sh
>>>>>>>>         No comments.
>>>>>>>>
>>>>>>>> test/runtime/6929067/T.java
>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>         No comments on the deleted files.
>>>>>>>>
>>>>>>>>      Dan
>>>>>>>>
>>>>>>>>> -Dmitry
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>>>>> Hi Everyone,
>>>>>>>>>>>
>>>>>>>>>>> Please, review test changes:
>>>>>>>>>>>
>>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>>>>         You need to check the command status after running gcc_cmd
>>>>>>>>>>         and if it is not zero, then gratuitously pass the test.
>>>>>>>>>>         Jerry made a similar test fix last week:
>>>>>>>>>>
>>>>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>>>>> +    echo "Compile failed, Ignoring failed compilation and
>>>>>>>>>> forcing the
>>>>>>>>>> test to pass"
>>>>>>>>>> +    exit 0
>>>>>>>>>> +fi
>>>>>>>>>>
>>>>>>>>>>         This will catch the case where a 64-bit Linux test machine
>>>>>>>>>>         has a compiler, but not the 32-bit build env.
>>>>>>>>>>
>>>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>>>         No comments.
>>>>>>>>>>
>>>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>>>         line 28:  * than detaching from vm thread and overflow
>>>>>>>>>> stack once
>>>>>>>>>> again.
>>>>>>>>>>             Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>>>>
>>>>>>>>>>         line 92: (*_jvm)->AttachCurrentThread(_jvm, (void**)&env,
>>>>>>>>>> NULL);
>>>>>>>>>>         line 125: (*_jvm)->AttachCurrentThread(_jvm, (void **)&env,
>>>>>>>>>> NULL);
>>>>>>>>>>             What happens if the JNI AttachCurrentThread() call
>>>>>>>>>> fails?
>>>>>>>>>>
>>>>>>>>>>         line 96: printf("Test ERROR. Can't load class
>>>>>>>>>> DoOverflow\n");
>>>>>>>>>>         line 102: printf("Test ERROR. Can't find method
>>>>>>>>>> DoOverflow.printIt\n");
>>>>>>>>>>         line 129: printf("Test ERROR. Can't load class
>>>>>>>>>> DoOverflow\n");
>>>>>>>>>>         line 135: printf("Test ERROR. Can't find method
>>>>>>>>>> DoOverflow.printAlive\n");
>>>>>>>>>>         line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>>>>         line 224: printf("Test ERROR. Unknown parameter should be
>>>>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>>>>             Don't errors usually get printed on stderr?
>>>>>>>>>>
>>>>>>>>>>         line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>>>>> method_id, NULL);
>>>>>>>>>>             What happens if this call fails? Other than the
>>>>>>>>>> intentional
>>>>>>>>>>             StackOverflowError, but wait, you catch and swallow
>>>>>>>>>> that
>>>>>>>>>> error...
>>>>>>>>>>
>>>>>>>>>>         line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>>         line 150   (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>>             What happens if this call fails?
>>>>>>>>>>
>>>>>>>>>>         line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>>>>> method_id,
>>>>>>>>>> NULL);
>>>>>>>>>>             What happens if this call fails?
>>>>>>>>>>
>>>>>>>>>>         line 153: // For non-initial thread we doesn't unmap
>>>>>>>>>>             Typo: "we doesn't unmap"
>>>>>>>>>>                -> "we don't unmap"
>>>>>>>>>>
>>>>>>>>>>         line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>>>>             Typos, perhaps: same SEGV signal trying to access it
>>>>>>>>>>
>>>>>>>>>>         line 156: // We have no ways to check this, so bail out,
>>>>>>>>>> marking
>>>>>>>>>> test as succeded
>>>>>>>>>>             Typo: 'no ways' -> 'no way'
>>>>>>>>>>             Typo: 'as succeded' -> 'as succeeded'
>>>>>>>>>>
>>>>>>>>>>         line 157: printf("Test OK. ...
>>>>>>>>>>         line 171: printf("Test OK. ...
>>>>>>>>>>             Should "OK" be "PASSED"?
>>>>>>>>>>
>>>>>>>>>>         line 168: printf("Test FAILED. Stack guard page is still
>>>>>>>>>> there\n");
>>>>>>>>>>             Don't failures usually get printed on stderr?
>>>>>>>>>>
>>>>>>>>>>         line 225: You'll end up here if no arguments are passed or
>>>>>>>>>> if the
>>>>>>>>>>             wrong arguments are passed. You should have a better
>>>>>>>>>> usage
>>>>>>>>>>             message, e.g.:
>>>>>>>>>>
>>>>>>>>>>         void usage() {
>>>>>>>>>>             fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>>>>             fprintf(stderr, "       invoke
>>>>>>>>>> test_native_overflow\n");
>>>>>>>>>>             exit(1);
>>>>>>>>>>         }
>>>>>>>>>>
>>>>>>>>>>         I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>>>>
>>>>>>>>>>         Also, why 'return 0' or 'return 1' instead of 'exit(0)' or
>>>>>>>>>> 'exit(1)'?
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> test/runtime/6929067/T.java
>>>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>>>         OK I can see why you want to combine the two tests.
>>>>>>>>>>         I'm guessing that 'InitialThreadOverflow' was
>>>>>>>>>>         derived from the much older '6929067'...
>>>>>>>>>>
>>>>>>>>>>
>>>>>>>>>> Dan
>>>>>>>>>>
>>>>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>>>>> sophisticated test.
>>>>>>>>>>>
>>>>>>>>>>> -Dmitry
>>>>>>>>>>>
>