RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up

Dmitry Samersoff dmitry.samersoff at oracle.com
Wed May 21 12:09:55 UTC 2014


Coleen,

Please, see updated webrev:

http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.03/

only invoke.c is changed (added -Xint and limited recursion depth for
second run).

>> It's guaranteed. We always have libjvm mapped after stack of initial
>> thread.
> We do?   Really?

Sure! Memory layout *of this test* is always the same. Unless we change
JNI_CreateJavaVM() implementation drastically.

-Dmitry



On 2014-05-20 18:59, Coleen Phillimore wrote:
> 
> On 5/20/14, 10:16 AM, Dmitry Samersoff wrote:
>> Coleen,
>>
>>> Is this guaranteed?  How many recursions of do_overflow will this
>>> take?   I still prefer my suggestion of limiting the recursions in the
>>> second case of do_overfflow so the test doesn't rely on some
>>> environmental setting that may not be true.
>> It's guaranteed. We always have libjvm mapped after stack of initial
>> thread.
> 
> We do?   Really?
>>
>> Typically we crashes after 136000 recursive calls and it takes 1.071 sec.
>>
>> But it's worth to limit recursion in second case. I'll do it.
> 
> Yes, please do this!
> 
> Coleen
>>
>> -Dmitry
>>
>>
>> On 2014-05-20 17:54, Coleen Phillimore wrote:
>>> Dmitry, some comments below.
>>>
>>> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>>>> Coleen,
>>>>
>>>> Thank you for the review!
>>>>
>>>> Please see below.
>>>>
>>>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>>>> I have reviewed the test.  I'm happy that you are fixing this test
>>>>> because it's been really problematic, but I have a couple of concerns.
>>>>> First the java program overflow doesn't allocate anything on the stack
>>>>> so when it gets compiled it might take a really long time to run.  Can
>>>>> you pass -Xint in invoke.c in the option string?
>>>> Sure. Will do.
>>>>
>>>>> Second, the second do_overflow() in run_native_overflow must hit some
>>>>> sort of protected page causing it to terminate, why isn't it
>>>>> SIG_ACCERR?   Maybe the second do_overflow should only recurse the
>>>>> same
>>>>> number of times as the first + some small number so it doesn't keep
>>>>> running up the stack (until ???)
>>>> Both first and second call to do_overflow can:
>>>>
>>>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>>>> case, ERR in second)
>>>>
>>>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>>>> SIG_ACCERR. (ERR in first case, OK in second)
>>> Is this guaranteed?  How many recursions of do_overflow will this
>>> take?   I still prefer my suggestion of limiting the recursions in the
>>> second case of do_overfflow so the test doesn't rely on some
>>> environmental setting that may not be true.
>>>> There is a case that is not covered by this test explicitly - if test
>>>> guard page was never been installed and stack starts immediately after
>>>> non-stack page it's possible that native do_verflow overwrite whole
>>>> non-stack page and hit another protected page and we get SIGSEGV with
>>>> SIG_ACCERR.
>>>>
>>>> Most probably, hotspot would crash in this case.
>>>>
>>>> Let me know if you like me to check it explicitly. We can read
>>>> /proc/self/map and check what page cause SIGSEGV.
>>> This seems more complicated and more likely to be affected by the
>>> environment, so I think you should not do this for now.
>>>
>>> thanks,
>>> Coleen
>>>
>>>> -Dmitry
>>>>
>>>>
>>>>> Thanks,
>>>>> Coleen
>>>>>
>>>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>>>> Dan,
>>>>>>
>>>>>> Fixed (in-place, press shift-reload)
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>
>>>>>>>        One thing I did notice is that some of the error messages are
>>>>>>>        identical which would make it more difficult for someone
>>>>>>>        investigating a future failure to know which failure
>>>>>>> happened:
>>>>>> It shouldn't be an issue because a) test prints banner to stdout with
>>>>>> the name of the function it's running b) test terminates after first
>>>>>> error
>>>>>>
>>>>>> -Dmitry
>>>>>>
>>>>>>
>>>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>>>> Dan,
>>>>>>>>
>>>>>>>> Thank you for the review!
>>>>>>>>
>>>>>>>> 1. I use exit(7) to make the situation when the test fails
>>>>>>>> because of
>>>>>>>> some error condition clear visible.
>>>>>>>>
>>>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>>>> fails, but
>>>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>>>
>>>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>>       test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>        No comments.
>>>>>>>
>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>        I'm good with most of the changes.
>>>>>>>
>>>>>>>        One thing I did notice is that some of the error messages are
>>>>>>>        identical which would make it more difficult for someone
>>>>>>>        investigating a future failure to know which failure
>>>>>>> happened:
>>>>>>>
>>>>>>>        102     fprintf(stderr, "Test ERROR. Can't load class
>>>>>>> DoOverflow\n");
>>>>>>>       144     printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>            You can make them unique by including the function
>>>>>>>            name that generated them
>>>>>>>
>>>>>>>        I just noticed that some error messages still go to stdout,
>>>>>>>        e.g., line 144 above.
>>>>>>>          Looks like this comment from the first round was also
>>>>>>> missed
>>>>>>>        (the line number is now 243):
>>>>>>>
>>>>>>>>        line 225: You'll end up here if no arguments are passed or
>>>>>>>> if the
>>>>>>>>            wrong arguments are passed. You should have a better
>>>>>>>> usage
>>>>>>>>            message, e.g.:
>>>>>>>>
>>>>>>>>        void usage() {
>>>>>>>>            fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>>            fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>>>            exit(1);
>>>>>>>>        }
>>>>>>>      test/runtime/StackGuardPages/testme.sh
>>>>>>>        No comments.
>>>>>>>
>>>>>>> test/runtime/6929067/T.java
>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>> test/runtime/6929067/invoke.c
>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>        No comments on the deleted files.
>>>>>>>
>>>>>>>     Dan
>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>>>>>>>
>>>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>>>> Hi Everyone,
>>>>>>>>>>
>>>>>>>>>> Please, review test changes:
>>>>>>>>>>
>>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>>>        You need to check the command status after running gcc_cmd
>>>>>>>>>        and if it is not zero, then gratuitously pass the test.
>>>>>>>>>        Jerry made a similar test fix last week:
>>>>>>>>>
>>>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>>>> +    echo "Compile failed, Ignoring failed compilation and
>>>>>>>>> forcing the
>>>>>>>>> test to pass"
>>>>>>>>> +    exit 0
>>>>>>>>> +fi
>>>>>>>>>
>>>>>>>>>        This will catch the case where a 64-bit Linux test machine
>>>>>>>>>        has a compiler, but not the 32-bit build env.
>>>>>>>>>
>>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>>        No comments.
>>>>>>>>>
>>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>>        line 28:  * than detaching from vm thread and overflow
>>>>>>>>> stack once
>>>>>>>>> again.
>>>>>>>>>            Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>>>
>>>>>>>>>        line 92: (*_jvm)->AttachCurrentThread(_jvm, (void**)&env,
>>>>>>>>> NULL);
>>>>>>>>>        line 125: (*_jvm)->AttachCurrentThread(_jvm, (void **)&env,
>>>>>>>>> NULL);
>>>>>>>>>            What happens if the JNI AttachCurrentThread() call
>>>>>>>>> fails?
>>>>>>>>>
>>>>>>>>>        line 96: printf("Test ERROR. Can't load class
>>>>>>>>> DoOverflow\n");
>>>>>>>>>        line 102: printf("Test ERROR. Can't find method
>>>>>>>>> DoOverflow.printIt\n");
>>>>>>>>>        line 129: printf("Test ERROR. Can't load class
>>>>>>>>> DoOverflow\n");
>>>>>>>>>        line 135: printf("Test ERROR. Can't find method
>>>>>>>>> DoOverflow.printAlive\n");
>>>>>>>>>        line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>>>        line 224: printf("Test ERROR. Unknown parameter should be
>>>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>>>            Don't errors usually get printed on stderr?
>>>>>>>>>
>>>>>>>>>        line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>>>> method_id, NULL);
>>>>>>>>>            What happens if this call fails? Other than the
>>>>>>>>> intentional
>>>>>>>>>            StackOverflowError, but wait, you catch and swallow
>>>>>>>>> that
>>>>>>>>> error...
>>>>>>>>>
>>>>>>>>>        line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>        line 150   (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>>            What happens if this call fails?
>>>>>>>>>
>>>>>>>>>        line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>>>> method_id,
>>>>>>>>> NULL);
>>>>>>>>>            What happens if this call fails?
>>>>>>>>>
>>>>>>>>>        line 153: // For non-initial thread we doesn't unmap
>>>>>>>>>            Typo: "we doesn't unmap"
>>>>>>>>>               -> "we don't unmap"
>>>>>>>>>
>>>>>>>>>        line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>>>            Typos, perhaps: same SEGV signal trying to access it
>>>>>>>>>
>>>>>>>>>        line 156: // We have no ways to check this, so bail out,
>>>>>>>>> marking
>>>>>>>>> test as succeded
>>>>>>>>>            Typo: 'no ways' -> 'no way'
>>>>>>>>>            Typo: 'as succeded' -> 'as succeeded'
>>>>>>>>>
>>>>>>>>>        line 157: printf("Test OK. ...
>>>>>>>>>        line 171: printf("Test OK. ...
>>>>>>>>>            Should "OK" be "PASSED"?
>>>>>>>>>
>>>>>>>>>        line 168: printf("Test FAILED. Stack guard page is still
>>>>>>>>> there\n");
>>>>>>>>>            Don't failures usually get printed on stderr?
>>>>>>>>>
>>>>>>>>>        line 225: You'll end up here if no arguments are passed or
>>>>>>>>> if the
>>>>>>>>>            wrong arguments are passed. You should have a better
>>>>>>>>> usage
>>>>>>>>>            message, e.g.:
>>>>>>>>>
>>>>>>>>>        void usage() {
>>>>>>>>>            fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>>>            fprintf(stderr, "       invoke
>>>>>>>>> test_native_overflow\n");
>>>>>>>>>            exit(1);
>>>>>>>>>        }
>>>>>>>>>
>>>>>>>>>        I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>>>
>>>>>>>>>        Also, why 'return 0' or 'return 1' instead of 'exit(0)' or
>>>>>>>>> 'exit(1)'?
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> test/runtime/6929067/T.java
>>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>>        OK I can see why you want to combine the two tests.
>>>>>>>>>        I'm guessing that 'InitialThreadOverflow' was
>>>>>>>>>        derived from the much older '6929067'...
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Dan
>>>>>>>>>
>>>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>>>> sophisticated test.
>>>>>>>>>>
>>>>>>>>>> -Dmitry
>>>>>>>>>>
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


More information about the hotspot-runtime-dev mailing list