RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up

Coleen Phillimore coleen.phillimore at oracle.com
Tue May 20 14:59:59 UTC 2014


On 5/20/14, 10:16 AM, Dmitry Samersoff wrote:
> Coleen,
>
>> Is this guaranteed?  How many recursions of do_overflow will this
>> take?   I still prefer my suggestion of limiting the recursions in the
>> second case of do_overfflow so the test doesn't rely on some
>> environmental setting that may not be true.
> It's guaranteed. We always have libjvm mapped after stack of initial thread.

We do?   Really?
>
> Typically we crashes after 136000 recursive calls and it takes 1.071 sec.
>
> But it's worth to limit recursion in second case. I'll do it.

Yes, please do this!

Coleen
>
> -Dmitry
>
>
> On 2014-05-20 17:54, Coleen Phillimore wrote:
>> Dmitry, some comments below.
>>
>> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>>> Coleen,
>>>
>>> Thank you for the review!
>>>
>>> Please see below.
>>>
>>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>>> I have reviewed the test.  I'm happy that you are fixing this test
>>>> because it's been really problematic, but I have a couple of concerns.
>>>> First the java program overflow doesn't allocate anything on the stack
>>>> so when it gets compiled it might take a really long time to run.  Can
>>>> you pass -Xint in invoke.c in the option string?
>>> Sure. Will do.
>>>
>>>> Second, the second do_overflow() in run_native_overflow must hit some
>>>> sort of protected page causing it to terminate, why isn't it
>>>> SIG_ACCERR?   Maybe the second do_overflow should only recurse the same
>>>> number of times as the first + some small number so it doesn't keep
>>>> running up the stack (until ???)
>>> Both first and second call to do_overflow can:
>>>
>>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>>> case, ERR in second)
>>>
>>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>>> SIG_ACCERR. (ERR in first case, OK in second)
>> Is this guaranteed?  How many recursions of do_overflow will this
>> take?   I still prefer my suggestion of limiting the recursions in the
>> second case of do_overfflow so the test doesn't rely on some
>> environmental setting that may not be true.
>>> There is a case that is not covered by this test explicitly - if test
>>> guard page was never been installed and stack starts immediately after
>>> non-stack page it's possible that native do_verflow overwrite whole
>>> non-stack page and hit another protected page and we get SIGSEGV with
>>> SIG_ACCERR.
>>>
>>> Most probably, hotspot would crash in this case.
>>>
>>> Let me know if you like me to check it explicitly. We can read
>>> /proc/self/map and check what page cause SIGSEGV.
>> This seems more complicated and more likely to be affected by the
>> environment, so I think you should not do this for now.
>>
>> thanks,
>> Coleen
>>
>>> -Dmitry
>>>
>>>
>>>> Thanks,
>>>> Coleen
>>>>
>>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>>> Dan,
>>>>>
>>>>> Fixed (in-place, press shift-reload)
>>>>>
>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>
>>>>>>        One thing I did notice is that some of the error messages are
>>>>>>        identical which would make it more difficult for someone
>>>>>>        investigating a future failure to know which failure happened:
>>>>> It shouldn't be an issue because a) test prints banner to stdout with
>>>>> the name of the function it's running b) test terminates after first
>>>>> error
>>>>>
>>>>> -Dmitry
>>>>>
>>>>>
>>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>>> Dan,
>>>>>>>
>>>>>>> Thank you for the review!
>>>>>>>
>>>>>>> 1. I use exit(7) to make the situation when the test fails because of
>>>>>>> some error condition clear visible.
>>>>>>>
>>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>>> fails, but
>>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>>
>>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>>
>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>>       test/runtime/StackGuardPages/DoOverflow.java
>>>>>>        No comments.
>>>>>>
>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>        I'm good with most of the changes.
>>>>>>
>>>>>>        One thing I did notice is that some of the error messages are
>>>>>>        identical which would make it more difficult for someone
>>>>>>        investigating a future failure to know which failure happened:
>>>>>>
>>>>>>        102     fprintf(stderr, "Test ERROR. Can't load class
>>>>>> DoOverflow\n");
>>>>>>       144     printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>            You can make them unique by including the function
>>>>>>            name that generated them
>>>>>>
>>>>>>        I just noticed that some error messages still go to stdout,
>>>>>>        e.g., line 144 above.
>>>>>>          Looks like this comment from the first round was also missed
>>>>>>        (the line number is now 243):
>>>>>>
>>>>>>>        line 225: You'll end up here if no arguments are passed or
>>>>>>> if the
>>>>>>>            wrong arguments are passed. You should have a better usage
>>>>>>>            message, e.g.:
>>>>>>>
>>>>>>>        void usage() {
>>>>>>>            fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>            fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>>            exit(1);
>>>>>>>        }
>>>>>>      test/runtime/StackGuardPages/testme.sh
>>>>>>        No comments.
>>>>>>
>>>>>> test/runtime/6929067/T.java
>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>> test/runtime/6929067/invoke.c
>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>        No comments on the deleted files.
>>>>>>
>>>>>>     Dan
>>>>>>
>>>>>>> -Dmitry
>>>>>>>
>>>>>>>
>>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>>> Hi Everyone,
>>>>>>>>>
>>>>>>>>> Please, review test changes:
>>>>>>>>>
>>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>>        You need to check the command status after running gcc_cmd
>>>>>>>>        and if it is not zero, then gratuitously pass the test.
>>>>>>>>        Jerry made a similar test fix last week:
>>>>>>>>
>>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>>> +    echo "Compile failed, Ignoring failed compilation and
>>>>>>>> forcing the
>>>>>>>> test to pass"
>>>>>>>> +    exit 0
>>>>>>>> +fi
>>>>>>>>
>>>>>>>>        This will catch the case where a 64-bit Linux test machine
>>>>>>>>        has a compiler, but not the 32-bit build env.
>>>>>>>>
>>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>>        No comments.
>>>>>>>>
>>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>>        line 28:  * than detaching from vm thread and overflow
>>>>>>>> stack once
>>>>>>>> again.
>>>>>>>>            Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>>
>>>>>>>>        line 92: (*_jvm)->AttachCurrentThread(_jvm, (void**)&env,
>>>>>>>> NULL);
>>>>>>>>        line 125: (*_jvm)->AttachCurrentThread(_jvm, (void **)&env,
>>>>>>>> NULL);
>>>>>>>>            What happens if the JNI AttachCurrentThread() call fails?
>>>>>>>>
>>>>>>>>        line 96: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>>        line 102: printf("Test ERROR. Can't find method
>>>>>>>> DoOverflow.printIt\n");
>>>>>>>>        line 129: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>>        line 135: printf("Test ERROR. Can't find method
>>>>>>>> DoOverflow.printAlive\n");
>>>>>>>>        line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>>        line 224: printf("Test ERROR. Unknown parameter should be
>>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>>            Don't errors usually get printed on stderr?
>>>>>>>>
>>>>>>>>        line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>>> method_id, NULL);
>>>>>>>>            What happens if this call fails? Other than the
>>>>>>>> intentional
>>>>>>>>            StackOverflowError, but wait, you catch and swallow that
>>>>>>>> error...
>>>>>>>>
>>>>>>>>        line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>        line 150   (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>>            What happens if this call fails?
>>>>>>>>
>>>>>>>>        line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>>> method_id,
>>>>>>>> NULL);
>>>>>>>>            What happens if this call fails?
>>>>>>>>
>>>>>>>>        line 153: // For non-initial thread we doesn't unmap
>>>>>>>>            Typo: "we doesn't unmap"
>>>>>>>>               -> "we don't unmap"
>>>>>>>>
>>>>>>>>        line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>>            Typos, perhaps: same SEGV signal trying to access it
>>>>>>>>
>>>>>>>>        line 156: // We have no ways to check this, so bail out,
>>>>>>>> marking
>>>>>>>> test as succeded
>>>>>>>>            Typo: 'no ways' -> 'no way'
>>>>>>>>            Typo: 'as succeded' -> 'as succeeded'
>>>>>>>>
>>>>>>>>        line 157: printf("Test OK. ...
>>>>>>>>        line 171: printf("Test OK. ...
>>>>>>>>            Should "OK" be "PASSED"?
>>>>>>>>
>>>>>>>>        line 168: printf("Test FAILED. Stack guard page is still
>>>>>>>> there\n");
>>>>>>>>            Don't failures usually get printed on stderr?
>>>>>>>>
>>>>>>>>        line 225: You'll end up here if no arguments are passed or
>>>>>>>> if the
>>>>>>>>            wrong arguments are passed. You should have a better usage
>>>>>>>>            message, e.g.:
>>>>>>>>
>>>>>>>>        void usage() {
>>>>>>>>            fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>>            fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>>>            exit(1);
>>>>>>>>        }
>>>>>>>>
>>>>>>>>        I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>>
>>>>>>>>        Also, why 'return 0' or 'return 1' instead of 'exit(0)' or
>>>>>>>> 'exit(1)'?
>>>>>>>>
>>>>>>>>
>>>>>>>> test/runtime/6929067/T.java
>>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>>> test/runtime/6929067/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>>        OK I can see why you want to combine the two tests.
>>>>>>>>        I'm guessing that 'InitialThreadOverflow' was
>>>>>>>>        derived from the much older '6929067'...
>>>>>>>>
>>>>>>>>
>>>>>>>> Dan
>>>>>>>>
>>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>>> sophisticated test.
>>>>>>>>>
>>>>>>>>> -Dmitry
>>>>>>>>>
>



More information about the hotspot-runtime-dev mailing list