RFR(S): JDK-8042155 Tests for stack guard pages have to be cleaned up

Dmitry Samersoff dmitry.samersoff at oracle.com
Tue May 20 14:16:48 UTC 2014


Coleen,

> Is this guaranteed?  How many recursions of do_overflow will this
> take?   I still prefer my suggestion of limiting the recursions in the
> second case of do_overfflow so the test doesn't rely on some
> environmental setting that may not be true.

It's guaranteed. We always have libjvm mapped after stack of initial thread.

Typically we crashes after 136000 recursive calls and it takes 1.071 sec.

But it's worth to limit recursion in second case. I'll do it.

-Dmitry


On 2014-05-20 17:54, Coleen Phillimore wrote:
> 
> Dmitry, some comments below.
> 
> On 5/20/14, 8:04 AM, Dmitry Samersoff wrote:
>> Coleen,
>>
>> Thank you for the review!
>>
>> Please see below.
>>
>> On 2014-05-20 06:19, Coleen Phillimore wrote:
>>> I have reviewed the test.  I'm happy that you are fixing this test
>>> because it's been really problematic, but I have a couple of concerns.
>>> First the java program overflow doesn't allocate anything on the stack
>>> so when it gets compiled it might take a really long time to run.  Can
>>> you pass -Xint in invoke.c in the option string?
>> Sure. Will do.
>>
>>> Second, the second do_overflow() in run_native_overflow must hit some
>>> sort of protected page causing it to terminate, why isn't it
>>> SIG_ACCERR?   Maybe the second do_overflow should only recurse the same
>>> number of times as the first + some small number so it doesn't keep
>>> running up the stack (until ???)
>> Both first and second call to do_overflow can:
>>
>> 1. *hit a guard page* - we get SIGSEGV with SIG_ACCERR. (OK in first
>> case, ERR in second)
>>
>> 2. *hit non-mapped page* - we get a SIGSEGV with code other than
>> SIG_ACCERR. (ERR in first case, OK in second)
> 
> Is this guaranteed?  How many recursions of do_overflow will this
> take?   I still prefer my suggestion of limiting the recursions in the
> second case of do_overfflow so the test doesn't rely on some
> environmental setting that may not be true.
>>
>> There is a case that is not covered by this test explicitly - if test
>> guard page was never been installed and stack starts immediately after
>> non-stack page it's possible that native do_verflow overwrite whole
>> non-stack page and hit another protected page and we get SIGSEGV with
>> SIG_ACCERR.
>>
>> Most probably, hotspot would crash in this case.
>>
>> Let me know if you like me to check it explicitly. We can read
>> /proc/self/map and check what page cause SIGSEGV.
> 
> This seems more complicated and more likely to be affected by the
> environment, so I think you should not do this for now.
> 
> thanks,
> Coleen
> 
>>
>> -Dmitry
>>
>>
>>> Thanks,
>>> Coleen
>>>
>>> On 5/14/14, 5:42 AM, Dmitry Samersoff wrote:
>>>> Dan,
>>>>
>>>> Fixed (in-place, press shift-reload)
>>>>
>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>
>>>>>       One thing I did notice is that some of the error messages are
>>>>>       identical which would make it more difficult for someone
>>>>>       investigating a future failure to know which failure happened:
>>>> It shouldn't be an issue because a) test prints banner to stdout with
>>>> the name of the function it's running b) test terminates after first
>>>> error
>>>>
>>>> -Dmitry
>>>>
>>>>
>>>> On 2014-05-13 21:12, Daniel D. Daugherty wrote:
>>>>> On 5/13/14 4:18 AM, Dmitry Samersoff wrote:
>>>>>> Dan,
>>>>>>
>>>>>> Thank you for the review!
>>>>>>
>>>>>> 1. I use exit(7) to make the situation when the test fails because of
>>>>>> some error condition clear visible.
>>>>>>
>>>>>> 2. I'm not sure it's a good idea to pass test if compilation
>>>>>> fails, but
>>>>>> I make this changes to handle it the same way as other tests does
>>>>>>
>>>>>> 3. Agree with rest of comments. See new webrev.
>>>>>>
>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.02/
>>>>>      test/runtime/StackGuardPages/DoOverflow.java
>>>>>       No comments.
>>>>>
>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>       I'm good with most of the changes.
>>>>>
>>>>>       One thing I did notice is that some of the error messages are
>>>>>       identical which would make it more difficult for someone
>>>>>       investigating a future failure to know which failure happened:
>>>>>
>>>>>       102     fprintf(stderr, "Test ERROR. Can't load class
>>>>> DoOverflow\n");
>>>>>      144     printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>           You can make them unique by including the function
>>>>>           name that generated them
>>>>>
>>>>>       I just noticed that some error messages still go to stdout,
>>>>>       e.g., line 144 above.
>>>>>         Looks like this comment from the first round was also missed
>>>>>       (the line number is now 243):
>>>>>
>>>>>>       line 225: You'll end up here if no arguments are passed or
>>>>>> if the
>>>>>>           wrong arguments are passed. You should have a better usage
>>>>>>           message, e.g.:
>>>>>>
>>>>>>       void usage() {
>>>>>>           fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>           fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>           exit(1);
>>>>>>       }
>>>>>     test/runtime/StackGuardPages/testme.sh
>>>>>       No comments.
>>>>>
>>>>> test/runtime/6929067/T.java
>>>>> test/runtime/6929067/Test6929067.sh
>>>>> test/runtime/6929067/invoke.c
>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>       No comments on the deleted files.
>>>>>
>>>>>    Dan
>>>>>
>>>>>> -Dmitry
>>>>>>
>>>>>>
>>>>>> On 2014-05-13 01:51, Daniel D. Daugherty wrote:
>>>>>>> On 5/1/14 8:21 AM, Dmitry Samersoff wrote:
>>>>>>>> Hi Everyone,
>>>>>>>>
>>>>>>>> Please, review test changes:
>>>>>>>>
>>>>>>>> http://cr.openjdk.java.net/~dsamersoff/JDK-8042155/webrev.01/
>>>>>>> test/runtime/StackGuardPages/testme.sh
>>>>>>>       You need to check the command status after running gcc_cmd
>>>>>>>       and if it is not zero, then gratuitously pass the test.
>>>>>>>       Jerry made a similar test fix last week:
>>>>>>>
>>>>>>> +if [ $? -ne 0 ] ; then
>>>>>>> +    echo "Compile failed, Ignoring failed compilation and
>>>>>>> forcing the
>>>>>>> test to pass"
>>>>>>> +    exit 0
>>>>>>> +fi
>>>>>>>
>>>>>>>       This will catch the case where a 64-bit Linux test machine
>>>>>>>       has a compiler, but not the 32-bit build env.
>>>>>>>
>>>>>>> test/runtime/StackGuardPages/DoOverflow.java
>>>>>>>       No comments.
>>>>>>>
>>>>>>> test/runtime/StackGuardPages/invoke.c
>>>>>>>       line 28:  * than detaching from vm thread and overflow
>>>>>>> stack once
>>>>>>> again.
>>>>>>>           Typo?: 'than detaching' -> 'Then we detach'
>>>>>>>
>>>>>>>       line 92: (*_jvm)->AttachCurrentThread(_jvm, (void**)&env,
>>>>>>> NULL);
>>>>>>>       line 125: (*_jvm)->AttachCurrentThread(_jvm, (void **)&env,
>>>>>>> NULL);
>>>>>>>           What happens if the JNI AttachCurrentThread() call fails?
>>>>>>>
>>>>>>>       line 96: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>       line 102: printf("Test ERROR. Can't find method
>>>>>>> DoOverflow.printIt\n");
>>>>>>>       line 129: printf("Test ERROR. Can't load class DoOverflow\n");
>>>>>>>       line 135: printf("Test ERROR. Can't find method
>>>>>>> DoOverflow.printAlive\n");
>>>>>>>       line 192: printf("Test ERROR. Can't create JavaVM\n");
>>>>>>>       line 224: printf("Test ERROR. Unknown parameter should be
>>>>>>> test_java_overflow or test_native_overflow\n");
>>>>>>>           Don't errors usually get printed on stderr?
>>>>>>>
>>>>>>>       line 106: (*env)->CallStaticVoidMethod(env, class_id,
>>>>>>> method_id, NULL);
>>>>>>>           What happens if this call fails? Other than the
>>>>>>> intentional
>>>>>>>           StackOverflowError, but wait, you catch and swallow that
>>>>>>> error...
>>>>>>>
>>>>>>>       line 108: (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>       line 150   (*_jvm)->DetachCurrentThread(_jvm);
>>>>>>>           What happens if this call fails?
>>>>>>>
>>>>>>>       line 139: (*env)->CallStaticVoidMethod (env, class_id,
>>>>>>> method_id,
>>>>>>> NULL);
>>>>>>>           What happens if this call fails?
>>>>>>>
>>>>>>>       line 153: // For non-initial thread we doesn't unmap
>>>>>>>           Typo: "we doesn't unmap"
>>>>>>>              -> "we don't unmap"
>>>>>>>
>>>>>>>       line 154: ...same SEGV_ACCERR tring to access it
>>>>>>>           Typos, perhaps: same SEGV signal trying to access it
>>>>>>>
>>>>>>>       line 156: // We have no ways to check this, so bail out,
>>>>>>> marking
>>>>>>> test as succeded
>>>>>>>           Typo: 'no ways' -> 'no way'
>>>>>>>           Typo: 'as succeded' -> 'as succeeded'
>>>>>>>
>>>>>>>       line 157: printf("Test OK. ...
>>>>>>>       line 171: printf("Test OK. ...
>>>>>>>           Should "OK" be "PASSED"?
>>>>>>>
>>>>>>>       line 168: printf("Test FAILED. Stack guard page is still
>>>>>>> there\n");
>>>>>>>           Don't failures usually get printed on stderr?
>>>>>>>
>>>>>>>       line 225: You'll end up here if no arguments are passed or
>>>>>>> if the
>>>>>>>           wrong arguments are passed. You should have a better usage
>>>>>>>           message, e.g.:
>>>>>>>
>>>>>>>       void usage() {
>>>>>>>           fprintf(stderr, "Usage: invoke test_java_overflow\n");
>>>>>>>           fprintf(stderr, "       invoke test_native_overflow\n");
>>>>>>>           exit(1);
>>>>>>>       }
>>>>>>>
>>>>>>>       I noticed that all the exit() calls pass '7'. Why '7'?
>>>>>>>
>>>>>>>       Also, why 'return 0' or 'return 1' instead of 'exit(0)' or
>>>>>>> 'exit(1)'?
>>>>>>>
>>>>>>>
>>>>>>> test/runtime/6929067/T.java
>>>>>>> test/runtime/6929067/Test6929067.sh
>>>>>>> test/runtime/6929067/invoke.c
>>>>>>> test/runtime/InitialThreadOverflow/DoOverflow.java
>>>>>>> test/runtime/InitialThreadOverflow/invoke.c
>>>>>>> test/runtime/InitialThreadOverflow/testme.sh
>>>>>>>       OK I can see why you want to combine the two tests.
>>>>>>>       I'm guessing that 'InitialThreadOverflow' was
>>>>>>>       derived from the much older '6929067'...
>>>>>>>
>>>>>>>
>>>>>>> Dan
>>>>>>>
>>>>>>>> The fix combine tests for 6929067 and 8009062 into single, more
>>>>>>>> sophisticated test.
>>>>>>>>
>>>>>>>> -Dmitry
>>>>>>>>
>>
> 


-- 
Dmitry Samersoff
Oracle Java development team, Saint Petersburg, Russia
* I would love to change the world, but they won't give me the sources.


More information about the hotspot-runtime-dev mailing list